Ad image

Large language overkill: How SLMs can beat their bigger, resource-intensive cousins

8 Min Read

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI. learn more


Two years after the public release of ChatGPT, conversations around AI are inevitable as companies across all industries look to leverage large-scale language models (LLMs) to transform their business processes. However, despite the LLM’s power and promise, many business and IT leaders have become overly reliant on it and have begun to overlook its limitations. This is why I foresee a future where Specialized Language Models (SLMs) play a more complementary role in enterprise IT.

SLM is commonly referred to as a “small language model” because it requires less data and training time and is a “more streamlined version of LLM.” However, I prefer the word “specialized”. Because they better communicate the ability of these purpose-built solutions to perform highly specialized tasks with greater precision, consistency, and transparency than LLMs. By complementing LLM with SLM, organizations can create solutions that leverage the strengths of each model.

Trust and the “black box” issue of LLM

Although LLM is incredibly powerful, it is also known for sometimes “losing the plot” and providing output that veers off course due to its generalist training and large datasets. This trend is made even more problematic by the fact that OpenAI’s ChatGPT and other LLMs are essentially “black boxes” and it is not clear how they arrive at the answer.

This black box problem will become even more problematic in the future, especially for enterprises and business-critical applications where accuracy, consistency, and compliance are paramount. Think of healthcare, financial services, and law as prime examples of professions where incorrect answers can have significant financial and even life-or-death consequences. Regulators are already taking notice and may start demanding explainable AI solutions, especially in industries that rely on data privacy and accuracy.

Companies often implement “human-involved” approaches to alleviate these issues, but over-reliance on LLMs can create a false sense of security. Over time, complacency sets in and mistakes can slip by unnoticed.

SLM = Improved Explainability

Fortunately, SLM is well suited to address many of the limitations of LLM. Rather than being designed for general-purpose tasks, SLMs are developed with a narrower focus and trained on domain-specific data. This specificity allows us to address delicate linguistic requirements in areas where accuracy is paramount. Rather than relying on vast and disparate datasets, SLM is trained on targeted information, providing contextual intelligence to provide more consistent, predictable, and relevant responses. I will.

This has several advantages. First, it’s easier to explain and understand the sources and rationale behind the output. This is critical in regulated industries where decisions must be traced back to the source.

Second, its smaller size means it can often run faster than LLM, which can be an important factor for real-time applications. Third, SLM gives businesses more control over data privacy and security, especially when deployed in-house or built specifically for businesses.

Additionally, while an SLM may require specialized training initially, it reduces the risks associated with using a third-party LLM managed by an external provider. This control is invaluable in applications that require strict data processing and compliance.

Focus on developing expertise (be wary of vendors who overpromise)

I would like to clarify that LLM and SLM are not mutually exclusive. In fact, SLM can extend LLM to create hybrid solutions where LLM provides broader context and SLM ensures accurate execution. It’s still early days for LLMs, so I always advise technology leaders to continue exploring the many possibilities and benefits of LLMs.

Furthermore, while LLM scales well for a variety of problems, SLM may not transfer well to certain use cases. Therefore, it is important to have a clear understanding of what use cases you are working on in advance.

It is also important that business and IT leaders spend more time and attention building the clear skills needed to train, fine-tune, and test SLM. Fortunately, there is a wealth of free information and training available through popular sources such as Coursera, YouTube, and Huggingface.co. As the battle for AI expertise intensifies, leaders must ensure developers have enough time to learn and experiment with SLM.

I also advise leaders to carefully vet their partners. I recently spoke with a company and they asked me for my opinion on a claim made by a technology provider. In my opinion, they either overstated their claims or simply lacked understanding of how the technology worked.

The company wisely took a step back and conducted a controlled proof of concept to test the vendor’s claims. As I guessed, this solution was not ready for prime time, so the company was able to exit the business with relatively little investment of time and money.

Whether a company starts with a proof of concept or a live implementation, I recommend starting small, testing often, and building on early successes. I personally have had the experience that when I manipulate a small number of instructions and information, and input more information into the model, the results veer off course. That’s why slow and steady is a wise approach.

In summary, LLM continues to offer ever more valuable capabilities, but its limitations are becoming increasingly apparent as companies expand their reliance on AI. Complementing SLM provides a path forward, especially in high-stakes areas where accuracy and explainability are required. By investing in SLM, companies can future-proof their AI strategies and ensure that their tools not only drive innovation, but also meet demands for trust, reliability, and control. Masu.

AJ Sunder is the co-founder, CIO, and CPO of Responsive.

data decision maker

Welcome to the VentureBeat community!

DataDecisionMakers is a place where experts, including technologists who work with data, can share data-related insights and innovations.

If you want to read about cutting-edge ideas, updates, best practices, and the future of data and data technology, join DataDecisionMakers.

Why not consider contributing your own articles?

Read more about DataDecisionMakers

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version