The role of large language models in the enterprise

ChatGPT has grasped the public’s imagination and other large language models, like Bing and Bard, are hot on its heels. Dataiku’s Kurt Muehmel looks at how enterprises can grasp the opportunities offered by LLMs

April 27, 2023

ChatGPT, Bing, Bard. From relative obscurity, these platforms have launched themselves into the public domain in the last few months. Each is a specific product developed by different companies, but they are all built on top of the same class of technologies called Large Language Models (LLMs).

What is a LLM? And what makes it “large”? A LLM is a neural network model architecture based on a specific component called a “transformer.” The transformer gives the LLM the ability to identify how words relate to each other in their context and produce unique answers to questions, rather than “looking up” responses. A vast number of “neurons” make up this neural network, and the connections between these neurons are called “parameters.” These parameters define the strength of the signal between neurons.

To put the size of LLMs into perspective, one of the models behind ChatGPT has 175 billion parameters. Clearly, these models can be incredibly large, and can therefore offer impressive performance capabilities for the enterprise. The flip side is that they can also become very complex and costly.

Therefore, considering the size and capability of a LLM is key to deciding how best to use it. So, what are the options?

Using LLMs in the Enterprise

There are two ways you can utilise a LLM in the enterprise beyond the simple web interface.

1 – You can make an API call to a model provided as-a-service

These services are generally on offer by companies like OpenAI, Amazon Web Services and Microsoft Azure. These companies provide public APIs that you can connect to your software. With this approach, there are several advantages, including:

– Low barrier to entry – calling an API is a straightforward task that a junior developer can do in a matter of minutes

– Higher sophistication – you can leverage some of the largest and most sophisticated models available, providing more accurate responses on a wide range of topics

– Speed – generally these models provide quick responses, so you can use them in real-time

But while convenient and powerful, these public models are not suited for certain enterprise applications. Their being public means that data from the query may be retained and used for further development of the model. So, enterprises need to check and see whether the architecture respects their data residency and privacy obligations for their use case.

Additionally, there’s potential to accrue high costs as most public APIs have a fee structure that charges according to the number of queries and the length of processed text. You can usually get cost estimates and use smaller/cheaper models for narrower tasks. Finally, though rare, the provider of an API can choose to stop the service at any moment; it’s risky to depend on a pipeline whose flow you don’t control.

2 – You can download and run an open-source model in an environment you manage

Given some of the limitations of harnessing a public model via an API, companies may be better off creating and running an open-source model themselves.

There’s a whole range of open-source models available, each characterised by strengths and weaknesses that will be more or less suited to a company’s needs. A smaller model – while limited in application – can often deliver a desired performance on a specific use case at far less the cost than a very large model. Moreover, by running and maintaining open-source models themselves, organisations are not dependent on a third-party API service.

But this approach might not be every organisation’s cup of tea. The process can, firstly, involve a high level of complexity, as to set up and maintain your own LLM requires a level of data science and engineering expertise beyond that of simpler models. Companies need to self-evaluate honestly to see if they have sufficient expertise and time to build and maintain such a model in the long term.

And then, secondly, there is the issue of narrower performance. Open-source community models are smaller and more focused in their application, whereas the huge models on offer via public APIs can cover an astonishing breadth and variety of topics.

Choosing an Approach

Taking into account the tradeoffs of each approach, you might ask: does one outweigh the other? In simple terms, no. In fact, there is no one-size-fits-all approach that could work enterprise-wide. Even within companies themselves, the best ways to choose which model and architecture to use should rest on a use case by use case basis.

Both options allow you to choose from smaller and larger models, with tradeoffs in terms of the breadth of their potential applications, the sophistication of the language generated, and the cost and complexity of using the model. For many enterprises, either method may be suited for different use cases at different times, depending on fluctuating budgets, capacity, and resources.

The companies that will have the greatest success using LLMs are those that can take an agile approach that allows them to opt for the right model for any given application. Innovation with LLMs is rapidly advancing. Companies that can be flexible and adapt to these changes will reap the greatest benefits from them.

ABOUT THE AUTHOR

Kurt Muehmel

Kurt Muehmel, everyday AI strategic director, Dataiku