Open source AI is booming, but OpenAI’s GPT-4 is still the big winner with corporate customers—for now

A year ago, a San Francisco gathering many called the “Woodstock of AI” brought together 5,000 devotees of “open source” AI models—that is, where the underlying code, and sometimes the model weights and training methods, are publicly available for researchers and developers to build on.

The event, hosted by open source AI hub Hugging Face and featuring live llamas (in a nod to Meta’s Llama model) kicked off an open source AI boom that hasn’t let up since. The landscape now includes unicorn startups such as Mistral and Together AI, and boasts a constant barrage of new open source AI models that are getting ever-closer to beating OpenAI’s flagship GPT-4 at various performance benchmarks. Just over the past couple of weeks, there were open source LLM releases from top companies like Databricks, Cerebras, AI21, and Cohere.

However, a recent survey by venture capital firm a16z found that for large companies adopting generative AI, OpenAI’s closed, proprietary models remain the most popular by far—particularly for use cases actually put into production. But it also showed signs of change: Six months ago, for instance, most organizations were experimenting with just one model—mostly from OpenAI—and most stuck to common use cases in areas like marketing, coding, and customer support. But in 2024, they are opening up to experimenting with more AI model options—that are often open source.

More organizations are experimenting with open source models

Sarah Wang, an a16z general partner who co-authored the survey, said OpenAI's biggest advantage up until now has been that of first-mover. In addition, it has been tough to push off its top perch, she explained, because for most of the past year, GPT-4 has also been both considered the best model available, as well as easy to access directly through an API or via Microsoft Azure.

“I think it was the easiest to plug and play and to say, this model is the best, let's just see what use cases come out of it,” she said. The survey estimated the 2023 market share of closed source models at 80%–90%, with the majority of share going to OpenAI. No updated market share was provided for this year, but 46% of respondents mentioned that they prefer or strongly prefer open source models.

“Every single enterprise said they were testing more than one model family,” said Wang, and pointed out that two of the top six model families in terms of use were open source —Llama and Mistral. “So certainly it's still early, but I think it is maybe a leading indicator for usage down the road,” she said.

OpenAI has touted ‘tremendous growth’ in enterprise offering

Perhaps with its eye in the rear-view mirror and its foot on the gas, OpenAI, led by CEO Sam Altman, appears to be working hard to solidify its lead with corporate customers. It published a new blog post last week that announced new features for its “self-serve fine-tuning API”—which allows for some customizing—and shared case studies of companies like SK Telecom that have customized and fine-tuned OpenAI models. The post also trumpeted an expanded “assisted fine-tuning offering” for companies to “collaborate with OpenAI technical teams to leverage techniques beyond the fine-tuning API.”

In addition, OpenAI COO Brad Lightcap touted the ‘tremendous growth’ in the corporate version of ChatGPT in a Bloomberg interview on Friday—claiming there are now more than 600,000 people signed up to use ChatGPT Enterprise, up from around 150,000 in January, and calling 2024 the ‘the year of adoption for AI in the enterprise.’

Still, according to Kjell Carlsson, head of AI strategy at Domino Data Lab and a former Forrester analyst, the market for generative AI models is fracturing based on enterprise use cases.

“OpenAI has a dominant position thanks to its head start but, even more so, its relationship with Microsoft and their powerful sales teams,” he said, referring to OpenAI's partnership with Microsoft, including it offering OpenAI models through its cloud arm, Azure. But he added that companies are primarily using OpenAI’s models for generic use cases like ad-hoc user queries or customer service chatbots. When companies are looking to create differentiated generative AI applications and instances—like a biotech business doing AI-powered drug discovery—and want to protect their data due to regulatory or security concerns, they are often turning to other vendors and open source models.

“I have yet to speak with a company that says they use OpenAI’s models because of any inherent technological advantages they have today,” he said.

Cost, control, and customization

While cost has often been cited as a reason to turn to open source—for example, Meta's Llama 2 has been shown to be 10-20 times cheaper than OpenAI's GPT-4 for generating 1 million tokens—Wang said that respondents widely cited other reasons for wanting to adopt AI models. Those reasons include control (security of proprietary data and understanding why models produce certain outputs) and customization (ability to effectively fine-tune for a given use case).

"The fact that you can self-host a model fine-tuned on your own data with an open source model was very attractive to a lot of the enterprises," she said.

Ali Ghodsi, CEO of data and AI platform Databricks (which recently released a powerful new open source large language model called DBRX) agreed with that assessment, calling the wholesale move to open source AI in 2024 an "underreported trend." He added that enterprises want to customize AI models on their specific data and tasks and, as a result, own the intellectual property.

“I think that's going to continue regardless if there are really smart models that come out by the proprietary vendors,” he said. “Enterprises want to be competitive in their market. They want to have their own recipes.”

The potential impact of GPT-5

Of course, none of the current predictions about corporate adoption of generative AI take into account the release of the OpenAI's next large language model, the highly-anticipated GPT-5, which Wang says will come very soon. However, she pointed out that in conversations with enterprise companies, the cost of switching models is very low—so it's likely that organizations will continue experimenting with a mix of closed and open-source models.

"They can swap out models pretty easily in the back end," she said, so there isn't the same vendor lock-in issues companies might have with something like a database business. In addition, the landscape is getting a lot more crowded with new entrants, so people are open to testing.

That said, GPT-5 could come out and "blow everyone away," Wang admitted, leading OpenAI to maintain their market share. "It could throw everything off. It's hard to predict."

This story was originally featured on Fortune.com

Advertisement