U.S. tech companies dominate the generative AI boom—and the cost of model training explains why, a new Stanford University report shows

Lester Cohen—Getty Images for Breakthrough Prize

Hello and welcome to Eye on AI.

We hope some of you got to enjoy all the fascinating discussions, professional networking, and fun at Fortune Brainstorm AI London earlier this week. If you weren’t able to attend, you can get a taste of what took place by reading Fortune’s coverage of the event here. And it’s not too soon to start making plans for the next Fortune Brainstorm AI conference in Singapore in July. You can find out more, including how to register for that event here.

The rise of multimodal foundation models, increasing investment in generative AI, an influx of regulations, and shifting opinions on AI around the globe—it’s all discussed in the Stanford Institute for Human-Centered Artificial Intelligence (HAI) 2024 AI Index, a 500-page report covering AI development and trends. For this year’s report, the seventh HAI has published, the Institute said it broadened its scope to include more original data than ever, including new estimates on AI training costs and an entirely new chapter dedicated to AI’s impact on science and medicine.

Overall, the report paints a picture of a rapidly growing and increasingly complex (and expensive) AI landscape dominated by commercial entities, particularly U.S. tech giants. The number of new LLMs released globally in 2023 doubled compared to the previous year, according to the report. Investment in generative AI also skyrocketed, and so did global mentions of AI in legislative proceedings and regulations—in the U.S. alone last year, the total number of AI-related regulations increased by 56.3%.

One of the biggest takeaways from the report, however, is the dominance of U.S. tech companies. While two-thirds of models released last year were open-source, the highest-performing models came from commercial entities with closed systems. Private industry accounted for 72% of foundational models released last year, putting out 108 compared to 28 from academia and just four from government. Google alone released a whopping 18 foundation models in 2023. For comparison, OpenAI released seven, Meta released 11, and Hugging Face released four. Overall, U.S. companies released 62 notable machine learning models last year compared to 15 coming out of China, eight from France, five from Germany, and four from Canada.

(U.S. dominance also comes across clearly in data on funding rounds for generative AI companies that Fortune published earlier this week.)

The main reason for this dominance is made crystal clear by new cost estimates for model training included in the report.

“The training costs of state-of-the-art AI models have reached unprecedented levels,” it reads, citing the exponential increase as a reason academia and governments have been edged out of AI development.

According to the report, Google’s Gemini Ultra cost an estimated $191 million worth of compute to train, and OpenAI’s GPT-4 cost an estimated $78 million, which is actually slightly lower than some previous estimates of how much that model cost. (Now imagine how much more it’d be if these companies had to pay for all the training data they scraped from the internet.) For comparison, the report notes that the original 2017 Transformer model, which introduced the architecture underlying all of today’s LLMs, cost only around $900.

On the achievements and potential of AI, the report discusses how AI systems have passed human performance on several benchmarks—including some in image classification, visual reasoning, and English understanding—and what it’s doing to turbocharge scientific discovery. While AI started to accelerate scientific discovery in 2022, 2023 saw the launch of even more significant science-related AI applications, the report says. Examples include Google DeepMind’s GNoME (an AI tool that facilitates the process of materials discovery—although some chemists have accused the company of overstating the model’s impact on the field), EVEscape (an AI tool developed by Harvard researchers that can predict viral variants and enhance pandemic prediction), and AlphaMissence (which assists in AI-driven mutation classification).

AI systems have also demonstrated rapid improvement on the MedQA benchmark test for assessing AI’s clinical knowledge. GPT-4 Medprompt, which the report calls “the standout model of 2023” in the clinical area, reached an accuracy rate of 90.2%—marking a 22.6% increase from the highest score in 2022. What’s more, the FDA is approving more and more AI-related medical devices, and AI is increasingly being used for real-world medical purposes.

Of course, AI progress is not a straight line, and there are many significant challenges, lingering questions, and legitimate concerns.

“Robust and standardized evaluations for LLM responsibility are seriously lacking,” the report authors wrote, citing how leading AI developers primarily test their models against different responsible AI benchmarks, complicating efforts to systematically compare the risks and limitations of the top models.

The report highlights many other issues surrounding the technology: Political deepfakes are simple to create but difficult to detect; the most extreme AI risks are difficult to analyze; there is a lack of transparency around the data used to train LLMs and around key aspects of their specific designs; researchers are finding more complex vulnerabilities in LLMs; ChatGPT is politically biased (toward Democrats in the U.S. and the Labour Party in the U.K.); and LLMs can output copyrighted material. Additionally, AI is leaving businesses vulnerable to new privacy, security, reliability, and legal risks, and the number of incidents involving the misuse of AI is rising—2023 saw a 32.3% increase over 2022.

Clocking in at over 500 pages, the report is a doozy. But it’s unquestionably the deepest and most thorough overview of the current state of AI available at the moment. If you want to dive deeper but don’t have time for the full report, HAI has also published some handy charts and will be presenting the findings and answering questions in a webinar on May 1.

And with that, here’s more AI news.

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

This story was originally featured on Fortune.com

Advertisement