Will these new, efficient AI models send Nvidia's stock tumbling again?
New AI models from the likes of Google are showing that the technology can be run on fewer Nvidia chips. Will it cause another DeepSeek selloff?
I-Hwa Cheng/AFP/Getty Images
- New AI models are emerging that show they can run with just a handful of Nvidia chips.
- Google is one company following DeepSeek in making more powerful AI that requires less compute.
- It doesn't look like a DeepSeek-level problem for Nvidia — but there are caveats.
A new generation of AI models is squeezing more power out of fewer chips. Whether they spark another DeepSeek-scale panic for Nvidia is another matter.
Google led the charge this week with a collection of smaller models — Gemma 3 — that appear to pack a serious punch with a standout feat: they run smoothly with just a single Nvidia chip, known as a GPU.
Unveiling the models Wednesday, Google CEO Sundar Pichai highlighted their efficiency in an X post, writing that "you'd need at least 10x the compute to get similar performance from other models."
Cohere, a Toronto-headquartered startup led by former Googler Aidan Gomez, also released a new model on Thursday called Command A, which is described as a "state-of-the-art" model that runs on just two GPUs. (Business Insider, alongside other publishers, has sued Cohere over copyright infringement.)
One of the key lessons DeepSeek imparted to the world when it released an AI model in January was the ability to do more with less. The Chinese startup said its R1 model was competitive with OpenAI's o1 model while claiming it needed fewer chips.
The claim triggered the biggest single-day wipeout in US stock market history, in which Nvidia lost close to $600 billion in value. The market wondered if more efficient AI would reduce demand for Nvidia chips — demand that helped it achieve a record full-year revenue in 2024 of $130.5 billion.
At first glance, this new wave of AI models seems to pose an even greater threat, as they claim to be state-of-the-art while only needing a handful of GPUs to run.
A chart of Gemma's performance on the industry leaderboard Chatbot Arena, for instance — shared by Pichai — showed the model outperformed those from DeepSeek, OpenAI, and Meta while being run on fewer GPUs.
But just because more companies are learning to squeeze more performance from their AI with fewer chips, it's not a given that these more efficient models will go on to pose a DeepSeek-style risk to Nvidia.
For one, as the DeepSeek saga unfolded, tech CEOs were quick to note a phenomenon known as Jevons Paradox. The economic principle suggests that as technology becomes more efficient, consumption of that technology will increase — not decrease.
It might help explain why Google itself has said it plans to increase its AI-related capital expenditure this year to $75 billion, which typically includes the GPUs housed in the data centers critical for AI.
Google has been one of the main buyers of the latest generation of Nvidia GPUs — the Blackwell chips introduced last year — so it's plausible to expect them to be among those ready to spend on the new GPU Nvidia is expected to unveil at its GTC event next week.
So far, the market does not appear to be worried by the latest chip efficiency developments — Nvidia's share price is up about 6% since Tuesday.
There is a small caveat to this.
While Google's new Gemma models can use a single Nvidia GPU to run on, it appears the training of the new models took place with Google's own chips, known as tensor processing units, or TPUs.
Tech giants like Google have spent years working on their own silicon as a means of reducing their dependence on Nvidia, so Gemma poses a curious situation in which Google has produced a competitive AI model without using any Nvidia GPUs for training.
Still, it's unlikely that a company like Google will reduce its dependence on Nvidia GPUs in a significant way anytime soon — and for a simple reason.
The company's push to produce more efficient models is happening alongside its development of more powerful, large-scale Gemini AI models that aim to push the boundaries of intelligence. These models' success, for now, depends on getting access to as much compute as possible.
This approach was recently validated by the release of Grok 3, the latest frontier AI model from Elon Musk's startup. Its release came with a note that said a future version of the model would be trained on a larger, 200,000 GPU cluster.
The future of AI development, then, looks like one in which two different paths emerge in tandem: one in which smaller, more efficient models emerge that run on fewer GPUs, and another in which large-scale models continue to have as many GPUs thrown at them as possible.