OpenAI says it could rebuild GPT-4 from scratch with just 5 to 10 people, thanks to breakthroughs from its latest model
Sam Altman also said OpenAI is no longer "compute-constrained" on the best models it can produce.
Tomohiro Ohsumi/Getty Images
- Retraining GPT-4 would now take as few as five people, thanks to big advances.
- GPT-4.5, launched in February, was OpenAI's most powerful model yet, the company said.
- Its breakthroughs would make rebuilding GPT-4 much easier.
Building GPT-4 took a lot of manpower. Now, OpenAI says it could rebuild GPT-4 with as few as five people, all because of what it learned from its latest model, GPT-4.5.
In a company podcast episode published Friday, OpenAI's CEO, Sam Altman, asked a question to three key engineers behind GPT-4.5: What's the smallest OpenAI team that could retrain GPT-4 from scratch today?
Altman said building GPT-4 took "hundreds of people, almost all of OpenAI's effort" — but things get much easier once a model is no longer at the frontier.
Alex Paino, who led pre-training machine learning for GPT-4.5, said retraining GPT-4 now would "probably" take just five to 10 people.
"We trained GPT-4o, which was a GPT-4-caliber model that we retrained using a lot of the same stuff coming out of the GPT-4.5 research program," Paino said. "Doing that run itself actually took a much smaller number of people."
Daniel Selsam, a researcher at OpenAI working on data efficiency and algorithms, agreed that rebuilding GPT-4 would now be far easier.
"Just finding out someone else did something — it becomes immensely easier," he said. "I feel like just the fact that something is possible is a huge cheat code."
In February, OpenAI released GPT-4.5, saying it was the company's largest and most powerful model to date.
Altman described it in a post on X as "the first model that feels like talking to a thoughtful person."
Paino said GPT-4.5 is designed to be "10x smarter" than GPT-4, which was released in March 2023.
"We're scaling 10x beyond what we did before with these GPT pre-training runs," Paino said.
"No longer compute-constrained"
Altman also said OpenAI is no longer "compute-constrained" on the best models it can produce — a shift he thinks the world hasn't really understood yet.
For many AI companies, the biggest hurdle to building better models is simply having enough computing power.
"It is a crazy update," Altman said. "For so long, we lived in a world where compute was always the limiting factor," he added.
Big Tech has been pouring billions into AI infrastructure. Microsoft, Amazon, Google, and Meta are expected to spend a collective $320 billion in capital expenditures this year to broaden their AI capabilities.
OpenAI announced in March that it had closed the largest private tech funding round on record, including $30 billion from SoftBank and $10 billion from other investors, bringing the company's valuation to $300 billion.
The fresh capital will help OpenAI scale its computing power even further, the company said in a statement at the time.
Nvidia CEO Jensen Huang said on an earnings call in February that demand for AI compute will only grow.
"Reasoning models can consume 100x more compute. Future reasoning can consume much more compute," Huang said on the call.
As for what's needed to hit the next 10x or 100x jump in scale, Selsam, the OpenAI researcher, said it's data efficiency.
The GPT models are very efficient at processing information, but there's a "ceiling to how deep of an insight it can gain from the data," he said.
"At some point, as the compute keeps growing and growing, the data grows much less quickly," he said, adding that "the data becomes the bottleneck."
Pushing beyond that, he said, will require "some algorithmic innovations" to squeeze more value from the same amount of data.