Ilya Sutskever, Yann LeCun and the End of “Just Add GPUs”
When two of the most influential people in AI both say that today’s large language models are hitting their limits, it’s worth paying attention.
In a recent long-form interview, Ilya Sutskever – co-founder of OpenAI and now head of Safe Superintelligence Inc. – argued that the industry is moving from an “age of scaling” to an “age of research”. At the same time, Yann LeCun, VP & Chief AI Scientist at Meta, has been loudly insisting that LLMs are not the future of AI at all and that we need a completely different path based on “world models” and architectures like JEPA.
As developers and founders, we’re building products right in the middle of that shift.
This article breaks down Sutskever’s and LeCun’s viewpoints and what they mean for people actually shipping software.
1. Sutskever’s Timeline: From Research → Scaling → Research Again
Sutskever divides the last decade of AI into three phases:
1.1. 2012–2020: The first age of research
This is the era of “try everything”:
convolutional nets for vision
sequence models and attention
early reinforcement learning breakthroughs
lots of small experiments, new architectures, and weird ideas
There were big models, but compute and data were still limited. The progress came from new concepts, not massive clusters.
1.2. 2020–2025: The age of scaling
Then scaling laws changed everything.
The recipe became:
More data + more compute + bigger models = better results.
You didn’t have to be extremely creative to justify a multi-billion-dollar GPU bill. You could point to a curve: as you scale up parameters and tokens, performance climbs smoothly.
This gave us:
GPT-3/4 class models
state-of-the-art multimodal systems
the current wave of AI products everyone is building on
1.3. 2025 onward: Back to an age of research (but with huge computers)
Now Sutskever is saying that scaling alone is no longer enough:
The industry is already operating at insane scale.
The internet is finite, so you can’t just keep scraping higher-quality, diverse text forever.
The returns from “just make it 10× bigger” are getting smaller and more unpredictable.
We’re moving into a phase where:
The clusters stay huge, but progress depends on new ideas, not only new GPUs.
2. Why the Current LLM Recipe Is Hitting Limits
Sutskever keeps circling three core issues.
2.1. Benchmarks vs. real-world usefulness
Models look god-tier on paper:
they pass exams
solve benchmark coding tasks
reach crazy scores on reasoning evals
But everyday users still run into:
hallucinations
brittle behavior on messy input
surprisingly dumb mistakes in practical workflows
So there’s a gap between benchmark performance and actual reliability when someone uses the model as a teammate or co-pilot.
2.2. Pre-training is powerful, but opaque
The big idea of this era was: pre-train on enormous text + images and you’ll learn “everything”.
It worked incredibly well… but it has downsides:
you don’t fully control what the model learns
when it fails, it’s hard to tell if the issue is data, architecture, or something deeper
pushing performance often means more of the same, not better understanding
That’s why there’s so much focus now on post-training tricks: RLHF, reward models, system prompts, fine-tuning, tool usage, etc. We’re papering over the limits of the pre-training recipe.
2.3. The real bottleneck: generalization
For Sutskever, the biggest unsolved problem is generalization.
Humans can:
learn a new concept from a handful of examples
transfer knowledge between domains
keep learning continuously without forgetting everything
Models, by comparison, still need:
huge amounts of data
careful evals to avoid weird corner-case failures
extensive guardrails and fine-tuning
Even the best systems today generalize much worse than people. Fixing that is not a matter of another 10,000 GPUs; it needs new theory and new training methods.
3. Safe Superintelligence Inc.: Betting on New Recipes
Sutskever’s new company, Safe Superintelligence Inc. (SSI), is built around a simple thesis:
scaling was the driver of the last wave
research will drive the next one
SSI is not rushing out consumer products. Instead, it positions itself as:
focused on long-term research into superintelligence
trying to invent new training methods and architectures
putting safety and controllability at the core from day one
Instead of betting that “GPT-7 but bigger” will magically become AGI, SSI is betting that a different kind of model, trained with different objectives, will be needed.
4. Have Tech Companies Overspent on GPUs?
Listening to Sutskever, it’s hard not to read between the lines:
Huge amounts of money have gone into GPU clusters on the assumption that scale alone would keep delivering step-function gains.
We’re discovering that the marginal gains from scaling are getting smaller, and progress is less predictable.
That doesn’t mean the GPU arms race was pointless. Without it, we wouldn’t have today’s LLMs at all.
But it does mean:
The next major improvements will likely come from smarter algorithms, not merely more expensive hardware.
Access to H100s is slowly becoming a commodity, while genuine innovation moves back to ideas and data.
For founders planning multi-year product strategies, that’s a big shift.
5. Yann LeCun’s Counterpoint: LLMs Aren’t the Future at All
If Sutskever is saying “scaling is necessary but insufficient,” Yann LeCun goes further:
LLMs, as we know them, are not the path to real intelligence.
He’s been very explicit about this in talks, interviews and posts.
5.1. What LeCun doesn’t like about LLMs
LeCun’s core criticisms can be summarized in three points:
Limited understanding
LLMs are great at manipulating text but have a shallow grasp of the physical world.
They don’t truly “understand” objects, physics or causality – all the things you need for real-world reasoning and planning.A product-driven dead-end
He sees LLMs as an amazing product technology (chatbots, assistants, coding helpers) but believes they are approaching their natural limits.
Each new model is larger and more expensive, yet delivers smaller improvements.Simplicity of token prediction
Under the hood, an LLM is just predicting the next token. LeCun argues this is a very narrow, simplistic proxy for intelligence.
For him, real reasoning can’t emerge from next-word prediction alone.
5.2. World models and JEPA
Instead of LLMs, LeCun pushes the idea of world models – systems that:
learn by watching the world (especially video)
build an internal representation of objects, space and time
can predict what will happen next in that world, not just what word comes next
One of the architectures he’s working on is JEPA – Joint Embedding Predictive Architecture:
it learns representations by predicting future embeddings rather than raw pixels or text
it’s designed to scale to complex, high-dimensional input like video
the goal is a model that can support persistent memory, reasoning and planning
5.3. Four pillars of future AI
LeCun often describes four pillars any truly intelligent system needs:
Understanding of the physical world
Persistent memory
Reasoning
Planning
His argument is that today’s LLM-centric systems mostly hack around these requirements instead of solving them directly. That’s why he’s increasingly focused on world-model architectures instead of bigger text models.
6. Sutskever vs. LeCun: Same Diagnosis, Different Cure
What’s fascinating is that Sutskever and LeCun agree on the problem:
current LLMs and scaling strategies are hitting limits
simply adding more parameters and data is delivering diminishing returns
new ideas are required
Where they differ is how radical the change needs to be:
Sutskever seems to believe that the next breakthroughs will still come from the same general family of models – big neural nets trained on massive datasets – but with better objectives, better generalization, and much stronger safety work.
LeCun believes we need a new paradigm: world models that learn from interaction with the environment, closer to how animals and humans learn.
For people building on today’s models, that tension is actually good news: it means there is still a lot of frontier left.
7. What All This Means for Developers and Founders
So what should you do if you’re not running an AI lab, but you are building products on top of OpenAI, Anthropic, Google, Meta, etc.?
7.1. Hardware is becoming less of a moat
If the next big gains won’t come from simply scaling, then:
the advantage of “we have more GPUs than you” decreases over time
your real edge comes from use cases, data, UX and integration, not raw model size
This is good for startups and agencies: you can piggyback on the big models and still differentiate.
7.2. Benchmarks are not your product
Both Sutskever’s and LeCun’s critiques are a warning against obsessing over leaderboards.
Ask yourself:
Does this improvement meaningfully change what my users can do?
Does it reduce hallucinations in their workflows?
Does it make the system more reliable, debuggable and explainable?
User-centric metrics matter more than another +2% on some synthetic reasoning benchmark.
7.3. Expect more diversity in model types
If LeCun’s world models, JEPA-style architectures, or other alternatives start to work, we’ll likely see:
specialized models for physical reasoning and robotics
LLMs acting as a language interface over deeper systems that actually handle planning and environment modeling
more hybrid stacks, where multiple models collaborate
For developers, that means learning to orchestrate multiple systems instead of just calling one chat completion endpoint.
7.4. Data, workflows and feedback loops are where you win
No matter who is right about the far future, one thing is clear for product builders:
Owning high-quality domain data
Designing tight feedback loops between users and models
Building evaluations that match your use case
…will matter more than anything else.
You don’t need to solve world modeling or superintelligence yourself. You need to:
pick the right model(s) for the job
wrap them in workflows that make sense for your users
keep improving based on real-world behavior
8. A Quiet Turning Point
In 2019–2021, the story of AI was simple: “scale is all you need.” Bigger models, more data, more GPUs.
Now, two of the field’s most influential figures are effectively saying:
scaling is not enough (Sutskever)
LLMs themselves may be a dead end for real intelligence (LeCun)
We’re entering a new phase where research, theory and new architectures matter again as much as infrastructure.
For builders, that doesn’t mean you should stop using LLMs or pause your AI roadmap. It means:
focus less on chasing the next parameter count
focus more on how intelligence shows up inside your product: reliability, reasoning, planning, and how it fits into real human workflows
The GPU race gave us today’s tools. The next decade will be defined by what we do with them – and by the new ideas that finally move us beyond “predict the next token.”