The Most Expensive Disagreement in AI
The AI industry has a consensus problem. Nearly every major lab — OpenAI, Anthropic, Google DeepMind, xAI — is betting billions on the same fundamental architecture: large language models trained on text, scaled to trillions of parameters, generating tokens one at a time. The market has spoken. Products built on LLMs are generating real revenue. The trajectory feels inevitable.
And then there is Yann LeCun.
One of three researchers awarded the 2018 ACM Turing Award — alongside Yoshua Bengio and Geoffrey Hinton — for foundational work on deep learning, LeCun spent over a decade at Meta building what became one of the world’s most prestigious AI research labs, FAIR. But for the past three years, he has been making a case that almost nobody in Silicon Valley wants to hear: large language models are a dead end. Not a stepping stone. Not a foundation to build on. A dead end.
In November 2025, LeCun put his career where his critique is. He left Meta after twelve years to found AMI Labs (Advanced Machine Intelligence Labs), a Paris-headquartered startup seeking a $3.5 billion valuation before it has shipped a single product. His mission: build “world models” — AI systems that understand physics, maintain persistent memory, and plan complex actions rather than simply predicting the next word.
The path to human-level AI, LeCun argues, does not run through bigger models predicting the next token. It runs through something fundamentally different — machines that learn to understand the physical world the way humans and animals do. His proposed architecture is called JEPA: Joint Embedding Predictive Architecture.
The Core Argument: Why Token Prediction Is Not Understanding
LeCun’s critique of LLMs is not about performance benchmarks or cherry-picked failure cases. It is architectural. His argument attacks the fundamental mechanism by which these systems operate.
LLMs are autoregressive models. They predict the next token in a sequence based on all previous tokens. This is how GPT-4 writes essays, how Claude generates code, and how Gemini answers questions. The approach works remarkably well for language tasks. But LeCun’s contention is that predicting text tokens is categorically different from understanding the world that text describes.
Consider a simple physical scenario: a ball rolling off a table. A human toddler understands what will happen next. The ball will fall. Gravity is not a concept the child has been taught explicitly — it is something they have learned through thousands of hours of sensory experience, watching objects interact with the world. They have built what cognitive scientists call a world model: an internal representation of how physical reality works.
An LLM, by contrast, has processed millions of descriptions of balls falling off tables. It can generate a perfectly coherent paragraph about gravity. But LeCun argues it has no internal model of physical reality. It is pattern-matching over text. The knowledge is linguistic, not grounded. The distinction matters because language covers a vanishingly small fraction of human knowledge. Most of what humans know about the world — spatial relationships, physical dynamics, cause and effect, social intuition — was never written down. It was learned through direct sensory experience.
LeCun quantifies this gap provocatively. A human child, he estimates, receives approximately 10 to the power of 14 bytes of sensory data through their eyes alone during their first four years of life. The entire text corpus used to train the largest LLMs represents roughly 10 to the power of 13 bytes. A four-year-old has processed more raw information about the world than any language model ever trained. And critically, the child’s data is multimodal, grounded, and interactive — not static text scraped from the internet.
JEPA: Prediction in Embedding Space
LeCun’s proposed alternative, JEPA, represents a fundamentally different approach to learning. First laid out in his June 2022 position paper “A Path Towards Autonomous Machine Intelligence,” JEPA predicts abstract representations in a learned embedding space rather than exact sequences of tokens.
The distinction is significant. Autoregressive models must predict every detail of their output. When generating an image pixel by pixel or a sentence word by word, the model has to commit to specifics at every step. This forces it to model uncertainty about irrelevant details — the exact shade of a pixel, the precise word choice in a paraphrase — consuming model capacity on noise rather than structure.
JEPA sidesteps this by operating in a compressed representation space. Rather than predicting “the ball will hit the floor, bounce twice, and roll under the couch,” a JEPA system predicts the abstract trajectory — the ball will move downward, impact a surface, lose energy — without committing to low-level details. This is closer to how humans think. When you imagine throwing a ball, you do not mentally render every frame of its trajectory at retinal resolution. You predict the abstracted outcome.
The architecture has two key components. First, an encoder that maps raw inputs (images, video, audio) into a high-dimensional embedding space. Second, a predictor that operates entirely within that embedding space, forecasting future states without ever decoding back into raw data. The model learns by comparing its predicted embeddings against actual embeddings of future states.
The research results have been building steadily. I-JEPA, published in 2023 and presented at CVPR, showed that predicting image patch embeddings rather than pixels produces representations that transfer well to downstream tasks. V-JEPA, released in 2024, extended this to video, learning temporal dynamics from unlabeled video data and achieving 82.1% accuracy on Kinetics-400 and 71.2% on Something-Something v2 — surpassing prior state-of-the-art video models. V-JEPA 2, released by Meta in June 2025, scaled to 1.2 billion parameters trained on over one million hours of video, achieving state-of-the-art performance on physical reasoning benchmarks. Most recently, VL-JEPA extended the architecture to vision-language tasks.
These are still research systems, not production products. But the trajectory from image patches to video to physical reasoning to language represents a coherent research program gaining momentum.
Advertisement
The System 1 / System 2 Problem
LeCun’s framework gains additional depth when viewed through the lens of Daniel Kahneman’s dual-process theory — a connection LeCun has explored publicly, including at a 2020 AAAI panel alongside Kahneman, Hinton, and Bengio.
Kahneman’s framework divides human cognition into System 1 (fast, intuitive, automatic) and System 2 (slow, deliberate, logical). Recognizing a face is System 1. Solving a novel math problem is System 2. Current LLMs, LeCun argues, are purely System 1 machines. They generate responses through fast pattern-matching without any mechanism for deliberate, step-by-step reasoning about novel problems.
Chain-of-thought prompting and extended thinking modes might appear to add System 2 capabilities, but LeCun views these as cosmetic. The model is still generating tokens autoregressively. It is not actually planning, searching through a problem space, or reasoning causally. It is producing text that resembles reasoning because it was trained on text written by humans who were reasoning. The imitation is sophisticated but brittle — which is why LLMs fail on problems that require genuine multi-step planning, particularly problems whose structure differs from anything in their training data.
LeCun’s proposed cognitive architecture envisions AI systems with explicit System 2 capabilities: a world model that maintains an internal state, a cost function that evaluates potential actions, and an optimization process that searches for action sequences minimizing that cost. This is closer to model-based reinforcement learning than to language modeling. And it is, by LeCun’s own admission, nowhere close to working at the scale and generality required. AMI Labs exists to close that gap.
The Counterarguments: Why LeCun Might Be Wrong
LeCun’s critique is intellectually serious, but the counterarguments are substantial.
The most powerful one is empirical: LLMs keep getting better. Every year brings new capabilities that critics said token prediction could not produce. Mathematical reasoning that early critics considered impossible. Production-quality code generation. Multimodal models that process images and video alongside text. The goalposts for what constitutes “real understanding” keep moving because LLMs keep clearing the bar.
Scaling law proponents argue that many of LeCun’s objections dissolve with sufficient scale. The inability to reason causally? Train on more data with more parameters and reasoning emerges. The inability to understand physics? Train on video and simulation data and physical intuition emerges. From this perspective, JEPA is a solution to a problem that may solve itself as LLMs incorporate more modalities and scale further.
There is also a practical argument. LLMs work today. They power products generating billions in revenue. They write code, translate languages, summarize documents, and assist with research. JEPA, by contrast, has produced promising research papers but no commercial products. No company has deployed a JEPA-based system at scale. The gap between “interesting research direction” and “viable alternative to transformers” is measured in years and billions of dollars — billions that AMI Labs is now raising.
Some researchers occupy a middle ground, suggesting that the future likely involves hybrid architectures — systems that combine the language fluency of autoregressive models with the grounded world understanding that JEPA-style architectures aim to provide. LeCun himself has acknowledged that LLMs will likely remain useful for language-specific tasks. His argument is about what cannot be achieved through token prediction alone, not that token prediction is useless.
AMI Labs: The Bet Becomes a Company
What makes LeCun’s position more than an academic debate is that he has now put institutional weight behind it.
AMI Labs, headquartered in Paris with Alexandre LeBrun (previously co-founder and CEO at Nabla) serving as CEO and LeCun as Executive Chairman, plans to develop world models for high-stakes applications including healthcare, robotics, automation, and industrial systems. Meta, despite losing its Chief AI Scientist, has agreed to partner with AMI Labs — though it will not invest directly.
The $3.5 billion pre-launch valuation is extraordinary for a company built on an architecture that has not yet produced a commercial product. It signals that investors are treating LeCun’s critique as more than academic contrarianism. If LLMs continue scaling smoothly — if the next generation of models demonstrates increasingly robust reasoning, physical understanding, and genuine planning ability — AMI Labs will face an uphill battle justifying its valuation.
But if scaling hits diminishing returns — if the next generation requires dramatically more compute for marginal capability improvements, if reasoning benchmarks plateau, if physical understanding remains stubbornly brittle despite multimodal training — then AMI Labs and its world model approach will look prescient. The increasing reliance on inference-time compute (chain-of-thought, tree search, extended thinking) rather than pure model scaling suggests that the easy gains from scaling may already be tapering. But these are early signals, not proof.
What is clear is that LeCun is asking the right question, even if his answer turns out to be wrong. The AI industry’s near-unanimous commitment to a single paradigm — predicting the next token — is historically unusual and potentially risky. Having a serious alternative under active development, backed by a Turing Award laureate and billions in funding, is not a distraction. It is insurance.
Advertisement
🧭 Decision Radar (Algeria Lens)
| Dimension | Assessment |
|---|---|
| Relevance for Algeria | Medium — the LLM vs. world models debate will determine which AI architectures Algerian developers and researchers should invest in learning over the next decade; AMI Labs’ Paris headquarters also creates proximity for Algerian-French AI talent |
| Infrastructure Ready? | Partial — Algerian universities have deep learning courses but no research groups working on JEPA or world model architectures; LLM fine-tuning is accessible through cloud providers |
| Skills Available? | No — self-supervised learning and embedding-space prediction require advanced ML research skills not widely available in Algeria; LLM application skills are more accessible |
| Action Timeline | 12-24 months — monitor the paradigm debate; invest in foundational ML education that transcends any single architecture |
| Key Stakeholders | AI researchers, university CS departments, ML engineers, startup CTOs evaluating which AI stack to build on |
| Decision Type | Educational |
Quick Take: Algerian AI teams should not pick sides in this debate — but they should understand it deeply. Organizations building products on LLMs today should continue, while researchers and graduate students should study both paradigms. AMI Labs’ Paris headquarters creates a rare proximity opportunity for Algerian AI talent. The worst outcome would be training an entire generation of Algerian AI practitioners exclusively on prompt engineering for a paradigm that may plateau within five years.
Sources & Further Reading
- A Path Towards Autonomous Machine Intelligence — Yann LeCun (2022) — LeCun’s foundational position paper outlining the JEPA architecture and his vision for world models.
- I-JEPA: The First AI Model Based on Yann LeCun’s Vision for More Human-like AI — Meta FAIR (2023) — Meta’s first implementation of JEPA for image understanding.
- V-JEPA 2: Introducing the V-JEPA 2 World Model and New Benchmarks for Physical Reasoning — Meta (2025) — V-JEPA 2 scaled to 1.2 billion parameters with state-of-the-art physical reasoning.
- Meta Chief AI Scientist Yann LeCun Is Leaving the Company — CNBC (November 2025) — Reporting on LeCun’s departure from Meta to found AMI Labs.
- Yann LeCun’s New Venture Is a Contrarian Bet Against Large Language Models — MIT Technology Review (January 2026) — In-depth profile of AMI Labs and its world model strategy.
- Fathers of the Deep Learning Revolution Receive ACM A.M. Turing Award — ACM (2018) — Official Turing Award announcement for LeCun, Bengio, and Hinton.





Advertisement