AMI – Advanced Machine Intelligence – The End of Scaling: Why Yann LeCun No Longer Believes in LLMs
Xpert pre-release
Language selection 📢
Published on: November 23, 2025 / Updated on: November 23, 2025 – Author: Konrad Wolfenstein

AMI – Advanced Machine Intelligence – The End of Scaling: Why Yann LeCun No Longer Believes in LLMs – Image: Xpert.Digital
Dead end instead of superintelligence: Why Meta's chief visionary is now quitting
600 billion for a misguided approach? The "Godfather of AI" bets against LLaMA, ChatGPT & Co.
The announcement came like a thunderbolt through the technology industry in November 2025. Yann LeCun, one of the three founding fathers of deep learning and chief scientist at Meta, announced his departure after twelve years with the company to found his own startup. This decision is far more than a personal career choice by a single scientist. It marks a fundamental turning point in the global artificial intelligence industry and reveals the growing gap between short-term market interests and long-term scientific vision.
LeCun, who received the Turing Award in 2018 along with Geoffrey Hinton and Yoshua Bengio, is considered the architect of convolutional neural networks, which today form the foundation of modern image processing systems. His departure comes at a time when the entire industry is investing hundreds of billions of dollars in large language models, a technology LeCun has described for years as a fundamental dead end. With his new company, the now 65-year-old scientist intends to pursue what he calls Advanced Machine Intelligence, a radically different approach based on world models and starting with physical perception, not text.
The economic implications of this development are immense. Meta itself has invested over $600 billion in AI infrastructure in the past three years. OpenAI has reached a valuation of half a trillion dollars, despite annual revenue of only ten billion dollars. The entire industry has moved in a direction that one of its most important pioneers has now publicly described as a dead end. To understand the economic consequences of this shift, one must delve deeply into the technical, organizational, and financial structures of the current AI revolution.
Suitable for:
The architecture of a bubble
The Transformer architecture, introduced by researchers at Google in 2017, has transformed the AI landscape at an unprecedented pace. This approach made it possible for the first time to efficiently process massive amounts of text and train language models with previously unattainable capabilities. OpenAI built upon this foundation with its GPT series, which, with ChatGPT in November 2022, demonstrated to a mass audience for the first time what these technologies could achieve. The response was explosive. Within a few months, tens of billions of dollars flowed into the sector.
However, since the end of 2024, there have been increasing signs that this exponential development is reaching its limits. OpenAI has been developing the successor to GPT-4, internally referred to as Orion or GPT-5, for over 18 months. The company has reportedly conducted at least two large training runs, each costing approximately $500 million. The results have been sobering. While GPT-4 represented a massive performance leap over GPT-3, Orion's improvements over GPT-4 are marginal. In some areas, particularly programming, the model shows virtually no progress.
This development fundamentally contradicts the scaling laws, those empirical principles that until recently guided the entire industry. The basic idea was simple: if you make a model larger, use more data for training, and invest more computing power, the performance increase follows a predictable power function. This principle seemed to hold true universally and justified the astronomical investments of recent years. Now it turns out that these curves are flattening. The next doubling of investment no longer yields the expected doubling of performance.
The reasons for this are numerous and technically complex. A key problem is the data wall. GPT-4 was trained with approximately 13 trillion tokens, which is essentially the entire publicly available internet. For GPT-5, there simply isn't enough new, high-quality data. OpenAI has responded by hiring software developers, mathematicians, and theoretical physicists to generate new data by writing code and solving mathematical problems. However, even if 1,000 people produced 5,000 words a day, it would take months to generate just one billion tokens. Scaling using human-generated data simply doesn't work.
As an alternative, companies are increasingly relying on synthetic data—that is, data generated by other AI models. But here lurks a new danger: model collapse. When models are recursively trained on data generated by other models, small errors amplify over generations. The result is models that become increasingly detached from reality, and in which minority groups in the data disproportionately disappear. A study published in Nature in 2024 showed that this process occurs surprisingly quickly. Synthetic data is therefore not a panacea, but rather carries significant risks.
The energy transition and the limits to growth
Besides the data barrier, there is a second, even more fundamental barrier: the energy barrier. Training GPT-3 consumed approximately 1,300 megawatt-hours of electricity, equivalent to the annual consumption of 130 American households. GPT-4 required an estimated 50 times that amount, or 65,000 megawatt-hours. The computing power required to train large AI models doubles roughly every 100 days. This exponential curve quickly leads to physical limitations.
Data centers that train and operate these models already consume as much electricity as small towns. The International Energy Agency predicts that data center electricity consumption will increase by 80 percent by 2026, from 20 terawatt-hours in 2022 to 36 terawatt-hours in 2026. AI is the primary driver of this growth. For comparison, a single ChatGPT query consumes about ten times more energy than a Google search. With billions of queries per day, this adds up to enormous amounts.
This development is forcing technology companies to take drastic measures. Microsoft has already signed contracts with nuclear energy providers. Meta, Amazon, and Google are investing a combined total of over $1.3 trillion in the coming years to build the necessary infrastructure. But these investments are running up against physical and political limits. The US simply doesn't have enough energy infrastructure to power the planned AI data centers. Analysts estimate that projects worth $750 billion could be delayed by 2030 due to energy infrastructure bottlenecks.
Added to this is the geopolitical dimension. The AI industry's energy demands intensify competition for resources and increase dependence on fossil fuels. While policymakers demand climate neutrality, the AI industry is driving up energy consumption. This tension will worsen in the coming years and may lead to regulatory interventions that limit the industry's growth.
The architectural wall and LeCun's alternative
The third barrier is perhaps the most fundamental: the architectural wall. Yann LeCun has argued for years that the Transformer architecture has inherent limitations that cannot be overcome simply by scaling. His critique focuses on the fundamental way Large Language Models work. These systems are trained to predict the next word in a sequence. They learn statistical patterns in massive text corpora, but they don't develop a true understanding of causality, physical laws, or long-term planning.
LeCun likes to illustrate the problem with a comparison: A four-year-old child has absorbed more information about the world through visual perception than the greatest language models have through text. A child intuitively understands that objects don't simply disappear, that heavy things fall, and that actions have consequences. They have developed a world model, an internal representation of physical reality, which they use to make predictions and plan actions. LLMs lack this fundamental ability. They can generate impressively coherent text, but they don't understand the world.
This limitation becomes apparent time and again in practical applications. If you ask GPT-4 to visualize a rotating cube, it fails at a task that any child can easily accomplish. With complex tasks requiring multi-step planning, the models regularly fail. They cannot reliably learn from errors because every token prediction error potentially cascades and amplifies itself. Autoregressive models have a fundamental fragility: an error early in the sequence can ruin the entire result.
LeCun's alternative is world models based on Joint Embedding Predictive Architecture. The basic idea is that AI systems should not learn through text prediction, but rather by predicting abstract representations of future states. Instead of generating pixel by pixel or token by token, the system learns a compressed, structured representation of the world and can use this to mentally simulate different scenarios before acting.
Under LeCun's leadership, Meta has already developed several implementations of this approach. I-JEPA for images and V-JEPA for videos show promising results. These models learn high-level object components and their spatial relationships without relying on intensive data acquisition. They are also significantly more energy-efficient to train than conventional models. The vision is to combine these approaches into hierarchical systems that can operate at different levels of abstraction and timescales.
The crucial difference lies in the nature of the learning process. While LLMs essentially perform pattern matching on steroids, world models aim to grasp the structure and causality of reality. A system with a robust world model could anticipate the consequences of its actions without actually having to carry them out. It could learn from a few examples because it understands the underlying principles, not just superficial correlations.
Organizational dysfunction and Meta's existential crisis
LeCun's departure, however, is not solely a scientific decision, but also the result of organizational dysfunction at Meta. In June 2025, CEO Mark Zuckerberg announced a massive restructuring of the AI divisions. He founded Meta Superintelligence Labs, a new unit with the stated goal of developing Artificial General Intelligence. It was headed by Alexandr Wang, the 28-year-old former CEO of Scale AI, a data preparation company. Meta invested $14.3 billion in Scale AI and recruited over 50 engineers and researchers from competitors.
This decision turned the existing structure upside down. LeCun's Fundamental AI Research Team, which had spent years developing PyTorch and the first Llama models, was marginalized. FAIR was geared towards fundamental research with a five- to ten-year time horizon, while the new superintelligence labs focused on short-term product development. Sources report increasing chaos in Meta's AI departments. Newly hired top talent expressed frustration with the bureaucracy of a large corporation, while established teams saw their influence waning.
The situation worsened due to several restructurings in just six months. In August 2025, Superintelligence Labs was reorganized again, this time into four subunits: a mysterious TBD Lab for new models, a product team, an infrastructure team, and FAIR. Another wave of layoffs followed in October, with approximately 600 employees placed on severance pay. The stated reason: reducing organizational complexity and accelerating AI development.
These constant restructurings stand in stark contrast to the relative stability of competitors like OpenAI, Google, and Anthropic. They point to a fundamental uncertainty at Meta regarding the right strategic direction. Zuckerberg has recognized that Meta is falling behind in the race for AI dominance. Llama 4, launched in April 2025, was a disappointment. While the Maverick model demonstrated good efficiency, it failed dramatically in longer contexts. Allegations surfaced that Meta optimized for benchmarks by specifically training models on common test questions, artificially inflating performance.
For LeCun, the situation became untenable. His vision of long-term fundamental research clashed with the pressure to deliver short-term product successes. The fact that he was effectively subordinate to the considerably younger Wang likely contributed to his decision. In his farewell memo, LeCun emphasizes that Meta will remain a partner in his new company, but the message is clear: the independent research he considers essential is no longer possible within the corporate structures.
A new dimension of digital transformation with 'Managed AI' (Artificial Intelligence) - Platform & B2B Solution | Xpert Consulting

A new dimension of digital transformation with 'Managed AI' (Artificial Intelligence) – Platform & B2B Solution | Xpert Consulting - Image: Xpert.Digital
Here you will learn how your company can implement customized AI solutions quickly, securely, and without high entry barriers.
A Managed AI Platform is your all-round, worry-free package for artificial intelligence. Instead of dealing with complex technology, expensive infrastructure, and lengthy development processes, you receive a turnkey solution tailored to your needs from a specialized partner – often within a few days.
The key benefits at a glance:
⚡ Fast implementation: From idea to operational application in days, not months. We deliver practical solutions that create immediate value.
🔒 Maximum data security: Your sensitive data remains with you. We guarantee secure and compliant processing without sharing data with third parties.
💸 No financial risk: You only pay for results. High upfront investments in hardware, software, or personnel are completely eliminated.
🎯 Focus on your core business: Concentrate on what you do best. We handle the entire technical implementation, operation, and maintenance of your AI solution.
📈 Future-proof & Scalable: Your AI grows with you. We ensure ongoing optimization and scalability, and flexibly adapt the models to new requirements.
More about it here:
From hype to reality: The looming reassessment of the AI industry
The economic anatomy of blister formation
The developments at Meta are symptomatic of a broader economic dynamic in the AI industry. Since ChatGPT's breakthrough in late 2022, an unprecedented investment boom has unfolded. In the first quarter of 2025 alone, $73.1 billion flowed into AI startups, representing 58 percent of all venture capital investments. OpenAI reached a valuation of $500 billion, making it the first private company to cross this threshold without ever having been profitable.
The valuations are wildly disproportionate to actual revenues. OpenAI generated $10 billion in annual revenue in 2025 with a valuation of $500 billion, resulting in a price-to-sales ratio of 50. For comparison, even at the height of the dot-com bubble, few companies achieved such multiples. Anthropic is valued at $170 billion with revenues of $2.2 billion, a P/E ratio of approximately 77. These figures indicate massive overvaluation.
Particularly problematic is the circular financing structure that has developed. Nvidia is investing $100 billion in OpenAI, which in turn is obligated to purchase tens of billions of dollars' worth of Nvidia chips. OpenAI made similar deals with AMD worth tens of billions of dollars. Microsoft has invested over $13 billion in OpenAI and hosts its infrastructure on Azure. Amazon invested $8 billion in Anthropic, which in return uses AWS as its primary cloud platform and employs Amazon's own AI chips.
These arrangements are eerily reminiscent of the circular financing of the late 1990s, when technology companies sold equipment to each other and booked the transactions as revenue without generating any real economic value. Analysts speak of an increasingly complex and opaque web of business relationships fueling a trillion-dollar boom. The parallels to the dot-com bubble and the 2008 financial crisis are unmistakable: opaque and unconventional financing mechanisms that are difficult for investors to understand and assess.
Added to this is the concentration of capital. The Magnificent Seven, the seven largest US technology companies, increased their energy consumption by 19 percent in 2023, while the median consumption of S&P 500 companies stagnated. Approximately 80 percent of stock market gains in the US in 2025 were attributable to AI-related companies. Nvidia alone became the most-bought stock by retail investors, who invested almost $30 billion in the chipmaker in 2024.
This extreme concentration carries systemic risks. If return expectations prove unrealistic, a market crash could have far-reaching consequences. JPMorgan estimates that AI-related investment-grade bond issuances alone could reach $1.5 trillion by 2030. Much of this debt is based on the assumption that AI systems will generate massive productivity gains. Should this expectation fail to materialize, a credit crisis looms.
Suitable for:
- Meta is betting everything on superintelligence: billion-dollar investments, mega data centers, and a risky AI race
The talent war and the social upheavals
The economic tensions are also manifesting themselves in the labor market. The ratio of open AI positions to qualified candidates is 3.2 to 1. There are 1.6 million open positions, but only 518,000 qualified applicants. This extreme shortage is driving salaries to astronomical heights. AI specialists can add tens of thousands of dollars to their annual income by acquiring skills in Python, TensorFlow, or specialized AI frameworks.
The competition is brutal. Large tech companies, well-funded startups, and even governments are vying for the same small group of experts. OpenAI has experienced an exodus of executives in recent months, including co-founder Ilya Sutskever and Chief Technology Officer Mira Murati. Many of these talented individuals are launching their own startups or moving to competitors. Meta is aggressively recruiting from OpenAI, Anthropic, and Google. Anthropic is recruiting from Meta and OpenAI.
This dynamic has several consequences. First, it fragments the research landscape. Instead of working toward common goals, small teams in different organizations compete for the same breakthroughs. Second, it drives up costs. The enormous salaries for AI specialists are only sustainable for well-capitalized companies, which excludes smaller players from the market. Third, it delays projects. Companies report that open positions remain unfilled for months, disrupting development timelines.
The societal implications extend far beyond the technology sector. If AI truly represents the next industrial revolution, then a massive upheaval of the labor market is imminent. Unlike the first industrial revolution, which primarily affected physical labor, AI targets cognitive tasks. Not only are simple data entry and customer service threatened, but potentially also highly skilled professions such as programmers, designers, lawyers, and journalists.
A study of the investment management industry predicts a five percent decline in the labor-based income share due to AI and big data. This is comparable to the shifts during the industrial revolution, which caused a decline of five to 15 percent. The crucial difference: the current transformation is taking place over years, not decades. Societies have little time to adapt.
Test-Time Compute and the Paradigm Shift
While scaling laws for pre-training are reaching their limits, a new paradigm has emerged: test-time compute scaling. OpenAI's o1 models demonstrated that significant performance gains are possible by investing more computing power during inference. Instead of simply increasing the model size, these systems allow the model to think about a query for longer, pursue multiple approaches to solving it, and self-verify its answers.
However, research shows that this paradigm also has limitations. Sequential scaling, in which a model iterates over the same problem multiple times, does not lead to continuous improvements. Studies on models like Deepseeks R1 and QwQ demonstrate that longer thought processes do not automatically produce better results. Often, the model corrects correct answers to incorrect ones, rather than the other way around. The self-revision capacity necessary for effective sequential scaling is insufficiently developed.
Parallel scaling, where multiple solutions are generated simultaneously and the best one is selected, shows better results. However, here too, the marginal benefit decreases with each doubling of the invested computing power. Cost efficiency drops rapidly. For commercial applications that need to answer millions of queries per day, the costs are prohibitive.
The real breakthrough could lie in combining different approaches. Hybrid architectures that combine Transformers with State Space Models promise to unite the strengths of both. State Space Models like Mamba offer linear scaling behavior in inference, while Transformers excel at capturing long-range dependencies. Such hybrid systems could rebalance the cost-quality equation.
Alternative architectures and the future after the Transformers
Alongside world models, a number of alternative architectures are emerging that could challenge the dominance of Transformers. State-space models have made significant progress in recent years. S4, Mamba, and Hyena demonstrate that efficient long-context reasoning with linear complexity is possible. While Transformers scale quadratically with sequence length, SSMs achieve linear scaling in both training and inference.
These efficiency gains could be crucial when AI systems are deployed in production environments. The cost of inference has often been underestimated. Training is a one-time investment, but inference runs continuously. ChatGPT is never offline. With billions of daily queries, even small efficiency improvements add up to massive cost savings. A model that requires half the computing power for the same quality has a tremendous competitive advantage.
The challenge lies in the maturation of these technologies. Transformers have a head start of almost eight years and a vast ecosystem of tools, libraries, and expertise. Alternative architectures must not only be technically superior but also practically usable. The history of technology is full of technically superior solutions that failed in the market because the ecosystem was lacking.
Interestingly, the Chinese competition is also relying on alternative approaches. DeepSeek V3, an open-source model with 671 billion parameters, uses a mixture-of-experts architecture in which only 37 billion parameters are activated per token. The model achieves comparable performance to Western competitors in benchmarks, but was trained at a fraction of the cost. The training time was only 2.788 million H800 GPU hours, significantly less than comparable models.
This development shows that technological leadership doesn't necessarily reside with the most financially powerful players. Clever architectural decisions and optimizations can compensate for resource advantages. For the global AI landscape, this means increasing multipolarity. China, Europe, and other regions are developing their own approaches that are not simply copies of Western models.
The reassessment and the inevitable hangover
The convergence of all these factors suggests an impending reassessment of the AI industry. Current valuations are based on the assumption of continuous exponential growth, both in model performance and commercial adoption. Both assumptions are becoming increasingly questionable. Model performance is stagnating, while costs continue to skyrocket. Although commercial adoption is growing, monetization remains challenging.
OpenAI, with its half-trillion-dollar valuation, would need to grow to at least $100 billion in annual revenue and become profitable in the coming years to justify its valuation. That would mean a tenfold increase in just a few years. By comparison, it took Google over a decade to grow from $10 billion to $100 billion in revenue. Expectations for AI companies are unrealistically high.
Analysts are warning of a potential bursting of the AI bubble. The parallels to the dot-com bubble are obvious. Then, as now, there is revolutionary technology with enormous potential. Then, as now, there are irrationally inflated valuations and circular financing structures. Then, as now, investors justify absurd valuations by arguing that the technology will change everything and that traditional valuation metrics are no longer applicable.
The crucial difference: Unlike many dot-com companies, today's AI firms actually have working products with real value. ChatGPT isn't vaporware, but a technology used by millions of people daily. The question isn't whether AI is valuable, but whether it's valuable enough to justify current valuations. The answer is most likely no.
When the revaluation comes, it will be painful. Venture capital funds have invested 70 percent of their capital in AI. Pension funds and institutional investors are massively exposed. A significant drop in AI valuations would have far-reaching financial consequences. Companies that rely on cheap financing would suddenly struggle to raise capital. Projects would be halted, and staff would be laid off.
The long-term perspective and the way forward
Despite these bleak short-term prospects, the long-term potential of artificial intelligence remains immense. The current hype does not change the fundamental importance of the technology. The question is not whether, but how and when AI will deliver on its promise. LeCun's shift from short-term product development to long-term fundamental research points the way.
The next generation of AI systems will likely look different from today's LLMs. It will combine elements of world models, alternative architectures, and new training paradigms. It will rely less on brute-force scaling and more on efficient, structured representations. It will learn from the physical world, not just text. And it will understand causality, not just correlations.
This vision, however, requires time, patience, and the freedom to conduct fundamental research. These very conditions are hard to find in the current market environment. The pressure to deliver rapid commercial success is immense. Quarterly reports and evaluation rounds dominate the agenda. Long-term research programs, which may take years to produce results, are difficult to justify.
LeCun's decision to found a startup at 65 is a remarkable statement. He could have retired with all the honors and a guaranteed place in history. Instead, he's chosen the rocky road of pursuing a vision rejected by the industry mainstream. Meta will remain a partner, meaning his company will have resources, at least initially. But its real success will depend on whether he can demonstrate in the coming years that Advanced Machine Intelligence is indeed superior.
The transformation will take years. Even if LeCun is right and world models are fundamentally superior, they still need to be developed, optimized, and industrialized. The ecosystem needs to be built. Developers need to learn how to use the new tools. Companies need to migrate from LLMs to the new systems. These transition phases have historically always been painful.
From hype to reality: The long-term course of action in AI
Yann LeCun's departure from Meta marks more than just a personnel change. It symbolizes the fundamental tension between scientific vision and commercial pragmatism, between long-term innovation and short-term market demands. The current AI revolution is at a turning point. The easy successes of scaling have been exhausted. The next steps will be more difficult, expensive, and uncertain.
For investors, this means that the exorbitant valuations of current AI champions need to be critically examined. For companies, it means that the hope for rapid productivity miracles through AI may be disappointed. For society, it means that the transformation will be slower and more uneven than the hype wave suggests.
At the same time, the foundation remains robust. AI is not a passing fad, but a fundamental technology that will transform virtually all sectors of the economy in the long term. The parallels to the industrial revolution are apt. As then, there will be winners and losers, excesses and corrections, upheavals and adjustments. The question is not whether the transformer architecture has reached the end of its capabilities, but what the next phase will look like and who will shape it.
LeCun's bet on advanced machine intelligence and world models is bold, but it could prove to be farsighted. In five years, we will know whether breaking away from the mainstream was the right decision or whether the industry has stayed the course. The coming years will be crucial for the long-term development of artificial intelligence and, consequently, for the economic and societal future.
Our US expertise in business development, sales and marketing
Industry focus: B2B, digitalization (from AI to XR), mechanical engineering, logistics, renewable energies and industry
More about it here:
A topic hub with insights and expertise:
- Knowledge platform on the global and regional economy, innovation and industry-specific trends
- Collection of analyses, impulses and background information from our focus areas
- A place for expertise and information on current developments in business and technology
- Topic hub for companies that want to learn about markets, digitalization and industry innovations
Your global marketing and business development partner
☑️ Our business language is English or German
☑️ NEW: Correspondence in your national language!
I would be happy to serve you and my team as a personal advisor.
You can contact me by filling out the contact form or simply call me on +49 89 89 674 804 (Munich) . My email address is: wolfenstein ∂ xpert.digital
I'm looking forward to our joint project.
☑️ SME support in strategy, consulting, planning and implementation
☑️ Creation or realignment of the digital strategy and digitalization
☑️ Expansion and optimization of international sales processes
☑️ Global & Digital B2B trading platforms
☑️ Pioneer Business Development / Marketing / PR / Trade Fairs
🎯🎯🎯 Benefit from Xpert.Digital's extensive, five-fold expertise in a comprehensive service package | BD, R&D, XR, PR & Digital Visibility Optimization

Benefit from Xpert.Digital's extensive, fivefold expertise in a comprehensive service package | R&D, XR, PR & Digital Visibility Optimization - Image: Xpert.Digital
Xpert.Digital has in-depth knowledge of various industries. This allows us to develop tailor-made strategies that are tailored precisely to the requirements and challenges of your specific market segment. By continually analyzing market trends and following industry developments, we can act with foresight and offer innovative solutions. Through the combination of experience and knowledge, we generate added value and give our customers a decisive competitive advantage.
More about it here:























