
AI chip hype meets reality: The future of data centers – in-house development versus market saturation – Image: Xpert.Digital
Nvidia's monopoly is faltering: Tech giants ignite the next stage in the chip war - a billion-dollar poker game over AI chips
The big showdown in the data center: In-house development meets looming market saturation
The world of artificial intelligence is experiencing an unprecedented boom, driven by an almost insatiable demand for computing power. At the heart of this hype are AI chips, above all the GPUs from market leader Nvidia, which have become the gold of the digital age. But behind the scenes, a strategic shift is taking place that could reshape the power structure of the entire tech industry. The biggest buyers of these chips—hyperscalers like Microsoft, Google, and Amazon—no longer want to be mere customers. With investments worth billions, they are developing their own custom-designed semiconductors, such as Microsoft's Maia, Google's TPUs, and Amazon's Trainium.
The motivation is clear: cut costs, reduce dependence on individual vendors, and perfectly tailor the entire infrastructure, from the chip to the cooling system, to their own AI models. What begins as a pragmatic business decision to optimize performance is igniting fundamental competition and seriously challenging Nvidia's dominance for the first time. But while an arms race for the most powerful AI infrastructure rages, with hundreds of billions of dollars being invested, warnings of overheating are growing ever louder. Experts are drawing comparisons to previous speculative bubbles and warning of impending market saturation and overcapacity in the coming years.
This article delves deep into the AI chip hype and sheds light on the reality behind it: Why are the tech giants relying on in-house development? How far along are they really? And what happens when exponential demand suddenly collapses and the dream of endless AI growth collides with the harsh reality of an economic correction?
Related to this:
- The billion-dollar time bomb of artificial intelligence: How Meta, Microsoft and OpenAI are creating a new tech bubble
What motivates hyperscalers to develop their own chips?
The major cloud providers, also known as hyperscalers, face a fundamental strategic decision: Should they continue to rely on chips from established manufacturers like Nvidia and AMD, or should they increasingly shift to their own semiconductor developments? Microsoft CTO Kevin Scott recently brought this issue into focus when he stated that Microsoft intends to primarily rely on its own Maia chips in the long term. This strategy is not new – both Google with its TPUs and Amazon with its Trainium chips are already pursuing similar approaches.
The main reason for this development lies in cost optimization. For hyperscalers, the price-performance ratio is the decisive factor, as Scott emphasizes: “We are not dogmatic about the chips we use. This means that Nvidia has been the best price-performance solution for many years. We are open to all options that ensure we have sufficient capacity to meet demand.” This statement clarifies that this is not a fundamental rejection of established providers, but rather a pragmatic business decision.
Developing their own chips also allows hyperscalers to optimize their entire system architecture. Microsoft, for example, can use its Maia chips not only to adjust computing power but also to tailor cooling, networking, and other infrastructure elements specifically to its own requirements. Scott explains: “It’s about the entire system design. It’s the networks and the cooling, and you want the freedom to make the decisions you need to make to truly optimize computing for the workload.”.
How far along are the various hyperscalers with their in-house developments?
The three major cloud providers are at different stages of developing their custom silicon strategies. Amazon Web Services is the pioneer in this area, having laid the foundation in 2018 with its first Graviton chip. AWS is now in its fourth generation of Graviton processors, designed for general-purpose compute workloads. In parallel, Amazon has developed specialized AI chips: Trainium for training and Inferentia for inferencing machine learning models.
The numbers speak for the success of this strategy: In the last two years, Graviton processors accounted for over 50 percent of all CPU capacity installed in AWS data centers. AWS also reports that more than 50,000 customers are using Graviton-based services. Particularly impressive is the practical deployment: During Prime Day 2024, Amazon deployed a quarter of a million Graviton chips and 80,000 of its custom AI chips.
Google has taken a different approach with its Tensor Processing Units (TPUs), focusing early on AI-specific hardware. The TPUs are already in their seventh generation and are offered exclusively through Google Cloud. Google also recently unveiled its first Arm-based general-purpose processor, Axion, which the company claims offers up to 30 percent better performance than comparable Arm-based instances from other cloud providers.
Microsoft is the latecomer in this race. The company only unveiled its first in-house designed chips at the end of 2023: the Azure Maia AI Accelerator and the Azure Cobalt CPU. The Cobalt CPU has been generally available since October 2024 and is based on a 64-bit architecture with 128 cores, manufactured on a 5-nanometer process by TSMC. Microsoft claims that Cobalt delivers up to 40 percent better performance than previous Arm-based offerings in Azure.
Why can't our own chips cover the entire demand?
Despite progress in in-house development, all hyperscalers are still far from meeting their entire demand with homegrown chips. The main reason lies in the sheer size of the market and the rapid increase in demand. Kevin Scott of Microsoft sums it up perfectly: “To call the massive shortage of computing capacity is probably an understatement. Since the launch of ChatGPT, it has been virtually impossible to scale capacity quickly enough.”.
The figures illustrate the scale of the challenge: Global data center capacity is projected to increase by 50 percent by 2027, driven by AI demand. Large tech companies alone plan to invest over $300 billion in AI infrastructure by 2025. At this rate of growth, it is physically impossible to meet the entire demand through internal chip development.
Additionally, there are technical limitations in manufacturing. The most advanced chips are produced by only a few foundries, such as TSMC, and production capacity is limited. Microsoft, Google, and Amazon have to share this production capacity with other customers, which restricts the quantities available for their own chips. Another factor is development time: while demand is exploding, developing a new chip takes several years.
Hyperscalers are therefore pursuing a mixed strategy. They develop their own chips for specific workloads where they see the greatest advantage and supplement these with chips from Nvidia, AMD, and Intel for other use cases. Scott explains: “We’re not dogmatic about the names on the chips. It’s about the best price-performance ratio.”.
What economic advantages do custom silicon solutions offer?
The economic incentives for developing in-house chips are substantial. Studies show that AWS Trainium and Google TPU v5e are 50 to 70 percent cheaper per token for large language models than high-end Nvidia H100 clusters. In some analyses, TPU implementations proved to be four to ten times more cost-effective than GPU solutions for training large language models.
These cost savings result from several factors. First, the chips can be precisely tailored to the specific requirements of the workloads, enabling efficiency gains. Second, the chip manufacturer's margin is eliminated, leading to significant savings given the enormous volumes produced by hyperscalers. Third, vertical integration allows for better control over the entire supply chain.
Amazon, for example, reports that SAP achieves a 35 percent performance increase in analytical workloads with Graviton-based EC2 instances. Google states that its TPU v5e delivers three times the inference throughput per dollar compared to the previous TPU generation through continuous batching. Microsoft claims that its Cobalt CPUs offer up to 1.5 times better performance in Java workloads and twice the performance in web servers.
The long-term financial implications are considerable. With investments totaling hundreds of billions of dollars, even small efficiency improvements can lead to enormous cost savings. Experts estimate that the market for custom silicon in cloud environments could reach a volume of $60 billion by 2035.
Related to this:
- The AI chip war escalates: Nvidia's nightmare? China strikes back with its own AI chips – and Alibaba is just the beginning
How is the competitive situation developing in the chip market?
The increasing in-house development of hyperscalers is fundamentally changing the traditional chip industry. Nvidia, long the undisputed market leader in AI accelerators, is facing serious competition for the first time. Analysts at Kearney predict that hyperscaler-developed silicon solutions like Google's TPU, AWS Trainium, and Microsoft's Maia could achieve a market share of up to 15 to 20 percent as internal implementations.
This development is forcing traditional chip manufacturers to reposition themselves. AMD, for example, is attempting to directly challenge Nvidia with its MI300 series while simultaneously offering strengthened partnerships with cloud providers. Intel, although less strongly positioned in AI chips, continues to benefit from custom Xeon processors for hyperscalers, as demonstrated by the R8i instances recently announced by AWS.
The competitive dynamics are further intensified by the differing strategies of the hyperscalers. While Google uses its TPUs exclusively internally and offers them via Google Cloud, other providers could market their chips externally in the future. This diversification of providers leads to healthier competition and can accelerate innovation cycles.
Another important aspect is the geopolitical dimension. Given the tensions between the US and China, American hyperscalers are increasingly investing in their own chip manufacturing capabilities to become less dependent on Asian suppliers. At the same time, Chinese companies like Baidu with its Kunlun chips are emerging as their own champions.
A new dimension of digital transformation with 'Managed AI' (Artificial Intelligence) - Platform & B2B solution | Xpert Consulting
A new dimension of digital transformation with 'Managed AI' (Artificial Intelligence) – Platform & B2B solution | Xpert Consulting - Image: Xpert.Digital
Here you will learn how your company can implement customized AI solutions quickly, securely and without high entry barriers.
A managed AI platform is your all-inclusive, worry-free solution for artificial intelligence. Instead of dealing with complex technology, expensive infrastructure, and lengthy development processes, you receive a ready-made solution tailored to your needs from a specialized partner – often within just a few days.
The key advantages at a glance:
⚡ Rapid implementation: From idea to ready-to-use application in days, not months. We deliver practical solutions that create immediate added value.
🔒 Maximum data security: Your sensitive data stays with you. We guarantee secure and compliant processing without sharing data with third parties.
💸 No financial risk: You only pay for results. High upfront investments in hardware, software, or personnel are completely eliminated.
🎯 Focus on your core business: Concentrate on what you do best. We take care of the entire technical implementation, operation, and maintenance of your AI solution.
📈 Future-proof & scalable: Your AI grows with you. We ensure continuous optimization and scalability, and flexibly adapt the models to new requirements.
More information here:
AI boom vs. chip shortage: When will the data center bubble threaten?
What does the current demand trend mean for the market?
The demand for computing power, especially for AI applications, is currently showing exponential growth. Nvidia estimates that responses from reasoning models require more than 100 times more computing resources than previous generations. This development is leading to a structural shortage of advanced chips and data center capacity.
McKinsey's analysis shows that global demand for data center capacity could triple by 2030, with an annual growth rate of approximately 22 percent. In the US, demand could even grow by 20 to 25 percent annually. Around 70 percent of this projected 2030 demand will come from hyperscalers.
This surge in demand is leading to a paradigm shift in the industry. Synergy Research Group predicts that hyperscalers will control 61 percent of global data center capacity by 2030, up from 44 percent today. At the same time, the share of on-premises data centers is expected to decline from 34 percent today to 22 percent by 2030.
High demand is also leading to bottlenecks throughout the supply chain. High-bandwidth memory, advanced packaging technologies like CoWoS, and specialized substrates have been sold out for months. Nvidia, for example, reports that its next-generation Blackwell GPUs are already sold out for a year or more.
Related to this:
- What does the AI chip deal between AMD and OpenAI mean for the industry? Is Nvidia's dominance in danger?
When might overcapacity occur?
The question of potential overcapacity in data centers is highly controversial. Various experts are already warning of an AI bubble that could be larger than the dot-com bubble of the 1990s. The MacroStrategy Partnership, an independent research firm, claims that the current AI bubble is 17 times larger than the dot-com bubble and four times larger than the 2008 housing bubble.
Goldman Sachs CEO David Solomon warned of a stock market drawdown in the coming years due to the enormous sums of money flowing into AI projects. He explained: “I think a lot of capital is being deployed that will prove unprofitable, and when that happens, people aren’t going to feel good.” Amazon CEO Jeff Bezos confirmed at the same conference that there is a bubble in the AI industry.
Warning signs are mounting: Julien Garran of MacroStrategy Partnership points out that the adoption of large language models by companies has already begun to decline. He also argues that ChatGPT may have “hit a wall,” as the latest version costs ten times more but doesn't perform noticeably better than previous versions.
On the other hand, recent market data shows that demand continues to exceed supply. CBRE reports that vacancy rates in primary data center markets in North America fell to a record low of 2.8 percent at the beginning of 2024. This occurred despite the largest annual increase in data center supply, suggesting that the fundamentals remain strong.
What timeframes are realistic for a potential market consolidation?
Accurately predicting the timing of a potential market consolidation is extremely difficult, as it depends on many unknown factors. However, analysts have identified several key periods in which market dynamics could change.
The first critical period lies between 2026 and 2027. Several factors suggest that growth rates could slow during this time. Hyperscalers are already planning a 20 to 30 percent reduction in their investments for 2026, indicating a degree of market saturation or reassessment.
The semiconductor industry expects demand for AI chips to reach an initial plateau between 2026 and 2027. The annual growth rate for wafers could normalize from the current 14 to 17 percent to around 4 percent. This would represent a significant turning point in capacity planning.
A second critical period lies around 2028 to 2030. By this time, the first generation of large-scale AI infrastructure investments may need to reach their return on investment. If not enough profitable use cases have emerged by then, a correction could occur. McKinsey predicts that demand for data center capacity will triple by 2030, but these forecasts are based on assumptions about AI adoption that could prove overly optimistic.
The crucial factor will be whether AI applications prove to be sustainably profitable. Dario Perkins of TS Lombard warns that technology companies are taking on massive debt to build AI data centers without regard for returns, driven by competition. This situation is reminiscent of past bubbles and could lead to a correction if returns fail to meet expectations.
What would be the effects of overcapacity?
Overcapacity in data centers would have far-reaching consequences for the entire technology industry. Initially, it would lead to a drastic drop in cloud service prices. While this would be beneficial for customers in the short term, it could significantly impact the profitability of hyperscalers and lead to market consolidation.
The impact on employment would be significant. As early as 2025, more than 250,000 workers in the technology sector were expected to face layoffs, and a market correction would exacerbate these trends. Data center operations, chip development, and related areas would be particularly affected.
For the semiconductor industry, overcapacity would be particularly painful. The enormous investments in manufacturing capacity for advanced chips could prove to be excessive. Samsung already reported a 39 percent drop in profits in the second quarter of 2025 due to weaker demand for AI chips, which could be a harbinger of things to come.
Market consolidation would likely lead to a concentration of power among the strongest providers. Smaller cloud providers and data center operators could be acquired by larger companies or forced out of the market. In the long run, this could lead to less competition and higher prices.
On the other hand, a correction could also have positive effects. It would eliminate inefficient capacities and redirect resources to more productive uses. The surviving companies would likely be stronger and more sustainably positioned. Furthermore, consolidation could promote the development of standards and interoperability.
How are companies preparing for different scenarios?
Given the uncertainty surrounding future market developments, hyperscalers and other companies are pursuing various strategies to minimize risk. The most important is diversifying their chip strategies. As Microsoft CTO Kevin Scott emphasizes, they remain “open to all options” to ensure sufficient capacity is available.
Microsoft not only develops its own chips but also continues to invest in partnerships with Nvidia, AMD, and other suppliers. This multi-vendor strategy reduces the risk of dependence on a single supplier and allows it to react quickly to market changes. Amazon and Google pursue similar approaches, although they each have different priorities.
Another important aspect is geographic diversification. Given the NIMBY issues in established markets like Northern Virginia, hyperscalers are increasingly shifting their investments to secondary markets and overseas. This not only reduces costs but also regulatory risks.
Hyperscalers are also increasingly investing in energy efficiency and sustainable technologies. With data center energy consumption potentially doubling by 2028, this is both an economic and a regulatory necessity. Liquid cooling, more efficient chips, and renewable energy sources are becoming standard features.
Finally, many companies are developing more flexible business models. Instead of relying solely on their own facilities, they are increasingly using hybrid models with colocation providers and other partners. This allows them to scale up or down capacity more quickly, depending on market conditions.
What role do regulatory factors play?
Regulatory developments could play a crucial role in the future development of the data center market. In the US, there is growing support for stricter regulation of data center energy consumption. Some states are already considering moratoria for new large-scale consumers or stricter auditing procedures.
Environmental impacts are increasingly coming into focus. Data centers could account for 20 percent of global energy consumption by 2028, which could lead to stricter environmental regulations. The European Union has already introduced the Climate Neutral Data Center Pact, which has been joined by over 40 data center operators.
Geopolitical tensions also affect the industry. Potential tariffs on semiconductors could increase chip costs and disrupt supply chains. This could force hyperscalers to rethink their procurement strategies and rely more heavily on regional suppliers.
Data privacy and data sovereignty are also becoming important factors. Various countries require that certain data be processed locally, which limits the global scaling of data centers. This could lead to market fragmentation and reduce efficiency gains through economies of scale.
Regulation could also provide positive impetus. Investments in sustainable technologies and renewable energies are often subsidized by the government. Furthermore, regulatory requirements could drive forward standards that, in the long term, increase the efficiency of the entire industry.
Related to this:
Navigating between growth and risk
The data center industry is at a critical turning point. The development of proprietary chips by hyperscalers like Microsoft, Google, and Amazon is a logical response to skyrocketing costs and the limited availability of off-the-shelf solutions. This strategy offers significant economic advantages and enables greater control over the entire infrastructure.
At the same time, the risks of overcapacity are real and could lead to a significant market correction between 2026 and 2030. Warning signs are mounting, ranging from the slowing adoption of AI technologies to warnings from prominent industry representatives about a bubble. A potential consolidation would present both opportunities and challenges.
The future of the industry will hinge on whether the enormous investments in AI infrastructure prove to be sustainably profitable. Hyperscalers are preparing for various scenarios through diversification, geographic spread, and flexible business models. Regulatory developments, particularly in the environmental and energy sectors, will add further complexity.
For companies and investors, this means they must keep a close eye on both the enormous growth opportunities and the considerable risks. The winners will be those who can react flexibly to market changes while continuously increasing the efficiency of their operations. The next few years will show whether the current expansion rests on solid foundations or whether the warnings of a bubble prove true.
Your global marketing and business development partner
☑️ Our business language is English or German
☑️ NEW: Correspondence in your native language!
I and my team are happy to be available to you as your personal advisor.
You can contact me by filling out the contact form here wolfenstein@xpert.digital:or simply call me at +49 7348 4088 965. My email address is
I'm looking forward to our joint project.
☑️ SME support in strategy, consulting, planning and implementation
☑️ Creation or realignment of the digital strategy and digitization
☑️ Expansion and optimization of international sales processes
☑️ Global & Digital B2B trading platforms
☑️ Pioneer Business Development / Marketing / PR / Trade Fairs
Our global industry and economic expertise in business development, sales and marketing
Our global industry and economic expertise in business development, sales and marketing - Image: Xpert.Digital
Industry focus areas: B2B, digitalization (from AI to XR), mechanical engineering, logistics, renewable energies and industry
More information here:
A thematic hub offering insights and expertise:
- Knowledge platform covering global and regional economies, innovation and industry-specific trends
- A collection of analyses, insights, and background information from our key areas of focus
- A place for expertise and information on current developments in business and technology
- A hub for companies seeking information on markets, digitalization, and industry innovations

