China & DeepSeek | Artificial Intelligence: How a new architecture is shaking up the chip market
Xpert pre-release
Language selection 📢
Published on: January 11, 2026 / Updated on: January 11, 2026 – Author: Konrad Wolfenstein

China & DeepSeek | Artificial Intelligence: Billions in Investments Useless? How a New Architecture is Shaking Up the Chip Market – Image: Xpert.Digital
The boomerang effect: How US sanctions enabled China's AI breakthrough
$294,000 instead of $100 million: The truth about DeepSeek's price war
The latest release from Chinese AI firm DeepSeek raises fundamental questions about the future of artificial intelligence. At the end of December 2025, the company presented a new training method (called Manifold-Constrained Hyper-Connections) that has the potential to reshape the entire industry. While Western tech giants are investing hundreds of billions of dollars in massive data centers and specialized chips, DeepSeek is demonstrating an alternative path based on architectural sophistication rather than sheer capital investment. This development could shake the economic foundations of the AI industry and usher in a transformation where success or failure is determined not by the mere availability of resources, but by engineering expertise.
The Chinese approach didn't arise from choice, but from necessity. Export restrictions imposed by the United States prevented Chinese companies from accessing Nvidia's most powerful AI chips. What initially appeared to be a strategic disadvantage became an accelerator for alternative development paths. DeepSeek had to achieve maximum performance with limited hardware, creating methods that now challenge the cost structure of the entire industry. The January 2025 release of the R1 model, which rivaled top-of-the-line American models but was developed at a fraction of the cost, sent shockwaves through the stock markets and forced analysts worldwide to rethink their valuation models.
Suitable for:
- DeepSeek V3.2: A competitor at the GPT-5 and Gemini-3 level AND deployable locally on your own systems! The end of gigabit AI data centers?
From hyper-connections to mathematical stability
The technical basis of the new DeepSeek method lies in the further development of networking within AI. Traditional neural networks use so-called residual connections – a kind of "shortcut" through which information is passed between layers of the network. These bridges make it possible to train deeper networks by preventing learning signals from fading along the way. DeepSeek's "hyper-connections" extend this concept by broadening the information flow between layers and allowing for more flexible patterns. This leads to performance improvements, but has a crucial drawback: the additional complexity compromises stability, as information is no longer passed through as reliably as with classical connections.
With traditional shortcuts, information remains largely unchanged as it travels through the network, resulting in stable training. The new hyper-connections sacrifice this characteristic for greater learning capability, but this leads to significant fluctuations when training large models. DeepSeek observed in experiments that error rates unexpectedly increased after approximately 12,000 training steps—a clear sign of instability. The control signals for the learning process behaved chaotically, making scaling up to more powerful models virtually impossible. Simultaneously, the wider connections increased data traffic, as more information had to be moved between memory and the processor.
DeepSeek's solution projects these complex connections into a controlled mathematical space (a "manifold") with fixed rules. This mathematical trick restores stability while preserving the benefits of richer information exchange. This space is defined by special matrices where the values balance out to maintain overall stability. While this constraint may sound technical, it has far-reaching practical consequences: it guarantees that signals are neither lost nor grow uncontrollably as they flow through the network.
Practical trials with a model of 27 billion parameters confirmed its effectiveness. Both the standard and stabilized hyper-connections outperformed the baseline model, but the stabilized version consistently achieved the best results. Training stability improved dramatically. While the standard model exhibited significant dropouts after 12,000 steps, training with the new method proceeded smoothly and closely followed the behavior of the stable baseline model. The learning signals remained within the normal range throughout the entire process, indicating a fundamental solution to the stability problem.
The performance gains don't come without a price, but the cost is surprisingly moderate. The method increases the computational effort by about 6.7 percent compared to the standard. This modest additional effort is negligible compared to the massive performance improvements, making the method one of the most efficient strategies in current research. DeepSeek also implemented rigorous infrastructure optimizations to reduce the load on data transmission paths. These optimizations are crucial because, with large models, the bottleneck is often not the computing power itself, but rather the speed of data transfer between memory and the processor.
Suitable for:
- NEW! DeepSeek OCR is China's quiet triumph: How an open-source AI is undermining US dominance in chips
The economic reality behind the headlines
The public discussion surrounding DeepSeek's costs was fraught with misunderstandings from the outset. When the company unveiled its R1 model in January 2025, figures circulating suggested training costs of less than six million dollars for the V3 base model. This was often compared to the estimated one hundred million dollars for OpenAI's GPT-4, creating the impression that DeepSeek had achieved a twenty-five-fold cost advantage. In September 2025, DeepSeek published an article in the journal Nature stating that the training costs for R1 were only 294,000 dollars. This figure once again dominated the media coverage and reinforced the perception of a fundamental cost advantage.
A closer analysis, however, reveals a more complex picture. The $294,000 refers exclusively to the so-called post-training phase, in which an already intelligent model is refined through practice and feedback. The actual total costs exceed $5.87 million for computing time alone, in addition to hardware investments of approximately $51 million. These figures still do not include the costs for research, data preparation, personnel, and failed experiments. When these factors are taken into account, the actual development costs are in a range that, while lower than comparable figures in the West, does not reach the dramatic magnitude of the often-cited numbers.
The cost structure of AI development is inherently difficult to grasp. OpenAI has never published precise figures for GPT-4. The often-cited estimate of $100 million comes from Sam Altman, who in 2023 spoke of costs for basic model training that were significantly higher. Analogous estimates for newer models like GPT-4o suggest that costs have decreased considerably due to modern techniques such as specialized expert networks, more efficient methods, and optimized infrastructure. Some analyses put the training costs for GPT-4o at between $5 and $16 million, which would mean that the cost difference to DeepSeek is considerably smaller than publicly perceived.
Nevertheless, DeepSeek's achievement remains remarkable. The company trained its V3 model with nearly 2.8 million GPU hours on 2,048 H800 chips over a two-month period. The H800 is a throttled version of Nvidia's H100 for the Chinese market, with its data transfer rate drastically reduced to comply with US export regulations. These chips are significantly less powerful than the originals used in Western data centers or the even newer Blackwell processors. The fact that DeepSeek was able to develop competitive models with this limited hardware is the real breakthrough.
The "mixture-of-experts" architecture plays a central role. DeepSeek V3 has a total of 671 billion parameters, but only activates 37 billion per word calculation. This means that only a fraction of the model is actually working on each query. The model consists of many specialized "experts" and a shared knowledge pool, with only a few specialists selected for each step. This design makes it possible to massively increase the model's knowledge without proportionally increasing computational costs. Each expert can specialize in specific topics, resulting in better performance and greater efficiency.
The challenge with this expert approach lies in load balancing. If some experts are constantly in demand while others remain idle, efficiency problems arise. Traditional approaches use so-called "penalty functions" that force the model to utilize all experts equally. However, this method often leads to poorer answers, as the best expert is not always selected. DeepSeek implemented a clever load-balancing strategy without such artificial penalties, ensuring a balanced expert utilization without compromising quality. This innovation was crucial for the successful scaling of the model.
China's strategic imperative to innovate
The development of DeepSeek cannot be understood in isolation from the geopolitical context. In October 2022, the United States dramatically tightened its export controls on AI chips and manufacturing equipment to China. These measures aimed to limit China's ability to develop advanced AI systems and their military applications. Nvidia was forced to develop chips specifically modified for the Chinese market. The A800 and H800 emerged as scaled-down versions of the top-of-the-line models, with reduced speeds just enough to comply with the US export restrictions.
In 2023, the US tightened controls again, blocking even these interim solutions. At the same time, export restrictions were imposed on high-performance memory, a critical component of modern AI chips. These measures forced Chinese companies to develop alternatives or resort to older, less efficient hardware. Huawei, once a global powerhouse in telecommunications, was effectively cut off from access to Western chip technology and forced to develop its own solutions. While Huawei's Ascend processors achieve only a fraction of the performance per chip compared to Nvidia, they can partially compensate for this through sheer volume.
Production figures illustrate the challenge. Huawei is expected to produce around 200,000 AI chips in 2025, while China has been able to legally import roughly one million modified Nvidia chips during the same period. Furthermore, the performance gap is widening. Analyses show that the best American chips are currently about five times more powerful than Huawei's best offerings, and this gap is expected to increase dramatically by 2027. Even if Huawei were to massively increase its production, the company would still not come close to matching the computing power that Nvidia delivers worldwide by 2027.
These restrictions forced Chinese developers to become radically efficient. DeepSeek founder Liang Wenfeng recognized this need early on and, as early as 2021, before the tightening of controls, purchased ten thousand Nvidia A100 GPUs. This forward-thinking investment gave DeepSeek a crucial advantage over competitors who later only had access to inferior hardware. The former hedge fund manager applied the same strategic foresight that had made him successful in the financial sector. His fund, High-Flyer, managed billions and was among the most technologically advanced financial companies in China.
The founding of DeepSeek in July 2023 was more than just an experiment. Liang saw the development of artificial general intelligence as the key technology project of the century and wanted to position China at the forefront of it. In an interview, he explained that young AI startups were well-positioned to compete with established corporations because the market was undergoing a fundamental transformation. The decisive factor, he argued, was not following old rules, but rather the ability to adapt flexibly to and respond to change.
This philosophy was reflected in DeepSeek's development approach. From the outset, the company focused on achieving maximum results with limited resources. While Western companies like OpenAI and Anthropic invested billions in ever-larger models and massive data centers, DeepSeek optimized architecture, training, and application for efficiency. The R1 model impressively demonstrated this strategy. It achieved results on mathematical tasks comparable to the best US models, but required an architecture that consumed significantly less computing power per answer.
A new dimension of digital transformation with 'Managed AI' (Artificial Intelligence) - Platform & B2B Solution | Xpert Consulting

A new dimension of digital transformation with 'Managed AI' (Artificial Intelligence) – Platform & B2B Solution | Xpert Consulting - Image: Xpert.Digital
Here you will learn how your company can implement customized AI solutions quickly, securely, and without high entry barriers.
A Managed AI Platform is your all-round, worry-free package for artificial intelligence. Instead of dealing with complex technology, expensive infrastructure, and lengthy development processes, you receive a turnkey solution tailored to your needs from a specialized partner – often within a few days.
The key benefits at a glance:
⚡ Fast implementation: From idea to operational application in days, not months. We deliver practical solutions that create immediate value.
🔒 Maximum data security: Your sensitive data remains with you. We guarantee secure and compliant processing without sharing data with third parties.
💸 No financial risk: You only pay for results. High upfront investments in hardware, software, or personnel are completely eliminated.
🎯 Focus on your core business: Concentrate on what you do best. We handle the entire technical implementation, operation, and maintenance of your AI solution.
📈 Future-proof & Scalable: Your AI grows with you. We ensure ongoing optimization and scalability, and flexibly adapt the models to new requirements.
More about it here:
The end of AI dominance: How a startup is thwarting the plans of Nvidia and OpenAI
Systemic disruptions and market reactions
The release of DeepSeek R1 in January 2025 sent shockwaves far beyond technical circles. The stock market reacted with losses for companies that had invested heavily in AI infrastructure. Nvidia, whose value was largely based on the assumption that demand for its expensive chips would continue to explode, lost value within days. Investors questioned whether the announced expenditure of hundreds of billions of dollars was even necessary if a Chinese startup could achieve comparable results with a fraction of that amount.
The reaction from the Chinese tech giants was immediate and decisive. ByteDance, Tencent, Baidu, and Alibaba drastically reduced the prices of their AI services. ByteDance's Doubao model became almost 99 percent cheaper year-over-year. These price cuts led to a massive surge in usage. Daily queries jumped from 120 billion to over 500 billion within a few months. The overall market for AI services in China was valued at relatively small sums, suggesting extremely low margins given the enormous volume of usage.
These figures illustrate a problem: Competition is shifting from the quality of AI to infrastructure efficiency and price. Alibaba Cloud, the market leader in China, nevertheless announced billions in investments in AI infrastructure. ByteDance is also planning massive chip purchases. Tencent, which lagged somewhat behind in chip procurement, is compensating for this through leased computing capacity and the use of DeepSeek's efficient technology.
The market consolidation is accelerating. Experts predict that the field of Chinese AI providers will narrow to a few major players. The winners will be those who make their technology the standard by combining performance with practical applications. This process mirrors developments in other technology sectors, where a period of rapid innovation is followed by consolidation, with only companies possessing the best combination of technology, scale, and market power surviving.
A similar trend is unfolding in the West. OpenAI's dominance is measurably waning. ChatGPT's market share has fallen significantly, while Google Gemini has gained ground. This shift is more than just a statistical fluctuation. It signals that the advantage of being "first to market" is diminishing, while competitors with established platforms are catching up. Google can integrate its AI directly into Search and Android, which represents a structural advantage over a pure AI provider.
Pricing reflects this dynamic. Western providers like Anthropic and OpenAI have also lowered their prices and introduced more efficient model variants. The price per million processed words has fallen dramatically in the last two years. This development suggests that AI is becoming a mass-market commodity. Once several providers offer similar quality, price will become the decisive factor, reducing profits and making scale even more important.
Suitable for:
- Which is better: Decentralized, federated, antifragile AI infrastructure or AI Gigafactory or hyperscale AI data center?
Limits of the Reasoning Revolution
Parallel to the increase in efficiency, a development took place that initially seemed like the next major breakthrough. So-called "reasoning models," which take more time to think about problems and explicitly work through their steps, achieved spectacular results. OpenAI's o1, DeepSeek's R1, and similar models demonstrated impressive capabilities in mathematics and programming. The idea is simple: if you give the model more time to "think" and allow it to formulate the solution path, the answers should improve.
However, in June 2025, Apple published a study that revealed limitations. Researchers tested state-of-the-art models with logic puzzles whose difficulty could be precisely controlled. The results were sobering: The models exhibited contradictory behavior. Their processing effort initially increased with complexity, but then decreased again at a certain point, even though they had enough time – and the solutions became incorrect.
The study identified three phases. For simple problems, normal language models were often better and more economical than the "thinking" models. For moderately difficult problems, the thinking processes offered clear advantages. However, for highly complex problems, both types of models completely broke down. They not only failed by a narrow margin, but were incapable of finding even remotely correct solutions.
What was particularly worrying was that even providing the correct solution formula hardly helped. The models still failed at similar levels of difficulty. This suggests that the problems run deeper: The models struggle to strictly execute logical steps and to check their own reasoning.
The analysis of the "thought protocols" revealed patterns. For simple problems, the models found the solution early on, but then continued to delve into unnecessary details. With high complexity, they often got lost on the wrong path. Beyond a certain level of difficulty, they were no longer able to generate correct approaches at all. They often fixated on early, incorrect ideas and wasted their computing time justifying them instead of correcting the error.
Another study warned that the improvement of these models could soon stagnate. While they achieve better scores in tests due to massive computational effort, this makes them slow and expensive. The economic consequences are significant: "Thinking" models cost many times more to operate than standard versions. If these models fail to deliver the expected breakthroughs and reach their limits, the question arises whether the high investments are justified. The finding that simpler models are often more efficient suggests that in the future, it will be necessary to choose more precisely which tool is best suited to which task.
Suitable for:
Infrastructure race and energy hunger
Despite more efficient software, the industry's resource consumption is increasing. Forecasts indicate that data center electricity demand will rise dramatically by the end of the decade. The share of AI applications in global data center electricity consumption could double. Gigantic sums are being invested to meet this demand—trillions of dollars worldwide. Initiatives like OpenAI's "Stargate" and its partners, or European investment programs, reflect the sheer scale of the challenge.
The regional distribution is shifting. While Asia and North America are currently leading the way, the majority of new capacity will be built in the USA. Europe is also planning massive expansions, which could significantly increase the continent's electricity demand.
At the same time, the power density in data centers is increasing. Since AI chips generate an enormous amount of heat in a small space, cooling is becoming an ever greater challenge. Conventional air conditioning systems are often no longer sufficient, which is why sophisticated liquid cooling systems are needed, which in turn are expensive and complex.
The market is showing signs of overheating. Data center utilization is increasing, driving up prices. This is not expected to ease until more construction projects are completed or the growth in AI demand slows. However, if efficient methods like DeepSeek's become widespread, the need for new data centers could be lower than anticipated. This would call into question the planned massive investments and lead to overcapacity – a risk for anyone who has bet on steadily increasing hardware demand.
National strategies and technological sovereignty
DeepSeek's development is closely linked to China's pursuit of independence. Five-year plans have prioritized semiconductors, and the goal of self-sufficiency is being pursued with enormous effort. New regulations are forcing Chinese chip manufacturers to use more domestically produced machinery. A state-owned fund is investing the equivalent of nearly $50 billion in the local chip industry to reduce dependence on the West.
This policy is having an effect, in some cases not as intended. Previously, Chinese factories favored US equipment. However, due to US sanctions, they no longer had a choice and had to work with domestic suppliers, which accelerated their development. China could soon control a large share of global production of simpler chips used in cars and household appliances.
However, the gap remains significant when it comes to top-tier AI. Huawei's chips cannot compete with Nvidia's in terms of performance, and production volumes are far too low. Even massive increases in production would not close the gap for years. Since the demand for computing power is growing faster than Chinese production, the shortage is only likely to worsen.
This necessitates creative solutions. DeepSeek's success is also based on its timely acquisition of Nvidia chips. Others resort to smuggling routes or indirect methods. The government is responding with countermeasures, such as export restrictions on rare earth elements and investigations into Western tech companies. The pressure on Chinese corporations to purchase domestically produced chips is growing, even if these are technically inferior.
Regulatory landscape and global governance
While the US and China are engaged in a technology race, the EU is focusing on regulation. The "AI Act" is the world's first comprehensive AI law. It prohibits particularly risky applications and establishes strict rules for powerful AI models. Violations are subject to heavy fines.
The European approach attempts to set ethical standards without stifling innovation. Critics fear disadvantages for European companies, while proponents see a long-term advantage in terms of trust and security. Globally, however, regulation remains a patchwork. The US relies on voluntary commitments, while China prioritizes state control. This fragmentation makes it difficult to establish common standards.
The issue of AI security is coming into focus. Experts warn of the risks posed by superhuman intelligence. The timelines for achieving such "artificial general intelligence" (AGI) have shortened. Leading developers are no longer talking about decades, but just a few years. Whether this is realistic or merely marketing hype remains to be seen, but the industry is preparing for it.
Failed models and strategic realignment
The delay of DeepSeek's successor model, the R2, shows that success is not guaranteed. Originally planned for an earlier release, it encountered problems. Attempts to train the model on Chinese Huawei chips apparently failed despite assistance from Huawei engineers.
The company therefore continues to use its existing Nvidia stock for training, but increasingly has to rely on Huawei for the application of the models – a politically mandated compromise. The delays caused user interest to temporarily plummet, as the competition was not idle.
Another problem is data. Reaching the next level requires more and better training data. In English-speaking countries, this is readily available online. In China, access to high-quality data is more difficult, partly due to censorship and partly because much content is not publicly accessible. Combined with inferior hardware, this slows development. If training takes longer and becomes more challenging, the cost advantage diminishes.
Structural change in the AI industry
The industry is facing a transformation. The previous motto of "more is better"—more data, more chips, more money—is reaching its limits or becoming prohibitively expensive. DeepSeek has demonstrated that intelligent architecture can be more important than raw power.
This has consequences for investors. Those who have poured billions into hardware could face problems if more efficient software reduces demand. At the same time, new players have a chance because you no longer necessarily need a fortune to participate.
As AI performance becomes increasingly cheaper and more similar, the model itself is no longer the only factor; what matters is how well it's integrated into products. Google and Microsoft have an advantage here because they already have users. Pure AI startups face greater challenges. Open source, or freely available software, is playing an increasingly important role. Models like those from DeepSeek or Meta are accessible to everyone, which accelerates innovation.
At the same time, investors are wondering when the money will start flowing back. ChatGPT has many users, but costs a fortune. Big profits are still a long way off. New jobs for AI experts are emerging in the labor market, while simple office tasks are being automated – a societal challenge for which there are still no easy solutions.
After the AI hype: Now the real battle for monetization begins
DeepSeek's innovations mark a turning point. They prove that world-class technology can be built even with limited resources. This challenges the assumption that only the wealthiest US corporations can win. It shifts the competition from "Who has the most money?" to "Who has the best engineers?".
Geopolitically, it's clear that sanctions can slow progress, but they can also force innovation. China is building its own industry under pressure. Economically, we're only at the beginning. Prices are falling, and the models are becoming everyday commodities. Those who want to win in the future must not only build good AI, but also be able to make money with it.
Technical hurdles remain. Current methods are reaching their limits, and whether we will truly see human-like intelligence anytime soon is uncertain. The next few years will show whether the industry overcomes these obstacles or whether the hype fizzles out. Perhaps DeepSeek's most important lesson isn't technical at all, but strategic: there's always another way if you're forced to find it.
Your global marketing and business development partner
☑️ Our business language is English or German
☑️ NEW: Correspondence in your national language!
I would be happy to serve you and my team as a personal advisor.
You can contact me by filling out the contact form or simply call me on +49 7348 4088 965 (Munich) . My email address is: wolfenstein ∂ xpert.digital
I'm looking forward to our joint project.
☑️ SME support in strategy, consulting, planning and implementation
☑️ Creation or realignment of the digital strategy and digitalization
☑️ Expansion and optimization of international sales processes
☑️ Global & Digital B2B trading platforms
☑️ Pioneer Business Development / Marketing / PR / Trade Fairs
🎯🎯🎯 Benefit from Xpert.Digital's extensive, five-fold expertise in a comprehensive service package | BD, R&D, XR, PR & Digital Visibility Optimization

Benefit from Xpert.Digital's extensive, fivefold expertise in a comprehensive service package | R&D, XR, PR & Digital Visibility Optimization - Image: Xpert.Digital
Xpert.Digital has in-depth knowledge of various industries. This allows us to develop tailor-made strategies that are tailored precisely to the requirements and challenges of your specific market segment. By continually analyzing market trends and following industry developments, we can act with foresight and offer innovative solutions. Through the combination of experience and knowledge, we generate added value and give our customers a decisive competitive advantage.
More about it here:
























