DeepSeek V3.1 – Alarm for OpenAI & Co: Chinese open-source AI poses new challenges for established providers

Konrad Wolfenstein

12 months ago

DeepSeek V3.1 – Alarm for OpenAI & Co.: Chinese open-source AI poses new challenges for established providers – Image: Xpert.Digital

New AI model from China: This free model is 27 times cheaper and directly challenges ChatGPT

### Alarm for OpenAI & Co.: China's new AI is just as powerful – but dirt cheap. What's behind it? ### DeepSeek V3.1: The silent AI attack that's now turning the tech world upside down ### Forget expensive AI: Why this Chinese open-source model is changing everything ### China's new super-AI: How Beijing is putting pressure on the West with a radical free strategy ### Better and cheaper than the competition? What China's new wonder AI can really do ###

DeepSeek V3.1 revolutionizes (once again) the AI landscape

Chinese artificial intelligence is becoming a serious challenge to American tech giants. The Hangzhou-based startup DeepSeek has achieved a significant breakthrough with its latest model, V3.1, fundamentally challenging traditional assumptions about AI development and funding. This open-source model achieves the performance of leading proprietary systems at a fraction of the development costs and points the way to a new future for artificial intelligence.

Related to this:

DeepSeek: China's AI revolution under the shadow of surveillance – Serious allegations from Washington

Technical innovation with hybrid architecture

DeepSeek V3.1 is based on an advanced Mixture of Experts architecture with a total of 685 billion parameters, of which 37 billion are activated per token. This technology enables significantly more efficient resource utilization than traditional models without compromising performance.

The outstanding feature of the new model is its hybrid inference architecture, which can switch between a "think mode" and a "non-think mode." In think mode, the system develops deeper internal reasoning processes and is ideally suited for complex problem-solving that requires multi-stage logical thinking. In contrast, non-think mode delivers direct and concise answers for tasks where speed is crucial.

Another technical advancement is the expanded context window of 128,000 tokens, which corresponds to approximately 96,000 words or two 200-page novels. This capacity enables the processing of extremely long documents, the understanding of entire code repositories, and multi-stage dialogue scenarios.

The further development was achieved through a two-phase approach to context expansion. The 32,000-token phase was expanded tenfold to 630 billion tokens, while the 128,000-token phase was increased 3.3-fold to 209 billion tokens. Additionally, the model utilizes the UE8M0 FP8 data format for optimal compatibility with modern hardware architectures.

Impressive performance parameters and benchmarks

DeepSeek V3.1 achieves remarkable results in standardized tests. In the renowned Aider Coding Benchmark, the model scored 71.6 percent – a score that rivals leading models from OpenAI and Anthropic. This performance is particularly impressive given its significantly lower cost.

In mathematical tasks, DeepSeek V3.1 even surpasses established competitors. In the Math-500 test, the model achieves 90.2 percent, while GPT-4o only manages 74.6 percent. In the MMLU-Pro test, the system improved by 5.3 points to 81.2, and in the GPQA benchmark by a remarkable 9.3 points to 68.4.

Of particular note is the improvement in multi-stage reasoning tasks, where version 3.1 performs 43 percent better than its predecessor. The model's programming capabilities allow it to generate error-free code of up to 700 lines in length – a performance that rivals expensive proprietary solutions.

Revolutionary cost efficiency

DeepSeek V3.1's cost structure completely upends previous assumptions about AI development. While a programming task with V3.1 costs about one dollar, comparable systems charge almost 70 dollars for similar tasks. This dramatic cost reduction makes advanced AI technology accessible to smaller companies and developers.

According to the company, the development costs for the underlying V3 model amounted to only about $5.6 million – a fraction of the hundreds of millions of dollars that American companies spend on comparable projects. This efficiency was achieved through innovative training methods and the use of less powerful, but less expensive, hardware.

DeepSeek's API pricing significantly undercuts the competition. The chat model costs $0.07 per million input tokens for cache hits and $1.10 per million output tokens. The reasoning model costs $0.14 for input tokens and $2.19 for output tokens. In comparison, OpenAI charges around $2 to $2.50 per million output tokens, while DeepSeek charges only $0.014.

Strategic importance for global AI competition

DeepSeek's successes have far-reaching implications for the global AI landscape. The company demonstrates that advanced AI performance no longer requires the massive resources and proprietary approaches that have characterized American AI development to date. This development challenges the foundations of current business models.

China's leadership attaches high strategic importance to DeepSeek, as evidenced by the meeting between founder Liang Wenfeng and Premier Li Qiang. The company is seen as a key component in China's ambition to become a global leader in artificial intelligence by 2030.

DeepSeek's open-source strategy allows other companies and researchers worldwide to build on its advancements and develop their own innovations. This promotes a decentralized development of AI technology and reduces dependence on individual tech giants.

Background and company structure

DeepSeek was founded in Hangzhou in 2023 by Liang Wenfeng and is fully funded by the Chinese hedge fund High-Flyer. Wenfeng, born in 1985 as the son of a primary school teacher, developed an interest in the application of AI in the financial sector while studying at Zhejiang University.

In 2016, Wenfeng founded High-Flyer, a hedge fund that uses machine learning for quantitative trading strategies. By 2021, the company had fully transitioned to AI-powered trading approaches and become one of China's leading quant funds with over 100 billion RMB in assets under management.

Even before founding DeepSeek, Wenfeng began buying thousands of Nvidia GPUs – initially ridiculed as the eccentric hobby of a billionaire. This far-sighted investment in hardware later enabled the company to develop competitive AI models despite US export restrictions.

EU/DE Data Security | Integration of an independent and cross-data-source AI platform for all business needs

Independent AI platforms as a strategic alternative for European companies - Image: Xpert.Digital

AI Game Changer: The most flexible AI platform - Tailor-made solutions that reduce costs, improve your decisions and increase efficiency

Independent AI platform: Integrates all relevant company data sources

Rapid AI integration: Tailor-made AI solutions for businesses in hours or days, instead of months
Flexible infrastructure: Cloud-based or hosting in your own data center (Germany, Europe, free choice of location)

Maximum data security: its use in law firms is irrefutable proof
Deployment across a wide variety of enterprise data sources
Choice of own or different AI models (DE, EU, USA, CN)

More information here:

Independent AI platforms vs. hyperscalers: Which solution is the right fit?

Chips, algorithms, innovation: DeepSeek's path to the top of the world

Impact of US export controls

DeepSeek's success is particularly remarkable given the US export restrictions on high-performance AI chips to China. The sanctions were intended to limit China's ability to develop advanced AI systems, but DeepSeek demonstrates that innovative software approaches and efficient resource utilization can overcome these limitations.

The company used less powerful H800 chips, which are approved for export to China, but still achieved top performance through optimized algorithms and efficient training methods. This approach challenges the effectiveness of technological sanctions and demonstrates alternative paths to AI development.

Experts see DeepSeek's breakthrough as a turning point that could fundamentally change existing estimates of China's AI capabilities and potential. The development suggests that innovations in software optimization may be more important than sheer hardware superiority.

Related to this:

China's catch-up in artificial intelligence: The DeepSeek case and the strategic use of data

Open Source as a competitive advantage

DeepSeek's open-source strategy offers several strategic advantages. Developers and businesses worldwide can run, customize, and integrate the model locally into their own projects without relying on cloud services. This is particularly important for data-sensitive applications and companies that want to maintain control over their information.

Community-based development enables faster bug fixing, continuous improvements, and a broad base of contributors. At the same time, the open-source approach democratizes access to advanced AI technology and fosters innovation, including in smaller companies and developing countries.

Unlike proprietary models that are only accessible via APIs or cloud platforms, open-source AI offers long-term availability and independence from individual vendors. Users don't have to worry about price increases, access restrictions, or service discontinuations.

Technological breakthroughs and innovations

DeepSeek V3.1 integrates several groundbreaking technologies that enable its exceptional efficiency. The multi-head Latent Attention architecture compresses key-value caches using latent vectors, reducing memory consumption and computational overhead during inference.

The multi-token prediction method allows each token to predict multiple future tokens simultaneously. This overcomes a significant bottleneck of traditional autoregressive models and improves both accuracy and inference speed.

Using 8-bit training significantly reduces memory requirements and costs without compromising accuracy. This technique was long considered problematic, but DeepSeek demonstrates that, when implemented correctly, it yields results comparable to traditional methods.

Market reactions and impacts

The announcement of DeepSeek V3.1 triggered a fierce reaction in the financial markets. Nvidia lost over $600 billion in market capitalization – the largest single loss in the history of the US stock market. Other AI hardware companies also experienced significant share price declines.

Investors and analysts are rethinking their assessments of the AI industry. The assumption that massive investments in hardware and proprietary development are necessary prerequisites for cutting-edge AI is being challenged by DeepSeek's success.

Western companies are already testing DeepSeek models in their workflows. A prominent example is Merck, whose Chief Data Officer publicly demonstrated the integration of DeepSeek as one of several AI options in internal processes.

Future developments and outlook

DeepSeek positions version 3.1 as the first step towards the "agent age" of AI. The model has been specifically optimized for improved tool usage and multi-step agent tasks. The post-training optimizations have resulted in significant improvements in the use of external tools and complex search tasks.

DeepSeek's development speed suggests that a V4 model might be released before OpenAI's next R2 version. This dynamic could accelerate traditional AI industry development cycles and set new standards for update frequencies.

DeepSeek's successes are already inspiring other Chinese AI companies and researchers worldwide. Open-source models are increasingly seen as a valid alternative to proprietary solutions, which could lead to a more diversified and competitive AI landscape.

Challenges and criticisms

Despite its impressive achievements, DeepSeek has also drawn criticism. Like other Chinese AI models, DeepSeek is subject to certain censorship measures, which can be applied to politically sensitive topics. However, these restrictions can often be circumvented through technical adjustments.

Transparency regarding training data and methods is limited. There is speculation that the training is partly based on responses from ChatGPT, as DeepSeek occasionally claims to be ChatGPT itself. These ambiguities raise questions about originality and potential copyright issues.

The rapid development and low price of deepseeking models also raise concerns about the sustainability of the business model. Critics question whether the extremely low prices can be maintained in the long term or whether they are part of a strategic market penetration strategy.

Global implications for the AI industry

DeepSeek V3.1 marks a turning point in global AI development. The model proves that innovative software approaches and efficient resource utilization can be more important than massive capital investments and access to the latest hardware. This finding will influence the strategies of all major AI companies.

The democratization of advanced AI technology through open-source models could lead to a more even distribution of AI capabilities worldwide. Countries and companies previously excluded by high costs or technical barriers would gain access to cutting-edge technology.

At the same time, DeepSeek's success calls into question the effectiveness of technological sanctions and export controls. Its ability to achieve world-class performance with limited resources could encourage other countries to pursue similar approaches and develop their own AI ecosystems.

DeepSeek V3.1 represents more than just another AI model – it symbolizes a fundamental shift in how AI is developed, funded, and deployed. The combination of technological innovation, cost-effective development, and open-source availability creates new opportunities and poses serious challenges to established market leaders. Future developments will show whether this approach will shape the future of the AI industry.

We are here for you - Consulting - Planning - Implementation - Project Management

☑️ SME support in strategy, consulting, planning and implementation

☑️ Creation or realignment of the AI strategy

☑️ Pioneer Business Development

Konrad Wolfenstein

I would be happy to serve as your personal advisor.

You can contact me by filling out the contact form below or simply call me on +49 7348 4088 965 .

I'm looking forward to our joint project.

Write to me

➡️ Video call request 👩👱

Xpert.Digital - Konrad Wolfenstein

Xpert.Digital is a hub for industry focusing on digitalization, mechanical engineering, logistics/intralogistics and photovoltaics.

With our 360° Business Development solution, we support renowned companies from new business to after-sales.

Market intelligence, smarketing, marketing automation, content development, PR, mail campaigns, personalized social media and lead nurturing are part of our digital tools.

You can find more information at: www.xpert.digital - www.xpert.solar - www.xpert.plus

Keep in touch