DeepSeek-R1-0528: DeepSeek Update brings Chinese AI model back on par with Western industry leaders

Konrad Wolfenstein

1 year ago

DeepSeek-R1-0528: DeepSeek update brings Chinese AI model back on par with Western industry leaders – Image: Xpert.Digital

Open-source AI at its limit: DeepSeek overshadows OpenAI and Google

From 60 to 68: DeepSeek catapults Chinese AI back to the top

Chinese AI startup DeepSeek reached a significant milestone with the release of DeepSeek-R1-0528 on May 28, 2025, redefining the global AI landscape. The update to the open-source reasoning model demonstrates dramatic performance improvements, positioning DeepSeek for the first time on par with OpenAI's o3 and Google Gemini 2.5 Pro. Particularly noteworthy is that this peak performance is achieved at a fraction of the cost and with completely open model weights, raising fundamental questions about the future of proprietary AI systems. The independent rating platform Artificial Analysis scored the new model 68 points – a jump from 60 to 68 points that corresponds to the performance difference between OpenAI o1 and o3.

Related to this:

DeepSeek and Alibaba: A breakthrough at the specialist level? The Chinese AI push in healthcare.

The update and its technical improvements

DeepSeek-R1-0528 represents a substantial enhancement that achieves significant performance improvements through algorithmic optimizations and increased use of computational resources in post-training, without altering the underlying architecture. The update primarily focuses on improving reasoning capabilities, enabling, according to DeepSeek, "significantly deeper thinking processes." A particularly impressive example of this improvement is seen in the AIME 2025 mathematics test, where accuracy increased from 70 percent to 87.5 percent. Simultaneously, the average number of tokens per question increased from 12,000 to 23,000 tokens, indicating more intensive processing.

In addition to reasoning improvements, the update introduces important new functionalities, including JSON output and function calls, an optimized user interface, and reduced hallucinations. These enhancements make the model significantly more practical for developers and considerably expand its scope. Availability remains unchanged: Existing API users will receive the update automatically, while the model weights will continue to be available under the open MIT license on Hugging Face.

Benchmark performance and performance comparisons

The benchmark results for DeepSeek-R1-0528 show impressive improvements across all evaluation categories. In mathematical tasks, the AIME-2024 score increased from 79.8 to 91.4 percent, HMMT-2025 from 41.7 to 79.4 percent, and CNMO-2024 from 78.8 to 86.9 percent. These results position the model as one of the most powerful AI systems for mathematical problem-solving worldwide.

DeepSeek-R1-0528 also shows significant progress in programming benchmarks. LiveCodeBench improved from 63.5 to 73.3 percent, Aider-Polyglot from 53.3 to 71.6 percent, and SWE Verified from 49.2 to 57.6 percent. The Codeforces rating climbed from 1,530 to 1,930 points, placing the model among the top algorithmic problem solvers. Compared to competing models, DeepSeek-R1 achieves 49.2 percent in SWE Verified, placing it just ahead of OpenAI o1-1217 with 48.9 percent, while in Codeforces, with 96.3 percentiles and an Elo rating of 2,029 points, it comes very close to OpenAI's leading model.

General knowledge and logic tests confirm the broad performance improvement: GPQA-Diamond increased from 71.5 to 81.0 percent, Humanity's Last Exam from 8.5 to 17.7 percent, MMLU-Pro from 84.0 to 85.0 percent, and MMLU-Redux from 92.9 to 93.4 percent. Only OpenAI's SimpleQA showed a slight decline from 30.1 to 27.8 percent. These comprehensive improvements demonstrate that DeepSeek-R1-0528 is competitive not only in specialized areas but across the entire spectrum of cognitive tasks.

Technical architecture and innovations

The technical foundation of DeepSeek-R1-0528 is based on a sophisticated MoE (Mixture of Experts) architecture with 37 billion active parameters out of a total of 671 billion parameters and a context length of 128,000 tokens. The model implements advanced reinforcement learning to achieve self-verification, multi-stage reflection, and human-like reasoning capabilities. This architecture enables the model to tackle complex reasoning tasks through iterative thinking processes, which distinguishes it from traditional language models.

A particularly innovative aspect is the development of a distilled variant, DeepSeek-R1-0528-Qwen3-8B, which was created by distilling the thought process of DeepSeek-R1-0528 for post-training Qwen3-8B-Base. This smaller version achieves impressive performance with significantly lower resource requirements and runs on GPUs with 8-12 GB of VRAM. In the AIME 2024 test, the model achieved state-of-the-art performance among open-source models, with a 10 percent improvement over Qwen3-8B and comparable performance to Qwen3-235B-Thinking.

The development methodology shows that DeepSeek increasingly relies on post-training with reinforcement learning, which led to a 40% increase in token consumption during the evaluation – from 71 to 99 million tokens. This suggests that the model is generating longer and deeper answers without requiring fundamental architectural changes.

Market position and competitive dynamics

DeepSeek-R1-0528 is establishing itself as a serious competitor to the leading proprietary models of Western technology companies. According to Artificial Analysis, the model scores 68 points, placing it on par with Google's Gemini 2.5 Pro and ahead of models like xAI's Grok 3 mini, Meta's Llama 4 Maverick, and Nvidia's Nemotron Ultra. In the code category, DeepSeek-R1-0528 achieves a level just below OpenAI's o4-mini and o3.

The release of the update has had a significant impact on the global AI landscape. The initial release of DeepSeek-R1 in January 2025 already led to a slump in technology stocks outside of China and challenged the assumption that scaling AI requires enormous computing power and investment. Western competitors responded swiftly: Google introduced discounted access rates for Gemini, while OpenAI lowered prices and introduced an o3 Mini model that requires less computing power.

Interestingly, text style analyses from EQBench show that DeepSeek-R1's style is more strongly influenced by Google than by OpenAI, suggesting that more synthetic Gemini outputs may have been used in its development. This observation underscores the complex influences and technology transfers between different AI developers.

Cost efficiency and availability

A key competitive advantage of DeepSeek-R1-0528 lies in its exceptional cost efficiency. Its pricing structure is significantly more favorable than OpenAI's: Input tokens cost $0.14 per million tokens for cache hits and $0.55 for cache misses, while output tokens cost $2.19 per million tokens. In comparison, OpenAI o1 charges $15 for input tokens and $60 for output tokens per million, making DeepSeek-R1 90-95 percent cheaper.

Microsoft Azure also offers DeepSeek-R1 at competitive prices: The global version costs $0.00135 for input tokens and $0.0054 for output tokens per 1,000 tokens, while the regional version has slightly higher prices. This pricing makes the model particularly attractive for companies and developers who want to leverage high-quality AI functionalities without the high costs of proprietary solutions.

Its availability as an open-source model under the MIT license also allows for commercial use and modification without license fees. Developers can run the model locally or use it via various APIs, offering flexibility and control over the implementation. For users with limited resources, a distilled 8-billion-parameter version is available, which runs on consumer hardware with 24 GB of memory.

Related to this:

China's catch-up in artificial intelligence: The DeepSeek case and the strategic use of data

China's AI catch-up: What DeepSeek's success means

DeepSeek-R1-0528 marks a turning point in global AI development, demonstrating that Chinese companies can develop models that compete with the best Western systems despite US export restrictions. The update proves that significant performance improvements are possible without fundamental architectural changes when post-training optimizations and reinforcement learning are effectively employed. The combination of peak performance, drastically reduced costs, and open-source availability fundamentally challenges established business models in the AI industry.

The reactions of Western competitors to DeepSeek's success are already showing initial market changes: price reductions from OpenAI and Google, as well as the development of more resource-efficient models. With the anticipated release of DeepSeek-R2, originally planned for May 2025, this competitive pressure could intensify further. The success story of DeepSeek-R1-0528 illustrates that innovation in AI does not necessarily require massive investments and computing resources, but can be achieved through clever algorithms and efficient development methods.

Related to this:

Your AI transformation, AI integration and AI platform industry expert

☑️ Our business language is English or German

☑️ NEW: Correspondence in your native language!

Konrad Wolfenstein

I and my team are happy to be available to you as your personal advisor.

You can contact me by filling out the contact form here wolfenstein@xpert.digital:or simply call me at +49 7348 4088 965. My email address is

I'm looking forward to our joint project.

DeepSeek-R1-0528: DeepSeek Update brings Chinese AI model back on par with Western industry leaders

Open-source AI at its limit: DeepSeek overshadows OpenAI and Google

From 60 to 68: DeepSeek catapults Chinese AI back to the top

The update and its technical improvements