⭐️ Artificial intelligence (KI) – blog, hotspot and content hub ⭐️ Xpaper

Language selection 📢

Who are the AI trailblazers? A comprehensive analysis of the deep learning revolution

Published on: August 2, 2025 / Updated on: August 2, 2025 – Author: Konrad Wolfenstein

Who are the AI pioneers? A comprehensive analysis of the deep learning revolution – Image: Xpert.Digital

Forget ChatGPT: The 2017 Google paper 'Attention Is All You Need' is the real reason for the AI explosion

What is the Deep Learning Era?

The Deep Learning Era refers to the period since 2010 in which the development of artificial intelligence has fundamentally accelerated due to several technological breakthroughs. This era marks a turning point in AI history, as the necessary prerequisites for training complex neural networks came together for the first time: sufficient computing power, large amounts of data, and improved algorithms.

The term deep learning refers to multi-layered neural networks that can automatically extract abstract features from data. Unlike previous approaches, these systems no longer need to be manually programmed to identify the features they should recognize; instead, they learn these patterns independently from the training data.

Suitable for:

Simply explained AI models: Understand the basics of AI, voice models and Reasoning

Why did the deep learning revolution begin in 2010?

The year 2010 was pivotal, as three critical developments converged. First, the ImageNet database was released, containing over 10 million labeled images in 1,000 categories, thus providing, for the first time, a sufficiently large dataset for training deep neural networks.

Second, graphics processing units (GPUs) had become powerful enough to enable parallel processing of large amounts of data. NVIDIA's CUDA platform, introduced in 2007, allowed researchers to perform the intensive computations required for deep learning.

Third, algorithmic improvements, particularly the use of the ReLU activation function instead of traditional sigmoid functions, significantly accelerated training. This convergence finally made it possible to implement the theoretical foundations from the 1980s in practice.

Which breakthrough marked the beginning of the deep learning revolution?

The decisive breakthrough came on September 30, 2012, with AlexNet's victory in the ImageNet competition. The convolutional neural network developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton achieved a top-5 error rate of 15.3 percent, more than 10 percentage points better than the second-place algorithm.

AlexNet was the first to successfully combine deep neural networks, large datasets, and GPU computing. Remarkably, training took place on just two NVIDIA graphics cards in Krizhevsky's bedroom. This success proved to the scientific community that deep learning was not only theoretically interesting, but practically superior.

The success of AlexNet triggered a cascade of developments. As early as 2015, the SENet model even surpassed the human recognition rate of ImageNet, with an error rate of 2.25 percent. This dramatic improvement within just a few years demonstrated the enormous potential of deep learning technology.

What role did the Transformer architecture play?

In 2017, a Google team published the groundbreaking paper "Attention Is All You Need," which introduced the Transformer architecture. This architecture revolutionized natural language processing by relying entirely on attention mechanisms and eliminating the need for recurrent neural networks.

What's special about Transformers is their ability to process data in parallel: While previous models had to work sequentially, word by word, Transformers can process entire sentences simultaneously. The self-attention mechanism allows the model to understand the relationships between all words in a sentence, regardless of their position.

The Transformer architecture became the foundation for all modern large-scale language models, from BERT to GPT to Gemini. The original paper was cited more than 173,000 times by 2025 and is considered one of the most influential scientific works of the 21st century.

Why is Google the leading AI pioneer?

According to Epoch AI's analysis, Google leads the field by a wide margin with 168 "notable" AI models. This dominance can be explained by several strategic decisions the company made early on.

Google invested heavily in AI research as early as the 2000s and recognized the potential of neural networks early on. The acquisition of DeepMind in 2014 brought additional expertise to the company. The release of the TensorFlow framework as open source in 2015 was also crucial, accelerating AI development worldwide.

Google's contribution to the Transformer architecture was particularly significant. The paper, published in 2017 by Google researchers, laid the foundation for today's generative AI. Building on this, Google developed BERT (2018), which revolutionized natural language processing, and later the Gemini models.

The close integration of research and product development at Google also contributed to the high visibility. AI models are integrated directly into Google services such as Search, YouTube, and Android, which contributes to practical use and thus meets the criteria for "notable" models.

Suitable for:

KI and SEO with Bert – Bidirectional Encoder Representations from Transformers – Model in the field of natural language processing (NLP)

How did Microsoft, OpenAI and Meta develop?

Microsoft ranks second with 43 notable AI models. The company benefited from its strategic partnership with OpenAI, in which Microsoft invested several billion dollars. This collaboration enabled Microsoft to integrate GPT models early on into products such as Bing and Copilot.

OpenAI ranks third with 40 models, despite only being founded in 2015. The development of the GPT series, from GPT-1 (2018) to current models such as GPT-4 and o3, established OpenAI as a leading developer of large language models. ChatGPT, released in 2022, reached one million users within five days and brought AI into the public eye.

Meta (Facebook) developed the LLaMA series, consisting of 35 models, as an open-source alternative to closed models. The LLaMA models, especially LLaMA 3 and the newer LLaMA 4, demonstrated that open-source models can also compete with proprietary solutions.

Suitable for:

As of September 2024: AI models in numbers: Top 15 large language models – 149 base models / “foundation models” – 51 machine learning models

What makes an AI model “worth mentioning”?

Epoch AI defines an AI model as "worthy of note" if it meets at least one of four criteria. First, it must achieve a technical improvement over a recognized benchmark. Second, it should achieve a high citation frequency of over 1,000 citations. Third, historical relevance can be a criterion, even if the model is now technically outdated. Fourth, significant practical use is taken into account.

This definition focuses not only on technological advancement, but also on actual impact and relevance in the scientific and economic environment. Thus, a model can be considered noteworthy if it finds widespread practical application, even if it is not necessarily the most technically advanced.

The Epoch AI database includes over 2,400 machine learning models from 1950 to the present, making it the largest publicly available collection of its kind. This comprehensive database enables in-depth analysis of AI development over more than 70 years.

How did AI develop before the deep learning era?

The history of artificial intelligence before 2010 was characterized by cycles of optimism and disappointment. The 1950s and 1960s saw great optimism, symbolized by Frank Rosenblatt's perceptron (1957). These early neural networks sparked hopes for the imminent advent of artificial intelligence.

The first AI winter began in the early 1970s, triggered by Marvin Minsky and Seymour Papert's book on the limits of perceptrons (1969). The 1973 Lighthill Report to the British Parliament led to drastic cuts in research funding. This phase lasted until around 1980 and significantly slowed AI research.

The 1980s saw a resurgence thanks to expert systems such as MYCIN, a medical diagnostic system. At the same time, Geoffrey Hinton, David Rumelhart, and Ronald Williams developed the backpropagation algorithm in 1986, which made neural networks trainable. Yann LeCun developed LeNet, an early convolutional neural network for handwriting recognition, as early as 1989.

The second AI winter followed in the late 1980s, when the high expectations for expert systems and LISP machines were dashed. This phase lasted until the 1990s and was characterized by skepticism toward neural networks.

What technological foundations made deep learning possible?

Three key breakthroughs enabled the deep learning revolution. The development of powerful GPUs was fundamental, as they enabled the parallel processing of large amounts of data. NVIDIA's CUDA platform in 2007 made GPU computing accessible for machine learning.

Large, high-quality datasets were the second prerequisite. ImageNet, published in 2010 by Fei-Fei Li, was the first to offer a dataset with over 10 million labeled images. This amount of data was necessary to effectively train deep neural networks.

Algorithmic improvements formed the third pillar. Using the ReLU activation function instead of sigmoid functions significantly accelerated training. Improved optimization procedures and regularization techniques such as dropout helped solve the overfitting problem.

How did the computing costs for AI training develop?

The cost of training AI models has risen exponentially. The original Transformer model cost only $930 to train in 2017. BERT-Large cost $3,300 in 2018, while GPT-3 cost approximately $4.3 million in 2020.

Modern models reach even more extreme costs: GPT-4 cost an estimated $78.4 million, while Google's Gemini Ultra, at approximately $191.4 million, may be the most expensive model trained to date. This trend reflects the increasing complexity and size of the models.

According to Epoch AI, the computing power required for training doubles approximately every five months. This development far exceeds Moore's Law and demonstrates the rapid scaling of AI research. At the same time, it leads to a concentration of AI development in the hands of a few companies with the necessary resources.

Suitable for:

Comprehensive analysis of the global AI landscape: the current state of artificial intelligence (July 2025)

What challenges exist for further AI development?

AI development faces several significant challenges. Reasoning models optimized for complex logical reasoning could reach their scaling limits as early as 2026. The enormous computational costs limit the circle of actors who can participate in cutting-edge AI research.

Technical problems such as hallucinations, where AI systems generate false information, have not yet been fully resolved. At the same time, ethical questions arise from the possibility of generating deceptively real content, as demonstrated by the viral AI image of the Pope in a down coat.

The availability of high-quality training data is becoming an increasing bottleneck. Many models have already been trained using a large portion of available internet data, requiring new approaches to data generation.

How does AI development affect society?

The deep learning revolution is already having a massive societal impact. AI systems are being used in critical areas such as medical diagnostics, finance, and autonomous vehicles. The potential for positive change is enormous, from accelerating scientific discoveries to personalizing education.

At the same time, new risks are emerging. The ability to create realistic fake content threatens information integrity. Jobs could be jeopardized by automation, with the German Federal Ministry of Labor expecting that by 2035, no job will be without AI software.

The concentration of AI power in a few technology companies raises questions about democratic control of this powerful technology. Experts like Geoffrey Hinton, one of the pioneers of deep learning, have warned of the potential dangers of future AI systems.

The AI pioneers of the Deep Learning Era have created a technology that has the potential to fundamentally transform humanity. Google's leadership in the development of 168 notable AI models, followed by Microsoft, OpenAI, and Meta, demonstrates the concentration of innovation power among a few players. The Deep Learning Revolution, which has been ongoing since 2010 and initiated by breakthroughs such as AlexNet and the Transformer architecture, has already transformed our daily lives and will do so even more in the future. The challenge is to harness this powerful technology for the benefit of humanity while minimizing its risks.

Suitable for: