⭐️ Artificial Intelligence (AI) - AI Blog, Hotspot and Content Hub ⭐️ XPaper

Available in 27 languages 📢

Who are the AI pioneers? A comprehensive analysis of the deep learning revolution

Published on: August 2, 2025 / Updated on: August 2, 2025 – Author: Konrad Wolfenstein

Who are the AI pioneers? A comprehensive analysis of the deep learning revolution – Image: Xpert.Digital

Forget ChatGPT: The 2017 Google paper 'Attention Is All You Need' is the real reason for the AI explosion

What is meant by the Deep Learning Era?

The Deep Learning Era refers to the period since 2010 in which the development of artificial intelligence has fundamentally accelerated due to several technological breakthroughs. This era marks a turning point in AI history, as for the first time the necessary prerequisites for training complex neural networks came together: sufficient computing power, large datasets, and improved algorithms.

The term deep learning refers to multi-layered neural networks that can automatically extract abstract features from data. Unlike previous approaches, these systems no longer need to be manually programmed to recognize specific features; instead, they learn these patterns independently from the training data.

Suitable for:

Simply explained AI models: Understand the basics of AI, voice models and Reasoning

Why did the Deep Learning Revolution begin in 2010?

The year 2010 was pivotal, as three critical developments converged. First, the ImageNet database was released, containing over 10 million labeled images in 1000 categories, thus providing for the first time a sufficiently large dataset for training deep neural networks.

Secondly, graphics processing units (GPUs) had become powerful enough to enable the parallel processing of large amounts of data. NVIDIA's CUDA platform, introduced in 2007, allowed researchers to perform the intensive calculations required for deep learning.

Third, algorithmic improvements, particularly the use of the ReLU activation function instead of traditional sigmoid functions, had significantly accelerated training. This convergence finally made it possible to put the theoretical foundations from the 1980s into practice.

What breakthrough marked the beginning of the Deep Learning Revolution?

The decisive breakthrough came on September 30, 2012, with AlexNet's victory in the ImageNet competition. The convolutional neural network, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, achieved a top-5 error rate of 15.3 percent, more than 10 percentage points better than the second-place algorithm.

AlexNet was the first successful combination of deep neural networks, large datasets, and GPU computing. Remarkably, the training took place on just two NVIDIA graphics cards in Krizhevsky's bedroom. This success proved to the scientific community that deep learning was not only theoretically interesting but also practically superior.

The success of AlexNet triggered a cascade of developments. As early as 2015, the SENet model, with an error rate of 2.25 percent, even surpassed the human recognition rate of ImageNet. This dramatic improvement within just a few years demonstrated the enormous potential of deep learning technology.

What role did the Transformer architecture play?

In 2017, a Google team published the groundbreaking paper “Attention Is All You Need,” which introduced the Transformer architecture. This architecture revolutionized natural language processing by relying entirely on attention mechanisms and eliminating the need for recurrent neural networks.

What makes transformers special is their ability for parallel processing: While earlier models had to work sequentially, word by word, transformers can process entire sentences simultaneously. The self-attention mechanism allows the model to understand the relationships between all the words in a sentence, regardless of their position.

The Transformer architecture became the foundation for all modern major language models, from BERT and GPT to Gemini. The original paper had been cited more than 173,000 times by 2025 and is considered one of the most influential scientific works of the 21st century.

Why is Google the leading AI pioneer?

According to an analysis by Epoch AI, Google leads the field by a wide margin with 168 "significant" AI models. This dominance can be explained by several strategic decisions the company made early on.

Google invested heavily in AI research as early as the 2000s and recognized the potential of neural networks early on. The acquisition of DeepMind in 2014 brought additional expertise to the company. Crucially, the release of the TensorFlow framework as open source in 2015 accelerated AI development worldwide.

Google's contribution to the Transformer architecture was particularly significant. The paper, published in 2017 by Google researchers, laid the foundation for today's generative AI. Building on this, Google developed BERT (2018), which revolutionized natural language processing, and later the Gemini models.

The close integration of research and product development at Google further contributed to its high visibility. AI models are directly integrated into Google services such as search, YouTube, and Android, which contributes to practical use and thus to the criteria for "noteworthy" models.

Suitable for:

AI and SEO with BERT – Bidirectional Encoder Representations from Transformers – Model in the field of natural language processing (NLP)

How did Microsoft, OpenAI and Meta develop?

Microsoft ranks second with 43 noteworthy AI models. The company benefited from its strategic partnership with OpenAI, in which Microsoft invested several billion dollars. This collaboration enabled Microsoft to integrate GPT models early into products like Bing and Copilot.

OpenAI, with 40 models, ranks third despite being founded only in 2015. The development of the GPT series, from GPT-1 (2018) to current models like GPT-4 and o3, established OpenAI as a leading developer of large language models. ChatGPT, released in 2022, reached one million users within five days, bringing AI to the public eye.

Meta (Facebook) developed the LLaMA series with 35 models as an open-source alternative to proprietary models. The LLaMA models, especially LLaMA 3 and the more recent LLaMA 4, demonstrated that open-source models can compete with proprietary solutions.

Suitable for:

As of September 2024: AI models in numbers: Top 15 major language models – 149 foundation models – 51 machine learning models

What makes an AI model “noteworthy”?

Epoch AI defines an AI model as “noteworthy” if it meets at least one of four criteria. First, it must demonstrate a technical improvement over a recognized benchmark. Second, it should achieve a high citation rate of over 1,000 citations. Third, historical relevance can be a criterion, even if the model is now technically obsolete. Fourth, significant practical use is considered.

This definition focuses not only on technological advancements but also on actual impact and relevance in the scientific and economic spheres. A model can therefore be considered noteworthy if it finds widespread practical application, even if it is not necessarily the most technologically advanced.

The Epoch AI database comprises over 2,400 machine learning models from 1950 to the present day, making it the largest publicly available collection of its kind. This comprehensive dataset enables a well-founded analysis of AI development over more than 70 years.

How did AI develop before the Deep Learning era?

The history of artificial intelligence before 2010 was characterized by cycles of optimism and disappointment. In the 1950s and 1960s, there was great optimism, symbolized by Frank Rosenblatt's Perceptron (1957). These early neural networks raised hopes for the imminent arrival of artificial intelligence.

The first AI winter began in the early 1970s, triggered by Marvin Minsky and Seymour Papert's book on the limits of perceptrons (1969). The 1973 Lighthill Report for the British Parliament led to drastic cuts in research funding. This period lasted until around 1980 and significantly slowed down AI research.

The 1980s saw a recovery through expert systems like MYCIN, a medical diagnostic system. At the same time, in 1986, Geoffrey Hinton, David Rumelhart, and Ronald Williams developed the backpropagation algorithm, which made neural networks trainable. As early as 1989, Yann LeCun developed LeNet, an early convolutional neural network for handwriting recognition.

The second AI winter followed in the late 1980s, when high expectations for expert systems and LISP machines were disappointed. This phase lasted into the 1990s and was characterized by skepticism towards neural networks.

What technological foundations enabled deep learning?

Three crucial breakthroughs enabled the deep learning revolution. The development of powerful GPUs was fundamental, as these enabled the parallel processing of large amounts of data. NVIDIA's CUDA platform from 2007 made GPU computing accessible for machine learning.

Large, high-quality datasets were the second requirement. ImageNet, published by Fei-Fei Li in 2010, was the first to offer a dataset with over 10 million labeled images. This amount of data was necessary to effectively train deep neural networks.

Algorithmic improvements formed the third pillar. Using the ReLU activation function instead of sigmoid functions significantly accelerated training. Improved optimization methods and regularization techniques such as dropout helped to solve the overfitting problem.

How have the computing costs for AI training developed?

The training costs for AI models have risen exponentially. The original Transformer model cost only $930 to train in 2017. BERT-Large already cost $3,300 in 2018, while GPT-3 consumed approximately $4.3 million in 2020.

Modern models reach even more extreme costs: GPT-4 cost an estimated $78.4 million, while Google's Gemini Ultra, at approximately $191.4 million, could be the most expensive model trained to date. This trend reflects the increasing complexity and size of the models.

According to Epoch AI, the computing power required for training doubles approximately every five months. This development far exceeds Moore's Law and demonstrates the rapid scaling of AI research. At the same time, this leads to a concentration of AI development in the hands of a few companies that possess the necessary resources.

Suitable for:

Comprehensive analysis of the global AI landscape: The current state of artificial intelligence (July 2025)

What challenges exist for further AI development?

AI development faces several significant challenges. Reasoning models optimized for complex logical thinking could reach their scaling limits as early as 2026. The enormous computing costs limit the pool of players who can participate in cutting-edge AI research.

Technical problems such as hallucinations, where AI systems generate false information, have not yet been fully solved. At the same time, ethical questions arise from the possibility of generating deceptively realistic content, as demonstrated by the viral AI image of the Pope in a down coat.

The availability of high-quality training data is increasingly becoming a bottleneck. Many models have already been trained using a large portion of the available internet data, necessitating new approaches to data generation.

How does AI development affect society?

The deep learning revolution is already having a massive societal impact. AI systems are being used in critical areas such as medical diagnostics, finance, and autonomous vehicles. The potential for positive change is enormous, ranging from accelerating scientific discoveries to personalizing education.

At the same time, new risks arise. The ability to create realistic fake content threatens information integrity. Jobs could be jeopardized by automation, with the Federal Ministry of Labor expecting that by 2035 no job will be possible without AI software.

The concentration of AI power in the hands of a few technology companies raises questions about democratic control of this powerful technology. Experts like Geoffrey Hinton, one of the pioneers of deep learning, have warned of the potential dangers of future AI systems.

The AI pioneers of the Deep Learning era have created a technology with the potential to fundamentally transform humanity. Google's leadership in developing 168 significant AI models, followed by Microsoft, OpenAI, and Meta, demonstrates the concentration of innovation power in the hands of a few key players. The Deep Learning revolution, which began in 2010 and was initiated by breakthroughs such as AlexNet and the Transformer architecture, has already changed our daily lives and will do so even more profoundly in the future. The challenge lies in harnessing this powerful technology for the benefit of humanity while simultaneously minimizing its risks.

Suitable for: