The Robotics AI system “Helix” by Figure Ai for Humanoid Robot-A Vision Language Action (VLA) Model
Xpert pre-release
Language selection 📢
Published on: February 28, 2025 / update from: February 28, 2025 - Author: Konrad Wolfenstein
The Robotics AI system “Helix” by Figure Ai for Humanoid Robot-A Vision-Language-Action (VLA) Model-Image: Xpert.Digital
Helix: The AI system that brings humanoid robots to a new level
Short version: Vision, language, movement: Helix as a milestone in robotics
Helix is an innovative AI system for humanoid robots developed by Figure Ai. It is a Vision-Language-Action (VLA) model that combines visual perception, understanding of language and precise motor control in a single system. Helix marks significant progress in the development of flexible robot systems for unstructured environments such as households. With its ability to carry out complex tasks without prior training, it could revolutionize the interaction between man and machine.
Suitable for:
- Language -controlled robot: Helix from Figure Ai changes everything! Industry, household, future - understanding, learning, executing in real time
HELIX skills
- Real-time control of the entire upper body of humanoid robots, including 35 movement axes
- Processing of voice input and visual information on the execution of complex tasks
- Detection and handling of unknown objects without specific training
- Cooperation between several robots in the execution of tasks
- Execution of household tasks such as the clearing of a refrigerator
Technical details
Consists of two main components:
- A multimodal language model with 7 billion parameters (7-9 Hz)
- A movement AI with 80 million parameters (200 Hz)
- Trained with only 500 hours of monitored training
- Runs on energy -efficient embedded GPUs
Greatest competitor
- Google Deepmind: Developed VLA models like RT-2
- Meta: Working on advanced humanoid robots
- Apple: Also in the race for the development of advanced AI humanoids
- Openaai: Former partner of Figure AI, now competitor in the field of AI development
Google DeepMind
With RT-2 (Robotics Transformer 2), Google Deepmind has presented a groundbreaking Vision-Language-Action (VLA) model. RT-2 enables robots to carry out new tasks without specific training by learning concepts from text and image data of the Internet and implementing them into robotic actions. In tests, RT-2 showed a significantly improved performance in new tasks compared to the previous model RT-1.
Suitable for:
- Google Project Mariner: Experimental AI agent as a browser extension – Autonomous web navigation with DeepMind technology
Meta
Meta invests strongly in the development of AI-controlled humanoid robots. The company has founded a new team within its Reality Labs Division, which focuses on the research and development of robots for consumers. Meta plans to develop AI systems, sensors and software platforms that can also be used by other manufacturers.
Apple
Apple also researches both humanoids and non-humanoid robot designs. However, the company is still in an early development phase. The analyst Ming-Chi Kuo predicts a possible mass production at the earliest for 2028. Apple focuses particularly on the interaction between humans and robots.
Suitable for:
- Apple in robot fever? Job advertisements reveal Apple's robot offensive: Does the tech giant now attack the household market?
OpenAI
Openaai, former partner of Figure Ai, builds up its own robotics department and deals with robots as an embodiment of artificial intelligence in the real world. The company now competes directly with Google Deepmind and others in the field of AI development for robotics.
🎯🎯🎯 Benefit from Xpert.Digital's extensive, fivefold expertise in a comprehensive service package | R&D, XR, PR & SEM
AI & XR 3D Rendering Machine: Fivefold expertise from Xpert.Digital in a comprehensive service package, R&D XR, PR & SEM - Image: Xpert.Digital
Xpert.Digital has in-depth knowledge of various industries. This allows us to develop tailor-made strategies that are tailored precisely to the requirements and challenges of your specific market segment. By continually analyzing market trends and following industry developments, we can act with foresight and offer innovative solutions. Through the combination of experience and knowledge, we generate added value and give our customers a decisive competitive advantage.
More about it here:
Helix: Differentiation compared to other AI systems for robots
Innovative VLA model: Helix combines perception, language and movement
The most recent introduction of Helix through Figure AI marks a significant progress in the robotics-to-do-go landscape. This innovative vision-length action (VLA) model stands out through several groundbreaking properties of existing systems and establishes new standards for the control of humanid robots. Helix combines visual perception, understanding of language and precise movement control in an integrated system that was specially designed for the challenges of physical robotics.
Unique dual system architecture
The most significant difference between Helix and other AI systems for robots lies in its innovative two-component architecture. This dual system structure solves a fundamental problem of the robotics AI.
System 1 and System 2: Complementary intelligence
In contrast to conventional approaches, Helix uses two complementary systems that together achieve a unique balance between universality and speed. System 2 (S2) is a multimodal language model with 7 billion parameters that work with a frequency of 7-9 Hz and acts as the analytical “brain” of the robot. It processes visual data and voice commands, interprets the environment and decides which actions should be carried out.
System 1 (S1), a fast, reactive visuo-motor control unit with 80 million parameters. This component translates the semantic information provided by the S2 into precise, continuous robot actions with an impressive frequency of 200 Hz. Figure AI explains that earlier approaches failed either because of a lack of universality or speed: “The use of VLM (Visual Large Language Model) is universal, but not quickly, and the use of visual movement strategies for Robot is fast, but not universal ”. Helix overcomes this dichotomy through its dual structure.
This architecture differs fundamentally from other known VLA models such as Google Deepminds RT-2, which also combines visual data and voice commands, but has no comparable division of two.
Suitable for:
- Google's Gemini platform with Google AI Studio, Google Deep Research with Gemini Advanced and Google DeepMind
Comprehensive control skills
Control over 35 degrees of freedom
Another distinction from Helix is his ability to coordinate 35 degrees of freedom at the same time. This comprehensive control enables precise control of the entire humanoid upper body, including wrists, torso, head and individual fingers at high speed. This control capacity exceeds most of the existing systems and allows complex manipulation tasks that require a high degree of fine motor skills.
Object generation and learning
Universal object recognition without specific training
An outstanding quality of Helix is the ability to recognize and handle practically every small household object without having been trained on its specific properties beforehand. This far -reaching generalization ability enables the system to handle thousands of objects with different shapes, sizes, colors and material properties.
In contrast to many other AI robot systems that have to be newly programmed or trained for every new task or new object type, Helix can adapt to different situations and react to natural voice commands. This represents a paradigm shift, since the system uses a single neuronal network to learn all behaviors - such as picking up and taking objects, using drawers and refrigerators as well as cross -robot interaction - without tasks -specific fine tuning.
Multi robot coordination
Unique collaboration skills
Helix is the first VLA model that is able to control two robots at the same time and enable them to work together. This ability allows robots to solve complex tasks together, in which they are sufficient and coordinate their movements. The almost human -looking communication between robots through nods and eye contact is particularly remarkable.
This form of coordination represents significant progress compared to conventional systems, in which each robot is typically controlled individually or must be trained specifically for certain roles. With Helix, both robots use the same model weights without the need for individual adjustments.
Training efficiency and implementation
Minimal training needs, maximum performance
Another significant difference lies in the remarkable efficiency of the training process. With just 500 hours of high -quality, telephoto -operated training data, Helix was developed, which is considerably less than in comparable approaches that often need thousands of hours of specific demonstrations. This efficiency not only underlines the technical sophistication of the system, but also its economic feasibility for commercial applications.
Embedded-capable processing
Unlike many robotics AI systems that rely on powerful external servers, Helix runs entirely on embedded, energy-efficient GPUs within the robot. This on-board processing eliminates the need for a constant connection to external arithmetic resources and makes the robot more autonomous and flexible in various environments.
Strategic differentiation
Vertical integration instead of generic AI models
Figure AI has strategically sets off by other companies by ending the cooperation with Openai and pursuing a vertically integrated strategy in which both hardware and software are developed internally. CEO Brett Adcock said that generic AI models are not sufficient to meet the requirements of “Embodied Ai”-that is, AI in physical robots. This decision underlines the approach of developing tailor-made solutions for the specific challenges of robotics instead of relying on general AI models.
Application orientation
Focus on household use
While many actors in the industry are currently focusing on industrial or workplace -related robot applications, Figure AI with Helix is pursuing a strategically surprising approach with a focus on household robotics. The ability of the robot, everyday activities such as sorting food, the gripping of the fridge or handling a wide variety of household items is aimed at a market that is often considered too complex for entry by other actors.
Multi-robot coordination: The key to the next robotic generation
With its dual-system architecture, Helix stands out clearly from other AI systems for robots through other AI systems for robots. With its efficient training process, embedded processing and strategic focus on household treatments, it represents significant progress in the development of humanoid robots. While other systems such as Google Deepminds RT-2 pursue similar approaches to the combination of visual data and voice commands, Helix offers differentiating advantages with its unique architecture and its integrated development approach, which make it a pioneer in the next generation of AI-controlled robots.
We are there for you - advice - planning - implementation - project management
☑️ SME support in strategy, consulting, planning and implementation
☑️ Creation or realignment of the digital strategy and digitalization
☑️ Expansion and optimization of international sales processes
☑️ Global & Digital B2B trading platforms
☑️ Pioneer Business Development
I would be happy to serve as your personal advisor.
You can contact me by filling out the contact form below or simply call me on +49 89 89 674 804 (Munich) .
I'm looking forward to our joint project.
Xpert.Digital - Konrad Wolfenstein
Xpert.Digital is a hub for industry with a focus on digitalization, mechanical engineering, logistics/intralogistics and photovoltaics.
With our 360° business development solution, we support well-known companies from new business to after sales.
Market intelligence, smarketing, marketing automation, content development, PR, mail campaigns, personalized social media and lead nurturing are part of our digital tools.
You can find out more at: www.xpert.digital - www.xpert.solar - www.xpert.plus