Train your humanoid robot with a prompt. 1X NEO now learns tasks through video generation

Revolutionizing Home Robotics: How 1X’s NEO is Changing the Game

In recent years, robotics demonstrations have dazzled audiences worldwide, yet 1X’s latest breakthrough signals a transformative leap in how robots will integrate into everyday household life.

Introducing NEO: A Robot That Learns Through Imagination

The Norwegian innovator 1X has unveiled its humanoid robot, NEO, which now possesses the remarkable ability to learn and execute intricate tasks by generating training videos from simple prompts. Whether receiving a spoken command or a text instruction, NEO internally visualizes the task’s completion before physically performing the required actions.

This contrasts sharply with traditional robot training methods, which relied heavily on teleoperation-where humans painstakingly guided robots through repetitive task executions, often numbering in the hundreds or thousands.

Leveraging a Vast World Model for Autonomous Learning

NEO’s new approach is powered by a pre-trained world model, developed from an extensive dataset of internet-scale videos. This enables the robot to grasp human interactions and physical dynamics without needing explicit demonstrations for every object it encounters.

“With the latest update to the 1X world model, NEO can autonomously convert any prompt into action.”
Eric Jang, VP of AI, 1X

This advancement effectively eliminates the data bottleneck that has historically constrained robotics development.

From Visualization to Physical Execution

NEO observes its surroundings, creates a mental video of the successful task, and then employs an inverse dynamics model to translate this visual plan into real-world movements. Unlike conventional AI systems that often falter under varying lighting or cluttered environments, NEO’s model is robust and adaptable.

By tapping into a vast repository of human knowledge sourced from the web, NEO comprehends the context of a room and the expected behavior of objects within it.

“The world model allows NEO to generate limitless data of itself performing real-world tasks, creating a self-reinforcing cycle of self-teaching.”
Eric Jang, VP of AI, 1X

Scaling Capabilities Beyond Pre-Programmed Tasks

The scalability of this technology is immense. NEO is no longer confined to a fixed set of pre-recorded tasks. Instead, it can attempt any task a human can describe, “hallucinating” the correct sequence of physical actions before execution.

However, this “dream-to-action” process is not without its challenges. Currently, generating high-fidelity, physics-grounded video data takes time. Demonstrations show NEO pausing momentarily to process commands and visualize the task before initiating movement.

For straightforward actions like picking up an object, a brief delay is manageable, but in scenarios demanding rapid response, this latency could be problematic. Additionally, the computational expense of running such extensive world models is significant, akin to the costs associated with querying large language models like ChatGPT or Gemini.

“At inference, the system receives a text prompt and an initial frame. The world model predicts the future sequence, the inverse dynamics model extracts the necessary trajectory, and the robot executes the plan in reality.”

This means that each new action requires fresh computational resources, potentially leading to ongoing operational costs that might resemble subscription fees for early users.

Ongoing Improvements and Real-World Challenges

1X is actively enhancing the duration and reliability of NEO’s generated skills to ensure consistent performance in everyday environments, which are far less controlled than laboratory settings. While the robot has demonstrated tasks like manipulating a toilet seat or handling a lunchbox, real homes present unpredictable variables.

Despite these hurdles, the concept of a robot that evolves alongside advances in generative AI video models is highly promising.

The Future of Robotics: Self-Improving and Context-Aware

As generative models such as Sora and Kling advance in their understanding of three-dimensional spaces, robots like NEO will gain enhanced capabilities without requiring hardware upgrades. This creates a virtuous cycle where the robot continuously generates its own training data, learns from its experiences, and refines its physical skills over time.

For markets like Australia, known for embracing smart home innovations early, the vision of a versatile, general-purpose robotic assistant is becoming increasingly tangible. Although pricing for the NEO Beta remains unconfirmed locally, 1X aims to make their robots accessible to the mass market eventually.

To contextualize, research-grade humanoid robots often cost several hundred thousand dollars, but consumer models are targeted to be priced comparably to a compact car. Initial estimates suggest a premium humanoid could range between AUD 30,000 and 50,000, though this remains speculative.

From “If-Then” Programming to Autonomous Problem Solving

The most exciting aspect of 1X’s innovation is the “flywheel” effect, where NEO’s ability to self-generate training data accelerates its skill acquisition exponentially. This marks a departure from rigid, rule-based programming toward an era where users simply specify desired outcomes, and the robot autonomously determines how to achieve them.

This shift could be likened to the “iPhone moment” in robotics-transformative and foundational-even as the ecosystem of applications continues to develop. The 1X world model envisions a future where the physical environment becomes just another interface for AI-driven prompt-to-action interactions.

Bridging the Gap Between Digital Intelligence and Physical Labor

Watching NEO perform complex tasks reveals the diminishing divide between virtual cognition and tangible work. While challenges remain-such as improving battery longevity, enhancing motor durability, and speeding up inference-the groundwork for a new generation of adaptable robots has been established.

Achieving the ability to generalize across any task and object is often described as the “Holy Grail” of robotics, and 1X appears to have discovered an ingenious shortcut toward this goal.

As this technology matures, it will be fascinating to observe how these “world models” manage the unpredictability inherent in human environments.

Train your humanoid robot with a prompt. 1X NEO now learns tasks through video generation

Get in Touch

Get in touch

Email

Phone

Social media

Find us

Train your humanoid robot with a prompt. 1X NEO now learns tasks through video generation

Revolutionizing Home Robotics: How 1X’s NEO is Changing the Game

Introducing NEO: A Robot That Learns Through Imagination

Leveraging a Vast World Model for Autonomous Learning

From Visualization to Physical Execution

Scaling Capabilities Beyond Pre-Programmed Tasks

Ongoing Improvements and Real-World Challenges

The Future of Robotics: Self-Improving and Context-Aware

From “If-Then” Programming to Autonomous Problem Solving

Bridging the Gap Between Digital Intelligence and Physical Labor

Related articles

Tesla’s revenue rises again as it prepares for more AI and robotics

Actuators Market by Actuation: Trends Growth Segmentation and Future Projections

AI in Computer Vision Market Trends in Robotics and Automation

The USC Professor Who Pioneered Socially Assistive Robotics

Recent articles

Tesla’s revenue rises again as it prepares for more AI and robotics

Actuators Market by Actuation: Trends Growth Segmentation and Future Projections

AI in Computer Vision Market Trends in Robotics and Automation

The USC Professor Who Pioneered Socially Assistive Robotics