Presented by DAIMON Robotics.
In April, DAIMON Robotics, headquartered in Hong Kong, unveiled Daimon-Infinity, heralded as the most extensive omni-modal dataset for physical AI to date. This dataset integrates ultra-high-resolution tactile sensing and encompasses a diverse array of tasks-from household chores like folding clothes to complex industrial assembly line operations. The initiative is a collaborative effort involving global leaders such as Google DeepMind, Northwestern University, and the National University of Singapore.
DAIMON Robotics, a pioneering company just over two years old, is renowned for its sophisticated tactile sensor technology. Their flagship product is a monochromatic, vision-based tactile sensor that compresses more than 110,000 sensing units into a fingertip-sized module. Leveraging this advanced sensor and a distributed data collection network that operates beyond traditional lab environments, DAIMON generates millions of hours of tactile data annually. This vast repository supports the creation of comprehensive robot manipulation datasets, which the company has partially open-sourced-releasing 10,000 hours of data to accelerate embodied AI development worldwide.
Prof. Michael Yu Wang, co-founder and chief scientist at DAIMON Robotics, has been instrumental in developing the Vision-Tactile-Language-Action (VTLA) framework, positioning tactile sensing alongside vision as a fundamental modality.
Revolutionizing Robot Manipulation with Tactile Data
At the heart of DAIMON’s innovation is Prof. Michael Yu Wang, a distinguished roboticist with a PhD from Carnegie Mellon University, where he studied manipulation under Matt Mason. He later founded the Robotics Institute at the Hong Kong University of Science and Technology and is an IEEE Fellow and former Editor-in-Chief of IEEE Transactions on Automation Science and Engineering. Prof. Wang’s mission is to overcome the limitations of current robot manipulation models, which predominantly rely on Vision-Language-Action (VLA) frameworks that lack tactile sensitivity. His team’s breakthrough is the Vision-Tactile-Language-Action (VTLA) architecture, which elevates tactile sensing to equal importance with vision, enabling robots to interact with their environment more naturally and effectively.
We discussed with Prof. Wang how tactile feedback is transforming dexterous manipulation, the significance of the Daimon-Infinity dataset in enhancing robotic hand performance in real-world settings, and the initial sectors-ranging from hospitality to retail in China-where tactile-enabled robots are making tangible impacts.
Daimon-Infinity stands as the globe’s largest omni-modal dataset for Physical AI, featuring multimodal data spanning millions of hours, ultra-high-resolution tactile feedback, over 80 real-world scenarios, and more than 2,000 human skills.
Building the World’s Largest Robotic Manipulation Dataset
Earlier this month, DAIMON Robotics launched the most comprehensive robotic manipulation dataset to date, developed in partnership with leading academic institutions and industry players. When asked why the company prioritized releasing this dataset over continuing product development, Prof. Wang emphasized the critical role of data in advancing embodied AI.
“Data scarcity, especially in physical interaction data, remains a significant bottleneck in robot learning,” he explained. “Our vision-based tactile sensors capture rich, multimodal data-not just contact forces but also deformation, slip, friction, material characteristics, and surface textures. This enables a detailed reconstruction of physical interactions essential for real-world robot operation.”
DAIMON’s unique approach involves a distributed, out-of-lab data collection network that gathers millions of hours of tactile data annually from diverse environments. This scalable system contrasts with traditional centralized data factories, allowing for more varied and practical datasets.
“By open-sourcing 10,000 hours of this dataset, we aim to empower the entire embodied AI community and accelerate the deployment of general-purpose robotic foundation models,” Prof. Wang added.
Collaborative Development and Industry Impact
The dataset’s creation involved collaboration with top-tier partners worldwide, including Northwestern University, the National University of Singapore, Google DeepMind, and China Mobile. These collaborators contribute by deploying DAIMON’s tactile sensors in various real-world scenarios-ranging from research labs to manufacturing floors-helping to collect application-driven data that informs tailored model training.
“Our partners’ involvement validates the dataset’s value and ensures it addresses practical challenges,” Prof. Wang noted. “This synergy between data collection and model development is crucial for advancing embodied AI.”
DAIMON’s visuotactile sensor enables robotic grippers to delicately sense contact and precisely modulate force, allowing them to handle fragile objects like eggshells with care.
From Vision-Language-Action to Vision-Tactile-Language-Action
While the Vision-Language-Action (VLA) model has dominated robotics, DAIMON’s team advocates for the inclusion of tactile sensing, forming the Vision-Tactile-Language-Action (VTLA) framework. Prof. Wang explained why tactile feedback is indispensable for advanced manipulation tasks.
“Dexterous manipulation requires more than just vision and language understanding. Robots must sense contact states, detect slip, and control force precisely to handle delicate or complex objects,” he said. “Without tactile input, robots struggle in low-light conditions, risk dropping fragile items, and often fail in tasks requiring nuanced force application.”
DAIMON’s vision-based tactile sensors capture sequential images of fingertip deformation, encoding rich contact information that integrates seamlessly with visual data. This compatibility allows tactile sensing to be incorporated naturally into existing VLA frameworks, enhancing robot perception and control.
DAIMON’s vision-based tactile sensors boast over 110,000 sensing units, delivering unparalleled resolution for robotic fingertips.
Innovating with Monochromatic Vision-Based Tactile Sensors
DAIMON’s choice to develop a monochromatic vision-based tactile sensor stems from a desire to replicate the human fingertip’s sensory capabilities. Prof. Wang highlighted the importance of mimicking the human sense of touch, which discerns material properties, force distribution, and object movement.
“We evaluated existing tactile technologies and opted for a monochromatic vision-based approach that balances sensitivity, reliability, and cost-effectiveness,” he explained. “This engineering-driven solution leverages foundational research while delivering practical performance for real-world applications.”
Last year, DAIMON introduced a multi-dimensional, high-frequency tactile sensor that surpasses traditional sensors in sensing density and dynamic response. This technology is poised to revolutionize industries requiring delicate manipulation, such as retail automation, where robots must navigate cramped spaces and handle diverse objects.
“For example, in convenience stores with tightly packed shelves, robots need slim, dexterous fingers to grasp items without disturbing adjacent products,” Prof. Wang said. “Our tactile sensors enable such nuanced control, which conventional two-jaw grippers cannot achieve.”
From Academic Roots to Industry Leadership
After a distinguished academic career spanning four decades-including founding the HKUST Robotics Institute and earning IEEE Fellow recognition-Prof. Wang co-founded DAIMON Robotics to translate research breakthroughs into commercial impact.
“My PhD work at Carnegie Mellon under Matt Mason laid the foundation for my focus on dexterous manipulation,” he reflected. “Despite progress in locomotion robots, manipulation remained a challenge. Founding DAIMON allowed us to harness emerging talent and capital to push tactile sensing technology forward.”
Co-founder Dr. Duan Jianghua’s entrepreneurial vision complemented this mission, helping DAIMON grow into a global player with a vibrant community spanning Asia, the U.S., and Europe.
DAIMON’s tactile sensing technology is already deployed in factory environments, bridging the gap between robotic perception and physical interaction.
Strategic Vision: Devices, Data, and Deployment
DAIMON’s business model revolves around a “3D” strategy: Devices, Data, and Deployment. Initially focused on producing advanced tactile sensors, the company has expanded to encompass large-scale data collection and real-world application deployment.
“We recognize that success requires an integrated technology chain-from hardware to high-quality data to effective model training and deployment,” Prof. Wang explained. “Our distributed data collection network and partnerships enable closed-loop validation, ensuring our solutions meet practical needs.”
Embodied Skills: The Next Frontier in Humanoid Robotics
Prof. Wang introduced the concept of “embodied skills” as critical for humanoid robots to transcend mere AI cognition and achieve physical competence.
“Advances in mechatronics and electronics have made fully electric, high-torque robots feasible,” he said. “Coupled with AI breakthroughs, especially in large language and world models, we are approaching a convergence where robots can autonomously perceive, decide, and act in unstructured environments.”
He envisions human-sized robots becoming commonplace in homes, capable of safe, reliable, and cost-effective operation that benefits society.
Pathways to Real-World Robot Integration
Despite impressive demonstrations, widespread deployment of generalist robots remains a work in progress. Prof. Wang identified specific sectors where robots are already making headway.
“In China, nearly every major hotel employs delivery robots that autonomously navigate lobbies, elevators, and corridors to deliver food to guests,” he noted. “These robots, though lacking arms, demonstrate the feasibility of autonomous service robots in controlled environments.”
He anticipates similar adoption in restaurants, drugstores, and convenience stores, where tactile-enabled humanoid robots can perform complex tasks in confined spaces. Gradual expansion into other industries will follow as capabilities mature.
“Our ultimate goal is for robots to develop robust manipulation skills and become dependable partners in daily life, seamlessly integrating into homes and workplaces to enhance human well-being,” Prof. Wang concluded.
Interview content has been edited for clarity and brevity.