NVIDIA has launched a range of advanced AI and simulation tools, including the open-source Isaac Lab framework and Project GROOT workflows, to accelerate the development of humanoid and AI-enabled robots, enhancing robot learning and world-model creation with groundbreaking data processing capabilities. (Source: Image by RR)

Robot Developers Gain Speed and Efficiency with NVIDIA’s Latest AI Frameworks

NVIDIA has unveiled a suite of new AI tools and workflows at the Conference for Robot Learning (CoRL) in Munich, designed to significantly speed up the development of AI-enabled robots, including humanoids. The centerpiece, as noted in blogs.nvidia.com, is the general availability of the open-source NVIDIA Isaac Lab, a robot learning framework built on NVIDIA Omniverse, which supports training at scale for various robot embodiments such as humanoids and quadrupeds. This is complemented by six new Project GR00T workflows aimed at accelerating humanoid robot development, providing blueprints for essential capabilities like generative 3D environments, motion generation, dexterity, whole-body control, navigation, and perception.

NVIDIA also introduced advanced world-model development tools to support video data curation and processing, including the NVIDIA Cosmos tokenizer and NeMo Curator. The Cosmos tokenizer offers high-quality visual tokenization with compression rates up to 12 times faster than traditional models, enhancing the development of AI representations needed for predictive world models. NeMo Curator’s new video processing pipeline streamlines data handling, reducing processing times and supporting large-scale operations across multi-node, multi-GPU systems, which aids in building comprehensive, data-intensive AI world models for robotics.

Several leading robotics companies and developers, such as 1X Technologies and XPENG Robotics, are already adopting NVIDIA’s new tools. The Cosmos tokenizer, for example, is integrated into 1X’s updated World Model Challenge dataset, allowing for efficient training of world models with long-horizon video generation while maintaining high visual fidelity. NeMo Curator enables developers to curate extensive text, image, and video data effectively, tackling challenges associated with the size of video data and ensuring faster processing and lower costs.

NVIDIA’s contributions to CoRL also included 23 research papers and nine workshops covering advancements in integrating vision-language models, temporal navigation, long-horizon planning, and skill acquisition from human demonstrations. Notable projects highlighted were SkillGen, which focuses on synthetic data generation for training robots, and HOVER, a foundational model for humanoid control. These developments, along with guides and resources available on platforms like GitHub and Hugging Face, aim to support robotics researchers and developers in creating more advanced and efficient robotic systems

read more at blogs.nvidia.com