How could such a humanoid robot, trained fully in simulation, far outperform human operators on a real-world door-opening task with zero post-deployment tuning? The answer is provided by NVIDIA’s latest breakthrough: the DoorMan system.

DoorMan was designed to address a deceptively complex problem: opening a wide range of doors-from knobs to levers-at speed and with precision. Tested on the $16,000 Unitree G1 humanoid robot, the system requires only RGB camera input-no depth sensors, motion capture markers, or specialized tactile hardware. The robot’s policy was trained in its entirety in NVIDIA Isaac Lab, a simulation environment designed for scalable reinforcement learning, then deployed zero-shot in physical tests. In head-to-head trials, DoorMan completed openings up to 31% faster than expert human teleoperators and achieved an 83% success rate, outperforming both expert (80%) and non-expert (60%) human control.
The design of this system directly addresses two of the stubborn challenges with reinforcement learning for manipulation: first, the exploration challenge-robots are generally unable to find the right sequence of actions when learning from scratch; DoorMan uses a kind of “staged-reset” mechanism, where it saves the state when the robot succeeds at grasping a handle, so subsequent episodes can start from that point. This greatly speeds up the learning of later phases, like pushing or pulling open a door. Second is the visibility problem, where as the robot approaches a door, the handle drops out of view. The team applied Group Relative Policy Optimization to encourage behaviors that kept key features in view, triggering very subtle adjustments, like stepping back or tilting the head.
With its success, DoorMan marks a significant advance in loco-manipulation, tasks requiring coherent locomotion, perception, and manipulation. This is consistent with broader trends in sim-to-real humanoid robotics, whereby policies optimized in simulation have to generalize to the variabilities of the real world. In this work, domain randomization in Isaac Lab resulted in millions of permutations of doors, where hinge stiffness, damping, handle geometry, and textures were varied continuously. The resulting environment became “just another instance” within the model’s training range.
Complementary work on closed-loop, haptic-aware control makes clear the richness of this interaction beyond raw visual appearance: whereas DoorMan relies exclusively on vision, active haptic feedback systems-employing inexpensive motor current data in lieu of expensive force/torque sensors-have demonstrated dynamic adaptation to unexpected resistive forces, real-time push/pull configuration detection, and real-time grasp error correction. These methods achieve success rates of up to 90% on previously unseen doors, illustrating the potential for further enhancements in robustness by combining the vision-based policy for DoorMan with low-cost active tactile sensing.
The approach in DoorMan thus realizes some of the general sim-to-real locomotion strategies materializing in humanoid robotics. Actuator modeling, nonparametric noise injection, and curriculum learning have been essential techniques for transferring both walking and manipulation skills to hardware without degradation. The training infrastructure within the NVIDIA ecosystem is centered around Isaac Lab, which is capable of large-scale policy development and validation. The performance of Unitree G1 during the trials with DoorMan reflected these advantages: robust zero-shot transfer, adaptability to diverse mechanical properties, and efficient policy scaling.
The implications go way beyond opening doors. The same visual reinforcement learning pipeline could be applied to other articulated object interactions-drawers, cabinets, machinery-where loco-manipulation is critical. As humanoid platforms gain more generalist capabilities with systems such as NVIDIA Isaac GR00T, integrating multimodal perception and imitation learning, combining those with specialized policies such as DoorMan may yield robots that can perform complex, human-centric tasks in unstructured environments. By showing that a vision-only, simulation-trained policy can outperform human teleoperation on a high-variability, contact-rich task, DoorMan sets a new bar for sim-to-real deployment. For robotics engineers and AI researchers, it serves as an example of how targeted policy design and large-scale simulated diversity, with careful handling of perception challenges, can unlock real-world performance without costly fine-tuning.
