ZeroHSI is a new technique that uses neural rendering and video generation models to create lifelike 4D human-scene interactions
ZeroHSI uses human-movement-trained video models and differentiable rendering to build interactions in 3D environments, including ones with dynamic objects
Generating human-scene interaction (HSI) is essential for robotics, virtual reality, and embodied AI applications
ZeroHSI, a new method that combines neural human rendering with video creation to enable zero-shot 4D human-scene interaction synthesis
By conditioning on a series of text prompts, ZeroHSI can also create long-term relationships
ZeroHSI excels at synthetic 3D environments containing objects. ZeroHSI generates interactions better than LINGO and CHOIS
ZeroHSI uses a 3D scene, an interactable object, a linguistic description, and beginning states as input to create motion sequences for both humans and dynamic objects