ZeroHSI:  Human-Scene Interaction in Zero-Shot 4D

ZeroHSI is a new technique that uses neural rendering and video generation models to create lifelike 4D human-scene interactions

ZeroHSI uses  human-movement-trained video models and differentiable rendering to build interactions in 3D environments, including ones with dynamic objects

Generating  human-scene interaction (HSI) is essential for robotics, virtual reality, and embodied AI applications

ZeroHSI, a new method that combines neural human rendering with video creation to enable zero-shot 4D human-scene interaction synthesis

By conditioning on a series of text prompts, ZeroHSI can also create long-term relationships

ZeroHSI excels at synthetic 3D environments containing objects. ZeroHSI generates interactions better than LINGO and CHOIS

ZeroHSI uses a 3D scene, an interactable object, a linguistic description, and beginning states as input to create motion sequences for both humans and dynamic objects