ByteDance researchers have developed OmniHuman, an AI system that can transform single photographs into realistic videos of people speaking and moving naturally.
OmniHuman generates full-body videos that match speech with gestures and movements, surpassing previous AI models.
The system was trained on over 18,700 hours of human video data using an innovative "omni-conditions" training approach.
This advancement in AI-generated media has the potential to reshape digital entertainment, educational content creation, and communications.