Meta Platforms Inc. has introduced a new AI model named J-VEPA 2 that enhances AI understanding of the physical world through video interpretation.
The model, building on previous work, helps AI agents and robots think before acting by interpreting video information.
Yann LeCun, Meta's chief AI scientist, highlights the importance of AI developing common sense to predict outcomes in abstract representations of space.
J-VEPA 2 is a state-of-the-art AI world model trained on video to enable robots and AI models to understand and predict interactions in the physical world.
World models allow AI agents to plan actions based on an understanding of the consequences in a simulated environment.
This predictive capability can help prevent workplace accidents by guiding robots on safe paths amidst humans.
V-JEPA 2 helps AI agents understand interactions in the physical world by observing patterns of object interactions and movements.
The model has been tested on robots in labs, demonstrating proficiency in tasks like picking up and relocating objects.
Meta believes world models will revolutionize robotics, making it possible for AI agents to perform physical tasks with minimal training data.
In addition to J-VEPA 2, Meta has introduced new benchmarks for assessing reasoning models using video to understand the world.
Meta's initiative aims to evaluate existing models that leverage video information for comprehensive AI understanding.