menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

V-JEPA 2: ...
source image

Arxiv

3d

read

75

img
dot

Image Credit: Arxiv

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

  • Researchers have developed a self-supervised approach named V-JEPA 2 to understand, predict, and plan in the physical world.
  • V-JEPA 2 was pre-trained on over 1 million hours of internet video data and achieves top performance in motion understanding and human action anticipation tasks.
  • By integrating V-JEPA 2 with a large language model, it excels in video question-answering tasks at a large scale.
  • The researchers further demonstrate the application of self-supervised learning in robotic planning by training V-JEPA 2-AC on unlabeled robot videos and achieving object manipulation tasks.
  • V-JEPA 2-AC allows picking and placing objects using planning with image goals on Franka arms in different lab environments.
  • This achievement is obtained without task-specific training, reward, or robot data collection in the target environments.

Read Full Article

like

4 Likes

For uninterrupted reading, download the app