<ul><li>Meta AI and Stanford have introduced Apollo, a family of video-based large multimodal models (LMMs) designed to efficiently and accurately understand video content.</li><li>Apollo models excel at video tasks by addressing key challenges, including how videos are sampled, encoded, and trained.</li><li>Apollo-3B outperforms larger 7B models with a score of 68.7 on the MLVU benchmark, while Apollo-7B achieves 70.9, surpassing some 30B models.</li><li>Apollo marks a significant leap in video AI, opening doors to applications like content analysis and autonomous systems.</li></ul>

Meta’s Apollo Models Set New Benchmarks for Video Understanding in AI

Discover more