<ul><li>Meta AI has released Apollo, a family of video-focused Large Multimodal Models (LMMs) designed to enhance video understanding.</li><li>Apollo addresses challenges in video-based models through efficient design choices, such as frame-per-second (fps) sampling and dual vision encoders.</li><li>The Apollo models come in three sizes - 1.5B, 3B, and 7B parameters - offering flexibility for different computational constraints and real-world needs.</li><li>Apollo achieves strong performance on video-language tasks and introduces innovations like scaling consistency and token resampling.</li></ul>

Meta AI Releases Apollo: A New Family of Video-LMMs Large Multimodal Models for Video Understanding

Discover more