Apple has released AIMv2, a family of state-of-the-art open-set vision encoders.AIMv2 improves upon existing models in multimodal understanding and object recognition tasks.It incorporates a multimodal autoregressive pre-training framework with a Vision Transformer (ViT) encoder and a causal multimodal decoder.AIMv2 achieves strong performance, scalability, and versatility in various applications, setting a new standard for open-set visual encoders.