Nomic has released the 'Nomic Embed Multimodal' model, achieving state-of-the-art performance on visual document retrieval tasks.
The model processes interleaved text, images, and screenshots, setting a new high score on the Vidore-v2 benchmark for visual document retrieval.
Nomic Embed Multimodal eliminates the need for complex, error-prone processing pipelines and captures the full richness of documents by embedding both text and visual components.
Nomic now offers a complete embedding ecosystem, featuring models for text, multimodal, and code embeddings.