menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

MMIE: Mass...
source image

Arxiv

2d

read

203

img
dot

Image Credit: Arxiv

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

  • Researchers introduce MMIE, a large-scale benchmark for evaluating multimodal comprehension and generation in Large Vision-Language Models (LVLMs).
  • MMIE consists of 20K curated multimodal queries covering various categories and subfields.
  • The benchmark supports interleaved inputs and outputs, evaluating competencies through multiple-choice and open-ended questions.
  • An automated evaluation metric with reduced bias and improved accuracy is proposed.

Read Full Article

like

12 Likes

For uninterrupted reading, download the app