menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Beyond Hum...
source image

Arxiv

3d

read

277

img
dot

Image Credit: Arxiv

Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution

  • Human preference alignment can greatly enhance Multimodal Large Language Models (MLLMs), but collecting high-quality preference data is costly.
  • A novel multimodal self-evolution framework is proposed to autonomously generate high-quality questions and answers using only unannotated images.
  • The framework incorporates an image-driven self-questioning mechanism, answer self-enhancement technique, and image content alignment loss function.
  • Experiments show that the framework performs competitively with methods using external information, providing a more efficient approach to MLLMs.

Read Full Article

like

16 Likes

For uninterrupted reading, download the app