menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Aurelia: T...
source image

Arxiv

2d

read

352

img
dot

Image Credit: Arxiv

Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs

  • AURELIA is a novel actor-critic based audio-visual reasoning framework that improves the ability of AVLLMs to process complex multi-modal inputs without additional training.
  • AVReasonBench is a challenging benchmark with 4500 audio-visual questions and detailed step-by-step reasoning, evaluating the reasoning skills of AVLLMs.
  • Evaluation of 18 AVLLMs on AVReasonBench reveals limitations in their multi-modal reasoning capabilities.
  • Using AURELIA, a relative improvement of up to 100% is achieved, highlighting the potential of reasoning-enhanced data generation for advancing AVLLMs.

Read Full Article

like

21 Likes

For uninterrupted reading, download the app