menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Efficiency...
source image

Arxiv

7h

read

261

img
dot

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

  • Large Language Models (LLMs) are being used for reranking tasks in information retrieval with high performance but face deployment challenges due to computational demands.
  • Existing studies on LLM-based rerankers' efficiency use metrics like latency and token count, but they do not adequately consider model size and hardware variations.
  • A new metric called E^2R-FLOPs is proposed to evaluate LLM-based rerankers, focusing on relevance per compute (RPP) and queries per PetaFLOP (QPP) for hardware-agnostic throughput.
  • Comprehensive experiments were conducted using the new metrics to assess the efficiency-effectiveness trade-off of various LLM-based rerankers, shedding light on this issue in the research community.

Read Full Article

like

15 Likes

For uninterrupted reading, download the app