menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Cost-aware...
source image

Arxiv

1w

read

355

img
dot

Image Credit: Arxiv

Cost-aware LLM-based Online Dataset Annotation

  • Recent advances in large language models (LLMs) have enabled automated dataset labeling with minimal human supervision.
  • A novel online framework called Cost-aware Majority Voting (CaMVo) is proposed for efficient and accurate LLM-based dataset annotation.
  • CaMVo adaptively selects a subset of LLMs for each data instance based on contextual embeddings to balance confidence and cost without pre-training or ground-truth labels.
  • Empirical evaluation on the MMLU and IMDB Movie Review datasets shows that CaMVo achieves comparable or superior accuracy to full majority voting while significantly reducing labeling costs, making it a practical and robust solution for cost-efficient annotation.

Read Full Article

like

21 Likes

For uninterrupted reading, download the app