menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Your Pre-t...
source image

Arxiv

1d

read

34

img
dot

Image Credit: Arxiv

Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator

  • Post-training of large language models is essential for adapting pre-trained language models (PLMs) to align with human preferences and downstream tasks.
  • Post-trained language models (PoLMs) often suffer from over-confidence, assigning high confidence to both correct and incorrect outputs, which can undermine reliability in critical applications.
  • Disagreement-Aware Confidence Alignment (DACA) is a novel unsupervised method proposed to optimize parameters in post-hoc confidence calibration by addressing the scarcity of labeled data for individual downstream tasks.
  • DACA selectively uses only agreement examples for calibration to improve confidence calibration performance, as demonstrated by extensive experiments showing up to 15.08% improvement in the average ECE of LLMs like GPT-4o on common benchmarks.

Read Full Article

like

2 Likes

For uninterrupted reading, download the app