menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

LogProber:...
source image

Arxiv

3d

read

342

img
dot

Image Credit: Arxiv

LogProber: Disentangling confidence from contamination in LLM responses

  • Contamination in machine learning refers to testing data leaking into the training set, affecting the evaluation of Large Language Models (LLMs) trained on large, opaque text corpora.
  • Tools to detect contamination are crucial for fairly tracking LLM performance evolution, especially given their training on web-scraped text.
  • Previous studies have addressed contamination quantification in short text sequences, but have limitations leading to impracticality.
  • LogProber is introduced as an efficient algorithm to detect contamination in a black box setting, focusing on question familiarity over the answer.
  • LogProber aims to address drawbacks in existing methods and highlights the importance of detection algorithms' design in identifying different forms of contamination.

Read Full Article

like

20 Likes

For uninterrupted reading, download the app