menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Beyond Con...
source image

Arxiv

2d

read

203

img
dot

Image Credit: Arxiv

Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance

  • Recent advancements in large language models (LLMs) have allowed the augmentation of information retrieval (IR) pipelines with synthetic data.
  • The traditional training paradigm in contrastive learning with binary relevance labels and InfoNCE loss treats all documents that are not explicitly annotated as relevant equally, regardless of their actual degree of relevance.
  • In this work, synthetic documents generated by open-source LLMs are used to create a fully synthetic ranking context of graduated relevance for training dense retrievers.
  • Experiments show that this approach outperforms conventional training and achieves comparable performance to retrievers trained on real, labeled training documents.

Read Full Article

like

12 Likes

For uninterrupted reading, download the app