menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Paying Ali...
source image

Arxiv

1w

read

122

img
dot

Image Credit: Arxiv

Paying Alignment Tax with Contrastive Learning

  • Debiasing approaches often lead to a decrease in model capabilities like accuracy and knowledge retention.
  • Existing debiasing methods face trade-offs resulting in reduced truthfulness, knowledge loss, or unintelligible outputs, especially in smaller models.
  • A contrastive learning framework is proposed to address these limitations by using positive and negative examples for learning, introducing contrast computation and dynamic loss scaling.
  • Experimental results show that this approach improves toxicity reduction and faithfulness preservation simultaneously, without the capability degradation seen in current methods.

Read Full Article

like

7 Likes

For uninterrupted reading, download the app