menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Cascading ...
source image

Arxiv

3d

read

4

img
dot

Image Credit: Arxiv

Cascading Adversarial Bias from Injection to Distillation in Language Models

  • Model distillation is crucial for creating smaller language models while maintaining performance, but there are concerns about vulnerability to adversarial bias injection.
  • Adversaries can introduce biases into teacher models through data poisoning, which then amplify in student models, leading to biased responses.
  • Two propagation modes are identified: Untargeted Propagation affecting multiple tasks and Targeted Propagation focusing on specific tasks.
  • The study highlights security vulnerabilities in distilled models and suggests the need for specialized safeguards and mitigation strategies.

Read Full Article

like

Like

For uninterrupted reading, download the app