menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Junk DNA H...
source image

Arxiv

3d

read

148

img
dot

Image Credit: Arxiv

Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs

  • A new study presents the Junk DNA Hypothesis, focusing on the pre-trained weights of large language models like GPT-3.
  • The hypothesis challenges the belief that pruning small weights in LLMs does not affect performance, suggesting that these weights actually encode vital information for challenging tasks.
  • Removing these seemingly insignificant weights can lead to an irreversible loss of knowledge and performance decline in difficult tasks, even with continued training.
  • Quantization as a compression method does not exhibit the same effect as weight pruning in exposing task difficulty information according to the study. Extensive experiments support the Junk DNA Hypothesis.

Read Full Article

like

8 Likes

For uninterrupted reading, download the app