<ul data-eligibleForWebStory="true"><li>Finetuning large pretrained neural networks can be resource-intensive in terms of memory and computational cost.</li><li>Researchers have observed a correlation between large gradients and small-magnitude weights during finetuning.</li><li>NANOADAM is proposed as a method that dynamically updates only small-magnitude weights during finetuning, offering practical advantages such as gradient-free determination of parameter subsets and better generalization performance in experiments.</li><li>The proposed method, NANOADAM, has shown benefits for both NLP and vision tasks by preserving large-magnitude weights encoding critical features from pretraining and enabling the use of larger learning rates.</li></ul>

Pay Attention to Small Weights

Discover more