<ul><li>Low-rank adaptation (LoRA) is a widely used parameter-efficient fine-tuning method where one of the matrices, $A$ or $B$, is traditionally initialized to zero to ensure fine-tuning starts from the pretrained model.</li><li>A new study investigates the impact of non-zero initialization on LoRA's fine-tuning dynamics, revealing that simultaneously initializing $A$ and $B$ to non-zero values improves robustness to suboptimal learning rates, especially smaller ones.</li><li>Analysis shows that non-zero initialization of $AB$ introduces random noise into pretrained weights but generally does not impact fine-tuning performance, suggesting that fine-tuning does not have to strictly begin from the pretrained model.</li><li>The research findings are supported by extensive experiments on various models and datasets with the code available at https://github.com/Leopold1423/non_zero_lora-icml25.</li></ul>

Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

Discover more