<ul><li>Pruning is a common technique to compress large language models by removing unimportant weights, but it often leads to performance degradation, especially under semi-structured sparsity constraints.</li><li>A new approach called DenoiseRotator is proposed to enhance pruning robustness by redistributing parameter importance to make the model more amenable to pruning.</li><li>DenoiseRotator minimizes the information entropy of normalized importance scores, concentrating importance onto a smaller subset of weights, thus improving pruning effectiveness.</li><li>Evaluation on various models shows that DenoiseRotator consistently enhances perplexity and zero-shot accuracy when compared to existing pruning techniques like Magnitude, SparseGPT, and Wanda.</li></ul>

DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration

Discover more