<ul><li>Researchers propose Rank-Insensitive LoRA-based Quantization Error Compensation (RILQ) to understand and address limitations in sub-4-bit scenarios of LoRA-based quantization error compensation (LQEC).</li><li>RILQ employs model-wise activation discrepancy loss to adjust adapters cooperatively across layers, enabling robust error compensation with low-rank adapters.</li><li>Evaluations on LLaMA-2 and LLaMA-3 demonstrate RILQ's consistent improvements in 2-bit quantized inference across various quantizers and enhanced accuracy in task-specific fine-tuning.</li><li>RILQ enables adapter-merged weight-quantized Large Language Model (LLM) inference with significantly enhanced accuracy, making it a promising approach for boosting 2-bit LLM performance.</li></ul>

RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy

Discover more