menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

RILQ: Rank...
source image

Arxiv

2d

read

170

img
dot

Image Credit: Arxiv

RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy

  • Researchers propose Rank-Insensitive LoRA-based Quantization Error Compensation (RILQ) to understand and address limitations in sub-4-bit scenarios of LoRA-based quantization error compensation (LQEC).
  • RILQ employs model-wise activation discrepancy loss to adjust adapters cooperatively across layers, enabling robust error compensation with low-rank adapters.
  • Evaluations on LLaMA-2 and LLaMA-3 demonstrate RILQ's consistent improvements in 2-bit quantized inference across various quantizers and enhanced accuracy in task-specific fine-tuning.
  • RILQ enables adapter-merged weight-quantized Large Language Model (LLM) inference with significantly enhanced accuracy, making it a promising approach for boosting 2-bit LLM performance.

Read Full Article

like

10 Likes

For uninterrupted reading, download the app