menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Pioneering...
source image

Arxiv

1d

read

11

img
dot

Image Credit: Arxiv

Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning

  • Model quantization reduces the bit-width of weights and activations, improving memory efficiency and inference speed in diffusion models.
  • Challenges in achieving 4-bit quantization have been identified, including handling asymmetric activation distributions, temporal complexity in the denoising process during fine-tuning, and misalignment between fine-tuning loss and quantization error.
  • To overcome these challenges, a mixup-sign floating-point quantization (MSFP) framework is proposed, introducing unsigned FP quantization, timestep-aware LoRA (TALoRA), and denoising-factor loss alignment (DFA) for precise and stable fine-tuning.
  • Through extensive experiments, superior performance in 4-bit FP quantization for diffusion models has been achieved, surpassing existing post-training quantization fine-tuning methods in 4-bit integer quantization.

Read Full Article

like

Like

For uninterrupted reading, download the app