<ul><li>Post-training quantization (PTQ) is a method to reduce a model's memory footprint without retraining.</li><li>A new mixed-precision PTQ approach called Task-Circuit Quantization (TaCQ) conditions the quantization process on specific weight circuits associated with downstream task performance.</li><li>TaCQ preserves task-specific weights by contrasting unquantized model weights with uniformly-quantized model weights.</li><li>Experimental results show that TaCQ outperforms existing mixed-precision quantization methods, achieving major improvements in the low 2- to 3-bit regime.</li></ul>

Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression

Discover more