MixLLM is a new optimization approach for quantization of LLMs.MixLLM explores mixed-precision quantization between output features based on their salience in the global view.By assigning larger bit-width to output features that need it most, MixLLM achieves good accuracy with low memory consumption.MixLLM demonstrates superior accuracy and state-of-the-art system efficiency compared to existing quantization solutions.