Knowledge Distillation is a commonly used Deep Neural Network (DNN) compression method, which often maintains overall generalization performance.
Even for balanced image classification datasets, as many as 41% of the classes are statistically significantly affected by distillation when comparing class-wise accuracy.
Increasing the distillation temperature improves the distilled student model's fairness, potentially surpassing the fairness of the teacher model at high temperatures.
Distillation can have uneven effects on certain classes and play a significant role in fairness, requiring caution when using distilled models for sensitive applications.