ICE-Pruning is an iterative pruning pipeline designed to compress Deep Neural Networks efficiently.
It reduces the computational cost of fine-tuning while maintaining a similar model accuracy as existing pruning pipelines.
ICE-Pruning comprises automatic mechanisms for determining when to perform fine-tuning, a freezing strategy for faster fine-tuning, and a custom pruning-aware learning rate scheduler.
Evaluations on various DNN models and datasets show that ICE-Pruning can accelerate the pruning process by up to 9.61x.