AdaDeDup is a hybrid data pruning framework designed to enhance the efficiency of training large-scale object detection models by integrating density-based pruning with model-informed feedback.
The framework partitions data, applies initial density-based pruning, and uses a proxy model to adjust cluster-specific pruning thresholds adaptively based on the impact of pruning on losses within each cluster.
Extensive experiments on Waymo, COCO, and nuScenes datasets using standard models show that AdaDeDup outperforms existing methods, reduces performance degradation, and maintains model performance while pruning 20% of data.
AdaDeDup's effectiveness in improving data efficiency for large-scale model training is highlighted by achieving near-original model performance with significant data reduction.