StatsMerging is a new lightweight learning-based model merging method designed to accommodate multiple large models within memory constraints.
It leverages singular values from singular value decomposition (SVD) to capture task-specific weight distributions and predict task coefficients.
StatsMerging employs a lightweight learner, StatsMergeLearner, to enhance generalization of weight distributions of task-specific pre-trained models.
The method introduces Task-Specific Teacher Distillation for merging vision models with different architectures, achieving improved accuracy, generalization, and robustness in experiments across eight tasks.