Mixture-of-experts (MoE) is a popular approach in parameter-efficient fine-tuning to balance model capacity and computation overhead.
Current MoE variants face challenges with heterogeneous datasets due to underutilization of experts' capacity.
CoMoE, Contrastive Representation for MoE, is introduced to enhance modularization and specialization within MoE by incorporating a contrastive objective.
Experiments show that CoMoE improves MoE's capacity and promotes modularization among experts across various benchmarks and multi-task settings.