The practice of utilizing task-specific fine-tuning has been implemented to improve the performance of large language models (LLM) in subsequent tasks.
Model merging strategies have emerged to reduce the memory footprint during inference by merging all LLMs into one model.
To mitigate parameter conflicts and improve storage efficiency, a novel strategy called Channel Merging is introduced.
Channel Merging clusters and merges channel parameters based on their similarity, reducing conflicts while preserving specialized knowledge.