S'MoRE is a framework that integrates the efficiency of low-rank adaptations with the flexibility of MoE architectures.
S'MoRE employs hierarchical low-rank decomposition of expert weights, resulting in residuals interconnected in a multi-layer structure.
Residuals allow S'MoRE to emulate the capacity of many experts while instantiating and assembling just a few low-rank matrices.
Theoretical analysis and empirical results demonstrate that S'MoRE achieves superior fine-tuning performance, offering an efficient approach for adapting LLM.