<ul><li>Scaling Low-Rank Adaptation (LoRA)-based Mixture-of-Experts (MoE) for large language models (LLMs) faces issues with traditional gating mechanisms hindering scalability.</li><li>A new gating method called RadarGate is proposed, using rotational operations of LoRA representations to enhance expressiveness and enable richer feature interactions among multiple LoRAs for scalable LLMs.</li><li>RadarGate involves fusion of LoRA representations, followed by feeding the output to a rotation matrix with learnable parameters defining relative angular relationships between representations, providing extra freedom for learning cross-LoRA synergies.</li><li>Experiments on multiple benchmarks show the effectiveness of RadarGate in scaling LoRAs, with insights suggesting contrastive rotations to align semantically similar representations and separate distant ones.</li></ul>

Two Is Better Than One: Rotations Scale LoRAs

Discover more