menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Two Is Bet...
source image

Arxiv

4d

read

376

img
dot

Image Credit: Arxiv

Two Is Better Than One: Rotations Scale LoRAs

  • Scaling Low-Rank Adaptation (LoRA)-based Mixture-of-Experts (MoE) for large language models (LLMs) faces issues with traditional gating mechanisms hindering scalability.
  • A new gating method called RadarGate is proposed, using rotational operations of LoRA representations to enhance expressiveness and enable richer feature interactions among multiple LoRAs for scalable LLMs.
  • RadarGate involves fusion of LoRA representations, followed by feeding the output to a rotation matrix with learnable parameters defining relative angular relationships between representations, providing extra freedom for learning cross-LoRA synergies.
  • Experiments on multiple benchmarks show the effectiveness of RadarGate in scaling LoRAs, with insights suggesting contrastive rotations to align semantically similar representations and separate distant ones.

Read Full Article

like

22 Likes

For uninterrupted reading, download the app