menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Code-Switc...
source image

Arxiv

3d

read

139

img
dot

Image Credit: Arxiv

Code-Switching Curriculum Learning for Multilingual Transfer in LLMs

  • Large language models (LLMs) face performance drops after a few high-resource languages due to pre-training data imbalance.
  • Inspired by second language acquisition, code-switching curriculum learning (CSCL) is proposed for enhancing cross-lingual transfer in LLMs.
  • CSCL mimics human language learning stages through token-level and sentence-level code-switching as well as monolingual corpora training.
  • Using Qwen 2 model, CSCL shows significant gains in language transfer to Korean compared to monolingual pre-training methods.
  • Ablation studies confirm the effectiveness of both token- and sentence-level code-switching in enhancing cross-lingual transfer, amplified by curriculum learning.
  • The study extends to languages like Japanese and Indonesian using Gemma 2 and Phi 3.5 models, demonstrating improved language transfer.
  • CSCL helps mitigate spurious correlations between language resources and safety alignment, offering an efficient framework for equitable language transfer in LLMs.
  • CSCL proves effective in low-resource settings lacking high-quality, monolingual corpora for language transfer.

Read Full Article

like

8 Likes

For uninterrupted reading, download the app