menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

ComplexFor...
source image

Arxiv

3d

read

57

img
dot

Image Credit: Arxiv

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

  • Transformer models face challenges in integrating positional information while allowing multi-head attention flexibility.
  • ComplexFormer introduces Complex Multi-Head Attention-CMHA to model semantic and positional differences in the complex plane, enhancing representational capacity.
  • Key improvements in ComplexFormer include per-head Euler transformation and adaptive differential rotation mechanism for head-specific complex subspace operation.
  • Extensive experiments show that ComplexFormer outperforms strong baselines like RoPE-Transformers in various tasks, demonstrating superior performance, lower generation perplexity, and improved long-context coherence.

Read Full Article

like

3 Likes

For uninterrupted reading, download the app