menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Subquadrat...
source image

Arxiv

1d

read

93

img
dot

Image Credit: Arxiv

Subquadratic Algorithms and Hardness for Attention with Any Temperature

  • Researchers have developed subquadratic algorithms for computing Attention in Transformers with head dimension d = Theta(log n).
  • Subquadratic Attention is feasible when inputs have small entries bounded by B = o(sqrt(log n)), or when softmax is applied with high temperature for d = Theta(log n).
  • Efficient computation of Attention without strong assumptions on temperature is explored, with subquadratic algorithms presented for constant d = O(1).
  • The study concludes that in certain scenarios, the standard algorithm for Attention is optimal under fine-grained complexity assumptions.

Read Full Article

like

5 Likes

For uninterrupted reading, download the app