menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

JoFormer (...
source image

Arxiv

5d

read

362

img
dot

Image Credit: Arxiv

JoFormer (Journey-based Transformer): Theory and Empirical Analysis on the Tiny Shakespeare Dataset

  • JoFormer is a journey-based Transformer architecture that incorporates positional information through learnable directional transforms.
  • It represents relative positions using sequentially composed directional transforms, outperforming the RoFormer baseline on the Tiny Shakespeare character-level language modeling task.
  • JoFormer achieves lower perplexity and faster convergence, showcasing the benefits of its more expressive treatment of positional relationships.
  • The per-token JoFormer, despite being a conceptual variant, demonstrates strong performance, hinting at its potential for more complex architectures.

Read Full Article

like

21 Likes

For uninterrupted reading, download the app