menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

OmniDraft:...
source image

Arxiv

1d

read

204

img
dot

Image Credit: Arxiv

OmniDraft: A Cross-vocabulary, Online Adaptive Drafter for On-device Speculative Decoding

  • OmniDraft is a unified framework designed to address challenges in online deployment settings related to cross-vocabulary mismatch and latency improvements in speculative decoding.
  • OmniDraft allows a single draft model to work with any target model and dynamically adapt to user data by utilizing an online n-gram cache and hybrid distillation fine-tuning.
  • This framework is ideal for on-device Large Language Model (LLM) applications focusing on model cost, efficiency, and user customization.
  • OmniDraft showcases its efficacy through online learning tasks in math reasoning, coding, and text generation, demonstrating compatibility with various target models and providing speed enhancements.

Read Full Article

like

12 Likes

For uninterrupted reading, download the app