menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Earley-Dri...
source image

Arxiv

3d

read

128

img
dot

Image Credit: Arxiv

Earley-Driven Dynamic Pruning for Efficient Structured Decoding

  • Large Language Models (LLMs) have shown impressive capabilities, but ensuring their outputs adhere to strict structural or grammatical constraints remains a challenge.
  • Constrained decoding with context-free grammar provides a method to ensure LLMs produce outputs in a specific format by dynamically creating a token logits mask.
  • A novel dynamic pruning strategy called ZapFormat, based on the Earley algorithm, has been proposed to eliminate invalid or redundant Earley states in real-time, reducing memory usage and improving speed.
  • Experiments show that the new constrained decoding engine Formatron, incorporating ZapFormat, maintains high-precision compliant outputs and achieves significant speed improvements compared to existing implementations.

Read Full Article

like

7 Likes

For uninterrupted reading, download the app