menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

ETS: Effic...
source image

Arxiv

1d

read

199

img
dot

Image Credit: Arxiv

ETS: Efficient Tree Search for Inference-Time Scaling

  • Test-time compute scaling aims to improve model accuracy by using additional computation at inference time.
  • Efficient Tree Search (ETS) is proposed to address challenges in search methods for inference-time scaling.
  • ETS uses a process reward model to generate and score potential candidates during the search process.
  • Increasing diversity in trajectories during tree search promotes more exploration but consumes more memory.
  • ETS promotes KV sharing by pruning redundant trajectories while maintaining necessary diversity.
  • A linear programming cost model in ETS penalizes the number of retained nodes to promote KV cache sharing.
  • ETS achieves a 1.8$ imes$ reduction in average KV cache size during search, leading to 1.4$ imes$ increased throughput.
  • ETS demonstrates improved performance relative to prior state-of-the-art methods with minimal accuracy degradation.
  • No custom kernel implementation is required for ETS, and the code is available on GitHub.

Read Full Article

like

12 Likes

For uninterrupted reading, download the app