menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

T\'yr-the-...
source image

Arxiv

2M

read

275

img
dot

Image Credit: Arxiv

T\'yr-the-Pruner: Unlocking Accurate 50% Structural Pruning for LLMs via Global Sparsity Distribution Optimization

  • A new pruning method called T'yr-the-Pruner has been proposed to enhance hardware-agnostic inference efficiency for large language models (LLMs).
  • T'yr-the-Pruner is an end-to-end search-based global structural pruning framework that aims to determine the optimal sparsity distribution under a target overall sparsity ratio.
  • The framework constructs a supernet using local pruning and expectation error accumulation approaches, and employs an iterative prune-and-search strategy for efficient convergence.
  • Experimental results demonstrate that T'yr-the-Pruner achieves state-of-the-art structural pruning by retaining 97% of the dense model's performance while removing 50% of Llama-3.1-70B's parameters.

Read Full Article

like

16 Likes

For uninterrupted reading, download the app