menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

DeepSeek A...
source image

Marktechpost

1M

read

152

img
dot

DeepSeek AI Releases DeepGEMM: An FP8 GEMM Library that Supports both Dense and MoE GEMMs Powering V3/R1 Training and Inference

  • DeepSeek AI has released DeepGEMM, an FP8 GEMM library for efficient matrix multiplications in deep learning and high-performance computing.
  • DeepGEMM supports both standard and Mix-of-Experts (MoE) grouped GEMMs, leveraging NVIDIA Hopper tensor cores.
  • The library utilizes fine-grained scaling and a two-level accumulation strategy for accurate FP8 arithmetic without compromising performance.
  • DeepGEMM offers clear efficiency improvements with speedups of up to 2.7x for normal GEMMs and 1.1x to 1.2x for grouped GEMMs.

Read Full Article

like

9 Likes

For uninterrupted reading, download the app