DeepSeek has released FlashMLA, a decoding kernel for Hopper GPUs, to improve computational efficiency for AI applications.
The kernel supports variable-length sequences and achieves high performance speeds of up to 3000 GB/s for memory tasks and 580 TFLOPS for computing.
FlashMLA is available on GitHub for exploration and use, and is expected to impact sectors like cryptocurrency trading algorithms.
DeepSeek recently announced the launch of five open-source repositories and has a collection of 14 open-source models and repositories on Hugging Face.