<ul><li>California-based startup Inception Labs has introduced Mercury, claimed to be the first commercial-scale diffusion large language model.</li><li>Mercury is ten times faster than current models, with an output speed of over 1000 tokens per second on NVIDIA H100 GPUs.</li><li>Unlike transformer models, Mercury generates the entire text at the same time using a diffusion process.</li><li>Inception Labs' evaluation shows that Mercury outperforms other small models in output speed, and diffusion models have advantages in reasoning and refining output.</li></ul>

The ‘First Commercial Scale’ Diffusion LLM Mercury Offers over 1000 Tokens/sec on NVIDIA H100

Discover more