Cerebras announced that its systems outperformed NVIDIA's DGX B200 with 8 Blackwell GPUs' output token speed on Meta's Llama 4 Maverick Model.
Cerebras achieved an output token speed of over 2,500 tokens/sec, surpassing NVIDIA's 1,000 tokens/sec speed.
Cerebras stated that they outperformed systems from other vendors and emphasized that only their solution stood out against Blackwell.
Cerebras, a US-based AI hardware maker, specializes in hardware designed for AI inference and offers faster inference/output token speed compared to traditional GPUs with its Wafer-Scale Engine (WSE) technology.