<ul><li>Software engineer John Leimgruber managed to run the massive, 671 billion parameter DeepSeek-R1 model without a GPU.</li><li>John used a quantised, non-distilled version of the model which retained good quality despite compression.</li><li>The model is built on 8 bits, making it efficient by default and reducing the file size.</li><li>John successfully ran the model on a fast NVMe SSD by loading the KV cache into RAM and using memory mapping.</li></ul>

This Developer Ran the 671 Billion Parameter DeepSeek-R1 Model—Without a GPU

Discover more