Recent advances in inference-time compute have improved performance on complex tasks using Large Reasoning Models (LRMs).The high inference latency is a trade-off for improved accuracy due to the length of generated reasoning sequences and autoregressive decoding.SpecReason is a system that accelerates LRM inference by using a lightweight model to carry out simpler intermediate reasoning steps.SpecReason achieves 1.5-2.5x speedup over vanilla LRM inference while improving accuracy by 1.0-9.9%.