<ul><li>LLM (large language model) practitioners commonly notice that outputs can vary for the same inputs under settings expected to be deterministic.</li><li>A systematic investigation into the non-determinism of five LLMs configured to be deterministic was performed.</li><li>Accuracy variations of up to 15% were observed across naturally occurring runs, with a gap of best possible performance to worst possible performance of up to 70%.</li><li>Non-determinism in LLMs is considered essential to the efficient use of compute resources, indicating that this issue will persist.</li></ul>

LLM Stability: A detailed analysis with some surprises

Discover more