<ul><li>Speculative decoding is popular for accelerating Large Language Models (LLMs) inference while maintaining text generation performance.</li><li>A training-free online learning framework, BanditSpec, is proposed to adaptively choose hyperparameter configurations during text generation.</li><li>BanditSpec formulates hyperparameter selection as a Multi-Armed Bandit problem and introduces two bandit-based algorithms, UCBSpec and EXP3Spec.</li><li>Empirical experiments show that UCBSpec and EXP3Spec are effective in hyperparameter selection for LLMs, with performance close to the best hyperparameters in real-life scenarios.</li></ul>

BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms

Discover more