<ul><li>This paper provides a review of Contextual Multi-Armed Bandit (CMAB) methods and introduces an experimental framework for scalable and interpretable offer selection in retail.</li><li>The framework models context at the product category level, allowing offers to span multiple categories, enhancing learning efficiency in dynamic environments.</li><li>It extends CMAB methodology to support multi-category contexts and achieves scalability through efficient feature engineering and modular design.</li><li>The prototype offers interpretability at scale through logistic regression models and a large language model interface for real-time tracking and explanation of evolving preferences.</li></ul>

Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype

Discover more