Lusifer is an LLM-based simulation environment designed to generate dynamic, realistic user feedback for RL-based recommender training.
Lusifer updates user profiles at each interaction step using Large Language Models (LLMs) and provides transparent explanations of how and why preferences evolve.
By processing textual metadata, Lusifer creates context-aware user states and simulates feedback on new items, reducing reliance on extensive historical data and facilitating adaptation to out of distribution cases.
Lusifer excels in capturing dynamic user responses and yielding explainable results, making it a scalable and ethically sound alternative to live user experiments in RL-based recommender systems.