Large-scale pre-training in machine learning research has enabled the use of foundation models for adapting and fine-tuning to specific tasks, similar framework is now being applied to reinforcement learning for addressing core challenges such as sample efficiency and robustness.
A probabilistic model called intention-conditioned flow occupancy models (InFOM) has been developed in this context to predict which states an agent will visit in the future by incorporating flow matching and latent variables capturing user intention, leading to improved returns and success rates in benchmark tasks.
The InFOM method outperformed alternative pre-training methods in experiments conducted on 36 state-based and 4 image-based benchmark tasks, achieving a 1.8 times median improvement in returns and increasing success rates by 36%.
More details can be found on the website https://chongyi-zheng.github.io/infom and the code is available at https://github.com/chongyi-zheng/infom