<ul><li>Integrating pre-collected offline data from a different environment can enhance reinforcement learning efficiency, but challenges arise due to discrepancies in transition dynamics.</li><li>Existing methods address this issue by penalizing or filtering out source transitions in high dynamics-gap regions, but their estimation methods can be problematic.</li><li>To address these limitations, a new method called CompFlow is proposed, which leverages flow matching and optimal transport principles to model target dynamics.</li><li>CompFlow offers improved generalization for learning target dynamics and a principled estimation of the dynamics gap, resulting in enhanced performance compared to strong baselines in RL benchmarks with shifted dynamics.</li></ul>

Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data

Discover more