<ul><li>Model-based offline reinforcement learning (MORL) focuses on learning a policy using a dynamics model from an existing dataset.</li><li>Existing MORL approaches generate trajectories to mimic real data distribution but can produce unreliable trajectories due to overlooking historical information.</li><li>A new MORL algorithm called Reliability-guaranteed Transformer (RT) is introduced to eliminate unreliable trajectories by assessing cumulative reliability.</li><li>The RT algorithm also efficiently generates high-return trajectories by sampling actions with high rewards and has been shown effective in benchmark tasks.</li></ul>

Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling

Discover more