Compounding error, where small prediction mistakes accumulate over time, presents a major challenge in learning-based control.
Mitigating compounding error is important in model-based reinforcement learning and imitation learning.
Training multi-step predictors directly can help reduce compounding error and improve performance.
In the context of linear dynamical systems, well-specified single-step models achieve lower asymptotic prediction error, while direct multi-step predictors perform better in case of misspecified models with partial observability.