Regularization, data augmentation, and sparse reward functions are used in reinforcement learning to promote simplicity and increase generalizability.
A new approach is introduced to maximize the total correlation within induced trajectories in reinforcement learning, aiming to promote simple behavior throughout the episode.
An algorithm is proposed to optimize policy and state representation models based on a lower-bound approximation, resulting in superior robustness and improved performance in simulated robot environments.
The method naturally generates policies with periodic and compressible trajectories, showing better resistance to noise and dynamic changes compared to existing methods.