Sliding Puzzles Gym (SPGym) is a benchmark created to evaluate visual representation learning in reinforcement learning (RL) agents by transforming the classic 8-tile puzzle into a visual RL task.
SPGym allows researchers to isolate and scale the visual representation challenge independently of other learning components by controlling representation learning complexity through adjustable grid sizes and image pools, while maintaining fixed environment dynamics.
Experiments with model-free and model-based RL algorithms using SPGym reveal limitations in handling visual diversity, with all algorithms showing performance degradation as the pool of possible images increases.
The study highlights the need for improved visual representation learning techniques in RL and positions SPGym as a valuable tool for advancing robust and generalizable decision-making systems.