Unsupervised skill discovery in reinforcement learning aims to learn diverse behaviors efficiently.
Existing methods focus on diversity through exploration, mutual information optimization, and temporal representation learning.
A new regret-aware method is proposed, framing skill discovery as a min-max game of skill generation and policy learning.
Experimental results demonstrate the method's outperformance of baselines in efficiency and diversity, with a 15% zero-shot improvement in high-dimensional environments.