<ul><li>We consider the problem of a learning agent playing a general sum game against an unknown strategic opponent.</li><li>The agent knows their own payoff function but is uncertain about the opponent's payoff distribution.</li><li>A polynomial-time algorithm is presented to maximize the agent's utility within a small epsilon of optimal utility.</li><li>When the algorithm is constrained to be a no-regret algorithm, an optimal learning algorithm can be constructed in polynomial time.</li></ul>

Learning to Play Against Unknown Opponents

Discover more