<ul><li>Beneficial actions taken by others, even when hidden, pose a challenge for multi-agent reinforcement learning (MARL).</li><li>A study was conducted on the impact of hidden gifts in a simple MARL task where agents in a grid-world environment need to unlock individual doors for rewards and drop a key to obtain a larger collective reward.</li><li>State-of-the-art RL algorithms, including MARL algorithms, struggled to learn how to achieve the collective reward in the task.</li><li>Independent model-free policy gradient agents could solve the task with information about their own action history, while MARL agents failed to do so. A correction term inspired by learning aware approaches helped independent agents converge to collective success more reliably.</li></ul>

The challenge of hidden gifts in multi-agent reinforcement learning

Discover more