<ul data-eligibleForWebStory="true"><li>Decentralized cooperative multi-armed bandits involve agents aiming to minimize regret by exchanging information to select arms.</li><li>Cooperative agents outperform single agents in selecting arms independently.</li><li>The study focuses on recovering behavior in the presence of Byzantine agents who can provide incorrect information.</li><li>The framework can model attackers in networks, offensive content instigators, or financial manipulators.</li><li>A decentralized resilient upper confidence bound (UCB) algorithm is developed to handle Byzantine agents.</li><li>The algorithm mixes information among agents and trims inconsistent extreme values.</li><li>The normal agent's performance matches UCB1 algorithm for regret, surpassing non-cooperative cases.</li><li>Each agent needs at least 3f+1 neighbors, where f is the maximum Byzantine agents in each agent's neighborhood.</li><li>Extensions to time-varying graphs and minimax lower bounds for achievable regret are established.</li><li>Experiments support the framework's effectiveness in practical applications.</li></ul>

Byzantine-Resilient Decentralized Multi-Armed Bandits

Discover more