menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

The Actor-...
source image

Arxiv

3d

read

160

img
dot

Image Credit: Arxiv

The Actor-Critic Update Order Matters for PPO in Federated Reinforcement Learning

  • Proximal Policy Optimization (PPO) faces challenges in Federated Reinforcement Learning (FRL) due to the update order of its actor and critic.
  • The conventional update order in PPO may cause heterogeneous gradient directions among clients, hindering convergence to a globally optimal policy in FRL.
  • FedRAC proposes reversing the update order in PPO (actor first, then critic) to eliminate the divergence of critics from different clients.
  • Theoretical analysis and empirical results support that FedRAC achieves higher cumulative rewards and faster convergence compared to the conventional PPO update order in FRL scenarios.

Read Full Article

like

9 Likes

For uninterrupted reading, download the app