menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

How Ensemb...
source image

Arxiv

1w

read

0

img
dot

Image Credit: Arxiv

How Ensembles of Distilled Policies Improve Generalisation in Reinforcement Learning

  • In the zero-shot policy transfer setting in reinforcement learning, training an agent on a fixed set of environments allows it to generalize to unseen environments.
  • Policy distillation after training can enhance performance in testing environments, with the theory suggesting training an ensemble of distilled policies and using diverse training data for distillation.
  • A generalization bound for policy distillation after training has been proven in this paper, offering insights for improved generalization in reinforcement learning.
  • Empirical verification shows that utilizing an ensemble of policies distilled on a diverse dataset can lead to significantly better generalization compared to the original agent.

Read Full Article

like

Like

For uninterrupted reading, download the app