menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Can In-Con...
source image

Arxiv

2d

read

102

img
dot

Image Credit: Arxiv

Can In-Context Reinforcement Learning Recover From Reward Poisoning Attacks?

  • Researchers studied the corruption-robustness of in-context reinforcement learning, specifically focusing on the Decision-Pretrained Transformer (DPT).
  • They introduced the Adversarially Trained Decision-Pretrained Transformer (AT-DPT) framework to combat reward poisoning attacks targeting the DPT.
  • The AT-DPT framework involves training an attacker to minimize the true reward of the DPT by poisoning environment rewards, while training the DPT model to infer optimal actions from the poisoned data.
  • Evaluation results demonstrated that the proposed AT-DPT method outperformed standard bandit algorithms and robust baselines in bandit settings, even against adaptive attackers, and showed robustness in more complex environments beyond bandit scenarios.

Read Full Article

like

6 Likes

For uninterrupted reading, download the app