menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Causal Pol...
source image

Arxiv

3d

read

47

img
dot

Image Credit: Arxiv

Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic

  • Hidden confounders can bias policy learning in reinforcement learning algorithms by influencing both states and actions.
  • DoSAC (Do-Calculus Soft Actor-Critic with Backdoor Adjustment) is proposed to correct for hidden confounding via causal intervention estimation.
  • DoSAC estimates the interventional policy using the backdoor criterion without needing access to true confounders or causal labels.
  • Empirical results on continuous control benchmarks demonstrate that DoSAC outperforms baselines under confounded settings, with improved robustness, generalization, and policy reliability.

Read Full Article

like

2 Likes

For uninterrupted reading, download the app