menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Interpreti...
source image

Arxiv

2d

read

223

img
dot

Image Credit: Arxiv

Interpreting learned search: finding a transition model and value function in an RNN that plays Sokoban

  • Researchers reverse-engineered a convolutional recurrent neural network (RNN) trained using model-free reinforcement learning to play the game Sokoban.
  • The RNN was found to solve more levels with increased test-time compute, resembling classic bidirectional search.
  • The RNN plans movements by representing them in activations associated with specific directions for each square.
  • These state-action activations function analogously to a value function, determining backtrack and plan survival during pruning.
  • Specialized kernels extend these activations forward and backward to create paths, forming a transition model.
  • The RNN deviates from classical search methods as it does not have a unified state representation; it addresses each box individually.
  • Each layer in the network has its own plan representation and value function, increasing search depth.
  • The mechanisms in the network leveraging test-time compute learned through model-free training are understandable in familiar terms.

Read Full Article

like

13 Likes

For uninterrupted reading, download the app