Model-based reinforcement learning (RL) offers a solution to the data inefficiency of model-free RL algorithms.
A new state space model (SSM)-based world model called Drama, with Mamba, achieves efficient training with longer sequences.
Drama addresses the challenges of vanishing gradients and capturing long-term dependencies in recurrent neural network (RNN) and transformer-based world models.
Drama achieves competitive performance on the Atari100k benchmark using a 7 million-parameter world model, making it accessible for training on standard hardware.