Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world.
Tokenized multi-agent policies have become the state-of-the-art in traffic simulation, but they suffer from covariate shift when executed in closed-loop during simulation.
A new strategy called Closest Among Top-K (CAT-K) rollouts is presented to mitigate covariate shift, enabling improved performance of tokenized traffic simulation policies.
CAT-K fine-tuning outperforms larger models in the Waymo Sim Agent Challenge leaderboard, achieving the top spot.