Researchers propose a reinforcement learning-based adaptive variable neighborhood search method for the Vehicle Routing Problem with Multiple Time Windows (VRPMTW).
The method integrates reinforcement learning to dynamically select neighborhood operators based on real-time solution states and learned experience.
A transformer-based neural policy network is used for intelligently guiding operator selection during local search.
Experiments show that RL-AVNS outperforms traditional methods, achieving significant improvements in solution quality and computational efficiency across various scenarios.