This paper introduces PROPEL, a framework that combines optimization with supervised and Deep Reinforcement Learning (DRL) for large-scale Supply Chain Planning (SCP) optimization problems.
PROPEL uses supervised learning to identify variables fixed to zero in the optimal solution, and DRL to select which fixed variables must be relaxed to improve solution quality.
The framework has been applied to industrial SCP optimizations with millions of variables, leading to significant improvements in solution times and quality.
The computational results show a 60% reduction in primal integral, an 88% primal gap reduction, and improvement factors of up to 13.57 and 15.92, respectively.