Boolean Satisfiability (SAT) solvers traditionally rely on hand-crafted heuristics for performance.
A new approach called Reinforcement Learning from Algorithm Feedback (RLAF) uses Graph Neural Networks (GNNs) to guide SAT solver branching heuristics.
RLAF leverages a GNN to assign variable weights and polarities in a single pass, training it through reinforcement learning using solver's computational cost as the reward.
Evaluations show that RLAF-trained policies reduce solve times of different SAT solvers, offering more than 2x speedup in some cases and outperforming expert-supervised methods.