A new approach called Entropy-Reinforced Planning (ERP) is proposed for drug discovery using large language models (LLMs).
LLMs are effective for generating molecules but can result in invalid or suboptimal compounds due to their prior experience.
ERP enhances the Transformer decoding process by employing an entropy-reinforced planning algorithm, achieving a balance between exploitation and exploration.
Experimental results show that ERP outperforms the current state-of-the-art algorithm and baselines in multiple benchmarks, including SARS-CoV-2 and human cancer target proteins.