<ul><li>Large language models (LLMs) are being increasingly used in molecular science for scientific discovery.</li><li>A new framework called CLEANMOL has been introduced to improve the understanding of molecular structures encoded in SMILES representation.</li><li>CLEANMOL formulates SMILES parsing into clean and deterministic tasks to enhance graph-level molecular comprehension.</li><li>Results show that pre-training LLMs on tasks from CLEANMOL framework improves structural comprehension and performs competitively on the Mol-Instructions benchmark.</li></ul>

Improving Chemical Understanding of LLMs via SMILES Parsing

Discover more