Virtual screening plays a critical role in modern drug discovery by enabling the identification of promising candidate molecules for experimental validation.
Traditional machine learning methods such as support vector machines (SVM) and XGBoost rely on predefined molecular representations, often leading to information loss and potential bias.
In contrast, utilizing Graph Convolutional Networks (GCNs) and Large Language Models (LLMs) can provide a more expressive and unbiased alternative by operating directly on molecular graphs and capturing complex chemical patterns.
A hybrid architecture that integrates GCNs with LLM-derived embeddings has been proposed, achieving superior results and outperforming standalone GCN, XGBoost, and SVM baselines.