This study proposes a pre-trained Graph Neural Network (GNN) model on molecules without human annotations or prior knowledge.
The previous pre-training methods rely on functional groups, but this approach aims to capture graph-level distinctions.
The proposed method, called Subgraph-conditioned Graph Information Bottleneck (S-CGIB), generates well-distinguished graph-level representations and discovers functional groups.
Experiments show the superiority of the S-CGIB approach on molecule datasets across different domains.