A novel cross-modal contrastive learning framework called NeuroLIP has been proposed.NeuroLIP integrates functional magnetic resonance imaging (fMRI) connectivity data with phenotypic textual descriptors.The framework improves interpretability using token-level attention maps, revealing brain region-disease associations.NeuroLIP demonstrates superiority in fairness metrics while maintaining overall best standard metric performance.