High-content screening (HCS) assays based on Cell Painting enable the study of cells' responses to perturbations on a large scale.
Recent advancements in cross-modal contrastive learning can be used to align perturbations with their morphological effects in a unified latent space.
CellCLIP, a cross-modal contrastive learning framework for HCS data, uses pre-trained image encoders and a unique channel encoding scheme to capture relationships between microscopy channels and natural language encoders for perturbations.
CellCLIP surpasses current open-source models, excelling in cross-modal retrieval, biological downstream tasks, and achieving notable reductions in computation time.