Researchers propose COSMIC (Clique-Oriented Semantic Multi-space Integration for CLIP), a test-time adaptation framework for vision-language models (VLMs).COSMIC enhances adaptability through multi-granular, cross-modal semantic caching and graph-based querying mechanisms.The framework introduces Dual Semantics Graph (DSG) to capture rich semantic relationships by incorporating textual features, coarse-grained CLIP features, and fine-grained DINOv2 features.The Clique Guided Hyper-class component leverages structured class relationships to enhance prediction robustness in COSMIC.