<ul data-eligibleForWebStory="true">GLM-4.1V-Thinking is a vision-language model (VLM) that aims to advance multimodal reasoning.The model utilizes Reinforcement Learning with Curriculum Sampling (RLCS) to enhance its capabilities across various tasks.GLM-4.1V-9B-Thinking, an open-source version, achieves state-of-the-art performance on multiple benchmarks.The model surpasses similar-sized models and even outperforms larger models on various tasks like STEM reasoning and long document understanding.