<ul><li>Vision-Language Models (VLMs) have shown remarkable performance improvements in recent years, but their large size can be a challenge for real-world applications with latency concerns.</li><li>To address this issue, a new approach called FREE (Fast and Robust Vision Language Models with Early Exits) proposes employing Early Exit (EE) strategies in VLMs, utilizing adversarial training within a GAN-based framework.</li><li>FREE focuses on input-adaptive inference to increase inference speed with minimal performance drop, training exit classifiers within VLMs to improve accuracy and model robustness while reducing overthinking and mid-crisis instances.</li><li>Experimental results show that FREE speeds up the inference process by more than 1.51x while maintaining comparable performance, with the source code available on GitHub.</li></ul>

FREE: Fast and Robust Vision Language Models with Early Exits

Discover more