<ul><li>Vision-Language Models (VLMs) face high inference costs in time and memory.</li><li>Token sparsity and neuron sparsity offer solutions to improve efficiency in VLMs.</li><li>A new study explores the interplay between Core Neurons and Core Tokens in VLMs.</li><li>The study introduces CoreMatching, a framework leveraging token and neuron sparsity for enhanced inference efficiency, achieving significant speedup.</li></ul>

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

Discover more