<ul data-eligibleForWebStory="true"><li>State-space language models like Mamba have billions of parameters which hinder deployment.</li><li>SparseSSM is introduced as a training-free pruning framework for state space architectures.</li><li>SparseSSM extends the optimal brain surgeon framework to state space models.</li><li>The algorithm calculates saliency scores to identify redundant parameters and guide pruning.</li><li>Component sensitivity analysis is used to identify where redundancy exists in the architecture.</li><li>SparseSSM can be extended to semi-structured and structured sparsity.</li><li>Empirical results show that 50% of SSM weights can be pruned without fine-tuning, maintaining accuracy.</li><li>No zero-shot accuracy loss is observed with SparseSSM, setting a new benchmark for pruning Mamba-based LLMs.</li></ul>

SparseSSM: Efficient Selective Structured State Space Models Can Be Pruned in One-Shot

Discover more