Authors: (1) Albert Gu, Machine Learning Department, Carnegie Mellon University and with equal contribution; (2) Tri Dao, Department of Computer Science, Princeton University and with equal contribution.
This paper discusses the math behind selective state space models (SSMs) and their application in various tasks.
The authors propose a selection mechanism as a means of compression in SSMs, improving their efficiency and performance.
The paper also provides empirical evaluations and benchmarks for synthetic tasks, language modeling, DNA modeling, and audio generation.