Researchers introduce Centaurus, a class of networks composed of generalized state-space model (SSM) blocks.
The SSM operations can be treated as tensor contractions during training.
The optimal order of tensor contractions is determined for every SSM block to maximize training efficiency.
The Centaurus network outperforms its counterparts in raw audio processing tasks and achieves competitive performance in automatic speech recognition (ASR).