The selection mechanism discussed in the article is inspired by concepts like gating, hypernetworks, and data-dependence.
The concept of gating in neural networks has evolved to include any multiplicative interaction, not just limited to RNN mechanisms like LSTM or GRU.
Hypernetworks involve neural networks whose parameters are generated by smaller networks, leading to more complex architectures.
Data-dependence, like hypernetworks, involves model parameters that depend on the data being processed.
Selection mechanisms are considered distinct concepts from ideas like gating or hypernetworks, despite some similarities.
Related work includes structured SSM models like S4, S5, and quasi-RNNs, and end-to-end architectures such as H3, RetNet, and RWKV.
S4 introduced structured SSMs with diagonal structures and focused on efficient convolutional algorithms for these models.
S5 independently discovered the diagonal SSM approximation and computed recurrently with a parallel scan, differing from S6 with a selection mechanism.
Mega simplified S4 models to real-valued forms, showing effectiveness in certain settings when combined with different architectural components.
Various methods like Liquid S4, SGConv, Hyena, and others focus on different parameterizations of convolutional representations in SSMs.
Most structured SSMs known are non-selective and usually strictly LTI (linear time invariant) in their operations.