S4 paper: Efficiently Modeling Long Sequences with Structured State Spaces
HiPPO: Recurrent Memory with Optimal Polynomial Projections
A new family of SSMs (a fusion of CNNs, RNNs, and classical SSMs like Kalman filter): Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
Follow-up works have focused on understanding S4 models, as well as refining them and augmenting their capabilities: [1, 2, 3, 4, 5]
Diagonal State Spaces are as Effective as Structured State Spaces
On the Parameterization and Initialization of Diagonal State Space Models
Long Range Language Modeling via Gated State Spaces
Simplifying and Understanding State Space Models with Diagonal Linear RNNs
Simplified State Space Layers for Sequence Modeling
Few recent methods optimize SSMs by integrating them with Transformers [1, 2, 3]
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Block-State Transformers
Efficient Long Sequence Modeling via State Space Augmented Transformer
SSMs for time series
Effectively Modeling Time Series with Simple Discrete State Spaces
SSMs for RL
Mohammad’s work: Mastering Memory Tasks with World Models
Decision S4: Efficient Sequence-Based RL via State Spaces Layers
meta RL S4: Structured State Space Models for In-Context Reinforcement Learning
Mamba: Linear-Time Sequence Modeling with Selective State Spaces & follow-ups [1, 2, 3]
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
VMamba: Visual State Space Model
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
MambaTab: A Simple Yet Effective Approach for Handling Tabular Data