Search this site
Embedded Files
Towards AGI
  • Home
  • Schedule
  • Projects
  • Topics&Papers
    • Adversarial Robustness
    • Alignment and Safety
    • CompPsych-FoMo
    • Compression and Fast Inference
    • Continual Learning at Scale
    • Emergence & Phase Transitions in ML
    • Foundation Models
    • Generalization (iid and ood)
    • High Performance Computing
    • Knowledge Fusion
    • Neural Scaling Laws
    • Out-of-Distribution Generalization
    • Scaling Laws in Nature
    • State Space Models
    • Time Series Foundation Models
  • Reading Group
Towards AGI
  • Home
  • Schedule
  • Projects
  • Topics&Papers
    • Adversarial Robustness
    • Alignment and Safety
    • CompPsych-FoMo
    • Compression and Fast Inference
    • Continual Learning at Scale
    • Emergence & Phase Transitions in ML
    • Foundation Models
    • Generalization (iid and ood)
    • High Performance Computing
    • Knowledge Fusion
    • Neural Scaling Laws
    • Out-of-Distribution Generalization
    • Scaling Laws in Nature
    • State Space Models
    • Time Series Foundation Models
  • Reading Group
  • More
    • Home
    • Schedule
    • Projects
    • Topics&Papers
      • Adversarial Robustness
      • Alignment and Safety
      • CompPsych-FoMo
      • Compression and Fast Inference
      • Continual Learning at Scale
      • Emergence & Phase Transitions in ML
      • Foundation Models
      • Generalization (iid and ood)
      • High Performance Computing
      • Knowledge Fusion
      • Neural Scaling Laws
      • Out-of-Distribution Generalization
      • Scaling Laws in Nature
      • State Space Models
      • Time Series Foundation Models
    • Reading Group

State Space Models

  1. S4 paper:  Efficiently Modeling Long Sequences with Structured State Spaces

  2. HiPPO: Recurrent Memory with Optimal Polynomial Projections

  3. A new family of SSMs (a fusion of CNNs, RNNs, and classical SSMs like Kalman filter): Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

  4. Follow-up works have focused on understanding S4 models, as well as refining them and augmenting their capabilities: [1, 2, 3, 4, 5]

    1. Diagonal State Spaces are as Effective as Structured State Spaces

    2. On the Parameterization and Initialization of Diagonal State Space Models

    3. Long Range Language Modeling via Gated State Spaces

    4. Simplifying and Understanding State Space Models with Diagonal Linear RNNs

    5. Simplified State Space Layers for Sequence Modeling

  5. Few recent methods optimize SSMs by integrating them with Transformers [1, 2, 3]

    1. Hungry Hungry Hippos: Towards Language Modeling with State Space Models

    2. Block-State Transformers

    3. Efficient Long Sequence Modeling via State Space Augmented Transformer

  6. SSMs for time series

    1. Effectively Modeling Time Series with Simple Discrete State Spaces

  7. SSMs for RL 

    1. Mohammad’s work:  Mastering Memory Tasks with World Models

    2. Decision S4: Efficient Sequence-Based RL via State Spaces Layers

    3. meta RL S4:  Structured State Space Models for In-Context Reinforcement Learning 

  8. Mamba: Linear-Time Sequence Modeling with Selective State Spaces & follow-ups [1, 2, 3]

    1. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

    2. VMamba: Visual State Space Model 

    3. U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

    4. MambaTab: A Simple Yet Effective Approach for Handling Tabular Data


Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse