Steve Abreu | University of Groningen | s.abreu@rug.nl
Nicole Dumont | University of Waterloo | ns2dumon@uwaterloo.ca
Guido Zarrella | MITRE | guido@mitre.org
Ivana Kajić | Google DeepMind | kivana@google.com
Alessandro Pierro | Intel Labs | alessandro.pierro@intel.com
Anand Subramoney | RHUL | anand.subramoney@rhul.ac.uk
Emre Neftci | FZ Juelich | e.neftci@fz-juelich.de
Chris Eliasmith | University of Waterloo | celiasmith@uwaterloo.ca
Maxence Ernoult | Google DeepMind
Anya Ivanova | Georgia Institute of Technology
Robert M. Mok | Center for Information and Neural Networks
With this topic area, we explore how neuromorphic principles can enhance the performance, efficiency, and robustness of state-of-the-art machine learning models. While foundation models excel in many tasks, they face significant challenges, such as high computational cost, lack of continual learning and knowledge editing, and limited reasoning abilities. We aim to bridge disciplines and collaborate between fields - mainstream machine learning, robotics, neuromorphic engineering, neuroscience, cognitive science, and psychology - by focusing on sparsity and always-on reasoning as convergence points.
Neuromorphic hardware offers energy-efficient, adaptive alternatives to GPUs/TPUs, enabling scalable and sustainable AI systems for real-world tasks. Participants will collaborate to develop models leveraging neuromorphic hardware for efficient training, inference, and applications. We will also connect sparsity to interpretability, using sparse autoencoders as tools for mechanistic interpretability while drawing parallels to neuroscience methods for functional understanding of complex systems. Insights from cognitive science, such as human reasoning mechanisms, will complement these efforts, synergizing with neuromorphic techniques to create adaptable, robust AI whose inner workings we can analyze and intervene on.
Anand Subramoney
Scaling up deep learning has relied on getting faster and more parallel hardware so far. But this approach is hitting its limits for energy consumption. While neuromorphic computing provides a promising alternative paradigm, the progress on developing neuromorphic-first primitives for DL has been slow. The focus is still on either biology or directly adapting mainstream DL algorithms developed for GPUs. In this talk, I will explain how a principled neuromorphic-first algorithmic design is key for scaling deep learning on neuromorphic hardware. This is a neuromorphic version of “There’s plenty of room at the Top”, a 2020 paper by Leiserson et al. on how algorithms will drive performance after Moore’s law. Which itself was a play on Feynman’s 1959 talk “There's Plenty of Room at the Bottom” which foresaw how miniaturization of hardware will drive computer performance at that time.
Guillaume Pourcel
Recurrent Hamiltonian Echo Learning (RHEL) is a "forward-only" proxy of BPTT (Backpropagation Through Time) that uses time-reversal symmetry breaking as a credit assignment mechanism. This approach enables training bespoke RNNs and stacked architectures (called "Hamiltonian SSMs") for long range benchmarks, offering an alternative compute paradigm for model inference and training.
Chris Eliasmith
My lab is starting to build the next generation of our large-scale brain model, called Spaun 3.0. In this talk I will outline key methods we intend to integrate (including those related to LLMs), and our overarching goals for this model. A main goal with the talk is to start a conversation about what is most compelling to include in such a model. What kinds of functions and tasks would have the biggest impact and provide the most interesting behaviors and comparisons to neural data? What kinds of challenges are outstanding for large-scale brain modeling and what would count as addressing them? Ideally this conversation will guide our work and make the result broadly useful for neuroscientists and engineers.
Emre Neftci
The causal decoder transformer is the workhorse of state-of-the-art large language models and sequence modeling. Its key enabling building block is self-attention, which acts as a history-dependent weighting of sequence elements. Self-attention can take a form strikingly similar to synaptic plasticity, which can be efficiently implemented in neuromorphic hardware.
So far, challenges in deep credit assignment have limited the use of synaptic plasticity to relatively shallow networks and simple tasks. By leveraging the equivalence between self-attention and plasticity, we explain how transformer inference is essentially a learning problem that can be addressed with local synaptic plasticity, thereby circumventing the online credit assignment problem. With this understanding, self-attention can be further improved using concepts inspired by computational neuroscience, such as continual learning and metaplasticity. Since causal transformers are notoriously inefficient on conventional hardware, neuromorphic principles for self-attention could hold the key to more efficient inference with transformer-like models.
Alessandro Pierro
Linear RNNs and State Space Models (SSMs) have emerged as powerful backbones for sequence modeling, offering a concrete alternative to full self-attention and better suitability for neuromorphic processors. In particular, their constant compute and memory requirements per step promise a path to bring advanced AI capabilities to low-power edge devices. In this tutorial, we will demonstrate a hardware-aware methodology to optimize the S5 SSM architecture for the Intel Loihi 2 neuromorphic processor, combining unstructured pruning, activity sparsification, and quantization-aware training. The resulting models exhibit a wide Pareto front compared to dense baselines on audio denoising and keyword spotting tasks. Moreover, when deployed on Loihi 2, our models demonstrate up to 42× lower latency and 149× lower energy consumption compared to a dense model on an edge GPU.
Varun Dhanraj
Large language models (LLMs) continue to face challenges in reliably solving reasoning tasks, particularly those that require precise rule following, as often found in mathematical reasoning. This paper introduces a novel neurosymbolic method that improves LLM reasoning by encoding hidden states into neurosymbolic vectors, enabling problem-solving within a neurosymbolic vector space. The results are decoded and merged with the original hidden state, significantly boosting the model's performance on numerical reasoning tasks. By offloading computation through neurosymbolic representations, this method enhances efficiency, reliability, and interpretability. Experimental results demonstrate an average of 88.6% lower cross-entropy loss and 15.4 times more problems correctly solved on a suite of mathematical reasoning tasks compared to chain-of-thought prompting and supervised fine-tuning (LoRA), without degrading performance on other tasks.
Rob Mok
In computational cognitive neuroscience, we aim to understand the brain by building theoretical models that can explain and predict brain function. However, many approaches don't help us understand how brains or how models work. High accuracy on benchmarks (machine learning; ML), high correlations between brains and models (c.f., brainscore), and brain blobs (fMRI activity) at best provide hints but no explanations. I propose a minimum of three elements required for good theoretical explanations of brains and models: Task (performing brains/models), Theory-based Analysis (opposed to only exploratory), and Interventions (opposed to focusing only on metrics) - TTI. I will present three projects where we have used some or all of these to understand concept and category representations in the brain.
First, I will present a non-spatial account of place and grid cells where a task-performing, clustering algorithm explains key findings in both spatial and conceptual domains, opening up questions about the functional specificity of 'spatial' cells. Next, I question the validity of the intuitively classified spatial cells as genuine cell types as cells for spatial function. We use DNNs and VR with the TTI approach in a spatial environment, analyze model units and representations, its spatial knowledge, and use unit ablation to assess how the spatial knowledge arises in the model, and conclude that spatial cells are not genuine cell types unique to spatial cognition. Finally, I will present recent work using a DNN-TTI approach to model dedifferentiation in ageing, and present a novel explanation for dedifferentiation based on age-related white matter neurodegeneration. In sum, I will argue that the combination of these three elements are essential to building good explanations of brains and models, and each requires careful and deep consideration to make progress on understanding the brain.
Maxence Ernoult
This talk aims to be a pragmatic introduction to learning algorithms grounded in physics, which encompasses a broad class of models from feedforward nets to physical systems, taking static or temporal data as inputs. Starting from first principles, we present a minimal hierarchy of independent concepts to circumvent some problems inherent to the hardware implementation of standard differentiation. This way, we avoid entangling essential ingredients with arbitrary design choices by naively listing existing algorithms and instead propose the draft of a “cookbook” to help the exploration of many possible combinations of these independent mechanisms.
Anya Ivanova
Today’s large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that LLMs have become “thinking machines”, capable of performing tasks that require reasoning and/or world knowledge. In this talk, I will introduce a distinction between formal competence—knowledge of linguistic rules and patterns—and functional competence—understanding and using language in the world. This distinction is grounded in human neuroscience, which shows that formal and functional competence recruit different brain mechanisms. I will show that the word-in-context prediction objective has allowed LLMs to essentially master formal linguistic competence; however, pretrained LLMs still lag behind at many aspects of functional linguistic competence, prompting engineers to adopt specialized fune-tuning techniques and/or couple LLMs with external modules. I will then turn to world knowledge, a capability where the formal/functional distinction is less clear-cut, and discuss our efforts to leverage both cognitive science and NLP to develop systematic ways to probe LLMs’ world knowledge. Finally, I will discuss the do’s and don’ts of cognitive evaluations in LLMs.
Extended reading list for state space models, from Telluride 2024: Intro to SSMs
[Neuromorphic SSM] Pierro, A., Abreu, S., Timcheck, J., Stratmann, P., Wild, A., & Shrestha, S. B. (2025). Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity. [pdf]
[SSM] Antonio Orvieto, Samuel L Smith, Albert Gu, Anushan Fernando, Caglar Gulcehre, Razvan Pascanu, Soham De (2023). Resurrecting recurrent neural networks for long sequences. ICML. [pdf]
[SSM-LLM] Albert Gu, Tri Dao (2023). Mamba: Linear-Time Sequence Modeling with Selective State Spaces. [pdf]
[SSM-LLM] Soham De, (...), Caglar Gulcehre (2024). Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models. [pdf]
[SSM] Aaron Voelker, Ivana Kajić, Chris Eliasmith (2019). Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks. NeurIPS. [pdf]
LLM Mechanistic Interpretability & Steering:
Ameisen, E., Lindsey, J., Pearce, A., Gurnee, W., Turner, N. L., Chen, B., ... & Batson, J. (2025). Circuit tracing: Revealing computational graphs in language models. [link]
Postmus, J., & Abreu, S. (2024). Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering. [pdf]
Dhanraj, V., & Eliasmith, C. (2025). Improving Rule-based Reasoning in LLMs via Neurosymbolic Representations. [pdf]
Tamkin, A., Taufeeque, M., & Goodman, N. D. (2023). Codebook features: Sparse and discrete interpretability for neural networks. [pdf]
Reasoning with LLMs:
Evaluations:
[Eval] Ivana Kajić, Olivia Wiles, Isabela Albuquerque, Matthias Bauer, Su Wang, Jordi Pont-Tuset, Aida Nematzadeh: "Evaluating Numerical Reasoning in Text-to-Image Models" arXiv preprint arXiv: 2406.14774 (2024). [pdf]
[Eval] Frank, Michael C. "Baby steps in evaluating the capacities of large language models." Nature Reviews Psychology 2, no. 8 (2023): 451-452. [pdf]
[Eval] Chang, Yupeng, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen et al. "A survey on evaluation of large language models." ACM Transactions on Intelligent Systems and Technology (2023). [pdf]