Online Speakers' Corner on Vector Symbolic Architectures and Hyperdimensional Computing

CHECK THE UPCOMING EVENTS AT THE END OF THIS PAGE!

If you want to give a credit to this webinar series use the following entry when citing (BibTeX). 


Welcome to the Fall 2022 session of the online workshop on VSA and hyperdimensional computing. The next webinar of the fall session will start on December 19, 2022. 20:00GMT. 

USE THIS LINK TO ACCESS THE WEBINAR:
https://ltu-se.zoom.us/j/65564790287

Hyperdimensional computing using time-to-spike neuromorphic circuits. August 29, 2022. 20:00GMT

Graham Bent, Cardiff University

Abstract:  Vector Symbolic Architectures (VSA) can be used to encode complex objects, such as services and sensors, as hypervectors. Such hypervectors can be used to perform efficient distributed service discovery and workflow orchestration in communications constrained environments typical of the Internet of Things (IoT). In these environments, energy efficiency is of great importance. However, most hypervector representations use dense i.i.d element values and performing energy efficient hyperdimensional computing operations on such dense vectors is challenging. More recently, a sparse binary VSA scheme has been proposed based on a slot encoding having M slots with B bit positions per slot, in which only one bit per slot can be set. This paper shows for the first time that such sparse encoded hypervectors can be mapped into energy-efficient time-to-spike Spiking Neural Network (SNN) circuits, such that all the required VSA operations can be performed. Example VSA SNN circuits have been implemented in the Brian 2 SNN simulator, showing that all VSA binding, bundling, unbinding, and clean-up memory operations execute correctly. Based on these circuit implementations, estimates of the energy and processing time required to perform the different VSA operations on typical SNN neuromorphic devices are estimated. Recommendations for the design of future SNN neuromorphic processor hardware that can more efficiently perform VSA processing are also made.

Presented slides: Download

Attention Approximates Sparse Distributed Memory. September 12, 2022. 20:00GMT

Trenton Bricken, Harvard University, USA. 

Abstract:  While Attention has come to be an important mechanism in deep learning, there remains limited intuition for why it works so well. Here, we show that Transformer Attention can be closely related under certain data conditions to Kanerva's Sparse Distributed Memory (SDM), a biologically plausible associative memory model. We confirm that these conditions are satisfied in pre-trained GPT2 Transformer models. We discuss the implications of the Attention-SDM map and provide new computational and biological interpretations of Attention. 

Presented slides: Download

VSA-based Few-shot Continual Learning with a Demonstration on In-memory Computing Hardware. September 26, 2022. 20:00GMT

Michael Hersche and Geethan Karunaratne, IBM Research-Zurich

Abstract:  In this two-part talk, we study the role of VSAs in challenging research problems of continually learning new classes from a few training examples without forgetting previous old classes.

In the first part of talk, we focus on continual learning algorithms to respect certain memory and computational constraints such as (i) training samples are limited to only a few per class, (ii) the computational cost of learning a novel class remains constant, and (iii) the memory footprint of the model grows at most linearly with the number of classes observed. To meet the above constraints, we propose C-FSCIL (Constrained Few-shot Class-incremental Learning), which is architecturally composed of a frozen meta-learned feature extractor, a trainable fixed-size fully connected layer, and a rewritable dynamically growing memory that stores as many vectors as the number of encountered classes. C-FSCIL provides three update modes that offer a trade-off between accuracy and compute-memory cost of learning novel classes. C-FSCIL exploits hyperdimensional embedding of VSAs that allows it to continually express many more classes than the fixed dimensions in the vector space, with minimal interference. The quality of class vector representations is further improved by aligning them quasi-orthogonally to each other by means of novel loss functions. Experiments on the CIFAR100, miniImageNet, and Omniglot datasets show that C-FSCIL outperforms the baselines with remarkable accuracy and compression benefits. It also scales up to the largest problem size ever tried in this few-shot setting by learning 423 novel classes on top of 1200 base classes with less than 1.6% accuracy drop. This part of talk is based on our CVPR 2022 paper: https://openaccess.thecvf.com/content/CVPR2022/html/Hersche_Constrained_Few-Shot_Class-Incremental_Learning_CVPR_2022_paper.html

 

In the second part of the talk, we demonstrate how C-FSCIL can be naturally realized on in-memory computing (IMC) cores. Particularly, we focus on the first update mode of C-FSCIL that is composed of a stationary deep CNN (including both feature extractor and fixed-size fully connected layer) and a dynamically evolving explicit memory. As the centerpiece of this architecture, we propose an explicit memory unit that leverages energy-efficient IMC cores during the course of continual learning operations. We demonstrate for the first time how the explicit memory unit can physically superpose multiple training examples, expand to accommodate unseen classes, and perform similarity search during inference, using operations on an IMC core based on phase-change memory devices. Specifically, the physical superposition of a few encoded training examples is realized via in-situ progressive crystallization of the phase-change memory devices. The classification accuracy achieved on the IMC core remains within a range of 1.28%--2.5% compared to that of the state-of-the-art full-precision baseline software model on both the CIFAR-100 and miniImageNet datasets when continually learning 40 novel classes (from only five examples per class) on top of 60 old classes. This part of talk is based on our ESSDERC 2022 paper: https://arxiv.org/abs/2207.06810   


Presented slides:   Download Presentation 1. Download Presentation 2 

Torchhd: An Open-Source Python Library to Support Hyperdimensional Computing Research. October 10, 2022. 20:00GMT

Mike Heddes, UC Irvine, USA.

Abstract:  Hyperdimensional Computing (HDC) is a neuro-inspired computing framework that exploits high-dimensional random vector spaces. HDC uses extremely parallelizable arithmetic to provide computational solutions that balance accuracy, efficiency and robustness. This has proven especially useful in resource-limited scenarios such as embedded systems. The commitment of the scientific community to aggregate and disseminate research in this particularly multidisciplinary field has been fundamental for its advancement. Adding to this effort, we propose Torchhd, a high-performance open-source Python library for HDC. Torchhd seeks to make HDC more accessible and serves as an efficient foundation for research and application development. The easy-to-use library builds on top of PyTorch and features state-of-the-art HDC functionality, clear documentation and implementation examples from notable publications. Comparing publicly available code with their Torchhd implementation shows that experiments can run up to 104× faster. Torchhd is available at: this https URL 

Presented slidesDownload slides here.

Recasting Self-Attention with Holographic Reduced Representations. October 24, 2022. 20:00GMT

Edward Raff, Booz Allen Hamilton

Abstract:  

Self-Attention has become fundamentally a new approach to set and sequence modeling, particularly within transformerstyle architectures. Given a sequence of 𝑇 items the standard self-attention has O (𝑇 2 ) memory and compute needs, leading to many recent works building approximations to self-attention with reduced computational or memory complexity. In this work, we instead re-cast self-attention using the neuro-symbolic approach of Holographic Reduced Representations (HRR). In doing so we perform the same logical strategy of the standard self-attention. Implemented as a “Hrrformer” we obtain several benefits including faster compute (O (𝑇 log𝑇 ) time complexity), less memory-use per layer (O (𝑇 ) space complexity), convergence in 10× fewer epochs, near state-of-the-art accuracy, and we are able to learn with just a single layer. Combined, these benefits make our Hrrformer up to 370× faster to train on the Long Range Arena benchmark.

Presented slides

HDC-MiniROCKET: Explicit Time Encoding in Time Series Classification with Hyperdimensional Computing. November 7, 2022. 20:00GMT

Kenny Schlegel, Chemnitz University, Germany.

Abstract:  Classification of time series data is an important task for many application domains. One of the best existing methods for this task, in terms of accuracy and computation time, is MiniROCKET. In this work, we extend this approach to provide better global temporal encodings using hyperdimensional computing (HDC) mechanisms. HDC (also known as Vector Symbolic Architectures, VSA) is a general method to explicitly represent and process information in high-dimensional vectors. It has previously been used successfully in combination with deep neural networks and other signal processing algorithms. We argue that the internal high-dimensional representation of MiniROCKET is well suited to be complemented by the algebra of HDC. This leads to a more general formulation, HDC-MiniROCKET, where the original algorithm is only a special case. We will discuss and demonstrate that HDC-MiniROCKET can systematically overcome catastrophic failures of MiniROCKET on simple synthetic datasets. These results are confirmed by experiments on the 128 datasets from the UCR time series classification benchmark. The extension with HDC can achieve considerably better results on datasets with high temporal dependence without increasing the computational effort for inference. 

Presented slides

Graph Embeddings via Tensor Products and Approximately Orthonormal Codes. November 14, 2022. 20:00GMT

Frank Qiu, UC Berkeley, USA

Abstract:  

We introduce a method for embedding graphs as vectors in a structure-preserving manner. In this paper, we showcase its rich representational capacity and give some theoretical properties of our method. In particular, our procedure falls under the bind-and-sum approach, and we show that our binding operation -- the tensor product -- is the most general binding operation that respects the principle of superposition. Similarly, we show that the spherical code achieves optimal compression. We then establish some precise results characterizing the performance our method as well as some experimental results showcasing how it can accurately perform various graph operations even when the number of edges is quite large. Finally, we conclude with establishing a link to adjacency matrices, showing that our method is, in some sense, a generalization of adjacency matrices with applications towards large sparse graphs.

Related publications: 

1.Statistical Comparison of Embedding Methods: https://arxiv.org/abs/2208.08769

2. Recipe Paper Introducing Method: https://arxiv.org/abs/2208.10917

Presented slidesDownload.

Hashing: A Robust and Efficient Dynamic Hash Table. December 5, 2022. 20:00GMT 

Mike Heddes,  UC Irvine, USA

Abstract:  Most cloud services and distributed applications rely on hashing algorithms that allow dynamic scaling of a robust and efficient hash table. Examples include AWS, Google Cloud and BitTorrent. Consistent and rendezvous hashing are algorithms that minimize key remapping as the hash table resizes. While memory errors in large-scale cloud deployments are common, neither algorithm offers both efficiency and robustness. Hyperdimensional Computing is an emerging computational model that has inherent efficiency, robustness and is well suited for vector or hardware acceleration. We propose Hyperdimensional (HD) hashing and show that it has the efficiency to be deployed in large systems. Moreover, a realistic level of memory errors causes more than 20% mismatches for consistent hashing while HD hashing remains unaffected. As part of HD hashing we introduce circularly correlated hypervectors, this novel hypervector set is used to encode circular data.

Presented slides: Download

PANEL: Work-in-progress and future perspective directions in HDC/VSA. December 19, 2022. 20:00GMT

CLICK HERE to watch  the recording of the pannel session.

Slides summarising the discussion: Download (will be available soon)