Role of Sparse matrix in machine learning

Introduction

Laymen explanation

Large sparse matrices are common in general and especially in applied machine learning, such as in data that contains counts, data encodings that map categories to counts, and even in whole subfields of machine learning such as natural language processing.

Technical explanation

Sparse matrix is a matrix which contains very few non-zero elements. It has space and time complexity overheads.

There is no strict definition how many elements need to be zero for a matrix to be considered sparse but a common criterion is that the number of non-zero elements is roughly the number of rows or columns.

The sparsity of a matrix can be quantified with a score, which is the number of zero values in the matrix divided by the total number of elements in the matrix.

sparsity = count zero elements / total elements

Origin of sparse matrix in ML

While they occur naturally in some data collection processes, more often they arise when applying certain data transformation techniques like:

Efficient data structures

There are many data structures for storing sparse matrix. MATLAB stores sparse matrices in compressed sparse column format.

Its mostly best to use a library for doing sparse matrix calculations since library ensures efficient implementation which is suitable for most cases.

Use in machine learning

In recommender systems, we typically work with very sparse matrices as the item universe is very large while a single user typically interacts with a very small subset of the item universe.

In image processing, Sparse matrix is a special way of representing the image in a matrix format. In sparse matrix, most of the elements are zero. This reduces the unwanted processing of the pixel values. This paper talks about it in detail.

Relevance in neural networks

Sparse networks, that is, neural networks in. which a large subset of the model parameters are zero, have. emerged as one of the leading approaches for reducing. model parameter count.

It has been shown empirically that deep neural networks can achieve state-of-the-art results under high levels of sparsity (Han et al., 2015; Louizos et al., 2017; Gale et al., 2019) , and this property has been leveraged to significantly reduce the parameter footprint and inference complexity (Kalchbrenner et al., 2018) of densely connected neural networks. This paper talks about it in detail. This is another relevant article.

For transformer NLP neural model, memory and time complexity is big issue. OpenAI used sparse factorisation of the attention matrices which reduced the network complexity from O(N2) to O(N√N). This allows OpenAI to "model sequences with tens of thousands of elements using hundreds of layers," compared to other networks that can only handle sequences of "a few thousand elements."

Reference

https://machinelearningmastery.com/sparse-matrices-for-machine-learning/

https://dziganto.github.io/Sparse-Matrices-For-Efficient-Machine-Learning/

https://en.wikipedia.org/wiki/Sparse_matrix

https://www.quora.com/What-is-the-fastest-sparse-matrix-data-structure-for-speeding-up-multiplication-of-a-1000-x-1000-there-about-double-valued-matrix

https://towardsdatascience.com/why-we-use-sparse-matrices-for-recommender-systems-2ccc9ab698a4

https://arxiv.org/pdf/1906.10732.pdf

https://analyticsindiamag.com/rigl-google-algorithm-neural-networks/

https://www.ijert.org/research/image-interpolation-using-sparse-matrix-representation-IJERTV4IS020434.pdf

https://www.researchgate.net/publication/258499564_Sparse_Matrix_Factorization

https://www.infoq.com/news/2019/05/openai-sparse-transformers/

https://images.app.goo.gl/5senKK5qmieqzVNcA

https://matteding.github.io/2019/04/25/sparse-matrices/

https://images.app.goo.gl/oV7jkjPbPvxYtJCY6

Page updated

Google Sites

Report abuse