Subspace learning (also called reduction dimension) seeks to find a low-dimension subspace to facilitate data analysis by reducing the dimension of the data. Different adaption of regular subspace learning methods have been developed in literature to address their limitations in presence of the challenges met in large scale modern data as follows:
Robust subspace learning refers to the problem of subspace learning in the presence of outliers, is both theorically and practically a much harder problem than regular subspace learning. [Outliers]
Adversarial robust subspace learning refers to the problem of subspace learning in malicious data produced by powerful adversaries who can modify the whole data set in addition of random noise or unintentional corrupted data.
Dynamic subspace learning refers to the problem of incoming data which need to be processed in an online manner.
Sparse subspace learning addresses the problem of high dimensional data by finding linear combinations that contain just a few input variables.[High dimensions]
Distributed subspace learning can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. Indeed, data tends to be distributed for various reasons. First, it can be inherently distributed like in IoT, sensor networks, etc. Second, it can be distributed due to storage and/or computational limitations. The ultimate goal of any distributed algorithm is to solve a common problem using data shared among the distributed entities through communication with each other so that all entities collectively reach a solution that is nearly as good as the solution of the centralized algorithms, for which data is available at a single location. Distributed setups can be largely classified into two categories as follows: 1) Centralized: Those having a central entity/server that coordinates with the other nodes in a master-slave architecture, and 2) Decentralized: Those without any central entity, in which the nodes are connected in an arbitrary network. In centralized setup, the central entity aggregates information from all the nodes and yields the final result. Centralized architecture does not rely on any central entity, it is a more general setup and it lacks a single point of failure. [BigData]
Exponential family subspace learning refers to a family of statistical methods for dimension reduction of large-scale data that are not real-valued, such as user ratings for items in e-commerce, and digital images in computer vision. [BigData]
Fair subspace learning refers to dimensionality reduction which aims to represent two populations A and B with similar fidelity. Indeed, the classical subspace learning formulation does not take into account different sensitive groups when projecting the data. As a consequence, the reduced dataset may contain distinct representation errors for those different groups, which may introduce a bias in practical applications.
Federated subspace learning refers to a distributed subspace learning scenario in which users/nodes keep their data private but only share intermediate locally computed iterates with the master node. The master, in turn, shares a global aggregate of these iterates with all the nodes at each iteration. In federated learning, the server trains a shared model based on data originating from remote clients such as smartphones and IoT devices, without the need to store and process the data in a centralized server. In this way, federated learning enables joint model training over privacy-sensitive data. [BigData]
Scalable subspace learning allows to handle large scale dataset instead to be limited to small-to-medium sized dataset. [BigData]
Probabilistic subspace learning allows to interpret subspace learning in a probabilistic manner to address the notable weakness of the regular subspace learning which is the absence of an associated probabilistic model for the observed data and the noise. [noise model]
Heterogeneous subspace learning addressed the case when data are collected from different sources with heterogeneous trends while still sharing some congruency. In this case, it is critical to extract shared knowledge while retaining the unique features of each source. Personalized PCA (perPCA) (Shi et al. , 2024) is a method for distributedly recovering shared and unique features with strong convergence guarantees. But, perPCA is limited to handling symmetric covariance matrices and cannot be extended to,asymmetric and incomplete observation settings, where only a small subset of data is available. An extensible algorithm named HMF (Shi et al. , 2024) allows to separate shared and unique factors with a provable convergence guarantee.
Kernel subspace learning allows to address nonlinear dimensionality reduction. [non linearity]
Multi-view subspace learning refers to the problem of subspace learning in the presence of data obtained from different views. It aims to seek a common space where features across different views could be aligned well.
Multilinear subspace learning refers to the problem of subspace learning in the presence of massive multidimensional data. Indeed, we will not only encounter first-order tensors (vectors) and second-order tensors (matrices), but also a large number of higher-order tensors. Tensor representation is then employed instead of the vector or matrix representations. It can be considered as subspace learning based on tensor.
The different existing subspace learning can be classified into either reconstructive (also called generative) and discriminative methods, either supervised and unsupervised methods, either local and global methods, either matrices and tensors methods. In addition, they can be unified into a general graph embedding framework as developed in (Yan et al. 2007) (Yu et al. 2018) (Han et al. 2018).
Definitions
Supervised subspace learning: labeled examples.
Unsupervised subspace learning: unlabeled examples.
Reinforcement subspace learning: interaction with environment.
References
C. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, S. Lin, "Graph embedding and extensions: a general framework for dimensionality reduction", IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 29, No. 1, pages 40-51, 2007.
W. Yu, F. Nie, F. Wang, R. Wang, X. Li, "Fast and Flexible Large Graph Embedding based on Anchors “, IEEE Journal of Selected Topics in Signal Processing, December 2018.
N. Han, J. Wu, Y. Liang, X. Fang, W. Wong, S. Teng, “Low-rank and sparse embedding for dimensionality reduction”, Neural Networks pages 202–216, 2018.
N. Shi, R. Kontar, S. Fattahi, "Heterogeneous matrix factorization: When features differ by datasets", Preprint, March 2024.