Sparse modeling is a quickly developing area on the intersection of statistics, machine-learning and signal processing. The aim of this tutorial is to provide a survey of recent advances in sparse modeling, including sparse regression [1,5,7] and classification [8,9], sparse graphical model selection [25,15,26] and other sparse M-estimators  [27,28], sparse dimensionality reduction, as well as sparse signal recovery in signal processing, sparse optimization methods, and related areas, all driven by objective of recovering a small number of most relevant variables in high-dimensional data.  Sparse modeling is a particularly important issue in many applications of machine learning and statistics where the main objective is discovering predictive patterns in data that would enhance our understanding of underlying physical, biological and other natural processes, beyond just building accurate 'black-box' predictors. Common examples include biomarker selection in biological applications [1], finding brain areas predictive about 'brain states' based on fMRI data [2], and identifying network bottlenecks best explaining end-to-end performance [3,4], just to name a few applications. Moreover, efficient recovery of high-dimensional sparse signals from a relatively low number of observations is the main focus of   compressed sensing [16-19], a rapidly growing and extremely popular area of signal processing.

Recent years have witnessed a flurry of research on algorithms and theory for sparse modeling and sparse signal recovery. Various types of convex relaxation, particularly L1-regularization, have proven very effective: examples include the LASSO [5],  Elastic Net [1], L1-regularized GLMs [7], sparse classifiers such as sparse (1-norm) SVM [8,9], as well as sparse dimensionality reduction methods (e.g. sparse component analysis [10], and particularly sparse PCA [11,12] and sparse NMF [13,14]). Applications of these methods are wide-ranging, including computational biology, medicine, neuroscience, graphical model selection and compressed sensing. Theoretical work has provided some conditions when various relaxation methods are capable of recovering an underlying sparse signal, provided bounds on sample complexity, and investigated trade-offs between different choices of design matrix properties that guarantee good performance. A wide range of algorithms has been proposed that include greedy search methods, L1-regularized optimization, and Bayesian approaches [23,24], just to name a few.  

We hope to provide a survey of the key recent developments in the above fields, maintaining a balance among the following three aspects: theoretical basis for sparse modeling, algorithmic approaches and applications, particularly biological ones, including brain imaging using fMRI data.