Motivation and Goal

Improvements in engineering and data acquisition techniques have rendered high dimensional data easily available. As a result, statistical analysis of high-dimensional data has become frequent in many scientific fields ranging from biology, genomics and health sciences to astronomy, economics and machine learning. Despite the high dimensionality and complexity of the data, many problems have structure that makes efficient statistical inference possible. Examples of such structure include sparsity, sparse conditional independence graphs, low-dimensional manifolds, low-rank factorization, latent variables and semiparametric copulas. In the last decade, sparsity inducing regularization methods have proven to be very useful in high-dimensional models both for selection of a small set of highly predictive variables and for uncovering of physical phenomena underlying many systems under scientific investigation. Nowadays, sparsity is a major tool for handling statistical problems in high dimensions.

A lot of effort in the machine learning and statistics community has been invested in understanding theoretical properties of the l1-regularization procedures and devising efficient algorithms for large scale problems. As a result, we have a good understanding of the theory behind the l1-regularization methods and are capable of fitting simple models to large amounts of data, for example, linear regression and Gaussian models. Unfortunately, the theoretical results based on these oversimplified models often do not reflect difficulties encountered in the real life problems. For example, it is hard (and often impossible) to check whether the model assumptions hold for any given data set. Furthermore, practitioners have access to a lot of prior knowledge about the problem which should be incorporated into the model. On the other hand, many Bayesian procedures work  well in practice and provide a flexible framework to incorporate prior knowledge. However, little or nothing can be said mathematically about their generalization performance.

Going beyond simple sparsity, there have been a lot of extensions of the Lasso, such as, group Lasso, fused Lasso, multi-task Lasso, elastic-net, etc. These extensions aim at incorporating additional structure into the model and try to improve the Lasso in cases when it fails. The structure may be pre-given or hidden in the data. Learning and exploiting such structure is a crucial first step towards better exploring and understanding complex datasets. This raises two key questions:

  • How can we automatically learn the hidden structure from the data?
  • Once the structure is learned or pre-given, how can we utilize the structure to conduct more effective inference?

Machine learning and statistics communities have addressed these two key questions from different perspectives: Bayesian vs. frequentist, parametric vs. nonparametric, optimization vs. integration. 

Below we provide some applications that benefit from exploiting of complex structure:

a) Sparse conditional independence graphs: Sparse network models are typically learned by maximizing a l1-penalized likelihood or pseudo-likelihood. These approaches, while computationally efficient, ignore prior information that is known about the system under consideration. For example, one biological application involves estimating regulatory networks of genes, about which there is a lot of information collected through experiments.

b) Multi-task learning:  The premise of multi-task learning is that by learning several related tasks the efficiency of an estimation procedure can be improved. Commonly it is assumed that all tasks share the same underlying structure, such as sparsity or low rank representation. In practice, this is not necessarily the case and the main question is how to incorporate additional knowledge about the relationship between tasks into an estimation procedure. 

c) Model-based compressed sensing: Under the sparsity assumption, the compressed sensing theory guarantees that a signal can be recovered with certain number of measurements. Given additional structure about the unknown signal, beyond sparsity, the number of measurements needed to recover the signal can be dramatically reduced without sacrificing robustness. 

The aim of the workshop is to bring together theory and practice in modeling and exploring structure in high-dimensional data. We would like to invite researchers working on methodology, theory and applications, both from the frequentist and Bayesian point of view, to participate in the workshop. We encourage genuine interaction between proponents of different approaches and hope to better understand possibilities for modeling of structure in high dimensional data.

Past workshops

NIPS 2010 Practical Applications of Sparse Modeling: Open Issues and New Directions [link]
2010 Sparse structures: statistical theory and practice [link]
2010 Sparsity and Modern Mathematical Methods for High Dimensional Data [link]
NIPS 2009 Manifolds, sparsity, and structured models: When can low-dimensional geometry really help? [link]
2009 Sparsity in Machine Learning and Statistics [link]
ICML 2008 Sparse Optimization and Variable Selection [link]
2008 Sparsity and Inverse Problems in Statistical Theory and Econometrics [link]
2008 Workshop on Sparsity in High Dimensional Statistics and Learning Theory [link]
NIPS 2006 Causality and Feature Selection [link]
NIPS 2003 Feature extraction and feature selection challenge [link]
NIPS 2001 Variable and Feature Selection [link]