Improvements in engineering and data acquisition techniques have rendered high dimensional data easily available. As a result, statistical analysis of highdimensional data has become frequent in many scientific fields ranging from biology, genomics and health sciences to astronomy, economics and machine learning. Despite the high dimensionality and complexity of the data, many problems have structure that makes efficient statistical inference possible. Examples of such structure include sparsity, sparse conditional independence graphs, lowdimensional manifolds, lowrank factorization, latent variables and semiparametric copulas. In the last decade, sparsity inducing regularization methods have proven to be very useful in highdimensional models both for selection of a small set of highly predictive variables and for uncovering of physical phenomena underlying many systems under scientific investigation. Nowadays, sparsity is a major tool for handling statistical problems in high dimensions. A lot of effort in the machine learning and statistics community has been invested in understanding theoretical properties of the l1regularization procedures and devising efficient algorithms for large scale problems. As a result, we have a good understanding of the theory behind the l1regularization methods and are capable of fitting simple models to large amounts of data, for example, linear regression and Gaussian models. Unfortunately, the theoretical results based on these oversimplified models often do not reflect difficulties encountered in the real life problems. For example, it is hard (and often impossible) to check whether the model assumptions hold for any given data set. Furthermore, practitioners have access to a lot of prior knowledge about the problem which should be incorporated into the model. On the other hand, many Bayesian procedures work well in practice and provide a flexible framework to incorporate prior knowledge. However, little or nothing can be said mathematically about their generalization performance. Going beyond simple sparsity, there have been a lot of extensions of the Lasso, such as, group Lasso, fused Lasso, multitask Lasso, elasticnet, etc. These extensions aim at incorporating additional structure into the model and try to improve the Lasso in cases when it fails. The structure may be pregiven or hidden in the data. Learning and exploiting such structure is a crucial first step towards better exploring and understanding complex datasets. This raises two key questions:
