## Workshop Schedule

9.00 -  9.10am    Introducing remarks
9.10 -  9.50am    Invited talk: Bin Yu (UC Berkeley Statistics)

Spectral Clustering and the High-dimensional Stochastic Block Model

In recent years network analysis have become the focus of much research in many fields including biology, communication studies, economics, information science, organizational studies, and social psychology. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasible method to discover these communities.

The Stochastic Block Model is a social network model with well defined communities. This talk will give conditions for spectral clustering to correctly estimate the community membership of nearly all nodes. These asymptotic results are the first clustering results that allow the number of clusters in the model to grow with the number of nodes, hence the name high-dimensional. Moreover, I will present on-going work on directed spectral clustering for networks whose edges are directed, including the enron data as an example.

9.50 - 10.30am    Invited talk: Volkan Cevher (Ecole Polytechnique Federale de Lausanne EE)

The CLASH Operator [slides]

The least absolute shrinkage and selection operator (Lasso) for linear regression exploits the geometric interplay of the ell2-data error objective and the ell1-norm constraint to arbitrarily select sparse models. Guiding this uninformed selection process with sparsity models has been precisely the center of attention over the last decade in order to improve learning performance. To this end, we alter the selection process of Lasso in this talk to explicitly leverage combinatorial sparsity models (CSMs) via the combinatorial selection and least absolute shrinkage operator (CLASH). A highlight is the introduction of a new algorithmic definition of CSMs, which we dub as the Polynomial time Modular epsilon-Approximation Property (PMAP_epsilon). PMAP_epsilon enables us to determine the impact of approximate combinatorial projections within CLASH. We then provide concrete guidelines how to leverage sets with PMAP_epsilon within CLASH, and characterize CLASH’s estimation guarantees as a function of epsilon as well as the set restricted isometry constants of the regression matrix. Finally, we present experimental results using both simulated and real world data to demonstrate the effectiveness of CLASH.

10.30 - 11.00am    Coffee break (and putting up posters)

11.00 - 11.30am    Y. Jin, B. D. Rao. Sparse Signal Recovery: A Multiple-User Information Theoretic Viewpoint
11.30 - 12.10pm    Invited talk: Venkat Chandrasekaran (MIT EECS)

The Convex Geometry of Linear Inverse Problems

Deducing the state or structure of a system from partial, noisy measurements is a fundamental task throughout the sciences and engineering. The resulting inverse problems are often ill-posed because there are fewer measurements available than the ambient dimension of the model to be estimated. In practice, however, many interesting signals or models contain few degrees of freedom relative to their ambient dimension: a small number of genes may constitute the signature of a disease, very few parameters may specify the correlation structure of a time series, or a sparse collection of geometric constraints may determine a molecular configuration. Discovering, leveraging, or recognizing such low-dimensional structure plays an important role in making inverse problems well-posed. Examples of structured models include previously studied cases such as sparse signals and low-rank matrices, as well as others such as low-rank tensors, binary vectors, orthogonal matrices, and matrices formed as the sum of a few permutations. Inspired by the success of the L1-norm and nuclear-norm heuristics, we propose a general convex relaxation framework to recover such simple structured models from partial information.  We provide sharp estimates of the number of generic measurements required for exact and robust recovery in a variety of settings.  These estimates are based on computing certain Gaussian statistics related to the underlying model geometry. Thus our work extends the catalog of objects and structures (beyond sparse vectors and low-rank matrices) that can be recovered from limited information via tractable convex programming.

Joint work with Benjamin Recht, Pablo Parrilo, and Alan Willsky.

12.10 -  1.30pm    Lunch Break   (and poster sesssion)

1.30 -  2.10pm    Invited talk: Julien Mairal (UC Berkeley Statistics)

Network Flow Algorithms for Structured Sparsity [slides]

We consider a class of learning problems that involve a structured sparsity-inducing norm defined as the sum of $\ell_\infty$-norms over groups of variables. Whereas a lot of effort has been put in developing fast optimization methods when the groups are disjoint or embedded in a specific hierarchical structure, we address here the case of general overlapping groups. To this end, we show that the corresponding optimization problem is related to network flow optimization. More precisely, the proximal problem associated with the norm we consider is dual to a quadratic min-cost flow problem. We propose an efficient procedure which computes its solution exactly in polynomial time. Our algorithm scales up to millions of variables, and opens up a whole new range of applications for structured sparse models.  We present several experiments on image and video data, demonstrating the applicability and scalability of our approach for various problems.

2.10 -  2.50pm    Invited talk: Pradeep Ravikumar (UT Austin CS)

Dirty statistical models: M-estimators and high-dimensional analysis

In fields across science and engineering, we are increasingly faced with high-dimensional problems where the number of parameters dwarf the number of observations. For any hope of statistically consistent estimation, it becomes vital to leverage any potential low-dimensional structure in the problem such as sparsity, group-sparsity, low-rank structure, or sparse graphical model structure. In many problems however, any single structure might not capture the data, whereas a superposition of structural classes might. For instance, in a latent-variable Gaussian graphical model, the precision matrix of the observed variables is neither sparse nor low-rank, but nonetheless could be expressed as the sum of a sparse and a low-rank matrix. We thus study a new class of M-estimators that learn a superposition of parameters each with their own structure. We then provide a unified framework for establishing consistency and convergence rates for such regularized M-estimators under high-dimensional scaling.

2.50 -  3.30pm    Poster session

3.30 -  4.00pm    Coffee break   (and poster session)

4.00 -  4.30pm    Z. Szabo, B. Poczos, A. Lorincz: Online Dictionary Learning with Group Structure
Inducing Norms. [slides]
4.30 -  5.10pm    Invited talk: Tong Zhang (Rutgers University Statistics)

Regularized Greedy Forest using Structured Sparsity [slides]

The existing approach to this problem is Friedman's gradient boosting procedure using a regression tree base learner.
We apply the concept of structured sparsity to improve boosted decision trees with general loss functions. Although this method has led to many successful industrial applications, it suffers from several theoretical and practical drawbacks. By employing the idea of structured greedy search, we are able to design a regularized greedy forest procedure to address these issues. The resulting method constructs tree ensembles more effectively than gradient boosting, and achieves better performance on most datasets we have tested on. This work suggests that we can view boosted decision trees broadly as procedures that construct high order nonlinear interactions through the concept of structured sparsity regularization. This general view point allows us to design nonlinear learning algorithms that are more effective than existing methods.

5.10 -  5.30pm    Discussion

## Contributed papers

• Benjamin Hamner, Ricardo Chavarriaga, Jose del R. Mill.an. Learning Dictionaries of Spatial and Temporal EEG Primitives for Brain-Computer Interfaces. [PDF]
• Subhojit Som, Philip Schniter. Approximate Message Passing for Recovery of Sparse Signals with Markov-Random-Field Support Structure. [PDF][Slides]
• Zhilin Zhang, Bhaskar D. Rao. Exploiting Correlation in Sparse Signal Recovery Problems: Multiple Measurement Vectors, Block Sparsity, and Time-Varying Sparsity. [PDF]
• Jin Xu, Guang Yang, Hong Man. Sparse Representation for Classiﬁcation with Structure Preserving Dimension Reduction. [PDF]
• Zoltan SzaboBarnabas PoczosAndras LorinczOnline Dictionary Learning with Group Structure Inducing Norms. [PDF] [Poster PDF]
• Mingyuan Zhou, Hongxia Yang, Guillermo Sapiro, David Dunson, Lawrence Carin. Landmark-Dependent Hierarchical Beta Process for Robust Sparse Factor Analysis. [PDF]
• Yuzhe Jin, Bhaskar D. Rao. Sparse Signal Recovery: A Multiple-User Information Theoretic Viewpoint. [PDF]
• Arto Klami, Seppo Virtanen, Samuel Kaski. Group sparsity on PCA gives multiset CCA. [PDF]
• Viet Cuong Nguyen, Nan Ye, Wee Sun Lee, Hai Leong Chieu. Semi-Markov Conditional Random Field with High-Order Features. [PDF]

Ċ
Jun 11, 2011, 10:01 PM
Ċ
Jul 15, 2011, 9:11 AM
Ċ
Jun 11, 2011, 10:01 PM
Ċ
Jun 11, 2011, 10:05 PM
Ċ
Jun 11, 2011, 10:04 PM
Ċ
Jun 11, 2011, 10:00 PM
Ċ
Jun 11, 2011, 10:00 PM
Ċ
icml11.pdf
(110k)
Jul 26, 2011, 11:22 AM
Ċ
Jul 26, 2011, 11:22 AM
Ċ
Jun 16, 2011, 2:09 PM
Ċ