UT Simons Seminar - Spring 2016

Friday, 12:30pm, UTA 7.532

Organizers: Francois Baccelli, Anastasios Kyrillidis and, Jae Oh Woo.

This is a weekly working group at the Department of Mathematics and Electrical Engineering at UT Austin. It is opened to anybody who is interested in topics around the themes of networks, random graphs, information theory and optimization, with applications in operations research, machine learning and computer science.

To be added to the mailing list, please email: anastasios@utexas.edu or jaeoh.woo@utexas.edu.

Schedule

DATE SPEAKER TITLE (click for abstract below)
April  1 Ahmad Beirami The geometry of guesswork
April  8 -  No talk
April  15 Eric Price Constant factors: Worth studying for space and sample complexity
April  22 - No talk
April  29 Luis Rademacher Provably efficient, high-dimensional feature extraction
May  6 - No talk
May  13 Or Ordentlich Integer-Forcing: From Simple Scalar Communication Problems to Complicated Networks
May  20 Sushant Sachdeva Regression on graphs -- Lipschitz and isotonic
May  27 Chinmay Hegde Nonconvex (but fast) algorithms for linear inverse problems

Titles and Abstracts

The geometry of guesswork

MIT 

The quantification of computational security is motivated by brute-force attacks on guessing passwords and secret keys in homomorphic encryption. Guesswork is the number of queries of the form “is x the identity of the string” that are required to identify a randomly selected string. Guesswork is a measure of computational security that is intimately related to famous problems in information theory, such as the length of one-shot source coding, the computational complexity of sequential decoding, and the probability of error in list decoding. Building on seminal work that began with J. Massey (1994) and E. Arıkan (1996), this talk provides new insights into the geometry of guesswork, leading to tight bounds and accurate approximations to its distribution, its large deviations performance, and scaling of its cumulant generating function.

Based on joint work with Robert Calderbank (Duke), Mark Christiansen (National University of Ireland Maynooth), Ken Duffy (National University of Ireland Maynooth), Ali Makhdoumi (MIT), and Muriel Médard (MIT).

Constant factors: Worth studying for space and sample complexity

University of Texas at Austin

Computer scientists have a tendency to ignore constant factors in the asymptotic complexity of algorithms. This tendency was developed for studying running times, where it is justified for example by the relative cost of multiplication and addition changing over time, but these justifications fall apart for other measures such as space or sample complexity. In some areas, such as coding theory or approximation algorithms, lots of interesting research has focused on constant factors in an asymptotic measure. I will argue that research in sublinear algorithms should also consider constant factors, and demonstrate a few problems in streaming algorithms and property testing where searching for an algorithm with a better leading constant forces us to develop new techniques.

Ohio State University

The goal of inference is to extract information from data. A basic building block in high dimensional inference is feature extraction, that is, to compute functionals of given data that represent it in a way that highlights some underlying structure. For example, Principal Component Analysis is an algorithm that finds a basis to represent data that highlights the property of data being close to a low-dimensional subspace. A fundamental challenge in high dimensional inference is the design of algorithms that are provably efficient and accurate as the dimension grows. In this context, I will describe two well-established feature extraction techniques: column subset selection (CSS) and independent component analysis (ICA). I will also present work by my coauthors and myself on CSS with optimal approximation guarantees, new applications of ICA and ICA for heavy-tailed distributions.

MIT

In the past decades, coding for the additive white Gaussian noise (AWGN) channel has reached an advanced state, and capacity-approaching coding schemes with low computational complexity are now known. It is thus desirable to leverage such codes for communication over more complicated channel models, arising in modern wireless communication.
I will describe a new framework, termed "integer-forcing", that exploits the linear structure of practical codes, to arrive at near-optimal low-complexity coding schemes for a multitude of challenging communication models. Examples include the inter-symbol interference (ISI) channel, the multiple-input multiple-output (MIMO) channel, the multiple access channel and the K-user interference channel. The performance of integer-forcing schemes will be analyzed for the different models, and explicit bounds will be provided on their gap from the information-theoretic limits. It will further be demonstrated that in certain cases, the integer-forcing framework can be used to prove new achievability results, beyond the reach of all currently known random coding proof techniques.

Yale University

In this talk, we'll discuss how to use observations on some vertices of a graph to draw inferences on the remaining vertices. 
Given real-valued observations on some vertices, we seek a smooth extension of these observations to all vertices. We propose the absolutely minimal Lipschitz extension, which is the limit of p-Laplacian regularization for large p. 
We'll present algorithmic results for computing these extensions efficiently. Our algorithms naturally apply to directed graphs, and run on graphs with millions of edges in a few minutes. These extensions are particularly suited for regularization and outlier removal, which is surprising since outlier removal is NP-hard for other natural extensions. We'll present some experimental results for detecting spam webpages using our algorithms. 
Finally, we'll demonstrate that these extensions are intimately connected to a classic problem in Statistics, Isotonic Regression: Given a directed acyclic graph G, and observations on all vertices, the goal is to compute the "closest" set of observations that are non-decreasing along the edges. We give algorithms for Isotonic Regression that achieve the best asymptotic running times on sparse graphs, and also run quickly in practice. 

This is joint work with Rasmus Kyng, Anup Rao, and Daniel Spielman.

Iowa State University

Estimation of an unknown object from noisy linear observations is a basic problem in machine learning and statistics. Standard approaches first assume that the unknown object obeys a structured representation, and then develop convex optimization algorithms for recovering the parameters of the representation. Analysis reveals that several such algorithms are statistically optimal. However, despite these advances in statistical understanding, the role of computation is less well understood and even the best convex methods for certain problems incur significant running times.

In this talk, we discuss a series of new algorithms for solving certain types of linear inverse problems. The algorithms share several common themes, some of which are borrowed from classical discrete optimization and approximation algorithms. Moreover, in contrast with traditional approaches, these algorithms are all inherently non-convex. Despite this feature, they are provably fast: they all enjoy an (almost) linear-time computational complexity thereby enabling their use for analyzing very large datasets. We outline the benefits of these algorithms in applications such as sparse feature selection, compressive sensing, and source separation.