UT Simons Seminar  Spring 2016

DATE  SPEAKER  TITLE (click for abstract below) 
April 1  Ahmad Beirami  The geometry of guesswork 
April 8    No talk 
April 15  Eric Price  Constant factors: Worth studying for space and sample complexity 
April 22    No talk 
April 29  Luis Rademacher  Provably efficient, highdimensional feature extraction 
May 6    No talk 
May 13  Or Ordentlich  IntegerForcing: From Simple Scalar Communication Problems to Complicated Networks 
May 20  Sushant Sachdeva  Regression on graphs  Lipschitz and isotonic 
May 27  Chinmay Hegde  Nonconvex (but fast) algorithms for linear inverse problems 
Titles and Abstracts
The geometry of guesswork
MIT
The quantification of computational security is motivated by bruteforce attacks on guessing passwords and secret keys in homomorphic encryption. Guesswork is the number of queries of the form “is x the identity of the string” that are required to identify a randomly selected string. Guesswork is a measure of computational security that is intimately related to famous problems in information theory, such as the length of oneshot source coding, the computational complexity of sequential decoding, and the probability of error in list decoding. Building on seminal work that began with J. Massey (1994) and E. Arıkan (1996), this talk provides new insights into the geometry of guesswork, leading to tight bounds and accurate approximations to its distribution, its large deviations performance, and scaling of its cumulant generating function.
Based on joint work with Robert Calderbank (Duke), Mark Christiansen (National University of Ireland Maynooth), Ken Duffy (National University of Ireland Maynooth), Ali Makhdoumi (MIT), and Muriel Médard (MIT).
Constant factors: Worth studying for space and sample complexity
University of Texas at Austin
Computer scientists have a tendency to ignore constant factors in the
asymptotic complexity of algorithms. This tendency was developed for
studying running times, where it is justified for example by the
relative cost of multiplication and addition changing over time, but
these justifications fall apart for other measures such as space or
sample complexity. In some areas, such as coding theory or
approximation algorithms, lots of interesting research has focused on
constant factors in an asymptotic measure.
I will argue that research in sublinear algorithms should also
consider constant factors, and demonstrate a few problems in streaming
algorithms and property testing where searching for an algorithm with
a better leading constant forces us to develop new techniques.
Ohio State University
The goal of inference is to extract information from data. A basic building block in high dimensional inference is feature extraction, that is, to compute functionals of given data that represent it in a way that highlights some underlying structure. For example, Principal Component Analysis is an algorithm that finds a basis to represent data that highlights the property of data being close to a lowdimensional subspace. A fundamental challenge in high dimensional inference is the design of algorithms that are provably efficient and accurate as the dimension grows. In this context, I will describe two wellestablished feature extraction techniques: column subset selection (CSS) and independent component analysis (ICA). I will also present work by my coauthors and myself on CSS with optimal approximation guarantees, new applications of ICA and ICA for heavytailed distributions.
MIT
In the past decades, coding for the additive white Gaussian noise (AWGN) channel has reached an advanced state, and capacityapproaching coding schemes with low computational complexity are now known. It is thus desirable to leverage such codes for communication over more complicated channel models, arising in modern wireless communication.
I will describe a new framework, termed "integerforcing", that exploits the linear structure of practical codes, to arrive at nearoptimal lowcomplexity coding schemes for a multitude of challenging communication models. Examples include the intersymbol interference (ISI) channel, the multipleinput multipleoutput (MIMO) channel, the multiple access channel and the Kuser interference channel. The performance of integerforcing schemes will be analyzed for the different models, and explicit bounds will be provided on their gap from the informationtheoretic limits. It will further be demonstrated that in certain cases, the integerforcing framework can be used to prove new achievability results, beyond the reach of all currently known random coding proof techniques.
I will describe a new framework, termed "integerforcing", that exploits the linear structure of practical codes, to arrive at nearoptimal lowcomplexity coding schemes for a multitude of challenging communication models. Examples include the intersymbol interference (ISI) channel, the multipleinput multipleoutput (MIMO) channel, the multiple access channel and the Kuser interference channel. The performance of integerforcing schemes will be analyzed for the different models, and explicit bounds will be provided on their gap from the informationtheoretic limits. It will further be demonstrated that in certain cases, the integerforcing framework can be used to prove new achievability results, beyond the reach of all currently known random coding proof techniques.
Yale University
In this talk, we'll discuss how to use observations on some vertices
of a graph to draw inferences on the remaining vertices.
Given realvalued observations on some vertices, we seek a smooth
extension of these observations to all vertices. We propose the
absolutely minimal Lipschitz extension, which is the limit of
pLaplacian regularization for large p.
We'll present algorithmic results for computing these extensions
efficiently. Our algorithms naturally apply to directed graphs, and
run on graphs with millions of edges in a few minutes. These
extensions are particularly suited for regularization and outlier
removal, which is surprising since outlier removal is NPhard for
other natural extensions. We'll present some experimental results for
detecting spam webpages using our algorithms.
Finally, we'll demonstrate that these extensions are intimately
connected to a classic problem in Statistics, Isotonic Regression:
Given a directed acyclic graph G, and observations on all vertices,
the goal is to compute the "closest" set of observations that are
nondecreasing along the edges. We give algorithms for Isotonic
Regression that achieve the best asymptotic running times on sparse
graphs, and also run quickly in practice.
This is joint work with Rasmus Kyng, Anup Rao, and Daniel Spielman.
Iowa State University
Estimation of an unknown object from noisy linear observations is a basic problem in machine learning and statistics. Standard approaches first assume that the unknown object obeys a structured representation, and then develop convex optimization algorithms for recovering the parameters of the representation. Analysis reveals that several such algorithms are statistically optimal. However, despite these advances in statistical understanding, the role of computation is less well understood and even the best convex methods for certain problems incur significant running times.
In this talk, we discuss a series of new algorithms for solving certain types of linear inverse problems. The algorithms share several common themes, some of which are borrowed from classical discrete optimization and approximation algorithms. Moreover, in contrast with traditional approaches, these algorithms are all inherently nonconvex. Despite this feature, they are provably fast: they all enjoy an (almost) lineartime computational complexity thereby enabling their use for analyzing very large datasets. We outline the benefits of these algorithms in applications such as sparse feature selection, compressive sensing, and source separation.
In this talk, we discuss a series of new algorithms for solving certain types of linear inverse problems. The algorithms share several common themes, some of which are borrowed from classical discrete optimization and approximation algorithms. Moreover, in contrast with traditional approaches, these algorithms are all inherently nonconvex. Despite this feature, they are provably fast: they all enjoy an (almost) lineartime computational complexity thereby enabling their use for analyzing very large datasets. We outline the benefits of these algorithms in applications such as sparse feature selection, compressive sensing, and source separation.