UT Simons Seminar - Fall 2015

Friday, 12:30pm, UTA 7.532

Organizers: Francois Baccelli, Anastasios Kyrillidis, Virag Shah and, Jae Oh Woo.

This is a weekly working group at the Department of Mathematics and Electrical Engineering at UT Austin. It is opened to anybody who is interested in topics around the themes of networks, random graphs, information theory and optimization, with applications in operations research, machine learning and computer science.

To be added to the mailing list, please email: anastasios@utexas.edu.

Schedule

DATE SPEAKER TITLE (click for abstract below)
September  18 Jae Oh Woo Discrete analogues of entropy power inequalities based on rearrangements
September  25 Lei Ying Source localization in networks: Trees and beyond
October  2 Anastasios Kyrillidis Composite self-concordant minimization and extensions to path-following schemes
October  9 Kristy Gardner Queues with redundancy: Exact analysis
October  16 Texas Summit 2015  No talk
October  23 -     No talk
October  30 Sreeram Kannan
RNA-sequence assembly: From information theory to software
November  6 Javad Ghaderi Simple high-performance algorithms for scheduling flows in datacenter networks
November  13 Nikhil Rao Fast structured matrix factorization and time series applications
November  20 Stephen Becker Applications of randomized sketching to subsampling, robust regression and linear algebra
November  27 Black Friday No talk
December  4 Armin Eftekhari What do super-resolution and time-series analysis have in common?


Titles and Abstracts

Discrete analogues of entropy power inequalities based on rearrangements

Jae Oh Woo
University of Texas, Austin

In this lecture, we suggest discrete analogues of entropy power inequalities over Z/pZ and a discrete entropy power inequality for uniform distributions over Z. To do that, we establish majorizations based on rearrangement inequalities over Z or Z/pZ and a strong Sperner property of some posets over Z. Then those majorizations immediately imply entropy power inequalities. As a Corollary, we discuss that the optimal solution of Littlewood-Offord problem can be interpreted as a minimizer of entropy. We also remark that some of majorization results can be applied to a certain stochastic control problems. 

This is a joint work with Liyao Wang and Mokshay Madiman.

Source localization in networks: Trees and beyond

Lei Ying
Arizona State University

Information diffusion in networks can be used to model many real-world phenomena, including rumor spreading on online social networks, epidemics in human beings, and malware on the Internet. Informally speaking, the source localization problem is to identify a node in the network that provides the best explanation of the observed diffusion. Despite significant efforts and successes over last few years, theoretical guarantees of source localization algorithms were established only for tree networks due to the complexity of the problem. In this talk, I will present a new source localization algorithm, called the Short-Fat Tree (SFT) algorithm. Loosely speaking, the algorithm selects the node such that the breadth-first search (BFS) tree from the node has the minimum depth but the maximum number of leaf nodes. I will then discuss the performance of SFT under the independent cascade (IC) model for both tree networks and the Erdos-Renyi (ER) random graph. On tree networks, SFT is the maximum a posterior (MAP) estimator. On the ER random graph, we established the following fundamental possibility and impossibility results: (i) when the infection duration <2/3t_u, SFT identifies the source with probability 1 asymptotically; (ii) when the infection duration >t_u, the probability of identifying the source approaches zero asymptotically under any algorithm; and (iii) when infection duration


Composite self-concordant minimization and extensions to path-following schemes

Anastasios Kyrillidis
University of Texas, Austin

We propose a variable metric framework for minimizing the sum of a self-concordant function and a possibly non-smooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new set of analytic step-size selection and correction procedures based on the structure of the problem. We describe concrete algorithmic instances of our framework for several interesting large-scale applications and demonstrate them numerically on both synthetic and real data. (If there is time) We also discuss how such technique can be used in path-following optimization for constrained problems.  

This is a joint work with Quoc Tran Dinh and Volkan Cevher.

Queues with redundancy: Exact analysis

Kristy Gardner
Carnegie Mellon University

Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a single request on multiple servers and only wait for the first completion (discarding all remaining instances of the request). However no exact analysis of systems with redundancy exists. We present the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution on the state of the system.

In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the “gain” to redundant classes and “pain” to non-redundant classes caused by redundancy. We find some surprising results. We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and Join-the-Shortest-Queue (JSQ) routing of a class. We find that redundancy outperforms JSQ and Opt-Split with respect to overall response time, making it an attractive solution. We also investigate performance in scaled systems, and find that many of our counterintuitive results continue to hold as the number of servers increases.

Joint work with Samuel Zbarsky, Sherwin Doroudi, Mor Harchol-Balter, Esa Hyytia, and Alan Scheller-Wolf.

RNA-sequence assembly: From information theory to software

Sreeram Kannan
University of Washington, Seattle

Extraordinary advances in sequencing technology in the past decade have revolutionized biology and medicine. In this talk, we will discuss an information theoretic approach to a combinatorial problem that arises from RNA sequencing. RNA sequence assembly from short reads is still a largely unsolved problem, due to the complexity induced by alternative splicing, a mechanism by which the same gene generates multiple distinct RNA transcripts. We present the first RNA sequence assembly algorithm that has a provable guarantee of near-optimality, i.e., it recovers the sequence correctly whenever there is enough information in the reads to do so. The heart of the algorithm is a new iterative approach to solve the sparsest flow decomposition problem in directed acyclic graphs. We will demonstrate the practical relevance of this approach by showcasing the superior performance of an upcoming open-source software package built on these principles. We will also discuss an interesting interplay between informational complexity and computational complexity observed in this context.

Simple high-performance algorithms for scheduling flows in datacenter networks

Javad Ghaderi
Columbia University

We study the problem of load balancing in datacenter networks, namely, assigning the end-to-end data flows among the available paths in order to efficiently balance the load in the network. The solutions used today typically rely on ECMP, which essentially attempts to balance the load by hashing the flows to the available shortest paths, and it is known to perform poorly when there is asymmetry in either the network topology or the traffic. In this talk, we consider a general network topology where each link has a cost which is a convex function of the link utilization. We propose a low-complexity congestion-aware algorithm that assigns the flows to the available paths in an online fashion and without splitting, and prove that it asymptotically minimizes the network cost when the number of flows in the network is large. Extensive simulation results are presented that show that our algorithm not only outperforms ECMP significantly, but is also close to optimal cost, under a wide range of traffic conditions and under different datacenter architectures.

This is a joint work with Mehrnoosh Shafiee (PhD student).

Fast structured matrix factorization and time series applications

Nikhil Rao
University of Texas, Austin

Matrix Factorization (MF) methods have been applied to a variety of applications, such as recommender systems and multilabel learning. In several applications of interest, there is additional information available in the form of pairwise relationships between variables, such as a social network or product co-purchasing information. We develop a scalable, conjugate gradient based method for Graph Structured matrix factorization problems, and show that such methods can be cast as a weighted atomic norm regularized program. We then turn to applications in high dimensional time series forecasting, and show that such problems can be cast in the aforementioned framework.

This is joint (and ongoing) work with Hsiang-Fu Yu and Inderjit Dhillon

Applications of randomized sketching to subsampling, robust regression and linear algebra

Stephen Becker
University of Colorado at Boulder

The fast Johnson-Lindenstrauss transform has triggered a large amount of research into fast randomized transforms that reduce data dimensionality while approximately preserving geometry. We discuss uses of these fast transforms in three situations. In the first, we use the transform to precondition a data matrix before subsampling, and show how for huge data sets, this leads to great acceleration in algorithms such as PCA and K-means clustering. The second situation reconsiders the common problem of sketching for regression. We argue that it is more natural to switch to a robust model, and we further introduce a "partially sketched" variant. Finally, we present recent empirical work on using sketching to reduce the need for pivoting in the QR and LU decompositions.

This work is joint with Farhad Pourkamali-Anaraki, Marek Petrik , Ban KawasKarthikeyan Ramamurthy, and Laura Grigori

What do super-resolution and time-series analysis have in common?

Armin Eftekhari
University of Texas, Austin

I will discuss two topics in model-based signal processing. The first part of the talk is focused on a fast two-phase algorithm for super-resolution with strong theoretical guarantees. Given the low-frequency part of the spectrum of a sequence of impulses, Phase I consists of a greedy algorithm that roughly estimates the impulse positions. These estimates are then refined with a local optimization in Phase II. The backbone of our approach is the fundamental work of Slepian et al. involving prolate functions and their unique properties which I will briefly cover.
In the context of nonlinear time-series analysis, delay-coordinate mapping is arguably the main tool of the trade, the efficacy of which has been long established by the celebrated Takens' embedding theorem, in the absence of noise. In the second part of this talk, we study the delay-coordinate mapping by extending Takens' result to account for measurement noise. In doing so, we answer a longstanding question in nonlinear time-series analysis.

This is joint work with Mike Wakin, Zhihui Zhu, and Paul Martin at Colorado School of Mines, as well as Han Lun Yap and Chris Rozell at Georgia Tech.