Adaptive Stochastic Optimization with Constraints

Video:

Abstract: Constrained stochastic optimization problems appear widely in numerous applications in statistics, machine learning, and engineering, including constrained maximum likelihood estimation, constrained deep neural networks, physical-informed machine learning, and optimal control. I will discuss our recent work on solving nonlinear optimization problems with stochastic objective and deterministic constraints. I will describe development of adaptive algorithms based on sequential quadratic programming and their properties. The talk is based on the joint work with Yuchen Fang, Ilgee Hong, Sen Na, Michael Mahoney, and Mihai Anitescu.

Bio: Mladen Kolar is a professor in the Department of Data Sciences and Operations at the USC Marshall School of Business. Mladen earned his PhD in Machine Learning from Carnegie Mellon University in 2013. His research focuses on high-dimensional statistical methods, probabilistic graphical models, and scalable optimization methods, driven by the need to uncover interesting and scientifically meaningful structures from observational data. Mladen was selected as a recipient of the 2024 Junior Leo Breiman Award for his outstanding contributions to these areas. He currently serves as an associate editor for the Journal of Machine Learning Research, the Annals of Statistics, the Journal of Computational and Graphical Statistics, and the New England Journal of Statistics in Data Science.

Sebastian Motsch

Associate Professor

ASU

Natural projected flow: A PDE solver using neural networks

Video:

Abstract: Solving PDEs with neural networks has attracted a lot of attentions in recent years especially with the introduction of Physics-Informed Neural Networks (PINNs). These methods typically utilize neural networks as approximate solutions and adjust their parameters to satisfy the PDE (approximately). Our method, called Natural Projected Flow, deviates from this approach by utilizing a semi-discrete formulation. This involves seeking a solution where the parameters of the neural network (representing the spatial variable) evolve over time. The crucial challenge lies in identifying the corresponding evolution equation for these parameters. Natural Projected Flow addresses this challenge by employing a L^2 projection of the flow of the PDE onto the manifold of neural networks. The effectiveness of our proposed numerical solver is demonstrated through applications to various classical PDEs, including diffusion and porous-media equations.

Bio: Sébastien Motsch is an associate professor at Arizona State University. He received his PhD. from Institut de Mathematiques de Toulouse in 2009. Prof. Motsch' research interests focus in the mathematical modeling of biological systems and especially those which exhibit self-organization such as bacterial colonies or flock of birds. His work aims at connecting two levels of description for these systems: a microscopic viewpoint (describing each individual) and a macroscopic description (using partial differential equations). One of the many questions addressed by biological systems is to understand how local interactions among individuals lead to the formation of large structures. The derivation and analysis of macroscopic models give new insights to understand these phenomenons. Dr. Motsch’ research can be divided in three themes: 1) derivation of macroscopic models from microscopic dynamics 2) numerical and analytically study of the macroscopic models derived 3) modeling of complex systems based on experimental data

Lalitha Vadlamani

Assistant Professor

IIIT Hyderabad

Codes for Distributed Storage and Distributed Gradient Descent

Video: Link

Abstract: In a distributed storage system, due to increase of storage capacity of a node, efficient repair of failed nodes is becoming increasingly important in addition to ensuring a given level of reliability and low storage overhead. Codes with locality are a class of codes designed for storage systems which have the characteristic that they trade off repair locality (number of nodes accessed to repair a failed node) for storage overhead. Maximally recoverable codes are a class of codes which correct maximum possible number of erasure patterns, given the locality constraints of the code and hence of interest. We will introduce three classes of maximally recoverable codes (MRC) based on the topology of the local parities will be introduced (i) MRC with locality (iii) MRC with hierarchical locality and (iii) Product Topologies. We will present various constructions and results dealing with these MRCs.

In a distributed gradient descent problem, a gradient computation job is divided into multiple parallel tasks, which are computed on different servers, and the job is finished when all the tasks are complete. In this framework, a subset of straggling servers form a bottleneck to the efficient execution of the gradient descent. Gradient coding ensures efficient distributed gradient computation even in the presence of stragglers by utilizing coding theoretic techniques. We will introduce two variants of gradient coding and present results in these settings: (i) Delayed start of tasks corresponding to a subset of servers is allowed (ii) a form of approximate gradient coding where only sum of a fraction of the gradients need to be recovered.

Bio: Lalitha Vadlamani received her B.E. degree in Electronics and Communication Engineering from the Osmania University, Hyderabad, in 2003 and her M.E. and Ph.D. degrees from the Indian Institute of Science (IISc), Bangalore, in 2005 and 2015 respectively. From May 2015, she is working as Assistant professor in IIIT Hyderabad, where she is affiliated to Signal Processing and Communications Research Center. Her research interests include coding for distributed storage and computing, index coding, polar codes, learning-based codes and coded blockchains. She is a recipient of Prof. I.S.N. Murthy medal from IISc, 2005 and the TCS Research Scholarship for the year 2011. She is currently visiting Simons Institute of the Theory of Computing, Berkeley.

Mayank Bakshi

Research Scientist

ASU

Beyond Redundancy: Adversarial Mitigation in Distributed Systems through Authentication and Adversary Detection

Video: Link

Abstract: The distributed nature of many modern applications, such as the Internet of Things (IoT) and Distributed Machine Learning, makes them inherently vulnerable to infiltration and disruption by unidentified adversarial agents. These agents may deviate arbitrarily from the protocol, disrupting the final outcome and potentially leading to privacy leakage from legitimate agents. Given the distributed nature of these problems, countering such deviations and minimizing the adversary's effects is challenging and costly — after all, any form of 'error correction' necessitates redundancies in the system. In this talk, we argue that such redundancy is not needed when adversaries are only sporadically present in the system. Instead, we advocate for an authentication-based approach to mitigate such adversarial risks in distributed systems. The philosophy here is to maintain efficiency in adversary-free scenarios while still safeguarding against malicious activities by 'validating' the outcome to ensure minimal adversarial influence. We show that for two different classes of problems—decentralized learning and multiple access communications—the authentication-based approach performs essentially as efficiently as a non-authenticated approach, with the added advantage that the presence of adversaries can be detected. In contrast to error correction-based approaches, which require significant overhead in terms of communication and cost, our approach validates system outcomes through suitable 'checks' at the end of the protocol to detect adversarial presence. Lastly, we will also explore interesting connections to adversarial hypothesis testing and active learning problems.

Bio: Dr. Mayank Bakshi received his B.Tech. and M.Tech. degrees from the Indian Institute of Technology, Kanpur, in 2003 and 2005, respectively, and his Ph.D. degree from the California Institute of Technology in 2011. He then served as a postdoctoral scholar and a research assistant professor at the Chinese University of Hong Kong from 2012 to 2019, and as a principal researcher at Theory Lab, Huawei Hong Kong from 2019 to 2021. Currently, he is a research scientist at Arizona State University. His research interests include physical layer security, adversarially robust communications and learning, and sparse recovery.

Yao Xie

Professor

GA Tech

Generative Models for Statistical Inference

Video: Link

Abstract: We consider the problem of learning a continuous probability density function from data, a fundamental problem in statistics known as density estimation. It also arises in distributionally robust optimization (DRO), where the goal is to find the worst-case distribution to represent scenario departure from observations. Such a problem is known to be hard in high dimensions and incurs a significant computational challenge. In this talk, I will present a machine learning approach to tackle these challenges, leveraging recent advances in neural-networks-based generative models, which have become popular recently due to their competitive performance in high-dimensional data. We develop a neural ODE flow network called JKO-iFlow, inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which unfolds the discrete-time dynamic of the Wasserstein gradient flow. Our method can greatly reduce computational costs when achieving competitive performance over existing generative models. The connection of our JKO-iflow method with proximal gradient descent in the Wasserstein space enables us to prove a density learning guarantee with an exponential convergence rate. Besides density estimation, we also demonstrate that the JKO-flow generative model can be used in various applications, including adversarial learning, robust hypothesis testing, and data-driven differential privacy.

Bio: Yao Xie is the Coca-Cola Foundation Chair, Professor at Georgia Institute of Technology in the H. Milton Stewart School of Industrial and Systems Engineering, and Associate Director of the Machine Learning Center. From September 2017 until May 2023, she was the Harold R. and Mary Anne Nash Early Career Professor. She received her Ph.D. in Electrical Engineering (minor in Mathematics) from Stanford University in 2012 and was a Research Scientist at Duke University. Her research lies at the intersection of statistics, machine learning, and optimization in providing theoretical guarantees and developing computationally efficient and statistically powerful methods for problems motivated by real-world applications. She received the National Science Foundation (NSF) CAREER Award in 2017, the INFORMS Wagner Prize Finalist in 2021, and the INFORMS Gaver Early Career Award for Excellence in Operations Research in 2022. She is currently an Associate Editor for IEEE Transactions on Information Theory, Journal of the American Statistical Association-Theory and Methods, Operations Research, Sequential Analysis: Design Methods and Applications, INFORMS Journal on Data Science, and an Area Chair of NeurIPS, ICML, and ICLR.

Massimo Franceschetti

Professor

UCSD

Electromagnetic Information Theory

Video:

Abstract: Theoretical analysis of the performance limits of next generation communication systems requires a deeper understanding of the effect of the propagation channel in the computation of relevant information-theoretic bounds. Most of the literature, however, abstracts out the physics, treating them as mathematical or engineering disciplines. Although abstractions are certainly necessary in the design of systems, much can be lost in understanding the fundamental limits of emerging technologies such as holographic MIMO, super-resolution, high-frequency, and quantum communications. In this talk, we illustrate how fundamental limits can be studied by merging classic results in functional analysis and electromagnetics. Specifically, we will consider degrees of freedom, entropy, and capacity, of radiated signals. We will recall classic results in communication theory and signal analysis, draw connections with electromagnetics, and discuss some recent advancements.

Bio: Massimo Franceschetti received the Laurea degree (with honors) in computer engineering from the University of Naples, Naples, Italy, in 1997, the M.S. and Ph.D. degrees in electrical engineering from the California Institute of Technology, Pasadena, CA, in 1999, and 2003, respectively. He is Professor of Electrical and Computer Engineering at the University of California at San Diego (UCSD). Before joining UCSD, he was a postdoctoral scholar at the University of California at Berkeley for two years. He is coauthor of the book “Random Networks for Communication” and author of the book “Wave theory of information,” both published by Cambridge University Press. He was awarded the C. H. Wilts Prize in 2003 for best doctoral thesis in electrical engineering at Caltech; the S.A. Schelkunoff Award in 2005 for best paper in the IEEE Transactions on Antennas and Propagation, a National Science Foundation (NSF) CAREER award in 2006, an Office of Naval Research (ONR) Young Investigator Award in 2007, the IEEE Communications Society Best Tutorial

Paper Award in 2010, and the IEEE Control theory society Ruberti young researcher award in 2012. He became an IEEE Fellow in 2018 and a Guggenheim Fellow for Natural Sciences, Engineering, in 2019.

Ravi Tandon

Associate Professor

University of Arizona

Amicable Perturbations

Video: Link

Abstract: Machine learning based classifiers have achieved incredible success in a variety of sectors such as college admissions, hiring, banking and other domains. However, their ability to make classifications has not been fully exploited to understand how to improve undesirable classifications. In this talk, I will present a new framework for finding the most efficient changes that could be made in the real world to achieve a more favorable classification, and term these changes Amicable Perturbations. We present a principled methodology for creating amicable perturbations and demonstrate their effectiveness on data sets from a variety of fields. Amicable perturbations differ from counterfactuals in that they are better suited to balance the effort-reward trade-off and lead to the most efficient plan of action. Unlike adversarial examples, which fool a classifier into making false predictions, amicable perturbations are intended to affect the true class of the data. To this end, we develop a novel method for verifying that amicable perturbations change the true class probabilities. We also compare our results to those achieved by previous approaches such as counterfactuals and adversarial attacks.

This talk is based on joint work with Jesse Friedbaum (Ph.D. student at the University of Arizona).

Bio: Ravi Tandon is the Litton Industries John M. Leonis Distinguished Associate Professor in the Department of ECE at the University of Arizona. He received the B.Tech. degree in Electrical Engineering from the Indian Institute of Technology, Kanpur (IIT Kanpur) in 2004 and the Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park (UMCP) in 2010. From 2010 to 2012, he was a post-doctoral research associate at Princeton University, and was a research assistant professor at Virginia Tech, with positions in the ECE department, Hume Center and the Department of Computer Science. He is a recipient of the 2018 Keysight Early Career Professor Award, NSF CAREER Award in 2017, and a Best Paper Award at IEEE GLOBECOM 2011. He is a Senior Member of IEEE and currently serves as an Editor for IEEE Transactions on Information Theory and the IEEE Transactions on Communications. His current research interests include information theory and its applications to wireless networks, signal processing, communications, security and privacy, machine learning and data mining.

Genevera Allen

Associate Professor

Rice University

Graph Learning for Functional Neuronal Connectivity

Video: Link

Abstract: Understanding how large populations of neurons communicate in the brain at rest, in response to stimuli, or to produce behavior as well as how brain function relates to structure are fundamental open questions in neuroscience. Many approach this by estimating the intrinsic functional neuronal connectivity using probabilistic graphical models. But there remain major statistical and computational hurdles to estimating graphical models from new large-scale calcium imaging technologies and from huge projects which image up to one hundred thousand neurons across multiple sessions in the active mouse brain. In this talk, I will highlight a number of new graph learning strategies my group has developed to address many critical unsolved challenges arising with large-scale neuroscience data. Specifically, we will focus on Graph Quilting, in which we derive a method and theoretical guarantees for graph learning from non-simultaneously recorded neurons. We will also highlight theory and methods for graph learning with latent variables induced by unrecorded neurons via thresholding, graph learning for spikey neuronal activity data via Subbotin graphical models, and computational approaches for graph learning from enormous numbers of neurons via minipatch learning. Finally, we will demonstrate the utility of all approaches on synthetic data as well as real calcium imaging data for the task of estimating functional neuronal connectivity.

Bio: Genevera Allen is an Associate Professor of Electrical and Computer Engineering, Statistics, and Computer Science at Rice University and an investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital and Baylor College of Medicine. She is also the Founding Director of the Rice Center for Transforming Data to Knowledge, informally called the Rice D2K Lab. Dr. Allen’s research develops new statistical machine learning tools to help people make reliable data-driven discoveries. She is known for her methods and theory work in the areas of unsupervised learning, interpretable machine learning, data integration, graphical models, and high-dimensional statistics. Her work is motivated by solving real scientific problems, especially in the areas of neuroscience and bioinformatics. Dr. Allen is also a leader in data science education. In 2018, she founded the Rice D2K Lab, a campus hub for experiential learning and data science education. Through her leadership of the D2K Lab, Dr. Allen developed new interdisciplinary data science degree programs, established a novel capstone program in data science and machine learning, and led Rice’s engagement with corporate and community partners in data science. Dr. Allen is the recipient of several honors for both her research and educational efforts including a National Science Foundation Career Award, Rice University’s Duncan Achievement Award for Outstanding Faculty, the Curriculum Innovation Award, and the School of Engineering’s Research and Teaching Excellence Award. In 2014, she was named to the “Forbes ’30 under 30′: Science and Healthcare” list. She is also an elected member of the International Statistics Institute and an elected fellow of the American Statistical Association. Dr. Allen currently serves as an Action Editor for the Journal of Machine Learning Research, an Associate Editor for the Journal of the American Statistical Association: Theory and Methods, and a Series Editor for Springer Texts in Statistics. Dr. Allen received her Ph.D. in statistics from Stanford University, under the mentorship of Prof. Robert Tibshirani, and her bachelors, also in statistics, from Rice University.

Malena Español

Assistant Professor

Arizona State University

Computational Methods for Solving Inverse Problems in Imaging

Video: Link

Abstract: Discrete linear and nonlinear inverse problems arise from many different imaging systems. These problems are ill-posed, which means, in most cases, that the solution is very sensitive to the data. Because the data usually contain errors produced by different imaging system parts (e.g., cameras, sensors, etc.), robust and reliable regularization methods need to be developed for computing meaningful solutions. In some imaging systems, massive amounts of data are produced making the data storage and computational cost of the inversion process intractable. In this talk, we will see different imaging systems, we will formulate the corresponding mathematical models, we will introduce regularization methods, and we will show some numerical results.

Bio: Malena Español is an Assistant Professor in the School of Mathematical and Statistical Sciences at Arizona State University. She has a Bachelor’s in Applied Mathematics from the University of Buenos Aires and a Master’s and PhD in Mathematics from Tufts University. Dr. Español was a Postdoctoral Fellow at the California Institute of Technology, before starting a faculty position at The University of Akron where she received tenure and was promoted to associate professor in 2018. Her research interests are in the development, analysis, and application of mathematical models and numerical methods for solving problems arising in science and engineering, with a focus on problems related to materials science, image processing, and medical applications. In 2018, she co-organized the Women in Math of Materials (WIMM) workshop and helped to create a research community for WIMM. Dr. Español was one of the editors of the Springer AWM Series volume “Research in the Mathematics of Materials Science.” which highlights the research work of women in this area and was published last year. Dr. Español is the lead organizer of AMIGAS, a new summer program for graduate students in applied and computational mathematics. Dr. Español was awarded the 2022 Karen EDGE Fellowship.

Duy Nguyen

Associate Professor

SDSU

Large-Scale Approximate Inference: Theory, Algorithms, and Applications

Video:

Abstract: A central task in statistical inference is the evaluation of the posterior distribution p(x|y) of the latent variables x given the observed data variables y. For many models of practical interest, it will be infeasible to evaluate the posterior distribution, because the dimensionality of the latent space is too high to work with directly or because the posterior distribution has a highly complex form. In these situations, we need to resort to approximate schemes. In this talk, I will review a range of deterministic approximation schemes, including approximate message passing (AMP) and variational Bayes (VB) inference, which scale well to large applications. These are based on analytical approximations to the posterior distribution, for example by assuming that it factorizes in a particular way. I will present the theory behind AMP and VB and some related computationally efficient algorithms in these schemes. Finally, I will present the applications of approximate inference in (sparse) linear regression, logistic regression, probit regression, compressed sensing, and MIMO signal estimation.

Bio: Duy H. N. Nguyen (Senior Member, IEEE) is an Associate Professor in the Department of Electrical and Computer Engineering, San Diego State University. He received the B.Eng. from Swinburne University of Technology, Hawthorn, VIC, Australia in 2005, the M.Sc. from University of Saskatchewan, Saskatoon, SK, Canada in 2009 and the Ph.D. from McGill University, Montreal, QC, Canada in 2013. He was a postdoctoral research fellow at INRS-EMT (University of Quebec), The University of Houston, and the University of Texas at Austin. He joined SDSU in 2016. His current research interests include resource allocation in wireless networks, signal processing for communications, optimization, game theory and machine learning. He received the NSF CAREER award in 2022.

Weizhi Li

Post-doc

Arizona State University

Designing Two-Sample Tests with Local Information

Video: Link

Abstract: Two-sample hypothesis testing is a fundamental tool to determine whether there is a significant difference between two groups of data. It is widely applied in decision-making contexts ranging from drug efficacy evaluation in drug discovery to policy assessment in government policymaking. Standard two-sample tests construct the statistic, which is a global characterization of data to make a decision; this could adversely impact the utility of a two-sample test when the ‘’signal to noise ratio” of the dataset is small or available data is

insufficient. In this talk, I will address how to integrate the local information of data into the design of two-sample testing in the settings of batch and sequential active learning (i.e., features are inexpensive, but labels are costly) and the standard batch-setting (both features and labels can be readily accessed) to increase the utility of the two-sample tests. Throughout this talk, I will highlight two-sample tests that are based on graph, information- theoretic, and betting-theoretic measures.

Bio: Weizhi Li is a post-doc at Arizona State University, working with Dr. Visar Berisha and Dr. Gautam Dasarathy. He obtained his M.S. in Electrical Engineering in 2017 from Texas A&M University and his Ph.D. in Computer Engineering in 2022 from Arizona State University. He was supervised by Dr. Visar Berisha and Dr. Gautam Dasarathy in his Ph.D. study. His research interests include active learning and statistical hypothesis testing with the aim of designing data-efficient decision-making algorithms.

Bane Vasić

Professor

University of Arizona

Quantum Error Correction: Introduction and Latest Developments in QLDPC Codes

Video: Link

Abstract: Fundamental concept of the mathematical theory of information laid down by Shannon is that of error correcting codes. Error correction codes play a vital role in ensuring the integrity of data in systems exposed to noise or errors. Classical error correcting codes were crucial to the success of modern communications and data storage systems (from the Internet to mobile, satellite and deep-space communications, and from disk to flash memory storage) and found applications in other areas, such as pattern recognition, group testing, cryptography, or fault-tolerant (FT) computing. Likewise, quantum error correction (QEC) codes at the heart of all quantum information processing, from FT quantum computing to reconciliation in quantum key distribution, quantum sensing, and reliable optical communications. However, unlike classical coding theory which is a mature and established discipline, quantum codes are still a subject of extensive research. The importance of QEC is that it is the only presently known gateway to reap the benefits of computational quantum algorithms, but a robust and scalable QEC has not been yet demonstrated experimentally. Arguably, QEC is the only technology still lacking to realize a vision of useful large-scale quantum computation, and its development is pursued by many research groups in academia, national labs, and industry. One of the most promising solutions is based on quantum low-density parity check (QLDPC) codes, which are the only known class of quantum codes in the stabilizer family with asymptotically nonzero rates and that support fault-tolerant operation using noisy quantum gates. This talk will present an overview of the research in QLDPC codes. It is prepared for classical communications theory researchers, and no background in quantum mechanics nor error correction is required.

Bio: Dr. Bane Vasic is a Professor of Electrical and Computer Engineering and Mathematics at the University of Arizona and a Director of the Error Correction Laboratory. He is an inventor of the soft error-event decoding algorithm for intersymbol interference channels with correlated noise, and the key architect of a detector/decoder for Bell Labs data storage read channel chips which were regarded as the best in industry. His pioneering work on structured low-density parity check (LDPC) error correcting codes based on combinatorial designs has enabled low-complexity iterative decoder implementations. Structured LDPC codes are today adopted in a number of communications standards and data storage systems. Dr. Vasic’s work on codes on graphs, trapping sets and error floor of iterative decoding algorithms has led to decoders for the binary symmetric channel with best error-floor performance known today. Dr. Vasic is a PI on a Department of Energy multi-university $115M-project led by Fermi National Laboratory to establish a Center for Superconducting Materials and Systems. He is a co-PI of the $52M NSF Center for Quantum Network hosted at the University of Arizona, and involved in University of Arizona Quantum Hub, a group of researchers leading effort to establish a graduate program in quantum information science and engineering at the UArizona. He is also funded by NASA-Jet Propulsion Laboratory through the Strategic University Partnership Program for the development of quantum codes and error correction algorithms for NASA space missions, and is a PI on seven research grants funded by the National Science Foundation. He is a founder of Codelucida, a company developing advanced error correction solutions for communications and data storage. Recently, Codelucida has received numerous innovation awards including 2017 Arizona Innovation Challenge Award from Arizona Commerce Authority, 2018 I-Squared Startup of the Year from Tech Launch Arizona, and Best of Show Award for the Most Innovative Flash Memory Technology at the 2019 Flash Memory Summit, the World largest exhibition of flash memory technologies. Codelucida is a Xilinx Partner providing LDPC Codec IP cores for flash memory controllers. He is an IEEE Fellow, Fulbright Scholar, da Vinci Fellow, and a past Chair of IEEE Data Storage Technical Committee.

Angelia Nedic

Professor

Arizona State University

Characterizing Trust and Resilience in Distributed Optimization for Cyberphysical Systems

Video: Link

Abstract: This talk considers the problem of resilient distributed multi-agent optimization for cyberphysical systems in the presence of malicious or non-cooperative agents. It is assumed that stochastic values of trust between agents are available which allows agents to learn their trustworthy neighbors simultaneously with performing updates to minimize their own local objective functions. The development of this trustworthy computational model combines the tools from statistical learning and distributed consensus-based optimization. Specifically, we derive a unified mathematical framework to characterize convergence, deviation of the consensus from the true consensus value, and expected convergence rate, when there exists additional information of trust between agents. We show that under certain conditions on the stochastic trust values and consensus protocol: 1) almost sure convergence to a common limit value is possible even when malicious agents constitute more than half of the network, 2) the deviation of the converged limit, from the nominal no attack case, i.e., the true consensus value, can be bounded with probability that approaches 1 exponentially, and 3) correct classification of malicious and legitimate agents can be attained in finite time almost surely. Further, the expected convergence rate decays exponentially with the quality of the trust observations between agents. We then combine this trust-learning model within a distributed gradient-based method for solving a multi-agent optimization problem and characterize its performance.

Bio: Angelia Nedić has a Ph.D. from Moscow State University, Moscow, Russia, in Computational Mathematics and Mathematical Physics (1994), and a Ph.D. from Massachusetts Institute of Technology, Cambridge, USA in Electrical and Computer Science Engineering (2002). She has worked as a senior engineer in BAE Systems North America, Advanced Information Technology Division at Burlington, MA. Currently, she is a faculty member of the school of Electrical, Computer and Energy Engineering at Arizona State University at Tempe. Prior to joining Arizona State University, she has been a Willard Scholar faculty member at the University of Illinois at Urbana-Champaign. She is a recipient (jointly with her co-authors) of the Best Paper Award at the Winter Simulation Conference 2013 and the Best Paper Award at the International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt) 2015. Her general research interest is in optimization, large scale complex systems dynamics, variational inequalities and games.

Yuan Wang

Assistant Professor

University of South Carolina

Topological Clustering and Inference of Brain Networks

Video:

Abstract: Topological data analysis (TDA) has motivated the exploration of mesoscale features in brain signals and networks. Persistent homology is a key TDA algorithm for decoding and representing these features via persistence descriptors such as persistence diagram (PD), whose statistical significance is often revealed through permutation testing. But testing on PDs is challenging due to the heterogeneity of points in the diagrams that encode the birth and death times of features through a dynamic filtration of the subnetworks. This talk will showcase a topological clustering and transposition-based permutation testing framework based on heat-diffusion estimates of PDs to resolve computational bottlenecks of heavy permutations, with applications to the comparison of brain networks constructed from neuroimaging data.

Bio: Dr. Wang is an assistant professor of biostatistics at the University of South Carolina. Her research has been focused on developing statistical inference methods for topological signal processing and network analysis, with applications to neuroimaging data in epilepsy and post-stroke aphasia. She is also collaborating on NIH-funded projects decoding patterns in wearable sensor signals.

Christian Arenz

Assistant Professor

Arizona State University

29th April

2023

On the convergence of hybrid quantum-classical algorithms

Video: Link

Abstract: The development of prototype quantum computers has sparked a wave of research into algorithms to best leverage these new resources. A major focus has been on hybrid quantum-classical algorithms, which have been designed for applications including combinatorial optimization and quantum simulation. The overarching premise of hybrid quantum-classical algorithms is to use quantum and classical computing resources in tandem to solve a problem under consideration that is formulated as an optimization problem. A quantum computer is used to evaluate the associated objective function by executing a parameterized quantum circuit, and a classical computer is used to optimize over the parameters to optimize the objective function and solve the problem at hand.

The performance and utility of hybrid quantum-classical algorithms are tightly linked to our ability to solve the classical optimization problem, and this can be a major challenge, i.e., the optimization problem is typically non-convex, and convergence to suboptimal solutions can be a significant concern. In this talk, I present several ways forward for addressing this challenge. In particular, I show that overparameterization can improve convergence to the global optimum and argue that randomized adaptive strategies, where quantum circuits are grown according to local gradient information, converge to the global optimum almost surely. I will conclude by outlining further challenges and opportunities related to the existence of Barren plateaus in the optimization landscape where gradients can become exponentially small due to the concentration of measure phenomenon.

Bio: Christian Arenz joined Arizona State University as an assistant professor in the School of Electrical, Computer and Energy Engineering in January 2022. Prior to joining ASU he was a lecturer and an associate research scholar at Princeton University. Previously, he completed his PhD in applied mathematics at Aberystwyth University in 2016, where he focused on the control of open and noisy quantum systems. He completed his master’s degree equivalent in theoretical physics from Saarland University in 2012, where he studied quantum optical systems. Christian’s current research centers on using tools from control theory to advance quantum information science. His work targets applications such as the design of robust and efficient controls for quantum computing, and the development of quantum algorithms for optimization and machine learning tasks.

Christos Thrampoulidis

Abstract: Mutual Information (MI) and its conditional variant (CMI) are well known information-theoretic measures to quantify the amount of information in a system. While MI measures information between two random variables, CMI extends it to a system where, in addition, we are given a set of conditioning random variables. CMI is more interesting in this regard, yet challenging, as the conditioning set grows in dimension. While nearest-neighbor based estimators of MI and CMI (Kraskov et al 2003) have long been known and extensively studied, they struggle to estimate MI and CMI in high dimensions. In this talk, we take a different approach and use classifiers for MI and CMI estimation. We study their applications in conditional independence testing and explore the recent research extensions of MI and CMI estimators in multifarious areas such as inferring functional connectivity in neural data or measuring time-series dependencies

Bio: Dr. Sudipto Mukherjee received his Ph.D. from the Department of Electrical and Computer Engineering, University of Washington (Seattle). After his Ph.D., he joined Microsoft Corporation and is currently a Senior Applied Scientist in the Research and Incubations division of the company. Dr. Mukherjee’s research interests span across multiple areas of machine learning and AI, namely unsupervised representation learning, multi-modal representation learning combining graphs and language data as well as bringing together the advances in AI to create novel applications for the tech industry. Some of his noteworthy works include “ClusterGAN: Latent space clustering inGenerative Adversarial Networks”, “CCMI: Classifier-based conditional mutual information estimation” and “Smart ToDo: Automatic generation of To-Do List from Emails”.

Flavio Calmon

Assistant Professor

Harvard

16th September

2022

Information-Theoretic Tools for Responsible Machine Learning

Video: Link

Abstract: We introduce information-theoretic results for fair machine learning. First, we study the problem of finding the element within a convex set of conditional distributions with the smallest f-divergence to a reference distribution. Motivated by applications in machine learning, we refer to this problem as model projection since any probabilistic classification model (e.g., logistic regression, random forests) can be viewed as a conditional distribution. The new, projected classifier is given by a tilting (i.e., post-processing) of the outputs of the original classifier. We show that the parameters of this tilting can be computed at scale (e.g., on a GPU) and has provable performance guarantees. We apply model projection to create group-fair probabilistic classifiers by projecting an (unfair) classifier onto the set determined by fairness constraints. Our numerical results demonstrate that this approach achieves state-of-the-art fairness-accuracy trade-off while scaling to datasets with millions of samples.

In the second part of the talk, we investigate the group fairness concerns of training a machine learning model using data with missing values. Most fairness interventions require a complete training set as input. In practice, data can have missing values, and data missing patterns can depend on group attributes. We theoretically analyze different sources of discrimination risk when training models with an imputed dataset. We then propose a classification approach based on decision trees that integrates classification and imputation, thus circumventing fairness risks that may appear when performing data imputation and classification separately.

Bio: Flavio P. Calmon is an Assistant Professor of Electrical Engineering at the Harvard John A. Paulson School of Engineering and Applied Sciences. Before joining Harvard, he was the inaugural Data Science for Social Good Post-Doctoral Fellow at IBM Research in Yorktown Heights, New York. He received his Ph.D. in Electrical Engineering and Computer Science at MIT. His research develops information-theoretic tools for responsible, reliable, and rigorous machine learning.

Prof. Calmon has received the NSF CAREER award, faculty awards from Google, IBM, Oracle, and Amazon, the NSF-Amazon Fairness in AI award, the Harvard Data Science Initiative Bias2 award, and the Harvard Dean of Undergraduate Studies Commendation for "Extraordinary Teaching during Extraordinary Times." He also received the inaugural Título de Honra ao Mérito (Honor to the Merit Title) given to alumni from the Universidade de Brasília (Brazil), being the first awardee in the areas of engineering, computer science, mathematics, and statistics.

Christina Yu

Assistant Professor

Cornell

9th September

2022

Sequential Fair Allocation - Achieving the Optimal Envy-Efficiency Tradeoff Curve

Video: Link

Abstract: We consider the problem of dividing limited resources to individuals arriving over T rounds with a goal of achieving fairness across individuals. In general there may be multiple resources and multiple types of individuals with different utilities. A standard definition of 'fairness' requires an allocation to simultaneously satisfy envy-freeness and pareto efficiency. However, in the online sequential setting, the social planner must decide on a current allocation before the downstream demand is realized, such that no policy can guarantee these desiderata simultaneously with probability 1, requiring a modified metric of measuring fairness for online policies. We show that in the online setting, the two desired properties (envy-freeness and efficiency) are in direct contention, in that any algorithm achieving additive counterfactual envy-freeness up to a factor of L_T necessarily suffers an efficiency loss of at least 1 / L_T. We complement this uncertainty principle with a simple algorithm, HopeGuardrail, which allocates resources based on an adaptive threshold policy and is able to achieve any fairness-efficiency point on this frontier. Our result is the first to provide guarantees for fair online resource allocation with high probability for multiple resource and multiple type settings. In simulation results, our algorithm provides allocations close to the optimal fair solution in hindsight, motivating its use in practical applications as the algorithm is able to adapt to any desired fairness efficiency trade-off. This is joint work with Sean Sinclair and Siddhartha Banerjee.

Bio: Christina Lee Yu is an Assistant Professor at Cornell University in the School of Operations Research and Information Engineering. Prior to Cornell, she was a postdoc at Microsoft Research New England. She received her PhD and MS in Electrical Engineering and Computer Science from Massachusetts Institute of Technology, and she received her BS in Computer Science from California Institute of Technology. She received honorable mention for the 2018 INFORMS Dantzig Dissertation Award, and she is a recipient of the 2021 Intel Rising Stars Award and 2021 JPMorgan Faculty Research Award. Her research interests include algorithm design and analysis, high dimensional statistics, inference over networks, sequential decision making under uncertainty, online learning, and network causal inference.

Michelle Efros

Professor

Caltech

29th April

8th April

2022

Super-resolution with Binary Constraints: Theory, Algorithm and Sensing Strategies

Video: Link

Abstract: The problem of super-resolving the “details” of a signal or image of interest from low-resolution measurements (where the high frequency contents of the signal/image are severely attenuated) has been extensively studied both theoretically and from an application perspective. It is well-known that even in the absence of noise, the problem of super-resolution is an “ill-posed” inverse problem and the forward model cannot be simply ‘inverted’ to recover the desired image. The mathematical theory of super-resolution therefore focuses on leveraging prior knowledge about the signal/image class, (often described in terms of sparsity) in order to develop theoretical guarantees under which it possible to solve this problem, and attain stable reconstruction (modulo noise amplification factors).

In this work, we explore the role of finite-valued (specifically, binary-valued) priors in super-resolution. The study of finite valued signals is primarily inspired from neural spike deconvolution (where the underlying high rate spiking activity is binary-valued), but also applies to a wider range of applications applications such as discrete tomography, medical imaging, astronomical imaging, and image segmentation. We will show that binary constraints offer surprisingly stronger identifiability guarantees than ‘sparsity’, even allowing us to operate in “extreme compression" regimes, where the number of measurements can be much smaller than the sparsity level. Instead of ‘relaxing’ the binary constraints, we advocate “no-relaxation’” strategies for super-resolution, which can operate at ‘extreme compressive regimes’ by explicitly imposing binary constraints. A central idea in overcoming the computational challenges associated with enforcing such binary constraints is via the design of certain structured filter-

11th February
2022

Towards Minimax Optimal Best Arm Identification in Linear Bandits

Video: Link

Abstract: We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We provide a theoretical analysis of the failure probability of OD-LinBAI. While the performances of existing methods (e.g., BayesGap) depend on all the optimality gaps, OD-LinBAI depends on the gaps of the top d arms, where d is the effective dimension of the linear bandit instance. Furthermore, we present a minimax lower bound for this problem. The upper and lower bounds show that OD-LinBAI is minimax optimal up to multiplicative factors in the exponent. Finally, numerical experiments corroborate our theoretical findings.

This is joint work with Junwen Yang (Institute of Operations Research and Analytics, NUS)

Bio: Vincent Y. F. Tan (S'07-M'11-SM'15) was born in Singapore in 1981. He received the B.A. and M.Eng. degrees in electrical and information science from Cambridge University in 2005, and the Ph.D. degree in electrical engineering and computer science (EECS) from the Massachusetts Institute of Technology (MIT) in 2011. He is currently a Dean’s Chair Associate Professor with the Department of Electrical and Computer Engineering and the Department of Mathematics, National University of Singapore (NUS). His research interests include information theory, machine learning, and statistical signal processing.

Dr. Tan is a member of the IEEE Information Theory Society Board of Governors. He was an IEEE Information Theory Society Distinguished Lecturer from 2018 to 2019. He received the MIT EECS Jin-Au Kong Outstanding Doctoral Thesis Prize in 2011, the NUS Young Investigator Award in 2014, the Singapore National Research Foundation (NRF) Fellowship (Class of 2018), and the NUS Young Researcher Award in 2019. A dedicated educator, he was also awarded the Engineering Educator Award in 2020 and 2021. He is currently serving as an Associate Editor for the IEEE Transactions on Signal Processing and as an Associate Editor in Machine Learning and Statistics for the IEEE Transactions on Information Theory.

Ken Duffy

Professor

Hamilton Institute

4th February
2022

Guessing Random Additive Noise Decoding

Video: Link

Abstract: Shannon's 1948 opus established that the highest rate that a noisy channel can support is achieved as error correcting codes become long. Since 1978 it has been known that Maximum Likelihood (ML) decoding of linear codes is NP-complete. Those results drove the paradigm of co-designing restricted classes of codebooks with code-specific methods that exploit code-structure to enable computationally efficient approximate-ML decoding for long, high-redundancy codes. Contemporary applications, including augmented reality, vehicle-to-vehicle communications, and the Internet of Things, are driving demand for Ultra-Reliable Low-Latency Communication (URLLC). Realizing URLLC technologies requires shorter codes, vacating the computational complexity issues associated with long codes and motivating revisiting the possibility of creating practical, accurate universal decoders.

In this talk, we introduce Guessing Random Additive Noise Decoding (GRAND), a universal ML decoder suitable for use with any moderate redundancy code of any length. Mathematically, GRAND's offers a new approach to establishing capacity and error exponent results. In practice, despite being first published in 2018, it has already resulted in circuit designs and a taped-out chip that demonstrate its suitability and efficiency in hardware. In this talk, we explain the theoretical rationale behind GRAND, recent hard- and soft-detection developments, and future possibilities.

The talk is based on joint work with Muriel Medard (MIT), with the circuits work performed in collaboration with Rabia Yazicigil (BU).

Bio: Ken R. Duffy is a Professor of Applied Probability and the Director of the Hamilton Institute, an interdisciplinary research centre with 40 affiliated faculty at the National University of Ireland Maynooth. He is one of three co-Directors of the Science Foundation Ireland Centre for Research Training in Foundations of Data Science, which is supported by 16 thematically diverse enterprise alliance partners and funds more than 90 PhD students.

He obtained a B.A. in mathematics in 1996 and Ph.D. in probability theory in 2000, both awarded by Trinity College Dublin. He works in works highly collaborative multi-disciplinary teams to design, analyse and realise algorithms for communication systems and the life sciences using tools from probability, statistics, and machine learning. Algorithms he has developed have been implemented in digital circuits and in DNA.

He is a co-founder of the Royal Statistical Society's Applied Probability Section (2011), co-authored a cover article of Trends in Cell Biology (2012), is a winner of a best paper award at the IEEE International Conference on Communications (2015), the best paper award from IEEE Transactions on Network Science and Engineering (2019), and the best research demo award from COMSNETS (2022).

Ashif Iquebal

Assistant Professor

ASU

3rd December
2021

Active Learning for Regression problem

Video: Link

Abstract: The past decade has witnessed widespread adoption of machine learning by the

manufacturing and materials community. However, most of the existing works have relied on vast

amounts of experimental data available from past experiments. With the emphasis shifting towards

novel materials and new manufacturing processes, traditional passive experimental design methods are

not suited to exploring high-dimensional search spaces. Active learning, in contrast, provides an

adaptive approach to select the most informative experiments, guide the search of high-dimensional

spaces, and reduce experimental costs.

In this work, we will focus on active learning in regression problems, where the objective is to learn the

underlying black-box function with as few samples as possible. In the first half, we will discuss the

exploration-exploitation issues in the regression setting and present a hierarchical Bayesian approach to

dynamically balance the trade-off as more samples are collected iteratively. Results on simulated case-

studies and a real-world materials problem are presented. In the second half, we will talk about the

application of active learning in approximating the value function for finite horizon partially observable

Markov decision processes using an active point-based algorithm. We will discuss the convergence rates and some results on benchmark datasets.

Bio: Ashif Iquebal is an assistant professor of Industrial Engineering in the School of Computing and

Augmented Intelligence at ASU. Prior to this, he obtained his Ph.D. from the Department of Industrial

and Systems Engineering at Texas A&M University. His research is focused on developing

methodological foundations in data science and machine learning, particularly on statistical

representation and quantification of high-dimensional data, active learning, and graphical models. He

received the Pritsker Doctoral Dissertation Award from the Institute of Industrial and Systems

Engineering (IISE) in 2021. In the past, his research papers were recognized as winners/finalists for five

best student paper/poster awards at INFORMS, IISE, and the American Statistical Association

conferences.

Duong Nguyen

Assistant Professor

ASU

19th November
2021

Robust service placement and workload allocation in edge computing

Video: Link

Abstract: Edge computing promises to offer low-latency and ubiquitous computation to numerous mobile and Internet of Things devices. Thus, it can complement the cloud to deliver a superior user experience, reduce network traffic, and enable various IoT applications. How to jointly optimize the service placement, sizing, and workload allocation decisions in an edge-computing system is an important and challenging problem, which becomes even more complicated when considering numerous system uncertainties. In this talk, we will study this problem from the perspective of a service provider, who can procure resources from numerous edge nodes to improve the user experience while minimizing its cost. We propose and formulate a novel two-stage adaptive robust optimization model to help the service provider optimally determine the placement and sizing decisions that can hedge against any possible realization of the uncertain demand and unpredictable node failures. We then extend it to a two-stage multi-period robust model with integer recourse to examine the benefits of considering dynamic service placement as well as spatial-temporally correlated uncertainties.

Bio: Duong Nguyen is an assistant professor of electrical engineering at Arizona State University. He received his doctorate in electrical and computer engineering from the University of British Columbia in 2020. His research lies at the intersection of operations research, AI, economics, and engineering, with a specific focus on developing new mathematical models and techniques for decision-making and economic analysis of large-scale networked systems such as cloud/edge computing, EV charging networks, intelligent transportation, and crowdsourcing.

Lalit Jain

Joint work with Aditya Gangrade, Praveen Venkatesh, Zeynep Kahraman, and Venkatesh Saligrama.

Bio: Bobak Nazer is an Associate Professor in the ECE Department and a Distinguished Faculty Fellow in the College of Engineering at Boston University. He received the Ph.D in 2009 and M.S. in 2005 from the University of California, Berkeley and the B.S. in 2003 from Rice University, all in electrical engineering. He is the recipient of the IEEE Communications Society and Information Theory Society Joint Paper Award and the NSF CAREER award in 2013 as well as the Eli Jury award in 2009 from the Berkeley EECS Department.

Dheeraj Nagaraj

PhD
MIT

15th October
2021

Streaming Estimation with Markovian Data: Limits and Algorithms

Video: Link

Abstract: Standard results in machine learning theory provide near optimal algorithms for learning under the assumption of i.i.d. data in many problems of interest. However, data with temporal dependence occur often in practice in the analysis of time series, system identification and reinforcement learning. In these settings, the data is often assumed to be derived from a mixing Markov process and it is important to learn-on-the-go.

Currently, the theoretical analyses of many algorithms reduce learning from these data sets to learning from independent data by considering one in every mixing time number of samples and dropping the rest. In this talk, we will first see that this is in fact tight in the worst case and naively implementing SGD in these scenarios suffers from slow convergence due to dependencies present in the data. However, in well-specified cases, we can design algorithms to accurately unfurl this dependency structure in order to obtain SGD-style streaming algorithms with i.i.d. data like sample complexity. To this end, we consider the specific tasks of least squares regression with Markovian data and non-linear system identification, and introduce parallel SGD and SGD with Reverse Experience Replay (SGD-RER) which is a rigorous form of the popular heuristic Experience Replay used in practical RL. We then sketch an application to the widely used Q-learning algorithm.

Bio: Dheeraj Nagaraj is a sixth year graduate student at Lab for Information and Decision Systems (LIDS) at MIT advised by Prof. Guy Bresler. His work involves various problems in theoretical machine learning, applied probability and statistics. His current work focuses on representation power of deep neural networks, random graphs with latent geometric structure and stochastic optimization algorithms.

Subhonmesh Bose

Assistant Professor
UIUC

1st October
2021

Risk-Sensitive Optimization for Electricity Markets

Video: Link

Abstract: Power system operation is fraught with uncertainties. Electricity markets must evolve to model such uncertainties and optimize available resources against them. In this talk, I will explore algorithm design motivated to tackle risk-sensitive electricity market clearing formulations, where power delivery risk is modeled via the conditional value at risk (CVaR) measure. I will discuss algorithmic architectures and their convergence properties to solve these risk-sensitive optimization problems. The first half of the talk will focus on an optimization problem that can be cast as a large linear program. For this problem, I will discuss an algorithm that shares parallels and differences with Benders’ decomposition. In the second half of this talk, I will consider another risk-sensitive problem for which I will present sample complexity guarantees of a stochastic primal-dual algorithm.

Bio: Subhonmesh Bose is an Assistant Professor in the Department of Electrical and Computer Engineering at UIUC. His research focuses on facilitating the integration of renewable and distributed energy resources into the grid edge, leveraging tools from optimization, control and game theory. Before joining UIUC, he was a postdoctoral fellow at the Atkinson Center for Sustainability at Cornell University. Prior to that, he received his MS and Ph.D. degrees from Caltech in 2012 and 2014, respectively. He received the NSF CAREER Award in 2021. His research projects have been supported by grants from NSF, PSERC, Siebel Energy Institute and C3.ai, among others.

Fei Wei

Postdoc
ASU

24th September
2021

Network coding, group network codes, and the edge removal problem

Video: Link

Abstract: Network coding is a network communication scheme which deploys encoding at intermediate nodes and can significantly improve the network throughput. Beyond the conventional linear network codes, group network codes are a special family of network coding schemes that are acclimated with an algebraic (group) structure and are known to be (approximately) optimal. However, not much is known on this special family of codes. In this talk, we will introduce on an intriguing problem in the context of network coding, the edge removal problem, which studies the impact of removing a network communication edge from a network. We will observe this problem thought the lens of group network codes, and discuss interesting results regarding to this problem.

Bio: : Fei Wei is a postdoc at Arizona State University working with Prof.Lalitha Sankar and Prof.Oliver Kosut. He obtained his PhD in Electrical Engineering from University at Buffalo, the State University of New York, advised by Prof.Michael Langberg and collaborated with Prof.Michelle Effros from Caltech. He is broadly interested in topics related to network information theory, communication, data security and privacy.

Steven Low

Professor
Caltech

17th September
2021

OPF Hardness: Theory and Practice

Abstract: Optimal power flow (OPF) problems underly numerous power system applications. It is well known that OPF is non-convex and NP-hard. It can be solved approximately either via convex relaxations or local algorithms. Even though OPF is hard in theory, it seems ``easy’’ in practice in that, empirically, semidefinite relaxations are often exact and local algorithms often yield global solutions. We summarize several sufficient conditions for exact relaxations in both single and three-phase radial networks. We describe sufficient or necessary condition for non-convex problems to simultaneously have exact relaxations and no spurious local optima. These conditions help explain widespread empirical experience that local algorithms for OPF problems often work well.

Bio: Steven Low is the F. J. Gilloon Professor of the Department of Computing & Mathematical Sciences and the Department of Electrical Engineering at Caltech, and an Honorary Professor of the Electrical & Electronic Engineering Department at Melbourne University. He was a co-recipient of IEEE best paper awards, an awardee of the IEEE INFOCOM Achievement Award, and the ACM SIGMETRICS Test of Time Award, and is a Fellow of IEEE, ACM, and CSEE. He was well-known for his work on Internet congestion control and semidefinite relaxation of optimal power flow problems in smart grid. His research on networks has been accelerating more than 1TB of Internet traffic every second since 2014. His research on smart grid is providing large-scale cost-effective electric vehicle charging to workplaces. He received his B.S. from Cornell and PhD from Berkeley, both in EE.

This was a joint talk with IE Decision Systems Engineering Fall ’21 Seminar Series

P. R. Kumar

Regents Professor
Texas A & M University

3rd September
2021

Revisiting Exploration versus Exploitation: Multi-Armed Bandits and Adaptive Control

Abstract: We consider the central problem of exploration versus exploitation that lies at the heart of several dynamic learning problems. We revisit the problem of regret in adaptive control and examine it in the light of recent interest in solving large scale bandit problems. For bandit problems, we present a family of schemes that admits simple index policies whose regret performance appears to be near the best apparently currently available, and at low computational complexity per decision. [Joint work with Ping-Chun Hsieh, Yu-Heng Hung, Xi Liu, Akshay Mete, Rahul Singh, Anirban Bhattacharya and Le Xie].

Bio: P. R. Kumar is a Regents Professor, a University Distinguished Professor, Holder of the O’Donnell Foundation Chair I, and Professor in the Department of Electrical and Computer Engineering at Texas A&M University, and Franklin W. Woeltge Professor Emeritus in the Department of Electrical and Computer Engineering at University of Illinois, Urbana-Champaign. He is an Honorary Professor at IIT Hyderabad.

His current focus includes 5G, Wireless Networks, Cybersecurity, Cyberphysical Systems, Privacy, Unmanned Aerial System Traffic Management, Reinforcement Learning, Machine Learning, and Power Systems.

Nicolo Michelusi

Assistant Professor
Arizona State University

19th March
2021

Some Recent Progresses in Mathematics of Neural Training

Video: Link

Abstract: One of the paramount mathematical mysteries of our times is to be able to explain the phenomenon of deep-learning. Neural nets can be made to paint while imitating classical art styles or play chess better than any machine or human ever and they seem to be the closest we have ever come to achieving "artificial intelligence". But trying to reason about these successes quickly lands us into a plethora of extremely challenging mathematical questions - typically about discrete stochastic processes. In this talk we will describe two of the most recent directions of our work in this quest.

Firstly we will explain how under mild distributional conditions we can construct iterative algorithms which can train a ReLU gate in the realizable setting in linear time while also keeping track of mini-batching. We will show how this algorithm does approximate training when there is a data-poisoning attack on the training labels. Such convergence proofs remain unknown for S.G.D and we will show in experiments that our algorithm very closely mimics the behaviour of S.G.D.

In the second half of the talk we will review this very new concept of "local elasticity" of a learning process and demonstrate how it appears to reveal certain universal phase changes during neural training. Then we will introduce a mathematical model which reproduces some of these key properties in a semi-analytic way. We will end by delineating various open questions in this theme of macroscopic phenomenology with neural nets

This is joint work with Prof. Weijie Su (Wharton, Statistics), Prof. Sayar Karmakar (U Florida, Statistics) and Phani Deep (Amazon, India)

Bio: Anirbit Mukherjee obtained his Ph.D. in applied mathematics at the Johns Hopkins University advised by Prof. Amitabh Basu. He is now a post-doc at Wharton (UPenn), Statistics with Prof. Weijie Su. He specializes in deep-learning theory and has been awarded 2 fellowships from JHU for this research - the Walter L. Robb Fellowship and the inaugural Mathematical Institute for Data Science Fellowship. Earlier, he was a researcher in Quantum Field Theory, while doing his undergrad in physics at the Chennai Mathematical Institute (CMI) and masters in theoretical physics at the Tata Institute of Fundamental research (TIFR)

Anna Scaglione

Professor, Arizona State

12th March
2021

A User Guide to Low-Pass Graph Signal Processing and its Applications

Video: Link

Abstract: The notion of graph filters can be used to define generative models for graph data. In fact, the data obtained from many examples of network dynamics may be viewed as the output of a graph filter. With this interpretation, classical signal processing tools such as frequency analysis have been successfully applied with analogous interpretation to graph data, generating new insights for data science. What follows is a user guide on a specific class of graph data, where the generating graph filters are low-pass, i.e., the filter attenuates contents in the higher graph frequencies while retaining contents in the lower frequencies. Our choice is motivated by the prevalence of low-pass models in application domains such as social networks, financial markets, and power systems. We illustrate how to leverage properties of low-pass graph filters to learn the graph topology or identify its community structure; efficiently represent graph data through sampling, recover missing measurements, and de-noise graph data; the low-pass property is also used as the baseline to detect anomalies

Bio: Anna Scaglione (M.Sc.'95, Ph.D. '99) is currently a Professor of Electrical, Computer and Energy Engineering at Arizona State University. She was Professor of Electrical and Computer Engineering previously at the University of California at Davis (2008-2014) and at Cornell University (2001-2008), where she became Associate Professor with tenure in 2006. Prior to joining the engineering faculty at Cornell, Scaglione was an Assistant Professor at the University of New Mexico (2000-2001). Dr. Scaglione’s expertise is in the broad area of statistical signal processing with application to communication networks, electric power systems/intelligent infrastructure and network science. Dr. Scaglione was elected an IEEE fellow in 2011. She is the recipient of the 2000 IEEE Signal Processing Transactions Best Paper Award, the 2013, IEEE Donald G. Fink Prize Paper Award for the best review paper in that year among all IEEE publications. Also, her work with her student earned the 2013 IEEE Signal Processing Society Young Author Best Paper Award (Lin Li) and several best conference paper awards. She was SPS Distinguished Lecturer for the years 2019-2020 and is the recipient of the 2020 Technical Achievement Award from the IEEE Communication Society Technical Committee on Smart Grid Communications

Shipra Agrawal

Asst. Professor, Columbia Univ.

5th March
2021

Learning in structured MDPs with convex cost functions : Improved regret bounds for inventory management

Video: TBU

Abstract: The stochastic inventory control problem under censored demands is a fundamental problem in revenue and supply chain management. A simple class of policies called "base-stock policies'' is known to be asymptotically optimal for this problem in certain settings, and further, the convexity of long-run average-cost under such policies has been established. In this work, we present a learning algorithm for the stochastic inventory control problem under lost sales penalty and positive lead times, when the demand distribution is a priori unknown. Our main result is a bound of O(L\sqrt{T}+D) on the regret against the best basestock policy. Here T is the time horizon, L is the fixed and known lead time, and D is an unknown parameter of the demand distribution described roughly as the number of time steps needed to generate enough demand for depleting one unit of inventory. Our results significantly improve the existing regret bounds for this problem. Notably, even though the state space of the underlying Markov Decision Process (MDP) in this problem is continuous and L-dimensional, our regret bounds depend linearly on L. Our techniques utilize convexity of the long-run average cost and a newly derived bound on `bias' of base-stock policies, to establish an almost blackbox connection between the problem of learning and optimization in such MDPs and stochastic convex bandit optimization. The techniques presented here may be of independent interest for other settings that involve large structured MDPs but with convex cost functions.

Bio: Shipra Agrawal is Cyrus Derman Assistant Professor of the Department of Industrial Engineering and Operations Research. She is also affiliated with the Department of Computer Science and the Data Science Institute, at Columbia University. Her research spans several areas of optimization and machine learning, including online optimization, multiarmed bandits, online learning, and reinforcement learning. Shipra serves as an associate editor for Management Science, Mathematics of Operations Research, and INFORMS Journal on Optimization. Her research is supported by an NSF CAREER award and faculty research awards from Google and Amazon.

Ankita Shukla

Postdoc,
Arizona State Univ.

26th February
2021

Geometric Constraints for Learning Representations for Visual Data

Abstract: Representation of visual data is a connecting link between perceptual world and machine based processing. Over the decades, the computer vision community is dedicated to improve these representations, so that it can assist humans in a wide range of applications from medical imaging to visual search and face recognition systems to name a few. In this talk, I will present how geometric constraints can be used to aid in learning representations for various computer vision applications that either have access to only limited amount of labeled training data, abundant unlabeled training data or a combination of two. The talk will cover two types of geometric constraints: manifold and semantic. The first part of the talk will cover the application of manifold constraint in the unsupervised learning of disentangled representations, that improves the interpretability of deep networks. The second part of the talk will cover an interesting application for visual animal biometrics for wildlife conservation using semantic constraints. We will see that such constraints result in improved robustness and generalization of the representations for primate face recognition as well as tiger re-identification problems.

Sanmi Koyejo

Asst. Professor, UIUC

12th February
2021

The Measurement and Mismeasurement of Trustworthy ML

Video: Link

Abstract: Across healthcare, science, and engineering, we increasingly employ machine learning (ML) to automate decision-making that, in turn, affects our lives in profound ways. However, ML can fail, with significant and long-lasting consequences. Reliably measuring such failures is the first step towards building robust and trustworthy learning machines. Consider algorithmic fairness, where widely-deployed fairness metrics can exacerbate group disparities and result in discriminatory outcomes. Moreover, existing metrics are often incompatible. Hence, selecting fairness metrics is an open problem. Measurement is also crucial for robustness, particularly in federated learning with error-prone devices. Here, once again, models constructed using well-accepted robustness metrics can fail. Across ML applications, the dire consequences of mismeasurement are a recurring theme. This talk will outline emerging strategies for addressing the measurement gap in ML and how this impacts trustworthiness.

Bio: Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo's research interests are in developing the principles and practice of trustworthy machine learning. Additionally, Koyejo focuses on applications to neuroscience and healthcare. Koyejo completed his Ph.D. in Electrical Engineering at the University of Texas at Austin, advised by Joydeep Ghosh, and completed postdoctoral research at Stanford University. His postdoctoral research was primarily with Russell A. Poldrack and Pradeep Ravikumar. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence (UAI), a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping (OHBM). Koyejo serves on the board of the Black in AI organization.

Oliver Kosut

Arizona State University

20th November
2020

The Multiple-Access Channel Is Stranger Than You Think

Video: Link

Abstract: The multiple-access channel (MAC) is the network information theory problem in which multiple transmitters each send a message to a common receiver. While generally considered to be a straightforward extension of the point-to-point channel, in fact the MAC is much stranger than that. In this talk I will present three stories of MAC strangeness. The first story is about the difference between average probability of error and maximal probability of error. The second concerns the so-called cooperation facilitator model, in which a small amount of cooperation between transmitters has a disproportionate effect on achievable rates. The final story is about my recent work characterizing the second-order behavior of the MAC via a new measure of dependence called wringing dependence.

Yunpeng Zhao

Arizona State University

13th November
2020

On Rate-optimal Uniform Concentration Inequalities for Shannon Entropies

Video: Link

Abstract: We present a new type of exponential decay concentration inequalities that bounds the tail probability of the difference between the log-likelihood of discrete random variables and the negative entropy. In contrast to classical Bernstein’s inequality and Hoeffding’s inequality when applied to log-likelihoods, the new bound is independent of the parameters and therefore does not blow up as the parameters approach 0 or 1. We further present a refined inequality that achieves the optimal rate, where is the sample size and is the number of possible values of the discrete variable. The key step in the proof is bound the moment generating function. We prove the bound by viewing it as a non-convex optimization problem and showing the duality gaps are zero by techniques in real analysis. The new inequalities strengthen certain theoretical results on likelihood-based methods for community detection in networks and can be applied to other likelihood-based methods for binary data.

(iii) Scale algorithms to high dimensional cases.

In view of the first challenge we present some of the speaker research in integrating Local Optimization with Bayesian optimization within the general framework GLOBO. In view of challenge (ii) the work in Multi-Fidelity Bayesian Optimization and the associated challenges is presented. Finally, we introduce BOFiP a new approach to scale Bayesian optimization that makes use of game theory.

Swati Gupta

Georgia Institute of Technology

9th October
2020

Mitigating the Impact of Bias in Selection Algorithms

Abstract: The introduction of automation into the hiring process has put a spotlight on a persistent problem: discrimination in hiring on the basis of protected-class status. Left unchecked, algorithmic applicant-screening can exacerbate pre-existing societal inequalities and even introduce new sources of bias; if designed with bias-mitigation in mind, however, automated methods have the potential to produce fairer decisions than non-automated methods. In this work, we focus on selection algorithms used in the hiring process (e.g., resume-filtering algorithms) given access to a "biased evaluation metric". That is, we assume that the method for numerically scoring applications is inaccurate in a way that adversely impacts certain demographic groups.

We analyze the classical online secretary algorithms under two models of bias or inaccuracy in evaluations: (i) first, we assume that the candidates belong to disjoint groups (e.g., race, gender, nationality, age), with unknown true utility Z, and “observed” utility Z/\beta for some unknown \beta that is group-dependent, (ii) second, we propose a “poset” model of bias, wherein certain pairs of candidates can be declared incomparable. We show that in the biased setting, group-agnostic algorithms for online secretary problem are suboptimal, often causing starvation of jobs for groups with \beta>1. We bring in techniques from matroid secretary literature and order theory to develop group-aware algorithms that are able to achieve certain “fair” properties, while obtaining near-optimal competitive ratios for maximizing true utility of hired candidates in a variety of adversarial and stochastic settings. Keeping in mind the requirements of U.S. anti-discrimination law, however, certain group-aware interventions can be construed as illegal, and we will conclude the talk by partially addressing tensions with the law and ways to argue legal feasibility of our proposed interventions. This talk is based on work with Jad Salem and Deven R. Desai.

Lalitha Sankar

Arizona State University

25th September
2020

Alpha-loss: A Tunable Class of Loss Functions for Robust Learning

Video: Link

Abstract: In this talk, we introduce alpha-loss as a parameterized class of loss functions that resulted from operationally motivating information-theoretic measures. Tuning the parameter alpha from 0 to infinity yields a class of loss functions that admit continuous interpolation between log-loss (alpha=1), exponential loss (alpha=1/2), and 0-1 loss (alpha=infinity). We discuss how different regimes of the parameter alpha enables the practitioner to tune the sensitivity of their algorithm towards two emerging challenges in learning: robustness and fairness. We discuss classification properties of the class, information-theoretic interpretations, and the optimization landscape of the average loss as viewed through the lens of Strict-Local-Quasi-Convexity under the logistic regression model. Finally, we comment on ongoing and future work on different applications of alpha-loss including deep neural networks, federated learning, and boosting.

Angelia Nedich

Arizona State University

18th September
2020

Distributed Algorithms for Optimization in Networks

Video: Link

Abstract: We will overview the distributed optimization algorithms starting with the basic underlying idea illustrated on a prototype problem in machine learning. In particular, we will focus on convex minimization problem where the objective function is given as the sum of convex functions, each of which is known by an agent in a network. The agents communicate over the network with a task to jointly determine a minimum of the sum of their objective functions. The communication network can vary over time, which is modeled through a sequence of graphs over a static set of nodes (representing the agents in a system). In this setting, the distributed first-order methods will be discussed that make use of an agreement protocol, which is a mechanism replacing the role of a coordinator. We will discuss some refinements of the basic method and conclude with more recent developments of fast methods that can match the performance of centralized methods.

Visar Berisha

Arizona State University

11th September
2020

Clinical speech analytics: algorithms, applications, and information limits

Video: Link

Abstract: The ability to share our thoughts and ideas through spoken communication is fragile. Even the simplest verbal response requires a complex sequence of events. It requires thinking of the words that best convey your message; sequencing these words appropriately; and then sending signals to the muscles required to produce speech. The slightest damage to the brain areas that orchestrate these events can manifest in speech and language problems. These disturbances offer a window into brain functioning and have gained popularity as digital biomarkers in clinical applications in neurology. In the first part of this presentation, I will present an overview of several projects where we use interpretable measures of speech and language production as proxies for motor and cognitive health. I will provide an overview of how these algorithms are validated and what clinical questions they can help answer. If time permits, in the second part of the talk, I will discuss recent results from a project that aims to characterize the information limits inherent in speech as a diagnostic. This work provides a first look at how well we can answer fundamental questions like “What are the limits of how well I can detect a disturbance in neurological health from only recorded speech?”

Archives

USC

Adaptive Stochastic Optimization with Constraints

Video:

ASU

Natural projected flow: A PDE solver using neural networks

Video:

IIIT Hyderabad

Codes for Distributed Storage and Distributed Gradient Descent

Video: Link

ASU

Beyond Redundancy: Adversarial Mitigation in Distributed Systems through Authentication and Adversary Detection

Video: Link

GA Tech

Generative Models for Statistical Inference

Video: Link

UCSD

Electromagnetic Information Theory

Video:

University of Arizona

Amicable Perturbations

Video: Link

Rice University

Graph Learning for Functional Neuronal Connectivity

Video: Link

Arizona State University

Computational Methods for Solving Inverse Problems in Imaging

Video: Link

Duy Nguyen

SDSU

Large-Scale Approximate Inference: Theory, Algorithms, and Applications

Video:

Weizhi Li

Post-doc

Arizona State University

Designing Two-Sample Tests with Local Information

Video: Link

Bane Vasić

Professor

University of Arizona

Quantum Error Correction: Introduction and Latest Developments in QLDPC Codes

Video: Link

Angelia Nedic

Professor

Arizona State University

Characterizing Trust and Resilience in Distributed Optimization for Cyberphysical Systems

Video: Link

Yuan Wang

Assistant Professor

University of South Carolina

Topological Clustering and Inference of Brain Networks

Video:

Christian Arenz

Assistant Professor

Arizona State University

29th April

2023

On the convergence of hybrid quantum-classical algorithms

Video: Link

Christos Thrampoulidis

Assistant Professor

University of British Columbia

22nd April

2023

Implicit Geometry of Deep-net Classifiers: Imbalance Trouble and How to Fix it

Video: Link

Wei Yu

Professor

University of Toronto

15th April

2023

Active Learning for Communication and Sensing

Video: Link

Rong Pan

Associate Professor

ASU

7th April

2023

Physics-Informed Neural Network and its Applications to Digital Twin

Video: Link