Archives

Mladen Kolar

Professor

           USC

Adaptive Stochastic Optimization with Constraints

Video:

Abstract: Constrained stochastic optimization problems appear widely in numerous applications in statistics, machine learning, and engineering, including constrained maximum likelihood estimation, constrained deep neural networks, physical-informed machine learning, and optimal control. I will discuss our recent work on solving nonlinear optimization problems with stochastic objective and deterministic constraints. I will describe development of adaptive algorithms based on sequential quadratic programming and their properties. The talk is based on the joint work with Yuchen Fang, Ilgee Hong, Sen Na, Michael Mahoney, and Mihai Anitescu.

Bio:  Mladen Kolar is a professor in the Department of Data Sciences and Operations at the USC Marshall School of Business. Mladen earned his PhD in Machine Learning from Carnegie Mellon University in 2013. His research focuses on high-dimensional statistical methods, probabilistic graphical models, and scalable optimization methods, driven by the need to uncover interesting and scientifically meaningful structures from observational data. Mladen was selected as a recipient of the 2024 Junior Leo Breiman Award for his outstanding contributions to these areas. He currently serves as an associate editor for the Journal of Machine Learning Research, the Annals of Statistics, the Journal of Computational and Graphical Statistics, and the New England Journal of Statistics in Data Science.

Sebastian Motsch

Associate Professor

           ASU

Natural projected flow: A PDE solver using neural networks

Video:

Abstract: Solving PDEs with neural networks has attracted a lot of attentions in recent years especially with the introduction of Physics-Informed Neural Networks (PINNs). These methods typically utilize neural networks as approximate solutions and adjust their parameters to satisfy the PDE (approximately). Our method, called Natural Projected Flow, deviates from this approach by utilizing a semi-discrete formulation. This involves seeking a solution where the parameters of the neural network (representing the spatial variable) evolve over time. The crucial challenge lies in identifying the corresponding evolution equation for these parameters. Natural Projected Flow addresses this challenge by employing a L^2 projection of the flow of the PDE onto the manifold of neural networks. The effectiveness of our proposed numerical solver is demonstrated through applications to various classical PDEs, including diffusion and porous-media equations.

Bio: Sébastien Motsch is an associate professor at Arizona State University. He received his PhD. from Institut de Mathematiques de Toulouse in 2009. Prof. Motsch' research interests focus in the mathematical modeling of biological systems and especially those which exhibit self-organization such as bacterial colonies or flock of birds. His work aims at connecting two levels of description for these systems: a microscopic viewpoint (describing each individual) and a macroscopic description (using partial differential equations). One of the many questions addressed by biological systems is to understand how local interactions among individuals lead to the formation of large structures. The derivation and analysis of macroscopic models give new insights to understand these phenomenons. Dr. Motsch’ research can be divided in three themes:   1) derivation of macroscopic models from microscopic dynamics   2) numerical and analytically study of the macroscopic models derived   3) modeling of complex systems based on experimental data

Lalitha Vadlamani

Assistant Professor

           IIIT Hyderabad

Codes for Distributed Storage and Distributed Gradient Descent

Video:  Link 

Abstract: In a distributed storage system, due to increase of storage capacity of a node, efficient repair of failed nodes is becoming increasingly important in addition to ensuring a given level of reliability and low storage overhead. Codes with locality are a class of codes designed for storage systems which have the characteristic that they trade off repair locality (number of nodes accessed to repair a failed node) for storage overhead. Maximally recoverable codes are a class of codes which correct maximum possible number of erasure patterns, given the locality constraints of the code and hence of interest. We will introduce three classes of maximally recoverable codes (MRC) based on the topology of the local parities will be introduced (i) MRC with locality (iii) MRC with hierarchical locality and (iii) Product Topologies. We will present various constructions and results dealing with these MRCs.

In a distributed gradient descent problem, a gradient computation job is divided into multiple parallel tasks, which are computed on different servers, and the job is finished when all the tasks are complete. In this framework, a subset of straggling servers form a bottleneck to the efficient execution of the gradient descent. Gradient coding ensures efficient distributed gradient computation even in the presence of stragglers by utilizing coding theoretic techniques. We will introduce two variants of gradient coding and present results in these settings: (i) Delayed start of tasks corresponding to a subset of servers is allowed (ii) a form of approximate gradient coding where only sum of a fraction of the gradients need to be recovered.


Bio: Lalitha Vadlamani received her B.E. degree in Electronics and Communication Engineering from the Osmania University, Hyderabad, in 2003 and her M.E. and Ph.D. degrees from the Indian Institute of Science (IISc), Bangalore, in 2005 and 2015 respectively. From May 2015, she is working as Assistant professor in IIIT Hyderabad, where she is affiliated to Signal Processing and Communications Research Center. Her research interests include coding for distributed storage and computing, index coding, polar codes, learning-based codes and coded blockchains. She is a recipient of Prof. I.S.N. Murthy medal from IISc, 2005 and the TCS Research Scholarship for the year 2011. She is currently visiting Simons Institute of the Theory of Computing, Berkeley.

Mayank Bakshi

Research Scientist

           ASU

Beyond Redundancy: Adversarial Mitigation in Distributed Systems through Authentication and Adversary Detection

Video: Link 

Abstract: The distributed nature of many modern applications, such as the Internet of Things (IoT) and Distributed Machine Learning, makes them inherently vulnerable to infiltration and disruption by unidentified adversarial agents. These agents may deviate arbitrarily from the protocol, disrupting the final outcome and potentially leading to privacy leakage from legitimate agents. Given the distributed nature of these problems, countering such deviations and minimizing the adversary's effects is challenging and costly — after all, any form of 'error correction' necessitates redundancies in the system. In this talk, we argue that such redundancy is not needed when adversaries are only sporadically present in the system. Instead, we advocate for an authentication-based approach to mitigate such adversarial risks in distributed systems. The philosophy here is to maintain efficiency in adversary-free scenarios while still safeguarding against malicious activities by 'validating' the outcome to ensure minimal adversarial influence. We show that for two different classes of problems—decentralized learning and multiple access communications—the authentication-based approach performs essentially as efficiently as a non-authenticated approach, with the added advantage that the presence of adversaries can be detected. In contrast to error correction-based approaches, which require significant overhead in terms of communication and cost, our approach validates system outcomes through suitable 'checks' at the end of the protocol to detect adversarial presence. Lastly, we will also explore interesting connections to adversarial hypothesis testing and active learning problems.

 

Bio: Dr. Mayank Bakshi received his B.Tech. and M.Tech. degrees from the Indian Institute of Technology, Kanpur, in 2003 and 2005, respectively, and his Ph.D. degree from the California Institute of Technology in 2011. He then served as a postdoctoral scholar and a research assistant professor at the Chinese University of Hong Kong from 2012 to 2019, and as a principal researcher at Theory Lab, Huawei Hong Kong from 2019 to 2021. Currently, he is a research scientist at Arizona State University. His research interests include physical layer security, adversarially robust communications and learning, and sparse recovery.


Yao Xie

Professor

           GA Tech

Generative Models for Statistical Inference

Video: Link 

Abstract: We consider the problem of learning a continuous probability density function from data, a fundamental problem in statistics known as density estimation. It also arises in distributionally robust optimization (DRO), where the goal is to find the worst-case distribution to represent scenario departure from observations. Such a problem is known to be hard in high dimensions and incurs a significant computational challenge. In this talk, I will present a machine learning approach to tackle these challenges, leveraging recent advances in neural-networks-based generative models, which have become popular recently due to their competitive performance in high-dimensional data. We develop a neural ODE flow network called JKO-iFlow, inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which unfolds the discrete-time dynamic of the Wasserstein gradient flow. Our method can greatly reduce computational costs when achieving competitive performance over existing generative models. The connection of our JKO-iflow method with proximal gradient descent in the Wasserstein space enables us to prove a density learning guarantee with an exponential convergence rate. Besides density estimation, we also demonstrate that the JKO-flow generative model can be used in various applications, including adversarial learning, robust hypothesis testing, and data-driven differential privacy.

 

Bio: Yao Xie is the Coca-Cola Foundation Chair, Professor at Georgia Institute of Technology in the H. Milton Stewart School of Industrial and Systems Engineering, and Associate Director of the Machine Learning Center. From September 2017 until May 2023, she was the Harold R. and Mary Anne Nash Early Career Professor. She received her Ph.D. in Electrical Engineering (minor in Mathematics) from Stanford University in 2012 and was a Research Scientist at Duke University. Her research lies at the intersection of statistics, machine learning, and optimization in providing theoretical guarantees and developing computationally efficient and statistically powerful methods for problems motivated by real-world applications. She received the National Science Foundation (NSF) CAREER Award in 2017, the INFORMS Wagner Prize Finalist in 2021, and the INFORMS Gaver Early Career Award for Excellence in Operations Research in 2022. She is currently an Associate Editor for IEEE Transactions on Information Theory, Journal of the American Statistical Association-Theory and Methods, Operations Research, Sequential Analysis: Design Methods and Applications, INFORMS Journal on Data Science, and an Area Chair of NeurIPS, ICML, and ICLR.


Massimo Franceschetti

Professor

           UCSD

Electromagnetic Information Theory

Video: 

Abstract: Theoretical analysis of  the performance limits of next generation communication systems  requires a deeper understanding of the effect of the propagation channel in the computation of relevant information-theoretic bounds. Most of the literature, however, abstracts out the physics, treating them as mathematical or engineering disciplines. Although abstractions are  certainly necessary in the design of systems, much can be lost in understanding the fundamental limits of emerging technologies such as holographic MIMO, super-resolution, high-frequency, and quantum communications. In this talk, we illustrate how fundamental limits can be studied by merging classic results in functional analysis and electromagnetics. Specifically, we will  consider degrees of freedom, entropy, and capacity, of radiated signals. We  will  recall classic results in communication theory and signal analysis, draw connections with electromagnetics, and discuss some recent advancements. 

 

Bio: Massimo Franceschetti received the Laurea degree (with honors) in computer engineering from the University of Naples, Naples, Italy, in 1997, the M.S. and Ph.D. degrees in electrical engineering from the California Institute of Technology, Pasadena, CA, in 1999, and 2003, respectively. He is Professor of Electrical and Computer Engineering at the University of California at San Diego (UCSD). Before joining UCSD, he was a postdoctoral scholar at the University of California at Berkeley for two years. He is coauthor of the book “Random Networks for Communication” and author of the book “Wave theory of information,” both published by Cambridge University Press.  He was awarded the C. H. Wilts Prize in 2003 for best doctoral thesis in electrical engineering at Caltech; the S.A. Schelkunoff Award in 2005 for best paper in the IEEE Transactions on Antennas and Propagation, a National Science Foundation (NSF) CAREER award in 2006, an Office of Naval Research (ONR) Young Investigator Award in 2007, the IEEE Communications Society Best Tutorial

Paper Award in 2010, and the IEEE Control theory society Ruberti young researcher award in 2012. He became an IEEE Fellow in 2018 and a Guggenheim Fellow for Natural Sciences, Engineering, in 2019.


Ravi Tandon

 Associate Professor

          University of Arizona

Amicable Perturbations

Video:  Link 

Abstract: Machine learning based classifiers have achieved incredible success in a variety of sectors such as college admissions, hiring, banking and other domains. However, their ability to make classifications has not been fully exploited to understand how to improve undesirable classifications. In this talk, I will present a new framework for finding the most efficient changes that could be made in the real world to achieve a more favorable classification, and term these changes Amicable Perturbations.  We present a principled methodology for creating amicable perturbations and demonstrate their effectiveness on data sets from a variety of fields. Amicable perturbations differ from counterfactuals in that they are better suited to balance the effort-reward trade-off and lead to the most efficient plan of action. Unlike adversarial examples, which fool a classifier into making false predictions, amicable perturbations are intended to affect the true class of the data.  To this end, we develop a novel method for verifying that amicable perturbations change the true class probabilities. We also compare our results to those achieved by previous approaches such as counterfactuals and adversarial attacks.

 

This talk is based on joint work with Jesse Friedbaum (Ph.D. student at the University of Arizona).  

Bio: Ravi Tandon is the Litton Industries John M. Leonis Distinguished Associate Professor in the Department of ECE at the University of Arizona. He received the B.Tech. degree in Electrical Engineering from the Indian Institute of Technology, Kanpur (IIT Kanpur) in 2004 and the Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park (UMCP) in 2010. From 2010 to 2012, he was a post-doctoral research associate at Princeton University, and was a research assistant professor at Virginia Tech, with positions in the ECE department, Hume Center and the Department of Computer Science. He is a recipient of the 2018 Keysight Early Career Professor Award, NSF CAREER Award in 2017, and a Best Paper Award at IEEE GLOBECOM 2011. He is a Senior Member of IEEE and currently serves as an Editor for IEEE Transactions on Information Theory and the IEEE Transactions on Communications. His current research interests include information theory and its applications to wireless networks, signal processing, communications, security and privacy, machine learning and data mining.

Genevera Allen

 Associate Professor

               Rice University

Graph Learning for Functional Neuronal Connectivity

Video: Link 

Abstract: Understanding how large populations of neurons communicate in the brain at rest, in response to stimuli, or to produce behavior as well as how brain function relates to structure are fundamental open questions in neuroscience. Many approach this by estimating the intrinsic functional neuronal connectivity using probabilistic graphical models. But there remain major statistical and computational hurdles to estimating graphical models from new large-scale calcium imaging technologies and from huge projects which image up to one hundred thousand neurons across multiple sessions in the active mouse brain.  In this talk, I will highlight a number of new graph learning strategies my group has developed to address many critical unsolved challenges arising with large-scale neuroscience data. Specifically, we will focus on Graph Quilting, in which we derive a method and theoretical guarantees for graph learning from non-simultaneously recorded neurons. We will also highlight theory and methods for graph learning with latent variables induced by unrecorded neurons via thresholding, graph learning for spikey neuronal activity data via Subbotin graphical models, and computational approaches for graph learning from enormous numbers of neurons via minipatch learning. Finally, we will demonstrate the utility of all approaches on synthetic data as well as real calcium imaging data for the task of estimating functional neuronal connectivity.


Bio: Genevera Allen is an Associate Professor of Electrical and Computer Engineering, Statistics, and Computer Science at Rice University and an investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital and Baylor College of Medicine. She is also the Founding Director of the Rice Center for Transforming Data to Knowledge, informally called the Rice D2K Lab. Dr. Allen’s research develops new statistical machine learning tools to help people make reliable data-driven discoveries. She is known for her methods and theory work in the areas of unsupervised learning, interpretable machine learning, data integration, graphical models, and high-dimensional statistics.  Her work is motivated by solving real scientific problems, especially in the areas of neuroscience and bioinformatics. Dr. Allen is also a leader in data science education.  In 2018, she founded the Rice D2K Lab, a campus hub for experiential learning and data science education. Through her leadership of the D2K Lab, Dr. Allen developed new interdisciplinary data science degree programs, established a novel capstone program in data science and machine learning, and led Rice’s engagement with corporate and community partners in data science. Dr. Allen is the recipient of several honors for both her research and educational efforts including a National Science Foundation Career Award, Rice University’s Duncan Achievement Award for Outstanding Faculty, the Curriculum Innovation Award, and the School of Engineering’s Research and Teaching Excellence Award.  In 2014, she was named to the “Forbes ’30 under 30′: Science and Healthcare” list.  She is also an elected member of the International Statistics Institute and an elected fellow of the American Statistical Association.  Dr. Allen currently serves as an Action Editor for the Journal of Machine Learning Research, an Associate Editor for the Journal of the American Statistical Association: Theory and Methods, and a Series Editor for Springer Texts in Statistics.  Dr. Allen received her Ph.D. in statistics from Stanford University, under the mentorship of Prof. Robert Tibshirani, and her bachelors, also in statistics, from Rice University. 

 Assistant Professor

       Arizona State University

Computational Methods for Solving Inverse Problems in Imaging

Video: Link 

Abstract: Discrete linear and nonlinear inverse problems arise from many different imaging systems. These problems are ill-posed, which means, in most cases, that the solution is very sensitive to the data. Because the data usually contain errors produced by different imaging system parts (e.g., cameras, sensors, etc.), robust and reliable regularization methods need to be developed for computing meaningful solutions. In some imaging systems, massive amounts of data are produced making the data storage and computational cost of the inversion process intractable. In this talk, we will see different imaging systems, we will formulate the corresponding mathematical models, we will introduce regularization methods, and we will show some numerical results.


BioMalena Español is an Assistant Professor in the School of Mathematical and Statistical Sciences at Arizona State University. She has a Bachelor’s in Applied Mathematics from the University of Buenos Aires and a Master’s and PhD in Mathematics from Tufts University. Dr. Español was a Postdoctoral Fellow at the California Institute of Technology, before starting a faculty position at The University of Akron where she received tenure and was promoted to associate professor in 2018. Her research interests are in the development, analysis, and application of mathematical models and numerical methods for solving problems arising in science and engineering, with a focus on problems related to materials science, image processing, and medical applications. In 2018, she co-organized the Women in Math of Materials (WIMM) workshop and helped to create a research community for WIMM.  Dr. Español was one of the editors of the Springer AWM Series volume “Research in the Mathematics of Materials Science.” which highlights the research work of women in this area and was published last year.  Dr. Español is the lead organizer of AMIGAS, a new summer program for graduate students in applied and computational mathematics.  Dr. Español was awarded the 2022 Karen EDGE Fellowship.

 Associate Professor

              SDSU

Large-Scale Approximate Inference: Theory, Algorithms, and Applications

Video: 

Abstract:   A central task in statistical inference is the evaluation of the posterior distribution p(x|y) of the latent variables x given the observed data variables y. For many models of practical interest, it will be infeasible to evaluate the posterior distribution, because the dimensionality of the latent space is too high to work with directly or because the posterior distribution has a highly complex form. In these situations, we need to resort to approximate schemes. In this talk, I will review a range of deterministic approximation schemes, including approximate message passing (AMP) and variational Bayes (VB) inference, which scale well to large applications. These are based on analytical approximations to the posterior distribution, for example by assuming that it factorizes in a particular way. I will present the theory behind AMP and VB and some related computationally efficient algorithms in these schemes. Finally, I will present the applications of approximate inference in (sparse) linear regression, logistic regression, probit regression, compressed sensing, and MIMO signal estimation.

BioDuy H. N. Nguyen (Senior Member, IEEE) is an Associate Professor in the Department of Electrical and Computer Engineering, San Diego State University. He received the B.Eng. from Swinburne University of Technology, Hawthorn, VIC, Australia in 2005, the M.Sc. from University of Saskatchewan, Saskatoon, SK, Canada in 2009 and the Ph.D. from McGill University, Montreal, QC, Canada in 2013. He was a postdoctoral research fellow at INRS-EMT (University of Quebec), The University of Houston, and the University of Texas at Austin. He joined SDSU in 2016. His current research interests include resource allocation in wireless networks, signal processing for communications, optimization, game theory and machine learning. He received the NSF CAREER award in 2022.


Weizhi Li

Post-doc

Arizona State  University

     

Designing Two-Sample Tests with Local Information

Video: Link 

Abstract:  Two-sample hypothesis testing is a fundamental tool to determine whether there is a significant difference between two groups of data. It is widely applied in decision-making contexts ranging from drug efficacy evaluation in drug discovery to policy assessment in government policymaking. Standard two-sample tests construct the statistic, which is a global characterization of data to make a decision; this could adversely impact the utility of a two-sample test when the ‘’signal to noise ratio” of the dataset is small or available data is

insufficient. In this talk, I will address how to integrate the local information of data into the design of two-sample testing in the settings of batch and sequential active learning (i.e., features are inexpensive, but labels are costly) and the standard batch-setting (both features and labels can be readily accessed) to increase the utility of the two-sample tests. Throughout this talk, I will highlight two-sample tests that are based on graph, information- theoretic, and betting-theoretic measures.

Bio: Weizhi Li is a post-doc at Arizona State University, working with Dr. Visar Berisha and Dr. Gautam Dasarathy. He obtained his M.S. in Electrical Engineering in 2017 from Texas A&M University and his Ph.D. in Computer Engineering in 2022 from Arizona State University. He was supervised by Dr. Visar Berisha and Dr. Gautam Dasarathy in his Ph.D. study. His research interests include active learning and statistical hypothesis testing with the aim of designing data-efficient decision-making algorithms.


Professor

 University of Arizona 

     

Quantum Error Correction: Introduction and Latest Developments in QLDPC Codes 

Video: Link 

Abstract:  Fundamental concept of the mathematical theory of information laid down by Shannon is that of  error correcting codes. Error correction codes play a vital role in ensuring the integrity of data in systems exposed to noise or errors. Classical error correcting codes were crucial to the success of modern communications and data storage systems (from the Internet to mobile, satellite and deep-space communications, and from disk to flash memory storage) and found applications in other areas, such as pattern recognition, group testing, cryptography, or fault-tolerant (FT) computing. Likewise, quantum error correction (QEC) codes at the heart of all quantum information processing, from FT quantum computing to reconciliation in quantum key distribution, quantum sensing, and reliable optical communications.  However, unlike classical coding theory which is a mature and established discipline, quantum codes are still a subject of extensive research. The importance of QEC is that it is the only presently known gateway to reap the benefits of computational quantum algorithms, but a robust and scalable QEC has not been yet demonstrated experimentally. Arguably, QEC is the only technology still lacking to realize a vision of useful large-scale quantum computation, and its development is pursued by many research groups in academia, national labs, and industry. One of the most promising solutions is based on quantum low-density parity check (QLDPC) codes, which are the only known class of quantum codes in the stabilizer family with asymptotically nonzero rates and that support fault-tolerant operation using noisy quantum gates. This talk will present an overview of the research in QLDPC codes. It is prepared for classical communications theory researchers, and no background in quantum mechanics nor error correction is required.


BioDr. Bane Vasic is a Professor of Electrical and Computer Engineering and Mathematics at the University of Arizona and a Director of the Error Correction Laboratory. He is an inventor of the soft error-event decoding algorithm for intersymbol interference channels with correlated noise, and the key architect of a detector/decoder for Bell Labs data storage read channel chips which were regarded as the best in industry. His pioneering work on structured low-density parity check (LDPC) error correcting codes based on combinatorial designs has enabled low-complexity iterative decoder implementations. Structured LDPC codes are today adopted in a number of communications standards and data storage systems. Dr. Vasic’s work on codes on graphs, trapping sets and error floor of iterative decoding algorithms has led to decoders for the binary symmetric channel with best error-floor performance known today. Dr. Vasic is a PI on a Department of Energy multi-university $115M-project led by Fermi National Laboratory to establish a Center for Superconducting Materials and Systems. He is a co-PI of the $52M NSF Center for Quantum Network hosted at the University of Arizona, and involved in University of Arizona Quantum Hub, a group of researchers leading effort to establish a graduate program in quantum information science and engineering at the UArizona.  He is also funded by NASA-Jet Propulsion Laboratory through the Strategic University Partnership Program for the development of quantum codes and error correction algorithms for NASA space missions, and is a PI on seven research grants funded by the National Science Foundation. He is a founder of Codelucida, a company developing advanced error correction solutions for communications and data storage. Recently, Codelucida has received numerous innovation awards including 2017 Arizona Innovation Challenge Award from Arizona Commerce Authority, 2018 I-Squared Startup of the Year from Tech Launch Arizona, and Best of Show Award for the Most Innovative Flash Memory Technology at the 2019 Flash Memory Summit, the World largest exhibition of flash memory technologies. Codelucida is a Xilinx Partner providing LDPC Codec IP cores for flash memory controllers. He is an IEEE Fellow, Fulbright Scholar, da Vinci Fellow, and a past Chair of IEEE Data Storage Technical Committee.


Angelia Nedic

Professor

     Arizona State University

     

Characterizing Trust and Resilience in Distributed Optimization for Cyberphysical Systems 

Video: Link 

Abstract: This talk considers the problem of resilient distributed multi-agent optimization for cyberphysical systems in the presence of malicious or non-cooperative agents. It is assumed that stochastic values of trust between agents are available which allows agents to learn their trustworthy neighbors simultaneously with performing updates to minimize their own local objective functions. The development of this trustworthy computational model combines the tools from statistical learning and distributed consensus-based optimization. Specifically, we derive a unified mathematical framework to characterize convergence, deviation of the consensus from the true consensus value, and expected convergence rate, when there exists additional information of trust between agents. We show that under certain conditions on the stochastic trust values and consensus protocol: 1) almost sure convergence to a common limit value is possible even when malicious agents constitute more than half of the network, 2) the deviation of the converged limit, from the nominal no attack case, i.e., the true consensus value, can be bounded with probability that approaches 1 exponentially, and 3) correct classification of malicious and legitimate agents can be attained in finite time almost surely. Further, the expected convergence rate decays exponentially with the quality of the trust observations between agents. We then combine this trust-learning model within a distributed gradient-based method for solving a multi-agent optimization problem and characterize its performance.


BioAngelia Nedić has a Ph.D. from Moscow State University, Moscow, Russia, in Computational Mathematics and Mathematical Physics (1994), and a Ph.D. from Massachusetts Institute of Technology, Cambridge, USA in Electrical and Computer Science Engineering (2002). She has worked as a senior engineer in BAE Systems North America, Advanced Information Technology Division at Burlington, MA. Currently, she is a faculty member of the school of Electrical, Computer and Energy Engineering at Arizona State University at Tempe. Prior to joining Arizona State University, she has been a Willard Scholar faculty member at the University of Illinois at Urbana-Champaign. She is a recipient (jointly with her co-authors) of the Best Paper Award at the Winter Simulation Conference 2013 and the Best Paper Award at the International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt) 2015.  Her general research interest is in optimization, large scale complex systems dynamics, variational inequalities and games.

Yuan Wang

Assistant Professor

     University of South Carolina

     

Topological Clustering and Inference of Brain Networks

Video: 

Abstract:  Topological data analysis (TDA) has motivated the exploration of mesoscale features in brain signals and networks. Persistent homology is a key TDA algorithm for decoding and representing these features via persistence descriptors such as persistence diagram (PD), whose statistical significance is often revealed through permutation testing. But testing on PDs is challenging due to the heterogeneity of points in the diagrams that encode the birth and death times of features through a dynamic filtration of the subnetworks. This talk will showcase a topological clustering and transposition-based permutation testing framework based on heat-diffusion estimates of PDs to resolve computational bottlenecks of heavy permutations, with applications to the comparison of brain networks constructed from neuroimaging data.


BioDr. Wang is an assistant professor of biostatistics at the University of South Carolina. Her research has been focused on developing statistical inference methods for topological signal processing and network analysis, with applications to neuroimaging data in epilepsy and post-stroke aphasia. She is also collaborating on NIH-funded projects decoding patterns in wearable sensor signals.

Christian Arenz

Assistant Professor

     Arizona State           University

        29th April

                       2023

On the convergence of hybrid quantum-classical algorithms

 Video: Link 

Abstract: The development of prototype quantum computers has sparked a wave of research into algorithms to best leverage these new resources. A major focus has been on hybrid quantum-classical algorithms, which have been designed for applications including combinatorial optimization and quantum simulation. The overarching premise of hybrid quantum-classical algorithms is to use quantum and classical computing resources in tandem to solve a problem under consideration that is formulated as an optimization problem. A quantum computer is used to evaluate the associated objective function by executing a parameterized quantum circuit, and a classical computer is used to optimize over the parameters to optimize the objective function and solve the problem at hand.

The performance and utility of hybrid quantum-classical algorithms are tightly linked to our ability to solve the classical optimization problem, and this can be a major challenge, i.e., the optimization problem is typically non-convex, and convergence to suboptimal solutions can be a significant concern. In this talk, I present several ways forward for addressing this challenge. In particular, I show that overparameterization can improve convergence to the global optimum and argue that randomized adaptive strategies, where quantum circuits are grown according to local gradient information, converge to the global optimum almost surely. I will conclude by outlining further challenges and opportunities related to the existence of Barren plateaus in the optimization landscape where gradients can become exponentially small due to the concentration of measure phenomenon.  


BioChristian Arenz joined Arizona State University as an assistant professor in the School of Electrical, Computer and Energy Engineering in January 2022. Prior to joining ASU he was a lecturer and an associate research scholar at Princeton University. Previously, he completed his PhD in applied mathematics at Aberystwyth University in 2016, where he focused on the control of open and noisy quantum systems. He completed his master’s degree equivalent in theoretical physics from Saarland University in 2012, where he studied quantum optical systems. Christian’s current research centers on using tools from control theory to advance quantum information science. His work targets applications such as the design of robust and efficient controls for quantum computing, and the development of quantum algorithms for optimization and machine learning tasks.  

Christos Thrampoulidis

Assistant Professor

         University of British Columbia

        22nd April

                       2023

Implicit Geometry of Deep-net Classifiers: Imbalance Trouble and How to Fix it

 Video: Link 

Abstract: What are the unique structural properties of models learned deep nets? Is there an implicit bias towards solutions of a certain geometry? How does this vary across training instances, architectures, and data? Towards answering these questions, the recently discovered Neural Collapse phenomenon formalizes simple geometric properties of the learned embeddings and of the classifiers, which appear to be “cross-situational invariant” across architectures and different balanced classification datasets. But what happens when classes are imbalanced? Is there a (ideally equally simple) description of the geometry that is invariant across class-imbalanced datasets? By characterizing the global optima of an unconstrained-features model, we formalize a new geometry that remains invariant across different imbalance levels. Importantly, it, too, has a simple description despite the asymmetries imposed by data imbalances on the geometric properties of different classes.  Overall, we show that it is possible to extend the scope of the neural-collapse phenomenon to a richer class of geometric structures. We also motivate further investigations into the impact of class imbalances on the implicit bias of first-order methods and into the potential connections between such geometry structures and generalization.

This is joint work with Tina Behnia, Ganesh Kini and Vala Vakilian.


BioDr. Thrampoulidis is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of British Columbia. Previously, he was an Assistant Professor at the University of California, Santa Barbara and a Postdoctoral Researcher at MIT. He received a M.Sc. and a Ph.D. in Electrical Engineering in 2012 and 2016, respectively, both from Caltech, with a minor in Applied and Computational Mathematics. In 2011, he received a Diploma in ECE from the University of Patras, Greece. His research is on machine learning, high-dimensional statistics and optimization.

Wei Yu

 Professor

University of Toronto

15th April

                  2023

Active Learning for Communication and Sensing

 Video: Link 

Abstract: Machine learning will play an important role in the optimization of future-generation physical-layer wireless communication systems, for the following two reasons. First, traditional wireless communication design always relies on the channel model, but models are only an approximation to reality. In wireless environments where the modelling task is complex and the channels are costly to estimate, a learning-based approach can significantly outperform the traditional model-based approaches. Second, modern wireless communication design often involves optimization problems that are high-dimensional, nonconvex, and difficult to solve efficiently. By exploring the availability of training data, a neural network may be able to learn the solution of an optimization problem directly, which can lead to a more efficient way to solve nonconvex optimization problems. In this talk, I will use examples from the active sensing and localization problems for the reconfigurable intelligent surface (RIS) system and the initial beam alignment problem for the mmWave massive MIMO system to illustrate the benefit of learning-based physical-layer communication system design.


Bio: Wei Yu is a Professor in the Electrical and Computer Engineering Department at the University of Toronto, where he holds a Canada Research Chair in Information Theory and Wireless Communications. He received the B.A.Sc. degree in Computer Engineering and Mathematics from the University of Waterloo, and the M.S. and Ph.D. degrees in Electrical Engineering from Stanford University. He is a Fellow of the IEEE and a Fellow of the Canadian Academy of Engineering. He received the Steacie Memorial Fellowship in 2015, the IEEE Marconi Prize Paper Award in Wireless Communications in 2019, the IEEE Communications Society Award for Advances in Communication in 2019, the IEEE Signal Processing Society Best Paper Award in 2008, 2017 and 2021, and the IEEE Communications Society Best Tutorial Paper Award in 2015. Prof. Wei Yu served as the President of the IEEE Information Theory Society in 2021.

Rong Pan

 Associate Professor

          ASU

7th April

                  2023

Physics-Informed Neural Network and its Applications to Digital Twin

 Video:  Link 

Abstract: Dynamics of physical phenomena are modeled by ordinary differential equations and partialdifferential equations, where the solutions are typically not easy to find. A recent trend is to solve these equations by some data-driven approaches through machine learning, such as Gaussian processes or neural networks. Physics-informed neural networks (PINNs) are a type of neural networks that incorporate physical laws or principles into their architectures, thus forcing the neural network prediction to abide physical laws. In this talk, I will discuss a 3D printing process called direct ink writing (DIW) and how PINNs can be used for predicting the fluid flow inside printer nozzle, which directly affects the quality of the 3D printing process. I will discuss several neural network architectures and compare their performance.


Bio: Dr. Rong Pan is an associate professor of Industrial Engineering in the School of Computing and Augmented Intelligence at Arizona State University. His research interests include failure time data analysis, design of experiments, multivariate statistical process control, time series analysis, and computational Bayesian methods. His research has been supported by NSF, Arizona Science Foundation, Air Force Research Lab, etc. He has published over 90 journal papers and 50+ refereed conference papers. Dr. Pan is a senior member of ASQ, IIE and IEEE, and a lifetime member of SRE. He serves on the editorial boards of Journal of Quality Technology and Quality Engineering.

Zelda Mariet

  Senior Research Scientist

            Google Brain

  31st March

                          2023

Ensembles of Classifiers: a Bias-Variance Perspective

 Video: Link 

Abstract: Ensembles are a straightforward, remarkably effective method for improving the accuracy, calibration, and robustness of neural networks on classification tasks. Yet, the reasons underlying their success remain an active area of research. Building upon (Pfau, 2013), we turn to the bias-variance decomposition of Bregman divergences in order to gain insight into the behavior of ensembles under classification losses. Introducing a dual reparameterization of the bias-variance decomposition, we first derive generalized laws of total expectation and variance, then discuss how bias and variance terms can be estimated empirically. Next, we show that the dual reparameterization naturally introduces a way of constructing ensembles which reduces the variance and leaves the bias unchanged. Conversely, we show that ensembles that directly average model outputs can arbitrarily increase or decrease the bias. Empirically, we see that such ensembles of neural networks may reduce the bias.

Bio: Zelda Mariet is a senior research scientist at Google Brain. She got her PhD at MIT, working with Suvrit Sra as a member of the Machine Learning and Learning and Intelligent Systems groups. Her research focuses on identifying precise mathematical definitions of diversity to understand the behavior of ML models, e.g., under distribution shift. Her PhD work focused on negatively dependent measures, which use Strongly Rayleigh polynomials to encode desirable properties for diversity modeling.

Anand Sarwate

  Professor

     Rutgers University

  24th March

                          2023

Federated Learning and Privacy in Collaborative Research Systems

 Video:  Link 

Abstract: Decentralized learning has been rebranded as “federated learning” with the advent of large-scale ML/AI models used in commercial technologies like mobile phones and smart home devices. In these systems, devices share summaries of locally collected data to a central aggregator, who then updates the ML model and updates the devices. In such a system, depending on the trust model, sites can locally perturb their summaries to guarantee some form of differential privacy, a statistical framework for measuring privacy risk. Many analyses and algorithms focus on scenarios with a large number of data collectors (such as phones), where the large sample size can compensate for the privacy-preserving perturbation. In the ``cross-silo’’ model for federated learning, a small number of sites with moderate amounts of data can collaborate to learn a model while leaving the data at sites. This model fits situations in human health research, where a consortium of research teams or centers may wish to collaborate. In this talk I describe ongoing work in building a collaborative research system for neuroimaging data and show the kinds of of studies which can be performed using this system. After an introduction to the basics of differential privacy I will describe how it can be applied in these settings and some strategies for mitigating the privacy loss under additional assumptions for the trust model. I will also discuss the major challenges in deploying privacy for these systems from a technical and policy perspective.

Joint work with: H. Imtiaz, J. Mohammadi, Y. Tao, B. Baker, A. Abrol, R.F. Silva, E. Damaraju, V.D. Calhoun, and S.M. Plis


BioAnand D. Sarwate is an Associate Professor in the Department of Electrical and Computer Engineering at Rutgers, the State University of New Jersey. He received a B.S. degree in Electrical Science and Engineering and a B.S. degree in Mathematics from MIT in 2002, an M.S. in Electrical Engineering from UC Berkeley in 2005 and a PhD in Electrical Engineering from UC Berkeley in 2008. From 2008-2011 he was a postdoctoral researcher at the Information Theory and Applications Center at UC San Diego and from 2011-2013 he was a Research Assistant Professor at the Toyota Technological Institute at Chicago. Prof. Sarwate received the Rutgers Board of Trustees Research Fellowship for Scholarly Excellence in 2020, the the A. Walter Tyson Assistant Professor Award from the Rutgers School of Engineering in 2018, and an NSF CAREER award in 2015. His interests are in information theory, machine learning, and signal processing, with applications to distributed systems, privacy and security, and biomedical research.

  Professor

        U.T Austin

    17th March

                          2023

Unlocking new capacity in 6G cellular systems via site-specific ML-aided design

 Video:  Link 

Abstract: 6G cellular networks will be extremely complex systems that must meet many competing requirements in a large variety of environments and use cases.   A key enabling 6G technology will be deep learning, which can unlock previously hidden system-level gains, particularly by effectively learning custom site-specific communication techniques that are optimally adapted for each cell-site.    I will present a short summary of some of my group’s recent discoveries and technologies based on deep learning (DL) that demonstrate a large potential impact in 6G.  The first is a novel and practical approach for beam alignment in millimeter wave and THz bands that can achieve a phenomenal speed up — at least 10x and in some cases approaching 1000x — by sensing and exploiting unique aspects of the environment.  The second is a new architecture for estimating high dimensional channels by harnessing the expressive power of deep generative networks to develop a custom model of each cell’s channel distribution.   The third is a site-specific multicell optimization that rapidly learns near-optimal global settings for each base stations antenna arrays, a problem that is completely intractable using conventional techniques.  It is trained and tested on AT&T’s commercial simulation platform and shows large gains over existing 3GPP approaches. 


BioJeffrey Andrews is the Truchard Family Endowed Chair in Engineering at the University of Texas at Austin where is Director of 6G@UT.  He received the B.S. in Engineering with High Distinction from Harvey Mudd College, and the M.S. and Ph.D. in Electrical Engineering from Stanford University.  Dr. Andrews is an IEEE Fellow and ISI Highly Cited Researcher and has been co-recipient of 15 best paper awards including the 2016 IEEE Communications Society & Information Theory Society Joint Paper Award, the 2014 IEEE Stephen O. Rice Prize, the 2014 and 2018 IEEE Leonard G. Abraham Prize, the 2011 and 2016 IEEE Heinrich Hertz Prize, and the 2010 IEEE ComSoc Best Tutorial Paper Award.  His other major awards include the 2015 Terman Award, the NSF CAREER Award, the 2021 Gordon Lepley Memorial Teaching Award, the 2021 IEEE ComSoc Joe LoCicero Service Award, the IEEE ComSoc Wireless Communications Technical Committee Recognition Award, and the 2019 IEEE Kiyo Tomiyasu technical field award.   His former PhD students include five IEEE Fellows, several professors at top universities in the USA, Asia, and Europe,  and industry leaders on LTE and 5G systems, on which they collectively hold thousands of US patents.

Sean Meyn

  Professor

  University of Florida

                  3rd March

                      2023

Who is Q?

 Video:    Link 

Abstract: One theoretical foundation of reinforcement learning is optimal control, usually the Markovian variety known as Markov decision processes (MDPs). The MDP model consists of a state process, an action (or input) process, and a one-step cost function that is a function of state and action. The goal is to obtain a policy (function from states to actions) that is optimal in some predefined sense. Chris Watkins introduced the Q-function in the 1980s as part of a methodology for reinforcement learning. Given its importance for over three decades, it is not surprising that the question of the true meaning of Q was a hot topic for discussion during the Simons Institute's Fall 2020 program on Theory of Reinforcement Learning. In this lecture we discover the truth about Q’s origins, and what has happened since.

We’ve all heard about the magic of Q—consider alpha-zero and chatGPT.   As we review the foundations of the reinforcement universe, you may share the speaker's amazement that Q-learning is ever successful!  This invites many research questions:  why does Q-learning result in successful solutions for decision and control?  How can we create new approaches to reinforcement learning that are efficient in terms of training, and also provide some estimate of policy performance?

The lecture draws on Chapters 5 and 9 of the new monograph,  Control Systems and Reinforcement Learning,  as well as recent papers on Convex Q-Learning and Logistic Q-Learning.  


BioSean Meyn was raised by the beach in Southern California. Following his BA in mathematics at UCLA, he moved on to pursue a PhD with Peter Caines at McGill University. After about 20 years as a professor of ECE at the University of Illinois, in 2012 he moved to beautiful Gainesville. He is now Professor and Robert C. Pittman Eminent Scholar Chair in the Department of Electrical and Computer Engineering at the University of Florida, and director of the Laboratory for Cognition and Control.  He also holds an Inria International Chair to support research with colleagues in France.  His interests span many aspects of stochastic control, stochastic processes, information theory, and optimization. For the past decade, his applied research has focused on engineering, markets, and policy in energy systems.

Waheed U. Bajwa

  Professor

  Rutgers University

    24th February

                    2023

FAST-PCA: A Scalable Algorithm for Distributed Principal Component Analysis

 Video:   Link 

Abstract: The Principal Component Analysis (PCA) is considered to be a quintessential data preprocessing tool in many machine learning applications. But the high dimensionality and massive scale of data in several of these applications means the traditional centralized PCA solutions are fast becoming irrelevant for them. Distributed PCA, in which a multitude of interconnected computing devices collaborate among themselves in order to obtain the principal components of the data, is a typical approach to overcome the limitations of the centralized PCA solutions. The focus in this talk is on the distributed PCA problem when the data are distributed among computing devices whose interconnections correspond to an ad-hoc topology. Such setup, which corresponds to the Internet-of-Things, vehicular networks, mobile edge computing, etc., has been considered in a few recent works on distributed PCA. But the resulting solutions either overlook the uncorrelated feature learning aspect of the PCA problem, tend to have high communications overhead that makes them unscalable and/or lack 'exact' or 'global' convergence guarantees. In order to overcome these limitations, this talk introduces two closely related variants of a new and scalable distributed PCA algorithm, termed FAST-PCA (Fast and exAct diSTributed PCA), that is efficient in terms of communications because of its one time-scale nature. The proposed FAST-PCA algorithm is theoretically shown to converge linearly and exactly to the principal components, leading to dimension reduction as well as uncorrelated features for machine learning, while extensive numerical experiments on both synthetic and real data highlight its superiority over existing distributed PCA algorithms.


BioWaheed U. Bajwa, whose research interests include statistical signal processing, high-dimensional statistics, machine learning, inverse problems, and networked systems, is currently a professor and graduate program director in the Department of Electrical and Computer Engineering and a member of the graduate faculty of the Department of Statistics at Rutgers University. Additionally, he has held positions in Princeton University, Duke University, and different technology startups.


Dr. Bajwa has received several research and teaching awards including the Army Research Office Young Investigator Award (2014), the National Science Foundation CAREER Award (2015), Rutgers Presidential Merit Award (2016), Rutgers Presidential Fellowship for Teaching Excellence (2017), Rutgers Engineering Governing Council ECE Professor of the Year Award (2016, 2017, 2019), Rutgers Warren I. Susman Award for Excellence in Teaching (2021), and Rutgers Presidential Outstanding Faculty Scholar Award (2022). He is a co-investigator on a work that received the Cancer Institute of New Jersey’s Gallo Award for Scientific Excellence in 2017, a co-author on papers that received Best Student Paper Awards at IEEE IVMSP 2016 and IEEE CAMSAP 2017 workshops, and a Member of the Class of 2015 National Academy of Engineering Frontiers of Engineering Education Symposium.

  Professor

  Lehigh University

    10th February

                    2023

Domain Agnostic Extraction of Sensitive Information

 Video:   Link 

Abstract: In this talk, we’ll discuss a paradigm for protection of sensitive inferences drawn from data streams with an added emphasis on extracting these inferences with limited domain knowledge. From a privacy perspective, we will discuss an alternative to end-to-end encryption of entire data streams, or noise-addition based privatization mechanisms. It relies on the notion that raw data shared are themselves not sensitive but for the inferences that can be drawn from them. If these inferences can be “extracted” from the data, then a high rate irrelevant stream is guaranteed to provide perfect privacy for the underlying inference without any additional protection. We will present theoretical results, and practical demonstrations on different datasets where trained deep learning based classifiers are shown to fail on the unprotected data. We will discuss further the role of quantization as a means to gain privacy, with novel variations on Lloyd’s algorithm for privacy sensitive quantization. In the last part of the talk we will focus on a non-private setting, where inferences need to be derived when underlying system models are sensitive. These inferences could be critical to the functioning of sensitive systems, and our methods will focus on semi supervised learning as a means to derive inferences without any prior knowledge of the system or the possible inferences that exist therein.


BioParv Venkitasubramaniam is a Professor in the Electrical and Computer Engineering Department at Lehigh University and a Founding Director of the New MS Program in Data Science offered by the P. C. Rossin College of Engineering and Applied Sciences. His current research interests are in developing theoretical foundations for information security in cyber physical systems, and data driven inference and control of engineering systems. His research spans different topics under this umbrella including privacy in inferential learning, interactive and dynamical systems, cybersecurity in the smart electric grid and privacy from timing analysis in networked systems. He is a recipient of the CAREER Award from the National Science Foundation and a Leonard G. Abraham Award from the IEEE Communications Society. 


He received a B.Tech in electrical engineering from the Indian Institute of Technology, Madras in 2002, and my M.S and Ph.D degree in electrical engineering in 2005 and 2007 respectively from Cornell University. From 2007 to 2009, he was a visiting post-doctoral fellow at the University of California, Berkeley.


A few of us will remain to discuss the talk and catch up until ~2:30pm, so please stay if you are able. 

Nicholas Lanchier

  Professor

            ASU

    3rd February

                    2023

Consensus and discordance in the Axelrod model for the dynamics of cultures

 Video:  Link 

Abstract: The Axelrod model is a spatial stochastic model for the dynamics of cultures which includes two important social components: homophily, the tendency of individuals to interact more frequently with individuals who are more similar, and social influence, the tendency of individuals to become more similar when they interact. Each individual is characterized by a collection of opinions about different issues, and pairs of neighbors interact at a rate equal to the number of issues for which they agree, which results in the interacting pair agreeing on one more issue. This model has been extensively studied during the past 20 years based on numerical simulations and heuristic arguments while there is a lack of analytical results. This talk gives rigorous fluctuation and fixation results for the one-dimensional system that sometimes confirm and sometimes refute some of the conjectures formulated by applied scientists.


BioNicholas Lanchier is a Professor in Mathematics at the School of Mathematical and Statistical Sciences and Honors Faculty at Arizona State University since 2021. He obtained his PhD from the University of Rouen (France) in 2005. He did a postdoc at the University of Minnesota from 2005 to 2007 and joined the school of math in 2007 as an Assistant Professor. He created a YouTube channel during COVID with 250+ videos in probability theory at various levels. The following is the link to the probability course: Link

R. Srikanth

  Professor

            UIUC

    11th November

                    2022

The Role of Lookahead and Approximate Policy Evaluation in Reinforcement Learning

 Video:  Link 

Abstract: When the sizes of the state and action spaces are large, solving MDPs can be computationally prohibitive even if the probability transition matrix is known. So in practice, a number of techniques are used to approximately solve the dynamic programming problem, including lookahead, approximate policy evaluation using an m-step return, and function approximation. Efroni et al. (2019) studied the impact of lookahead on the convergence rate of approximate dynamic programming. However, these convergence results can change dramatically when function approximation is used. Specifically, we show that when linear function approximation is used to represent the value function, a certain minimum amount of lookahead and multi-step return is needed for the algorithm to even converge. And when this condition is met, we characterize the finite-time performance of policies obtained using approximate policy iteration. Our results are presented for two different procedures to compute the function approximation: linear least-squares regression and gradient descent. Joint work with Anna Winnicki.

BioR. Srikant is a Grainger Distinguished Chair in Engineering, and Professor of Electrical and Computer Engineering and Coordinated Science Lab at the University of Illinois at Urbana-Champaign. His research interests include machine learning and communication networks. He is a winner of the ACM SIGMETRICS Achievement Award, the IEEE Koji Kobayashi Computers and Communication Award, and the IEEE INFOCOM Achievement Award. He has won several best paper awards including the Applied Probability Society’s Best Publication Award, the IEEE INFOCOM Best Paper Award, and the WiOpt Best Paper Award.

Guido Montufar

  Associate Professor

            UCLA

    4th November

                    2022

Geometry and Convergence of Natural Policy Gradient Methods

 Video:  Link 

Abstract: We study the convergence of different natural policy gradient (NPG) methods in discounted infinite-horizon Markov decision processes with memoryless stochastic policies. We show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, and obtain global convergence guarantees and convergence rates for a variety of NPG flows. In particular, we show linear convergence for unregularized (and regularized) NPG flows with the Riemannian metrics proposed by Kakade and Morimura et al. which we interpret as the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sub-linear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the time discrete NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer (up to scaling). This yields local quadratic convergence rates of these methods for step size equal to the penalization strength. This is work with Johannes Mueller.


BioGuido Montúfar is an Associate Professor of Mathematics and Statistics at UCLA and Head of the Math Machine Learning Group at the Max Planck Institute for Mathematics in the Sciences, working on deep learning theory and math machine learning more broadly. He studied mathematics and theoretical physics at TU Berlin, obtained the Dr.rer.nat. in 2012 as an IMPRS fellow in Leipzig, and was a postdoc at PennState and MPI MiS. Guido's research is supported in part by the ERC, DFG, and NSF, and he is a 2022 Alfred P. Sloan Research Fellow.

Pulkit Grover

      Professor

            CMU

    28th October

                    2022

Information-theoretic techniques for examining and intervening on neural circuits

 Video:  Link 

Abstract: Motivated by networks of clinical and basic neuroscience, this talk delves into the question of how to examine the computational system of our brain, and influence it to attain desirable outcomes.One part of my talk is theoretical: I will draw from existing neuroscience literature and simple examples to arrive at what we call the “M-Information Flow” framework, which provides the first formal definition of information flow in the brain. With this, I will show that it is possible to verifiably track information flow about a given message, and, further, obtain finer grained information than existing tools. Through examples, I will also illustrate how information theory can suggest efficient interventions on a network for desirable outcomes, and use this result as a motivation to discuss the formalization of a new way of thinking about reverse engineering the brain: a difficult problem of finding minimal interventions. In another part, I'll quickly overview translation work in my lab in noninvasive sensing and stimulation of the brain, including i) designing the first neural sensing systems that work with Black hair types ii) a new problem of localizing “silences” in the brain; and iii)  the first noninvasive detection of “cortical spreading depolarizations” in brain injury patients. I will also illustrate how information, optimization, and control-theoretic approaches are beginning to influence neuromodulation, in particular, focusing on how the non-linear dynamics of neural membrane potentials can be harnessed to design novel neuro-stimulation strategies. In this context, I will discuss an exciting case study with biology collaborators. The main message of this part is that neural engineering and neuroscience provide a set of engaging problems for LIONS, where practical issues motivate entirely new problems in established fields, with exciting applications and an immense potential for positive and widespread impact. The talk largely is based on joint works led by Sara Caldas-Martinez, Alireza Chamanzar, Sanghamitra Dutta, Arnelle Etienne, Mats Forssell, Chaitanya Goswami, Vishal Jain, Jasmine Kwasa, Neil Mehta, and Praveen Venkatesh.


BioPulkit (Ph.D. UC Berkeley'10, B.Tech, M.Tech IIT Kanpur) is the Angel Jordan Professor at CMU. His current work involves thought and laboratory experiments to expand and develop a science of information for neural sensing, and stimulation, with increasing focus on identifying and eliminating racial biases these systems can have, and improving accessibility by examining limits of non-invasive systems. To bring these to practice, his lab works extensively with data scientists, system and device engineers, neuroscientists, and clinicians. Specifically, work of his lab is focused on a) fair and explainable AI at algorithm, theory, and hardware level; b) tools (theoretical, computational, and hardware) for understanding the healthy brain, and understanding, diagnosing, and treating disorders such as epilepsy, stroke, and traumatic brain injuries. Pulkit received the 2010 best student paper award at IEEE Conference on Decision and Control; the 2011 Eli Jury Dissertation Award from UC Berkeley; the 2012 Leonard G. Abraham best journal paper award (IEEE ComSoc); a 2014 NSF CAREER award; a 2015 Google Research Award; a 2018 inaugural award from the Chuck Noll Foundation for Brain Injury Research; the 2018 Spira Excellence in Teaching Award (CMU), and the 2019 best tutorial paper award (IEEE ComSoc). He co-founded Precision Neuroscopics, Inc., a startup translating his lab's work on accessible, high-resolution neural sensing solutions to the real world. He’s the PI of the SharpFocus award, a multi-institution effort aimed at mm- and msec-scale noninvasive brain sensing and stimulation, and is a distinguished lecturer for the IEEE Information Theory Society for 2022-23.

Ardhendu Tripathy

     Assistant Professor

         Missouri S&T

                 22nd October

                      2022

Chernoff Sampling for Active Testing and Extension to Active Regression

 Video:   Link 

Abstract: Commonly used machine learning models are often trained on large datasets. As a result, a natural stumbling block in applying them more widely is the human effort required to label or annotate the data. Active learning is a framework that can reduce the number of labelled data needed to achieve a desired performance. In this talk, I will explain the benefit of active learning in two problem settings: active testing and active regression. In active testing, the sequential design of experiments developed by Chernoff in 1959 is widely used and known to be asymptotically optimal. We obtained a novel non-asymptotic bound on the number of labelled data needed for Chernoff’s algorithm. We then extend Chernoff sampling and apply it in active regression. In addition to obtaining a theoretical performance guarantee, we find that our extension requires fewer labelled data compared to existing methods in both simulated and real-world datasets. This is joint work with Subhojyoti Mukherjee and Robert Nowak.


BioArdhendu Tripathy is an assistant professor in the Computer Science department at Missouri University of Science & Technology, Rolla since November 2020. Prior to that, he was a postdoctoral research associate with Robert Nowak at University of Wisconsin-Madison, and he received his PhD from the Electrical & Computer Engineering department at Iowa State University, Ames. His research interests are in sequential decision-making, active learning, and multi-armed bandits.

Ahmed Alkhateeb

     Assistant Professor

ASU

                 14th October

                      2022

Multi-Modal Sensing Aided Communications and the Role of Machine Learning

 Video:  Link 

Abstract: Wireless communication systems are moving to higher frequency bands (mmWave in 5G and above 100GHz in 6G and beyond) and deploying large antenna arrays at the infrastructure and mobile users (massive MIMO, mmWave/terahertz MIMO, reconfigurable intelligent surfaces, etc.). While using large antenna arrays and migrating to higher frequency bands enable satisfying the increasing demand in data rate, they also introduce new challenges that make it hard for these systems to support mobility and maintain high reliability and low latency. In this talk, I will first motivate the use of sensory data and machine learning to address these challenges. Then, I will present DeepSense 6G, the world's first large-scale real-world multi-modal sensing and communication dataset that enables the research in a wide range of integrated sensing and communication applications. After that, I will go over a few machine learning tasks enabled by the dataset such as radar, LiDAR, camera, and position aided beam and blockage prediction. Finally, I will discuss some future research directions in the interplay of communications, sensing, and positioning.


Bio: Ahmed Alkhateeb received his B.S. and M.S. degrees in Electrical Engineering from Cairo University, Egypt, in 2008 and 2012, and his Ph.D. degree in Electrical Engineering from The University of Texas at Austin, USA, in 2016. After the Ph.D., he spent some time as a Wireless Communications Researcher at the Connectivity Lab, Facebook, before joining Arizona State University (ASU) in Spring 2018, where he is currently an Assistant Professor in the School of Electrical, Computer, and Energy Engineering. His research interests are in the broad areas of wireless communications, signal processing, machine learning, and applied math. Dr. Alkhateeb is the recipient of the 2012 MCD Fellowship from The University of Texas at Austin, the 2016 IEEE Signal Processing Society Young Author Best Paper Award for his work on hybrid precoding and channel estimation in millimeter-wave communication systems, and the NSF CAREER Award 2021 to support his research on leveraging machine learning for large-scale MIMO systems.

Sudipto Mukherjee

      Microsoft


        23rd  September

                  2022

Classifier-based Information Estimation: Formulation, Applications and Extensions

 Video:  Link 

Abstract: Mutual Information (MI) and its conditional variant (CMI) are well known information-theoretic measures to quantify the amount of information in a system. While MI measures information between two random variables, CMI extends it to a system where, in addition, we are given a set of conditioning random variables. CMI is more interesting in this regard, yet challenging, as the conditioning set grows in dimension. While nearest-neighbor based estimators of MI and CMI (Kraskov et al 2003) have long been known and extensively studied, they struggle to estimate MI and CMI in high dimensions. In this talk, we take a different approach and use classifiers for MI and CMI estimation. We study their applications in conditional independence testing and explore the recent research extensions of MI and CMI estimators in multifarious areas such as inferring functional connectivity in neural data or measuring time-series dependencies


Bio: Dr. Sudipto Mukherjee received his Ph.D. from the Department of Electrical and Computer Engineering, University of Washington (Seattle). After his Ph.D., he joined Microsoft Corporation and is currently a Senior Applied Scientist in the Research and Incubations division of the company. Dr. Mukherjee’s research interests span across multiple areas of machine learning and AI, namely unsupervised representation learning, multi-modal representation learning combining graphs and language data as well as bringing together the advances in AI to create novel applications for the tech industry. Some of his noteworthy works include “ClusterGAN: Latent space clustering inGenerative Adversarial Networks”, “CCMI: Classifier-based conditional mutual information estimation” and “Smart ToDo: Automatic generation of To-Do List from Emails”.

   Flavio Calmon

     Assistant Professor

                    Harvard

        16th  September

                  2022

Information-Theoretic Tools for Responsible Machine Learning

 Video:  Link 

Abstract: We introduce information-theoretic results for fair machine learning. First, we study the problem of finding the element within a convex set of conditional distributions with the smallest f-divergence to a reference distribution. Motivated by applications in machine learning, we refer to this problem as model projection since any probabilistic classification model (e.g., logistic regression, random forests) can be viewed as a conditional distribution. The new, projected classifier is given by a tilting (i.e., post-processing) of the outputs of the original classifier. We show that the parameters of this tilting can be computed at scale (e.g., on a GPU) and has provable performance guarantees. We apply model projection to create group-fair probabilistic classifiers by projecting an (unfair) classifier onto the set determined by fairness constraints. Our numerical results demonstrate that this approach achieves state-of-the-art fairness-accuracy trade-off while scaling to datasets with millions of samples. 

In the second part of the talk, we investigate the group fairness concerns of training a machine learning model using data with missing values. Most fairness interventions require a complete training set as input. In practice, data can have missing values, and data missing patterns can depend on group attributes. We theoretically analyze different sources of discrimination risk when training models with an imputed dataset. We then propose a classification approach based on decision trees that integrates classification and imputation, thus circumventing fairness risks that may appear when performing data imputation and classification separately.


Bio: Flavio P. Calmon is an Assistant Professor of Electrical Engineering at the Harvard John A. Paulson School of Engineering and Applied Sciences. Before joining Harvard, he was the inaugural Data Science for Social Good Post-Doctoral Fellow at IBM Research in Yorktown Heights, New York. He received his Ph.D. in  Electrical Engineering and Computer Science at MIT. His research develops information-theoretic tools for responsible, reliable, and rigorous machine learning. 

Prof. Calmon has received the NSF CAREER award, faculty awards from Google, IBM, Oracle, and Amazon, the NSF-Amazon Fairness in AI award, the Harvard Data Science Initiative Bias2 award, and the Harvard Dean of Undergraduate Studies Commendation for "Extraordinary Teaching during Extraordinary Times." He also received the inaugural Título de Honra ao Mérito (Honor to the Merit Title) given to alumni from the Universidade de Brasília (Brazil), being the first awardee in the areas of engineering, computer science, mathematics, and statistics.

   Christina Yu

     Assistant Professor

                    Cornell

        9th  September

                  2022

Sequential Fair Allocation - Achieving the Optimal Envy-Efficiency Tradeoff Curve

 Video:  Link 

Abstract: We consider the problem of dividing limited resources to individuals arriving over T rounds with a goal of achieving fairness across individuals. In general there may be multiple resources and multiple types of individuals with different utilities. A standard definition of 'fairness' requires an allocation to simultaneously satisfy envy-freeness and pareto efficiency. However, in the online sequential setting, the social planner must decide on a current allocation before the downstream demand is realized, such that no policy can guarantee these desiderata simultaneously with probability 1, requiring a modified metric of measuring fairness for online policies. We show that in the online setting, the two desired properties (envy-freeness and efficiency) are in direct contention, in that any algorithm achieving additive counterfactual envy-freeness up to a factor of L_T necessarily suffers an efficiency loss of at least 1 / L_T.  We complement this uncertainty principle with a simple algorithm, HopeGuardrail, which allocates resources based on an adaptive threshold policy and is able to achieve any fairness-efficiency point on this frontier.  Our result is the first to provide guarantees for fair online resource allocation with high probability for multiple resource and multiple type settings.  In simulation results, our algorithm provides allocations close to the optimal fair solution in hindsight, motivating its use in practical applications as the algorithm is able to adapt to any desired fairness efficiency trade-off. This is joint work with Sean Sinclair and Siddhartha Banerjee.


Bio: Christina Lee Yu is an Assistant Professor at Cornell University in the School of Operations Research and Information Engineering. Prior to Cornell, she was a postdoc at Microsoft Research New England. She received her PhD and MS in Electrical Engineering and Computer Science from Massachusetts Institute of Technology, and she received her BS in Computer Science from California Institute of Technology. She received honorable mention for the 2018 INFORMS Dantzig Dissertation Award, and she is a recipient of the 2021 Intel Rising Stars Award and 2021 JPMorgan Faculty Research Award. Her research interests include algorithm design and analysis, high dimensional statistics, inference over networks, sequential decision making under uncertainty, online learning, and network causal inference.

    Michelle Efros

            Professor 

                    Caltech

           29th  April

               2022

New Tools for Random Access Communication

 Video:  Link 

Abstract: The random access channel model captures scenarios experienced by WiFi hotspots and cell phone towers, where a single receiver is tasked with decoding the transmissions from an unknown number of independent transmitters. The fact that the number of transmitters is unknown to both the transmitters and the receiver makes coding difficult since the best possible codes and even the rates at which those codes communicate vary with the number of active transmitters.  Current strategies for communicating over such channels typically either sacrifice performance for simplicity or pay a heavy price in overhead to eliminate transmitter-set uncertainty.  As more and more people and devices connect to the internet wirelessly, random access channels become an increasingly critical bottleneck to efficient and reliable communication.

This talk considers new methods for tackling random access communication, focusing on the competing goals of building practical codes and achieving the best possible performance. Central results include new coding strategies and bounds to capture some of their underlying complexity-performance tradeoffs.


Bio: Michelle Effros is the Vice Provost and George Van Osdol Professor of Electrical Engineering at the California Institute of Technology. She received her Ph.D. in Electrical Engineering from Stanford University in 1994 and joined Caltech the same year. Her research interests are primarily in the area of information theory for networks of communicating devices -- with particular interest in developing tools for understanding large networks traditionally considered impenetrable to information theoretic techniques. Prof. Effros has received a number of awards and fellowships including the NSF CAREER Award, the Charles Lee Powell Foundation Award, the Richard Feynman-Hughes Fellowship, an Okawa Research Grant, a citation by Technology Review as one of the world's top young innovators, and a Communication and Information Theory Society Joint Paper Award. She is a fellow of IEEE and a member of Tau Beta Pi, Phi Beta Kappa, and Sigma Xi. She served as President of the IEEE Information Theory Society in 2015 and has served on a large number of publications committees, technical program committees, and advisory boards.

    Pratik Chaudhari

            Professor 

                    UPenn

           22nd  April

               2022

Does the Data Induce Capacity Control in Deep Learning

 Video:  Link 

Abstract: Accepted statistical wisdom suggests that larger the model class, the more likely it is to overfit to the training data. And yet, deep networks generalize extremely well. The larger the deep network, the better its accuracy on new data. This talk seeks to shed light upon this apparent paradox. We will argue that deep networks are successful because of a characteristic structure in the space of learning tasks. The input correlation matrix for typical tasks has a peculiar (“sloppy”) eigenspectrum where, in addition to a few large eigenvalues (salient features), there are a large number of small eigenvalues that are distributed uniformly over exponentially large ranges. This structure in the input data is strongly mirrored in the representation learned by the network. A number of quantities such as the Hessian, the Fisher Information Matrix, as well as others activation correlations and Jacobians, are also sloppy. Even if the model class for deep networks is very large, there is an exponentially small subset of models (in the number of data) that fit such sloppy tasks. This talk will demonstrate the first analytical non-vacuous generalization bound for deep networks. We will also discuss an application of these concepts that gives new algorithms for semi-supervised learning.

Bio: Pratik Chaudhari is an Assistant Professor of Electrical and Systems Engineering and Computer and Information Science at the University of Pennsylvania. He is a member of the GRASP Laboratory. From 2018--19, he was a Senior Applied Scientist at Amazon Web Services and a Postdoctoral Scholar in Computing and Mathematical Sciences at Caltech. Pratik received his PhD (2018) in Computer Science from UCLA, his Master's (2012) and Engineer's (2014) degrees in Aeronautics and Astronautics from MIT. He was a part of NuTonomy Inc. (now Hyundai-Aptiv Motional) from 2014--16. He received the NSF CAREER award in 2022.

    Ekram Hossain

            Professor 

         U of Manitoba

           15th  April

               2022

Stochastic Multi-Armed Bandits with Knapsack and It's Application in Edge Computing Networks

 Video:  Link 

Abstract: Multi-armed bandits (MAB) is a popular sequential decision making technique ideal for decision making under uncertainty given no prior knowledge of the environment. It uses the history of previous decisions and observations as well as side information, if available, to arrive at the current decision. The classic MAB algorithm such as the upper confidence bound (UCB) algorithm concerns with learning the single optimal action among a set of candidate actions with unknown rewards. Different from traditional bandits, bandits with knapsacks (BwK) can model more sophisticated distributed decision-making problems under global constraints. Starting with the basics of stochastic MAB models and the UCB algorithm, in this talk, I shall discuss a BwK model and show it’s application to the server selection problem in an edge computing system. Time permitting, I will also discuss a linear contextual bandit with knapsack model for the same problem

Bio:  Ekram Hossain (IEEE Fellow) is a Professor and Associate Head of Graduate Studies in the Department of Electrical and Computer Engineering at University of Manitoba, Winnipeg, Canada. He is a Member (Class of 2016) of the College of the Royal Society of Canada, a Fellow of the Canadian Academy of Engineering, and a Fellow of the Engineering Institute of Canada. Dr. Hossain's current research interests include design, analysis, and optimization of wireless communication networks (with emphasis on beyond 5G/6G cellular wireless networks), applied machine learning, game theory, and network economics (http://home.cc.umanitoba.ca/~hossaina). He was elevated to an IEEE Fellow “for contributions to spectrum management and resource allocation in cognitive and cellular radio networks". He was listed as a Clarivate Analytics Highly Cited Researcher in Computer Science in 2017, 2018, 2019, 2020, and 2021. Dr. Hossain has won several research awards including the “2017 IEEE Communications Society Best Survey Paper Award” and the “2011 IEEE Communications Society Fred Ellersick Prize Paper Award”. He received the 2017 IEEE ComSoc TCGCC (Technical Committee on Green Communications & Computing) Distinguished Technical Achievement Recognition Award “for outstanding technical leadership and achievement in green wireless communications and networking”. He served as the Editor-in-Chief of IEEE Press (2018-2021), the IEEE Communications Society (ComSoc) Director of Magazines (2020-2021), and the Editor-in-Chief of the IEEE Communications Surveys and Tutorials (2012-2016). He was an elected Member of the Board of Governors of the IEEE ComSoc (2018-2020). Currently, he serves as the Technical Program Committee Chair for the IEEE International Conference on Communications 2022 (ICC'22), an Editor of the IEEE Transactions on Mobile Computing, and the Director of Online Content (2022-2023) for the IEEE ComSoc.

        Piya Pal

   Associate Professor 

                  UCSD

           8th  April

               2022

Super-resolution with Binary Constraints: Theory, Algorithm and Sensing Strategies

 Video:  Link 

Abstract: The problem of super-resolving the “details” of a signal or image of interest from low-resolution measurements (where the high frequency contents of the signal/image are severely attenuated) has been extensively studied both theoretically and from an application perspective. It is well-known that even in the absence of noise, the problem of super-resolution is an “ill-posed” inverse problem and the forward model cannot be simply ‘inverted’ to recover the desired image. The mathematical theory of super-resolution therefore focuses on leveraging prior knowledge about the signal/image class, (often described in terms of sparsity) in order to develop theoretical guarantees under which it possible to solve this problem, and attain stable reconstruction (modulo noise amplification factors). 

In this work, we explore the role of finite-valued (specifically, binary-valued) priors in super-resolution. The study of finite valued signals is primarily inspired from neural spike deconvolution (where the underlying high rate spiking activity is binary-valued), but also applies to a wider range of applications applications such as discrete tomography, medical imaging, astronomical imaging, and image segmentation. We will show that binary constraints offer surprisingly stronger identifiability guarantees than ‘sparsity’, even allowing us to operate in “extreme compression" regimes, where the number of measurements can be much smaller than the sparsity level. Instead of ‘relaxing’ the binary constraints, we advocate “no-relaxation’” strategies for super-resolution, which can operate at ‘extreme compressive regimes’ by explicitly imposing binary constraints. A central idea in overcoming the computational challenges associated with enforcing such binary constraints is via the design of certain structured filter-

dependent sampling/sensing strategies. This gives rise to a new idea of algorithm-measurement co-design, where the measurement matrix is designed as a function of the ‘filtering kernel’ such that the recovery of binary signals with arbitrary sparsity is possible by using computationally efficient algorithms. Finally, we demonstrate the benefits of binary constraints in a concrete application of spike deconvolution from real calcium imaging data.

(Joint work with my Ph.D student, Pulak Sarangi)


Bio:  Piya Pal is an Associate Professor of Electrical and Computer Engineering at the University of California, San Diego, where she is also a founding faculty of the Haliciouglu Data Science Institute (HDSI). She obtained her B.Tech in Electronics and Electrical Communication Engineering from IIT Kharagpur, India in 2007, and her Ph.Din Electrical Engineering from Caltech in 2013 supervised by Prof. P. P. Vaidyanathan. Her Ph.D thesis was awarded the 2014 Charles and Ellen Wilts Prize for Outstanding Doctoral Thesis in Electrical Engineering at Caltech. Her research interests include signal representation and sampling techniques for high-dimensional signal/data processing,(sparse) sensor arrays and computational sensing, mathematical foundations of super-resolution imaging, and optimization and machine learning for inverse problems. Her research has been recognized by several awards, including the 2020 IEEE Signal Processing Society Pierre-Simon Laplace Early Career Technical Achievement Award, 2019 US Presidential Early Career Award for Scientists and Engineers (PECASE), 2019 Office of Naval Research Young Investigator Program (ONR YIP) award, 2016 NSF CAREER Award, and several Student Paper Awards including the Best Student Paper Award for her Ph.D students at the 2017 IEEE ICASSP and 2019 IEEE CAMSAP conferences. For her contributions to teaching, she received the ECE Best Graduate Teaching Award at UC San Diego in 2017 and 2018. She has served on the IEEE SAM and SPTM Technical Committees, and EURASIP Signal Processing for Multisensor Systems Technical Committee. She is currently serving as an Associate Editor for the IEEE Signal Processing Magazine.

  Chinmay Hegde

   Assistant Professor 

                  NYU

           1st  April

               2022

Designing Neural Networks for Efficient Encrypted Inference

 Video:  Link 

Abstract: As deep neural networks become ever more pervasive, so too are concerns surrounding users' data privacy. Curiously, standard cryptographic encryption approaches for guaranteeing data privacy do not interact well with traditional neural network models. In this talk, I will (a) outline why standard networks are not encryption-efficient, (b) suggest two new approaches for designing deep networks that do support efficient and secure inference, (c) show results instantiating these approaches on real-world use cases, and (d) discuss theoretical approaches for understanding the limits of private inference. 

Bio: Chinmay Hegde is an Assistant Professor at NYU, jointly appointed with the CSE and ECE Departments. His research focuses on foundational aspects of machine learning (such as reliability, robustness, and computational efficiency). He also works on applications ranging from computational imaging, materials design, and cybersecurity. He is a recipient of the NSF CAREER and CRII awards, the Black and Veatch Faculty Fellowship, multiple teaching awards, and best paper awards at ICML, SPARS, and MMLS.  

     Laura Balzano

   Associate Professor 

                  UMich

25th March
2022

Preference Modeling with Context-Dependent Salient Features

 Video:  Link 

Abstract: This talk considers the preference modeling problem and addresses the fact that pairwise comparison data often reflects irrational choice, e.g. intransitivity. Our key observation is that two items compared in isolation from other items may be compared based on only a salient subset of features. Formalizing this idea, I will introduce our proposal for a “salient feature preference model” and discuss sample complexity results for learning the parameters of our model and the underlying ranking with maximum likelihood estimation. I will also provide empirical results that support our theoretical bounds, illustrate how our model explains systematic intransitivity, and show in this setting that our model is able to recover both pairwise comparisons and rankings for unseen pairs or items. Finally I will share results on two data sets: the UT Zappos50K data set and comparison data about the compactness of legislative districts in the US.

This is joint work with Amanda Bower, who is now at Twitter

Bio:  Laura Balzano is an associate professor of Electrical Engineering and Computer Science at the University of Michigan. She has a PhD from the University of Wisconsin in ECE. She is recipient of the NSF Career Award, ARO Young Investigator Award, AFOSR Young Investigator Award, and faculty fellowships from Intel and 3M. She is currently serving as associate editor of the IEEE Open Journal of Signal Processing and the SIAM Journal of the Mathematics of Data Science. Her main research focus is on modeling and optimization with big, messy data — highly incomplete or corrupted data, uncalibrated data, and heterogeneous data — and its applications in a wide range of scientific problems.

     Marco Mondelli

         Assistant Professor 

                   IST Austria

18th March
2022

Understanding Gradient Descent for Over-parameterized Deep Neural Networks

 Video:  Link 

Abstract: Training a neural network is a non-convex problem that exhibits spurious and disconnected local minima. Yet, in practice neural networks with millions of parameters are successfully optimized using gradient descent methods. In this talk, I will give some theoretical insights on why this is possible and discuss two approaches to study the behavior of gradient descent. The first one takes a mean-field view and it relates the dynamics of stochastic gradient descent (SGD) to a certain Wasserstein gradient flow in probability space. I will show how this idea allows to study the connectivity, convergence and implicit bias of the solutions found by SGD. The second approach consists in the analysis of the Neural Tangent Kernel. I will present tight bounds on its smallest eigenvalue and show their implications on memorization and optimization in deep networks.

Based on joint work with Adel Javanmard, Vyacheslav Kungurtsev, Andrea Montanari, Guido Montufar, Quynh Nguyen, and Alexander Shevchenko.

Bio:  Marco Mondelli received the B.S. and M.S. degree in Telecommunications Engineering from the University of Pisa, Italy, in 2010 and 2012, respectively. In 2016, he obtained his Ph.D. degree in Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. He is currently an Assistant Professor at the Institute of Science and Technology Austria (IST Austria). Prior to that, he was a Postdoctoral Scholar in the Department of Electrical Engineering at Stanford University, USA, from February 2017 to August 2019. He was also a Research Fellow with the Simons Institute for the Theory of Computing, UC Berkeley, USA, for the program on Foundations of Data Science from August to December 2018. His research interests include data science, machine learning, information theory, wireless communication systems, and modern coding theory. He was the recipient of a number of fellowships and awards, including the Jack K. Wolf ISIT Student Paper Award in 2015, the STOC Best Paper Award in 2016, the EPFL Doctorate Award in 2018, the Simons-Berkeley Research Fellowship in 2018, the Lopez-Loreta Prize in 2019, and Information Theory Society Best Paper Award in 2021.

     Thinh T. Doan

    Assistant Professor 

             Virginia Tech

4th March
2022

Two-Time-Scale Stochastic Optimization and it's Application in Reinforcement Learning

 Video:  Link 

Abstract: Online policy gradient algorithms for reinforcement learning (RL), commonly referred to as “actor-critic” algorithms, can be re-cast as a two-time-scale stochastic approximation with a specific type of stochastic oracle for gradient evaluations. In this talk, I will present our recent work, where we consider a simple actor-critic-like algorithm solving general optimization problems with this same form, and give convergence guarantees for different types of assumed structural properties of the function being optimized. Our abstraction unifies the analysis of actor-critic method in RL, and we show how our main results reproduce the best-known convergence rates for the general policy optimization problem and how they can be used to derive a state-of-the-art rate for the online linear-quadratic regulator (LQR) controllers.

Bio:  Thinh T. Doan is an Assistant Professor in the Department of Electrical and Computer Engineering at Virginia Tech. Before joining Virginia Tech, he was a NSF TRIAD postdoctoral fellow at Georgia Tech. He obtained his Ph.D. degree at the University of Illinois, Urbana-Champaign, his master degree at the University of Oklahoma, and his bachelor degree at Hanoi University of Science and Technology, Vietnam, all in Electrical Engineering. His research interests span on the intersection of control theory, optimization, game theory, machine learning, reinforcement learning, and applied probability theory.

Gesualdo Scutari

   Professor 

                 Purdue

25th February
2022

Bringing Statistical thinking in Distributed Optimization. Vignettes from statistical inference over Networks

 Video: Link 

Abstract: There is growing interest in solving large-scale statistical machine learning problems over decentralized networks, where data are distributed across the nodes of the network and no centralized coordination is present (we termed these systems “mesh” networks). Modern massive datasets create a fundamental problem at the intersection of the computational and statistical sciences: how to provide guarantees on the quality of statistical inference given bounds on computational resources, such as time and communication efforts? While statistical-computation tradeoffs have been largely explored in the centralized setting, our understanding over mesh networks is limited: (i) distributed schemes, designed and performing well in the classical low-dimensional regime, can break down in the high-dimensional case; and (ii) existing convergence studies may fail to predict algorithmic behaviors; some are in fact confuted by experiments. This is mainly due to the fact that the majority of distributed algorithms  have been designed and studied only from the optimization perspective, lacking the statistical dimension. This talk will discuss some vignettes from  high-dimensional statistical inference suggesting  new analyses aiming at bringing statistical thinking in distributed optimization.

Bio:  Gesualdo Scutari  is  the Thomas and Jane Schmidt Rising Star Professor with the School of Industrial Engineering and Electrical and Computer Engineering (by courtesy) Purdue University, West Lafayette, IN, USA. His research interests include continuous and distributed optimization, equilibrium programming, and their applications to signal processing and machine learning. Dr. Scutari is a Senior Area Editor for the IEEE Transactions on Signal Processing, and an Associate Editor of SIAM Journal on Optimization. Among others, he received the 2013 NSF CAREER Award,  the 2015 IEEE Signal Processing Society Young Author Best Paper Award and the 2021 IEEE Signal Processing Society Best Paper Award. He is Fellow of IEEE.

Maxim Raginsky

   Associate Professor 

                    UIUC

18th February
2022

On Some Information-Theoretic Aspects of Generative Adversarial Models 

 Video: Link 

Abstract: The term ‘probabilistic generative model’ refers to any process by which a sample from a target probability measure on some high-dimensional space is produced by applying a deterministic transformation to a sample from a fixed probability measure on some latent space. Generative adversarial models have been proposed recently as a methodology for learning this transformation (referred to as the generator) on the basis of samples from the target measure, where the goodness of the generator is assessed by another deterministic transformation (the discriminator). Both the generator and the discriminator are learned jointly, in the min-max fashion, where the generator attempts to minimize some empirical measure of closeness between the target and the model, while the discriminator attempts to optimally distinguish between the two. In this talk, I will show that one can examine the capabilities of such models through an information-theoretic lens: Consider a binary-input channel, where the transmission of 0 produces a sample from the target measure, while the transmission of 1 produces a sample from the generator, and the discriminator acts as a decoder. One can then characterize the capabilities of the generator using information-theoretic converses, while the performance of the discriminator can be quantified using achievability arguments. I will present several illustrations of the utility of this approach in the context of generative adversarial nets (GANs).

Bio:  Maxim Raginsky received the B.S., M.S., and Ph.D. degrees in electrical engineering from Northwestern University, Evanston, IL, USA, in 2000, 2000, and 2002, respectively. He has held research positions with Northwestern University, the University of Illinois at Urbana–Champaign, Urbana, IL, USA, where he was a Beckman Foundation Fellow, from 2004 to 2007, and Duke University, Durham, NC, USA. In 2012, he returned to UIUC, where he is currently a William L. Everitt Fellow and an Associate Professor with the Department of Electrical and Computer Engineering and the Coordinated Science Laboratory. Dr. Raginsky was the recipient of the Faculty Early Career Development (CAREER) Award from the National Science Foundation in 2013. He has served on editorial boards of IEEE Transactions on Information Theory, Foundations and Trends in Communications and Information Theory, and IEEE Transactions on Network Science and Engineering. He is currently a member of the editorial boards of Journal of Machine Learning Research, SIAM Journal on Mathematics of Data Science, and Mathematics of Control, Signals, and Systems. His research interests are in probability and stochastic processes, deterministic and stochastic control, machine learning, optimization, and information theory.

Vincent Tan

Professor 

                           NUS

11th February
2022

Towards Minimax Optimal Best Arm Identification in Linear Bandits

 Video: Link 

Abstract: We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We provide a theoretical analysis of the failure probability of OD-LinBAI. While the performances of existing methods (e.g., BayesGap) depend on all the optimality gaps, OD-LinBAI depends on the gaps of the top d arms, where d is the effective dimension of the linear bandit instance. Furthermore, we present a minimax lower bound for this problem. The upper and lower bounds show that OD-LinBAI is minimax optimal up to multiplicative factors in the exponent. Finally, numerical experiments corroborate our theoretical findings.

This is joint work with Junwen Yang (Institute of Operations Research and Analytics, NUS)

Bio:  Vincent Y. F. Tan (S'07-M'11-SM'15) was born in Singapore in 1981. He received the B.A. and M.Eng. degrees in electrical and information science from Cambridge University in 2005, and the Ph.D. degree in electrical engineering and computer science (EECS) from the Massachusetts Institute of Technology (MIT) in 2011. He is currently a Dean’s Chair Associate Professor with the Department of Electrical and Computer Engineering and the Department of Mathematics, National University of Singapore (NUS). His research interests include information theory, machine learning, and statistical signal processing.


Dr. Tan is a member of the IEEE Information Theory Society Board of Governors. He was an IEEE Information Theory Society Distinguished Lecturer from 2018 to 2019. He received the MIT EECS Jin-Au Kong Outstanding Doctoral Thesis Prize in 2011, the NUS Young Investigator Award in 2014, the Singapore National Research Foundation (NRF) Fellowship (Class of 2018), and the NUS Young Researcher Award in 2019. A dedicated educator, he was also awarded the Engineering Educator Award in 2020 and 2021. He is currently serving as an Associate Editor for the IEEE Transactions on Signal Processing and as an Associate Editor in Machine Learning and Statistics for the IEEE Transactions on Information Theory.

Ken Duffy

Professor 

       Hamilton Institute

4th February
2022

Guessing Random Additive Noise Decoding

 Video: Link 

Abstract: Shannon's 1948 opus established that the highest rate that a noisy channel can support is achieved as error correcting codes become long. Since 1978 it has been known that Maximum Likelihood (ML) decoding of linear codes is NP-complete. Those results drove the paradigm of co-designing restricted classes of codebooks with code-specific methods that exploit code-structure to enable computationally efficient approximate-ML decoding for long, high-redundancy codes. Contemporary applications, including augmented reality, vehicle-to-vehicle communications, and the Internet of Things, are driving demand for Ultra-Reliable Low-Latency Communication (URLLC). Realizing URLLC technologies requires shorter codes, vacating the computational complexity issues associated with long codes and motivating revisiting the possibility of creating practical, accurate universal decoders.

In this talk, we introduce Guessing Random Additive Noise Decoding (GRAND), a universal ML decoder suitable for use with any moderate redundancy code of any length. Mathematically, GRAND's offers a new approach to establishing capacity and error exponent results. In practice, despite being first published in 2018, it has already resulted in circuit designs and a taped-out chip that demonstrate its suitability and efficiency in hardware. In this talk, we explain the theoretical rationale behind GRAND, recent hard- and soft-detection developments, and future possibilities.

The talk is based on joint work with Muriel Medard (MIT), with the circuits work performed in collaboration with Rabia Yazicigil (BU).


Bio:  Ken R. Duffy is a Professor of Applied Probability and the Director of the Hamilton Institute, an interdisciplinary research centre with 40 affiliated faculty at the National University of Ireland Maynooth. He is one of three co-Directors of the Science Foundation Ireland Centre for Research Training in Foundations of Data Science, which is supported by 16 thematically diverse enterprise alliance partners and funds more than 90 PhD students. 


He obtained a B.A. in mathematics in 1996 and Ph.D. in probability theory in 2000, both awarded by Trinity College Dublin. He works in works highly collaborative multi-disciplinary teams to design, analyse and realise algorithms for communication systems and the life sciences using tools from probability, statistics, and machine learning. Algorithms he has developed have been implemented in digital circuits and in DNA.

 

He is a co-founder of the Royal Statistical Society's Applied Probability Section (2011), co-authored a cover article of Trends in Cell Biology (2012), is a winner of a best paper award at the IEEE International Conference on Communications (2015), the best paper award from IEEE Transactions on Network Science and Engineering (2019), and the best research demo award from COMSNETS (2022).

Assistant Professor 

                                    ASU

3rd December
2021

Active Learning for Regression problem

Video: Link 

Abstract: The past decade has witnessed widespread adoption of machine learning by the

manufacturing and materials community. However, most of the existing works have relied on vast

amounts of experimental data available from past experiments. With the emphasis shifting towards

novel materials and new manufacturing processes, traditional passive experimental design methods are

not suited to exploring high-dimensional search spaces. Active learning, in contrast, provides an

adaptive approach to select the most informative experiments, guide the search of high-dimensional

spaces, and reduce experimental costs.

In this work, we will focus on active learning in regression problems, where the objective is to learn the

underlying black-box function with as few samples as possible. In the first half, we will discuss the

exploration-exploitation issues in the regression setting and present a hierarchical Bayesian approach to

dynamically balance the trade-off as more samples are collected iteratively. Results on simulated case-

studies and a real-world materials problem are presented. In the second half, we will talk about the

application of active learning in approximating the value function for finite horizon partially observable

Markov decision processes using an active point-based algorithm. We will discuss the convergence rates and some results on benchmark datasets.


Bio: Ashif Iquebal is an assistant professor of Industrial Engineering in the School of Computing and

Augmented Intelligence at ASU. Prior to this, he obtained his Ph.D. from the Department of Industrial

and Systems Engineering at Texas A&M University. His research is focused on developing

methodological foundations in data science and machine learning, particularly on statistical

representation and quantification of high-dimensional data, active learning, and graphical models. He

received the Pritsker Doctoral Dissertation Award from the Institute of Industrial and Systems

Engineering (IISE) in 2021. In the past, his research papers were recognized as winners/finalists for five

best student paper/poster awards at INFORMS, IISE, and the American Statistical Association

conferences.

Assistant Professor 

                                    ASU

19th November
2021

Robust service placement and workload allocation in edge computing

Video: Link 

Abstract: Edge computing promises to offer low-latency and ubiquitous computation to numerous mobile and Internet of Things devices. Thus, it can complement the cloud to deliver a superior user experience, reduce network traffic, and enable various IoT applications. How to jointly optimize the service placement, sizing, and workload allocation decisions in an edge-computing system is an important and challenging problem, which becomes even more complicated when considering numerous system uncertainties. In this talk, we will study this problem from the perspective of a service provider, who can procure resources from numerous edge nodes to improve the user experience while minimizing its cost. We propose and formulate a novel two-stage adaptive robust optimization model to help the service provider optimally determine the placement and sizing decisions that can hedge against any possible realization of the uncertain demand and unpredictable node failures. We then extend it to a two-stage multi-period robust model with integer recourse to examine the benefits of considering dynamic service placement as well as spatial-temporally correlated uncertainties.

Bio: Duong Nguyen is an assistant professor of electrical engineering at Arizona State University. He received his doctorate in electrical and computer engineering from the University of British Columbia in 2020. His research lies at the intersection of operations research, AI, economics, and engineering, with a specific focus on developing new mathematical models and techniques for decision-making and economic analysis of large-scale networked systems such as cloud/edge computing, EV charging networks, intelligent transportation, and crowdsourcing.

Assistant Professor 

                              U Wash

12th November
2021

Planning for Best Arm Identification

Video: Link 

Abstract: Scientific discovery is driven by the researcher's ability to collect high-quality data relevant to either verifying or disproving a hypothesis as quickly as possible. In recent years, a paradigm addressing this problem known as adaptive experimental design (AED) has been gaining traction. AED uses past measurements to inform the researcher what future measurements they should collect in a closed loop. In this talk, we show how AED can be applied to provide matching upper and lower bounds for the problem of best arm identification for linear bandits. As we will discuss, best-arm identification is a general framework that encapsulates problems such as stochastic shortest path, active classification, and linear dueling bandits. The AED approach to best-arm identification leads to improved results in all of these problem settings.

Bio: Lalit K Jain is an assistant professor in the Foster School of Business. His research is focused on the theory and implementation of of machine learning algorithms for large-scale data collection with an emphasis on ”human in the loop” and crowdsourcing applications. His work has been applied to a variety of applications including optimizing crowdfunding and microlending platforms, measuring conceptual perception in cognitive psychology, and detecting humor. Prior to joining the Foster school, he did a postdoc at the University of Washington with Professor Kevin Jamieson, a postdoc at the University of Michigan with Professor Anna Gilbert and a PhD in Mathematics at the University of Wisconsin-Madison advised by Professor Jordan Ellenberg.

Facebook

5th November
2021

Methods for responsible and reliable neural conversational AI

Video: Link 

Abstract: We first give an overview of CAIRaoke, an effort to build neural conversational AI models to power the next generation of task-oriented virtual digital assistants. We then continue with some of the challenges that we faced in training CAIRaoke dialog models, namely noisy data and lack of variations in dialog flow, which prompted us to create new public benchmarks for robustness in task-oriented dialog. We continue with presenting two methods that we have developed to solve these challenges. The first method (TERM) is a simple tweak to the widely used empirical risk minimization framework that can promote noise robustness by suppressing the influence of individual noisy outlier samples. The second method (DAIR) is a simple regularization add-on promoting performance consistency on data augmentation to better generalize to unseen examples.

Bio: Ahmad Beirami is a research scientist at Facebook AI, leading research to power the next generation of virtual digital assistants with AR/VR capabilities. His research broadly involves learning models with robustness and fairness considerations in large-scale systems. Prior to that, he led the AI agent research program for automated playtesting of video games at Electronic Arts. Before moving to industry in 2018, he held a joint postdoctoral fellow position at Harvard & MIT, focused on problems in the intersection of core machine learning and information theory. He is the recipient of the 2015 Sigma Xi Best PhD Thesis Award from Georgia Tech, for his work on the fundamental limits of efficient communication over IoT networks.

Assistant Professor
ASU

29th October
2021

Secure Computation for Contact Tracing and Heatmap Detection

Video: Link 

Abstract: Contact tracing is an essential tool in containing infectious diseases such as COVID-19. It enables users to search over released tokens in order to learn if they have recently been in the proximity of an infectious user. However, prior approaches have several weaknesses, including: (1) do not provide end-to-end privacy in the collection and querying of tokens, (2) do not utilize the huge volume of data stored in business databases and individual digital devices, or (3) impose heavy bandwidth or computational demands on client devices. In this talk, I will discuss the existing cryptographic attacks and privacy concerns in deploying contact tracing applications, how secure computation can tackle these problems. Moreover, I will describe our system design to improve the security guarantee and performance of the existing contact tracing frameworks. Along with this, I will present our new protocol for the delegated private intersection set cardinality (PSI-CA) that allows clients to delegate their contact tracing computation to cloud servers without privacy compromise. I will also show how to use PSI-CA to implement a secure dot product which is a core building block of heatmap detection.

Bio: Ni Trieu is an Assistant Professor of computer science at Arizona State University. Her research interests are in the area of cryptography and security, with a specific focus on secure computation and its applications such as private set intersection, secure bio-computing, and privacy-preserving machine learning. Before joining ASU, she was a postdoctoral researcher at UC Berkeley. She received my Ph.D. degree from Oregon State University.

Associate Professor
Boston University

22th October
2021

Detecting Structural Changes in Networks

Video: Link 

Abstract: Consider a probability distribution defined over a graph, and a dataset comprised of either node observations (e.g., a Markov random field) or edge observations (e.g., a stochastic block model). Such distributions are used in a plethora of scenarios to model social, biological, or other network phenomena. For instance, the nodes may represent individual neurons for which we observe noisy versions of their spiking activity. Exciting recent work has sharply characterized the requirements for learning the underlying graph structures from noisy observations, as well as proposed efficient inference algorithms that can approach these information-theoretic limits.

While learning the full network structure is sometimes useful, we are often interested in changes in network structure in response to external stimuli (e.g., changes in neuronal connectivity as a subject learns a task). Moreover, we often do not have sufficient data to estimate the network before and after the stimuli, in order to compare the differences. Is it possible to directly detect changes in network structure? When is this easier than learning the network structure itself? This talk examines this question via several case studies from the perspective of minimax risk. At a high level, we uncover the following phase transition: testing is statistically easier than recovery when the number of changes is large, but comparable to recovery when the number of changes is small, relative to the size of the network.

Joint work with Aditya Gangrade, Praveen Venkatesh, Zeynep Kahraman, and Venkatesh Saligrama.

Bio: Bobak Nazer is an Associate Professor in the ECE Department and a Distinguished Faculty Fellow in the College of Engineering at Boston University. He received the Ph.D in 2009 and M.S. in 2005 from the University of California, Berkeley and the B.S. in 2003 from Rice University, all in electrical engineering. He is the recipient of the IEEE Communications Society and Information Theory Society Joint Paper Award and the NSF CAREER award in 2013 as well as the Eli Jury award in 2009 from the Berkeley EECS Department.

PhD
MIT

15th October
2021

Streaming Estimation with Markovian Data: Limits and Algorithms

Video: Link 

Abstract:  Standard results in machine learning theory provide near optimal algorithms for learning under the assumption of i.i.d. data in many problems of interest. However, data with temporal dependence occur often in practice in the analysis of time series, system identification and reinforcement learning. In these settings, the data is often assumed to be derived from a mixing Markov process and it is important to learn-on-the-go.

Currently, the theoretical analyses of many algorithms reduce learning from these data sets to learning from independent data by considering one in every mixing time number of samples and dropping the rest. In this talk, we will first see that this is in fact tight in the worst case and naively implementing SGD in these scenarios suffers from slow convergence due to dependencies present in the data. However, in well-specified cases, we can design algorithms to accurately unfurl this dependency structure in order to obtain SGD-style streaming algorithms with i.i.d. data like sample complexity. To this end, we consider the specific tasks of least squares regression with Markovian data and non-linear system identification, and introduce parallel SGD and SGD with Reverse Experience Replay (SGD-RER) which is a rigorous form of the popular heuristic Experience Replay used in practical RL. We then sketch an application to the widely used Q-learning algorithm.

Bio: Dheeraj Nagaraj is a sixth year graduate student at Lab for Information and Decision Systems (LIDS) at MIT advised by Prof. Guy Bresler. His work involves various problems in theoretical machine learning, applied probability and statistics. His current work focuses on representation power of deep neural networks, random graphs with latent geometric structure and stochastic optimization algorithms.

Assistant Professor
UIUC

1st October
2021

Risk-Sensitive Optimization for Electricity Markets

Video: Link 

Abstract: Power system operation is fraught with uncertainties. Electricity markets must evolve to model such uncertainties and optimize available resources against them. In this talk, I will explore algorithm design motivated to tackle risk-sensitive electricity market clearing formulations, where power delivery risk is modeled via the conditional value at risk (CVaR) measure. I will discuss algorithmic architectures and their convergence properties to solve these risk-sensitive optimization problems. The first half of the talk will focus on an optimization problem that can be cast as a large linear program. For this problem, I will discuss an algorithm that shares parallels and differences with Benders’ decomposition. In the second half of this talk, I will consider another risk-sensitive problem for which I will present sample complexity guarantees of a stochastic primal-dual algorithm.

Bio: Subhonmesh Bose is an Assistant Professor in the Department of Electrical and Computer Engineering at UIUC. His research focuses on facilitating the integration of renewable and distributed energy resources into the grid edge, leveraging tools from optimization, control and game theory. Before joining UIUC, he was a postdoctoral fellow at the Atkinson Center for Sustainability at Cornell University. Prior to that, he received his MS and Ph.D. degrees from Caltech in 2012 and 2014, respectively. He received the NSF CAREER Award in 2021. His research projects have been supported by grants from NSF, PSERC, Siebel Energy Institute and C3.ai, among others.

Fei Wei

Postdoc
ASU

24th September
2021

Network coding, group network codes, and the edge removal problem

Video: Link 

Abstract: Network coding is a network communication scheme which deploys encoding at intermediate nodes and can significantly improve the network throughput. Beyond the conventional linear network codes, group network codes are a special family of network coding schemes that are acclimated with an algebraic (group) structure and are known to be (approximately) optimal. However, not much is known on this special family of codes. In this talk, we will introduce on an intriguing problem in the context of network coding, the edge removal problem, which studies the impact of removing a network communication edge from a network. We will observe this problem thought the lens of group network codes, and discuss interesting results regarding to this problem.

Bio: : Fei Wei is a postdoc at Arizona State University working with Prof.Lalitha Sankar and Prof.Oliver Kosut. He obtained his PhD in Electrical Engineering from University at Buffalo, the State University of New York, advised by Prof.Michael Langberg and collaborated with Prof.Michelle Effros from Caltech. He is broadly interested in topics related to network information theory, communication, data security and privacy.

Professor
Caltech

17th September
2021

OPF Hardness: Theory and Practice

Abstract: Optimal power flow (OPF) problems underly numerous power system applications. It is well known that OPF is non-convex and NP-hard. It can be solved approximately either via convex relaxations or local algorithms. Even though OPF is hard in theory, it seems ``easy’’ in practice in that, empirically, semidefinite relaxations are often exact and local algorithms often yield global solutions. We summarize several sufficient conditions for exact relaxations in both single and three-phase radial networks. We describe sufficient or necessary condition for non-convex problems to simultaneously have exact relaxations and no spurious local optima. These conditions help explain widespread empirical experience that local algorithms for OPF problems often work well.

Bio: Steven Low is the F. J. Gilloon Professor of the Department of Computing & Mathematical Sciences and the Department of Electrical Engineering at Caltech, and an Honorary Professor of the Electrical & Electronic Engineering Department at Melbourne University. He was a co-recipient of IEEE best paper awards, an awardee of the IEEE INFOCOM Achievement Award, and the ACM SIGMETRICS Test of Time Award, and is a Fellow of IEEE, ACM, and CSEE. He was well-known for his work on Internet congestion control and semidefinite relaxation of optimal power flow problems in smart grid. His research on networks has been accelerating more than 1TB of Internet traffic every second since 2014. His research on smart grid is providing large-scale cost-effective electric vehicle charging to workplaces. He received his B.S. from Cornell and PhD from Berkeley, both in EE.

This was a joint talk with IE Decision Systems Engineering Fall ’21 Seminar Series


Regents Professor
Texas A & M University

3rd September
2021

Revisiting Exploration versus Exploitation: Multi-Armed Bandits and Adaptive Control

Abstract: We consider the central problem of exploration versus exploitation that lies at the heart of several dynamic learning problems. We revisit the problem of regret in adaptive control and examine it in the light of recent interest in solving large scale bandit problems. For bandit problems, we present a family of schemes that admits simple index policies whose regret performance appears to be near the best apparently currently available, and at low computational complexity per decision. [Joint work with Ping-Chun Hsieh, Yu-Heng Hung, Xi Liu, Akshay Mete, Rahul Singh, Anirban Bhattacharya and Le Xie].

Bio: P. R. Kumar is a Regents Professor, a University Distinguished Professor, Holder of the O’Donnell Foundation Chair I, and Professor in the Department of Electrical and Computer Engineering at Texas A&M University, and Franklin W. Woeltge Professor Emeritus in the Department of Electrical and Computer Engineering at University of Illinois, Urbana-Champaign. He is an Honorary Professor at IIT Hyderabad.

His current focus includes 5G, Wireless Networks, Cybersecurity, Cyberphysical Systems, Privacy, Unmanned Aerial System Traffic Management, Reinforcement Learning, Machine Learning, and Power Systems.


Assistant Professor
Arizona State University

23rd April
2021

Learning and Adaptation in Millimeter-Wave Networks: a Dual Timescale approach

 Video: Link

Abstract: Mobile broadband data traffic is expected to increase tremendously over the next decade, and cannot be accommodated by current sub-6GHz systems. The millimeter-wave frequencies in the 28-100 GHz range promise to overcome these limitations. Yet, millimeter-wave systems require highly directional beams at transmitter and receiver to overcome the severe pathloss and achieve the promised capacity increase, entailing a large signaling and control overhead. Thus, the design of schemes that learn the propagation environment and mobility of users and adapt to these features to achieve communication-efficient beam-alignment protocols is of utmost importance. In this talk, I will present a learning and adaptation scheme to address this problem, in which the dynamics of the communication beams are learned and then exploited to design adaptive beam-training procedures. Specifically, a dual timescale approach is proposed: on a large timescale, a recurrent deep variational autoencoder (R-VAE) uses noisy beam-training observations to learn a probabilistic model of beam dynamics; on a short timescale, an adaptive beam-training procedure is formulated as a partially observable (PO-) Markov decision process (MDP) and optimized using point-based value iteration (PBVI) by leveraging beam-training feedback and a probabilistic knowledge of beam pairs provided by the R-VAE. In turn, beam-training observations are used to refine the R-VAE via stochastic gradient descent in a continuous process of learning and adaptation. I will conclude this talk by presenting numerical evaluations and comparisons with state-of-the-art algorithms.


Bio: Dr. Nicolò Michelusi (Senior Member, IEEE) received the B.Sc, M.Sc. (both with honors), and Ph.D. degrees from the University of Padova, Italy, in 2006, 2009, and 2013, respectively, and the M.Sc. degree in telecommunications engineering from the Technical University of Denmark in 2009, as part of the T.I.M.E. double degree program. From 2013 to 2015, he was a Postdoctoral Research Fellow at the Ming-Hsieh Department of Electrical Engineering, University of Southern California, and from 2016 to 2020, he was an Assistant Professor at the School of Electrical and Computer Engineering, Purdue University. He is currently an Assistant Professor at the School of Electrical, Computer, and Energy Engineering, Arizona State University. His research, funded by the National Science Foundation and by DARPA, focuses on the design and analysis of distributed wirelessly connected systems using methods from stochastic optimization and machine learning. He authored 25 IEEE Journal papers and more than 50 conference papers. He is an Associate Editor for the IEEE Transactions on Wireless Communications, and a Reviewer for several IEEE journals. He was the Co-Chair for the Distributed Machine Learning and Fog Network workshop at IEEE INFOCOM 2021, the Wireless Communications Symposium at the IEEE Globecom 2020, the IoT, M2M, Sensor Networks, and Ad-Hoc Networking track at IEEE VTC 2020, and the Cognitive Computing and Networking symposium at ICNC 2018. He received the NSF CAREER award in 2021.

Postdoc

Arizona State University

16th April
2021

Network Theoretic Analysis of Maximum a Posteriori Detectors for Optimal Input Detection

 Video: Link

Abstract: In this talk, I will discuss maximum-a-posteriori (MAP) detectors to detect unknown stochastic inputs (e.g., malicious attacks) driving specific nodes a network, using noisy measurements from sensors non-collocated with the input nodes. Starting from a brief introduction to statistical hypothesis testing on state-space systems, I will discuss the key ideas that we leveraged from the theory of Toeplitz operators to obtain the closed-form expressions for the performance of MAP detectors. Next, I will discuss how these expressions help us to study the qualitative behavior of the detectors' performance as a function of network topology and input and sensor nodes' location. Finally, I will highlight the counterintuitive result: for a few classes of networks, the detectors' performance deteriorates as the graphical distance between the input nodes and the sensors' location increases. Our results provide structural insights into the sensor placement from a detection-theoretic viewpoint.

Bio: Rajasekhar Anguluri is a post-doc at Arizona State University, working with Dr. Lalitha Sankar, Dr. Oliver Kosut, and Dr. Gautam Dasarathy. He obtained his M.S. in statistics and Ph.D. in Mechanical Engineering in 2019 from the University of California-Riverside under the supervision of Dr. Fabio Pasqualetti. His research interests include statistical signal processing, systems and control, and power systems. He likes to read about the history of sciences and mathematics and biographies of eccentric scientists.

Assistant Professor

Arizona State University

9th April
2021

Scaling Up Bayesian Uncertainty Quantification for Inverse Problems using Deep Neural Networks

 Video: Link

Abstract: Due to the importance of uncertainty quantification (UQ), Bayesian approach to inverse problems has recently gained popularity in applied mathematics, physics, and engineering. However, traditional Bayesian inference methods based on Markov Chain Monte Carlo (MCMC) tend to be computationally intensive and inefficient for such high dimensional problems. To address this issue, several methods based on surrogate models have been proposed to speed up the inference process. More specifically, the calibration-emulation-sampling (CES) scheme has been proven to be successful in large dimensional UQ problems. In this work, we propose a novel CES approach for Bayesian inference based on deep neural network (DNN) models for the emulation phase. The resulting algorithm is not only computationally more efficient, but also less sensitive to the training set. Further, by using an Autoencoder (AE) for dimension reduction, we have been able to speed up our Bayesian inference method up to three orders of magnitude. Overall, our method, henceforth called Dimension-Reduced Emulative Autoencoder Monte Carlo (DREAM) algorithm, is able to scale Bayesian UQ up to thousands of dimensions in physics-constrained inverse problems. Using two low-dimensional (linear and nonlinear) inverse problems we illustrate the validity this approach. Next, we apply our method to two high-dimensional numerical examples (elliptic and advection-diffusion) to demonstrate its computational advantage over existing algorithms.

Bio: Dr. Shiwei Lan is an assistant professor at School of Mathematical and Statistical Sciences, Arizona State University. His research interests include statistical computing, Bayesian modeling and uncertainty quantification. He obtained his Ph.D. in the Department of Statistics, University of California, Irvine in 2014. After graduation, he did a postdoc at University of Warwick in UK working on functional inference methods and uncertainty quantification. Then he came back to US for another postdoc at California Institute of Technology. Dr. Lan joined ASU in 2019.

Assistant Professor

Univ. of Washington

2nd April
2021

Blockchain protocols made efficient and scalable

 Video: Link

Abstract: Blockchain protocols such as Bitcoin have created the possibility of highly decentralized computing. However, existing blockchain protocols suffer from various problems: (1) energy inefficiency, (2) large confirmation latency (order of hours), and (3) lack of scalability (performance does not improve as more nodes are added to the system). In this talk, we highlight our work in solving these bottlenecks. A primary contribution is the abstraction of the blockchain using tree-processes, which have both a randomized component as well as an adversarial component. We then use this abstraction to prove sharp phase-transitions of these processes yielding security theorems for the corresponding blockchain protocols. We then show how to use this abstraction to achieve (1) energy efficiency and (2) optimal confirmation latency. Finally, we show that (3) the scalability bottleneck of blockchains can be solved using an interesting connection to the classical result of Blackwell in dynamic game theory.

Bio: Sreeram Kannan is an assistant professor at University of Washington, Seattle, where he runs the information theory lab focusing on information theory and its applications in communication networks, machine learning and blockchain systems. He was a postdoctoral scholar at University of California, Berkeley and a visiting postdoc at Stanford University between 2012-2014 before which he received his Ph.D. in Electrical and Computer Engineering and M.S. in Mathematics from the University of Illinois Urbana Champaign. For more details, please visit Professor Sreeram Kannan's website.

Anirbit Mukherjee

Postdoc

Wharton, Statistics

19th March
2021

Some Recent Progresses in Mathematics of Neural Training

 Video: Link

Abstract: One of the paramount mathematical mysteries of our times is to be able to explain the phenomenon of deep-learning. Neural nets can be made to paint while imitating classical art styles or play chess better than any machine or human ever and they seem to be the closest we have ever come to achieving "artificial intelligence". But trying to reason about these successes quickly lands us into a plethora of extremely challenging mathematical questions - typically about discrete stochastic processes. In this talk we will describe two of the most recent directions of our work in this quest.

Firstly we will explain how under mild distributional conditions we can construct iterative algorithms which can train a ReLU gate in the realizable setting in linear time while also keeping track of mini-batching. We will show how this algorithm does approximate training when there is a data-poisoning attack on the training labels. Such convergence proofs remain unknown for S.G.D and we will show in experiments that our algorithm very closely mimics the behaviour of S.G.D. 

In the second half of the talk we will review this very new concept of "local elasticity" of a learning process and demonstrate how it appears to reveal certain universal phase changes during neural training. Then we will introduce a mathematical model which reproduces some of these key properties in a semi-analytic way. We will end by delineating various open questions in this theme of macroscopic phenomenology with neural nets

This is joint work with Prof. Weijie Su (Wharton, Statistics), Prof. Sayar Karmakar (U Florida, Statistics) and Phani Deep (Amazon, India) 

Bio: Anirbit Mukherjee obtained his Ph.D. in applied mathematics at the Johns Hopkins University advised by Prof. Amitabh Basu. He is now a post-doc at Wharton (UPenn), Statistics with Prof. Weijie Su. He specializes in deep-learning theory and has been awarded 2 fellowships from JHU for this research - the Walter L. Robb Fellowship and the inaugural Mathematical Institute for Data Science Fellowship. Earlier, he was a researcher in Quantum Field Theory, while doing his undergrad in physics at the Chennai Mathematical Institute (CMI) and masters in theoretical physics at the Tata Institute of Fundamental research (TIFR)

Professor, Arizona State

12th March
2021

A User Guide to Low-Pass Graph Signal Processing and its Applications

 Video: Link

Abstract: The notion of graph filters can be used to define generative models for graph data. In fact, the data obtained from many examples of network dynamics may be viewed as the output of a graph filter. With this interpretation, classical signal processing tools such as frequency analysis have been successfully applied with analogous interpretation to graph data, generating new insights for data science. What follows is a user guide on a specific class of graph data, where the generating graph filters are low-pass, i.e., the filter attenuates contents in the higher graph frequencies while retaining contents in the lower frequencies. Our choice is motivated by the prevalence of low-pass models in application domains such as social networks, financial markets, and power systems. We illustrate how to leverage properties of low-pass graph filters to learn the graph topology or identify its community structure; efficiently represent graph data through sampling, recover missing measurements, and de-noise graph data; the low-pass property is also used as the baseline to detect anomalies

Bio: Anna Scaglione (M.Sc.'95, Ph.D. '99) is currently a Professor of Electrical, Computer and Energy Engineering at Arizona State University. She was Professor of Electrical and Computer Engineering previously at the University of California at Davis (2008-2014) and at Cornell University (2001-2008), where she became Associate Professor with tenure in 2006. Prior to joining the engineering faculty at Cornell, Scaglione was an Assistant Professor at the University of New Mexico (2000-2001). Dr. Scaglione’s expertise is in the broad area of statistical signal processing with application to communication networks, electric power systems/intelligent infrastructure and network science. Dr. Scaglione was elected an IEEE fellow in 2011. She is the recipient of the 2000 IEEE Signal Processing Transactions Best Paper Award, the 2013, IEEE Donald G. Fink Prize Paper Award for the best review paper in that year among all IEEE publications. Also, her work with her student earned the 2013 IEEE Signal Processing Society Young Author Best Paper Award (Lin Li) and several best conference paper awards. She was SPS Distinguished Lecturer for the years 2019-2020 and is the recipient of the 2020 Technical Achievement Award from the IEEE Communication Society Technical Committee on Smart Grid Communications

Asst. Professor, Columbia Univ.

5th March
2021

Learning in structured MDPs with convex cost functions : Improved regret bounds for inventory management

 Video: TBU

Abstract: The stochastic inventory control problem under censored demands is a fundamental problem in revenue and supply chain management. A simple class of policies called "base-stock policies'' is known to be asymptotically optimal for this problem in certain settings, and further, the convexity of long-run average-cost under such policies has been established. In this work, we present a learning algorithm for the stochastic inventory control problem under lost sales penalty and positive lead times, when the demand distribution is a priori unknown. Our main result is a bound of O(L\sqrt{T}+D) on the regret against the best basestock policy. Here T is the time horizon, L is the fixed and known lead time, and D is an unknown parameter of the demand distribution described roughly as the number of time steps needed to generate enough demand for depleting one unit of inventory. Our results significantly improve the existing regret bounds for this problem. Notably, even though the state space of the underlying Markov Decision Process (MDP) in this problem is continuous and L-dimensional, our regret bounds depend linearly on L. Our techniques utilize convexity of the long-run average cost and a newly derived bound on `bias' of base-stock policies, to establish an almost blackbox connection between the problem of learning and optimization in such MDPs and stochastic convex bandit optimization. The techniques presented here may be of independent interest for other settings that involve large structured MDPs but with convex cost functions.


Bio: Shipra Agrawal is Cyrus Derman Assistant Professor of the Department of Industrial Engineering and Operations Research. She is also affiliated with the Department of Computer Science and the Data Science Institute, at Columbia University. Her research spans several areas of optimization and machine learning, including online optimization, multiarmed bandits, online learning, and reinforcement learning. Shipra serves as an associate editor for Management Science, Mathematics of Operations Research, and INFORMS Journal on Optimization. Her research is supported by an NSF CAREER award and faculty research awards from Google and Amazon.

Postdoc,
Arizona State Univ.

26th February
2021

Geometric Constraints for Learning Representations for Visual Data 

 

Abstract: Representation of visual data is a connecting link between perceptual world and machine based processing. Over the decades, the computer vision community is dedicated to improve these representations, so that it can assist humans in a wide range of applications from medical imaging to visual search and face recognition systems to name a few. In this talk, I will present how geometric constraints can be used to aid in learning representations for various computer vision applications that either have access to only limited amount of labeled training data, abundant unlabeled training data or a combination of two. The talk will cover two types of geometric constraints: manifold and semantic. The first part of the talk will cover the application of manifold constraint in the unsupervised learning of disentangled representations, that improves the interpretability of deep networks. The second part of the talk will cover an interesting application for visual animal biometrics for wildlife conservation using semantic constraints. We will see that such constraints result in improved robustness and generalization of the representations for primate face recognition as well as tiger re-identification problems.

Asst. Professor, UIUC

12th February
2021

The Measurement and Mismeasurement of Trustworthy ML

 Video: Link

Abstract: Across healthcare, science, and engineering, we increasingly employ machine learning (ML) to automate decision-making that, in turn, affects our lives in profound ways. However, ML can fail, with significant and long-lasting consequences. Reliably measuring such failures is the first step towards building robust and trustworthy learning machines. Consider algorithmic fairness, where widely-deployed fairness metrics can exacerbate group disparities and result in discriminatory outcomes. Moreover, existing metrics are often incompatible. Hence, selecting fairness metrics is an open problem. Measurement is also crucial for robustness, particularly in federated learning with error-prone devices. Here, once again, models constructed using well-accepted robustness metrics can fail. Across ML applications, the dire consequences of mismeasurement are a recurring theme. This talk will outline emerging strategies for addressing the measurement gap in ML and how this impacts trustworthiness.


Bio: Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo's research interests are in developing the principles and practice of trustworthy machine learning. Additionally, Koyejo focuses on applications to neuroscience and healthcare. Koyejo completed his Ph.D. in Electrical Engineering at the University of Texas at Austin, advised by Joydeep Ghosh, and completed postdoctoral research at Stanford University. His postdoctoral research was primarily with Russell A. Poldrack and Pradeep Ravikumar. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence (UAI), a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping (OHBM). Koyejo serves on the board of the Black in AI organization.

Arizona State University

20th November
2020

The Multiple-Access Channel Is Stranger Than You Think

 Video: Link

Abstract: The multiple-access channel (MAC) is the network information theory problem in which multiple transmitters each send a message to a common receiver. While generally considered to be a straightforward extension of the point-to-point channel, in fact the MAC is much stranger than that. In this talk I will present three stories of MAC strangeness. The first story is about the difference between average probability of error and maximal probability of error. The second concerns the so-called cooperation facilitator model, in which a small amount of cooperation between transmitters has a disproportionate effect on achievable rates. The final story is about my recent work characterizing the second-order behavior of the MAC via a new measure of dependence called wringing dependence.

Arizona State University

13th November
2020

On Rate-optimal Uniform Concentration Inequalities for Shannon Entropies

 Video: Link

Abstract: We present a new type of exponential decay concentration inequalities that bounds the tail probability of the difference between the log-likelihood of discrete random variables and the negative entropy. In contrast to classical Bernstein’s inequality and Hoeffding’s inequality when applied to log-likelihoods, the new bound is independent of the parameters and therefore does not blow up as the parameters approach 0 or 1. We further present a refined inequality that achieves the optimal rate, where is the sample size and is the number of possible values of the discrete variable. The key step in the proof is bound the moment generating function. We prove the bound by viewing it as a non-convex optimization problem and showing the duality gaps are zero by techniques in real analysis. The new inequalities strengthen certain theoretical results on likelihood-based methods for community detection in networks and can be applied to other likelihood-based methods for binary data.

Adaptive Experimental Design for Best Identification and Multiple Testing

 Video: Link

Abstract: Adaptive experimental design (AED), or active learning, leverages already-collected data to guide future measurements, in a closed loop, to collect the most informative data for the learning problem at hand. In both theory and practice, AED can extract considerably richer insights than any measurement plan fixed in advance, using the same statistical budget. Unfortunately, the same mechanism of feedback that can aid an algorithm in collecting data can also mislead it: a data collection heuristic can become overconfident in an incorrect belief, then collect data based on that belief, yet give little indication to the practitioner that anything went wrong. Consequently, it is critical that AED algorithms are provably robust with transparent guarantees. In this talk I will present my group’s recent work on near-optimal approaches to adaptive testing with false discovery control and the best-arm identification problem for linear bandits, and how these approaches relate to, and leverage, ideas from non-adaptive optimal linear experimental design.

Bio: Kevin Jamieson is an Assistant Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington and is the Guestrin Endowed Professor in Artificial Intelligence and Machine Learning. He received his B.S. in 2009 from the University of Washington, his M.S. in 2010 from Columbia University, and his Ph.D. in 2015 from the University of Wisconsin - Madison under the advisement of Robert Nowak, all in electrical engineering. He returned to the University of Washington as faculty in 2017 after a postdoc with Benjamin Recht at the University of California, Berkeley. Jamieson’s research explores how to leverage already-collected data to inform what future measurements to make next, in a closed loop. His work ranges from theory to practical algorithms with guarantees to open-source machine learning systems and has been adopted in a range of applications, including measuring human perception in psychology studies, adaptive A/B/n testing in dynamic web-environments, numerical optimization, and efficient tuning of hyperparameters for deep neural networks.

Arizona State University

23rd October
2020

Shared Spectrum / Radar-communications coexistence: Recent results

 Video: Link

Abstract: The limited availability of frequency spectrum requires greater spectral efficiency to meet the increasing demands for communication (comm.) and data services. Thus, we explore the possibility of diverse RF systems coexisting within the same frequency band as a means of improving spectral efficiency. Specifically, radars coexisting in the same frequency band as comm. systems are of interest, as this presents a set of new challenges for system design and analysis. We first develop a performance bound for a cooperative radar and comm. system coexisting within the same frequency band. Second, we develop new theory for predicting the receiver operating characteristics of a radar receiver cooperating with an in-band comm. system.

Arizona State University

16th October
2020

Bayesian Optimization: Challenges and Opportunities

Video: Link

Abstract: Bayesian Optimization is attracting increasing attention for its ability to be applied to arbitrarily complex problems. In this talk, we focus on continuous optimization problems of black box functions, and we look into three challenges:

(i) Accelerating Bayesian optimization

(ii) Make use of multiple input sources to evaluate the expensive objective function

(iii) Scale algorithms to high dimensional cases.

In view of the first challenge we present some of the speaker research in integrating Local Optimization with Bayesian optimization within the general framework GLOBO. In view of challenge (ii) the work in Multi-Fidelity Bayesian Optimization and the associated challenges is presented. Finally, we introduce BOFiP a new approach to scale Bayesian optimization that makes use of game theory.

Georgia Institute of Technology

9th October
2020

Mitigating the Impact of Bias in Selection Algorithms

 

Abstract: The introduction of automation into the hiring process has put a spotlight on a persistent problem: discrimination in hiring on the basis of protected-class status. Left unchecked, algorithmic applicant-screening can exacerbate pre-existing societal inequalities and even introduce new sources of bias; if designed with bias-mitigation in mind, however, automated methods have the potential to produce fairer decisions than non-automated methods. In this work, we focus on selection algorithms used in the hiring process (e.g., resume-filtering algorithms) given access to a "biased evaluation metric". That is, we assume that the method for numerically scoring applications is inaccurate in a way that adversely impacts certain demographic groups.

We analyze the classical online secretary algorithms under two models of bias or inaccuracy in evaluations: (i) first, we assume that the candidates belong to disjoint groups (e.g., race, gender, nationality, age), with unknown true utility Z, and “observed” utility Z/\beta for some unknown \beta that is group-dependent, (ii) second, we propose a “poset” model of bias, wherein certain pairs of candidates can be declared incomparable.  We show that in the biased setting, group-agnostic algorithms for online secretary problem are suboptimal, often causing starvation of jobs for groups with \beta>1. We bring in techniques from matroid secretary literature and order theory to develop group-aware algorithms that are able to achieve certain “fair” properties, while obtaining near-optimal competitive ratios for maximizing true utility of hired candidates in a variety of adversarial and stochastic settings. Keeping in mind the requirements of U.S. anti-discrimination law, however, certain group-aware interventions can be construed as illegal, and we will conclude the talk by partially addressing tensions with the law and ways to argue legal feasibility of our proposed interventions. This talk is based on work with Jad Salem and Deven R. Desai.

Arizona State University

25th September
2020

Alpha-loss: A Tunable Class of Loss Functions for Robust Learning

Video: Link

Abstract: In this talk, we introduce alpha-loss as a parameterized class of loss functions that resulted from operationally motivating information-theoretic measures. Tuning the parameter alpha from 0 to infinity yields a class of loss functions that admit continuous interpolation between log-loss (alpha=1), exponential loss (alpha=1/2), and 0-1 loss (alpha=infinity). We discuss how different regimes of the parameter alpha enables the practitioner to tune the sensitivity of their algorithm towards two emerging challenges in learning: robustness and fairness. We discuss classification properties of the class, information-theoretic interpretations, and the optimization landscape of the average loss as viewed through the lens of Strict-Local-Quasi-Convexity under the logistic regression model. Finally, we comment on ongoing and future work on different applications of alpha-loss including deep neural networks, federated learning, and boosting.

Arizona State University

18th September
2020

Distributed Algorithms for Optimization in Networks

Video: Link

Abstract: We will overview the distributed optimization algorithms starting with the basic underlying idea illustrated on a prototype problem in machine learning. In particular, we will focus on convex minimization problem where the objective function is given as the sum of convex functions, each of which is known by an agent in a network. The agents communicate over the network with a task to jointly determine a minimum of the sum of their objective functions. The communication network can vary over time, which is modeled through a sequence of graphs over a static set of nodes (representing the agents in a system). In this setting, the distributed first-order methods will be discussed that make use of an agreement protocol, which is a mechanism replacing the role of a coordinator. We will discuss some refinements of the basic method and conclude with more recent developments of fast methods that can match the performance of centralized methods.

Arizona State University

11th September
2020

Clinical speech analytics: algorithms, applications, and information limits

Video: Link

Abstract: The ability to share our thoughts and ideas through spoken communication is fragile. Even the simplest verbal response requires a complex sequence of events. It requires thinking of the words that best convey your message; sequencing these words appropriately; and then sending signals to the muscles required to produce speech. The slightest damage to the brain areas that orchestrate these events can manifest in speech and language problems. These disturbances offer a window into brain functioning and have gained popularity as digital biomarkers in clinical applications in neurology. In the first part of this presentation, I will present an overview of several projects where we use interpretable measures of speech and language production as proxies for motor and cognitive health. I will provide an overview of how these algorithms are validated and what clinical questions they can help answer. If time permits, in the second part of the talk, I will discuss recent results from a project that aims to characterize the information limits inherent in speech as a diagnostic. This work provides a first look at how well we can answer fundamental questions like “What are the limits of how well I can detect a disturbance in neurological health from only recorded speech?”