Machine Learning Seminar Series

CV

As of Sep 2022, this seminar series has been rebranded as "CSE DSI Machine Learning Seminar Series", fully sponsored and supported by UMN CSE Data Science Initiative (DSI). All future seminar info will be posted on the DSI website.

... organized by

and sponsored by the College of Science and Engineering... to bring together faculty, students, and industrial partners who are interested in the theoretical, computational, and applied aspects of machine learning, to pose problems, exchange ideas, and foster collaborations.

Happens every Wed 12-1pm in 2022. Interested to give a talk? Please contact any of us in the organization team. Web master: Le Peng (peng0347@umn.edu).

To subscribe to our email list:

UMN students: Log into your UMN web email (i.e., Gmail), go to Groups and search for "Machine Learning and Optimization Student". You should be able to join the group directly from there.

UMN researchers: please contact the organization team to be added to ml-researcher

2022 Spring

Upcoming Event (Uncollapse to see the full list)


Jun 22 2022


Jun 29 2022


Jul 6 2022


Jul 13 2022


Jul 20 2022


Jul 27 2022


Aug 3 2022


Aug 10 2022


Aug 17 2022


Aug 24 2022


Aug 31 2022

Dan Hendrycks (UC Berkeley)



Add the event to you google calendar by choose the event(Thursday) and clicking "copy to my calendar»"

(if it is not working, try "+Google Calendar")


**This is restricted to UMN account**

Aug. 10th

Speaker: Sam Park

  • CASIL, MIT

  • Bio : Sung Min (Sam) Park is a PhD student at MIT, advised by Prof. Aleksander Madry. His research interests are in robust and reliable machine learning, with a recent focus on understanding models through the lens of data.

Talk information

  • Title: Datamodels: Predicting Predictions from Training Data

  • Time: Wednesday, Aug. 10th, 2022 12:001:00 pm

  • Location: Online via zoom (join)

Abstract

Machine learning models tend to rely on an abundance of training data. Yet, understanding the underlying structure of this data—and models' exact dependence on it---remains a challenge.

In this talk, we will present a framework for directly modeling predictions as functions of training data. This framework, given a dataset and a learning algorithm, pinpoints---at varying levels of granularity---the relationships between train and test point pairs through the lens of the corresponding model class. Even in its most basic version, our framework enables many applications, including discovering data subpopulations, quantifying model brittleness via counterfactuals, and identifying train-test leakage.

Based on joint work with Andrew Ilyas, Logan Engstrom, Guillaume Leclerc, and Aleksander Madry.

July 27th

Speaker: Bradley Erickson

  • Mayo Clinic

  • Bio : Brad Erickson, MD PhD, received his MD and PhD degrees from Mayo Clinic. He went on to be trained in radiology, and then a Neuroradiology fellowship at Mayo, and has been on staff at Mayo for more than 20 years. He does clinical Neuroradiology, has been chair of the Radiology Informatics Division and was previously Associate Chair for Research. He has been vice chair of Information Technology for Mayo Clinic. He has been awarded multiple external grants, including NIH grants on MS, brain tumors, polycystic kidney disease and medical image processing. He is a former president of the Society of Imaging Informatics in Medicine, was the Chair of the Board of Directors for the American Board of Imaging Informatics and is on the Board of the IHE USA. Dr. Erickson has received numerous awards, including SIIM Gold Medal, Academic Radiology Distinguished investigator. He holds several patents and has been involved in 3 startup companies, including founder of TeraMedica, which was the first commercially successful vendor-neutral archive, and FlowSIGMA which is the first company focused on applying Intelligent Process Automation to the clinical side of medicine.

Talk information

  • Title: AI in Medical Imaging

  • Time: Wednesday, July 27th, 2022 12:001:00 pm

  • Location: Online via zoom (join)

Abstract

There has been a huge wage of excitement for applying Deep learning to medical images. The initial hype has led to many disappointments and painful lessons. But there have been a few successful applications, and the potential for gaining important clinical value from deep learning still exists. The lessons are that one must be very vigilant for subtle biases or unexpected associations that can lead to poor outcomes. But when properly executed, Deep Learning can also lead to valuable insights and improved medical care. This talk will provide a brief coverage of deep learning applied to medical images, and then discuss both the painful lessons, and the great potential of this technology.


July 20th

Speaker: Sören Laue

  • University of Kaiserslautern (Germany)

  • Bio : Soeren Laue studied mathematics, physics, and computer science at the University of Leipzig and the University of Saarbruecken, Germany. He obtained his PhD from the Max-Planck-Institute for Informatics in the area of approximation algorithms for geometric optimization problems. He became a senior researcher with a focus on efficient optimization algorithms and machine learning. In 2022, he because a professor at the University of Kaiserslautern where he is heading the Algorithms for Machine Learning group. His research interests include efficient optimization algorithms for machine learning problems and efficient computation of matrix and tensor derivatives. He created the MatrixCalculus.org web service.

Talk information

  • Title: GENO -- Optimization for Classical Machine Learning Made Fast and Easy

  • Time: Wednesday, July 20th, 2022 12:001:00 pm

  • Location: Online via zoom (join)

Abstract

Most problems from classical machine learning can be cast as an optimization problem. I will present GENO (GENeric Optimization), a framework that lets the user specify a constrained or unconstrained optimization problem in an easy-to-read modeling language. GENO then generates a solver that can solve this class of optimization problems. The generated solver is usually as fast as hand-written, problem-specific, and well-engineered solvers. Often the solvers generated by GENO are faster by a large margin compared to recently developed solvers that are tailored to a specific problem class. I will dig into some of the algorithmic details, e.g., computing derivatives of matrix and tensor expressions, the optimization methods used in GENO, and their implementation in Python.


July 13th

Speaker: Efrat Shimron

  • Department of Electrical Engineering and Computer Sciences, UC Berkeley

  • Bio : Efrat is a postdoc in the Department of Electrical Engineering and Computer Sciences (EECS) at UC Berkeley, working with Prof. Miki Lustig. Her research focuses on developing machine learning techniques for MRI, focusing on dynamic body imaging. She had previously obtained a PhD from the Technion - Israel Institute of Technology, where she developed Compressed Sensing techniques for rapid MRI. Efrat's research received many international excellence awards and her work on identifying “data crimes” in medical AI algorithms received wide media coverage.

Talk information

  • Title: Implicit Data Crimes: Machine Learning Bias Arising from Misuse of Public Data

  • Time: Wednesday, July 13th, 2022 12:001:00 pm

  • Location: Online via zoom (join)

Abstract

Although open-access databases are an important resource in the current deep learning (DL) era, they are sometimes used in an “off label” manner: data published for one task are used during training of algorithms for a different task. In this seminar I will show that this leads to biased, overly optimistic results of well-known inverse problem solvers, focusing on algorithms developed for magnetic resonance imaging (MRI) reconstruction. I will show that when such algorithms are trained using off-label data, they yield biased results, with up to 48% artificial improvement. The underlying cause is that public databases are often preprocessed using hidden pipelines, which change the data features and improve the inverse problem conditioning. My works shows that Compressed Sensing, Dictionary Learning, and Deep Learning algorithms are all prone to this form of bias. Furthermore, once trained, these algorithms exhibit poor generalization to real-world data. To raise awareness to the growing problem of naïve use of public databases, the study refers to publication of biased results as “data crimes”.


Jun. 15th

Speaker: Alex Dimakis

  • UT Austin

  • Bio : Alexandros G. Dimakis (Alex Dimakis) is a UT Austin Professor and the co-director of the National AI Institute on the Foundations of Machine Learning (IFML). He received his Ph.D. from UC Berkeley and the Diploma degree from NTU in Athens, Greece. He received several awards including the James Massey Award, NSF Career, a Google research award, the UC Berkeley Eli Jury dissertation award, and several best paper awards. He served as an Associate Editor for IEEE Transactions on Information Theory, as an Area Chair for major Machine Learning conferences (NeurIPS, ICML, AAAI) and as the chair of the Technical Committee for MLSys 2021. His research interests include information theory and machine learning with a current focus on unsupervised learning and inverse problems. He is an IEEE Fellow for contributions to distributed coding and learning.

Talk information

  • Title: Deep Generative Models and Inverse Problems

  • Time: Wednesday, Jun. 15th, 2022 12:001:00 pm

  • Location: Online via zoom (join)(video)

Abstract

Sparsity has given us MP3, JPEG, MPEG, Faster MRI and many fun mathematical problems. Deep generative models like GANs, VAEs, invertible flows and Score-based models are modern data-driven generalizations of sparse structure. We will start by presenting the CSGM framework by Bora et al. to solve inverse problems like denoising, filling missing data, and recovery from linear projections using an unsupervised method that relies on a pre-trained generator. We generalize compressed sensing theory beyond sparsity, extending Restricted Isometries to sets created by deep generative models. Our recent results include establishing theoretical results for Langevin sampling from full-dimensional generative models, generative models for MRI reconstruction and fairness guarantees for inverse problems.


Jun. 8th

Speaker: Sewoong Oh

  • University of Washington

  • Bio : Sewoong Oh is an Associate Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Previous to joining University of Washington in 2019, he was an Assistant Professor in the department of Industrial and Enterprise Systems Engineering at University of Illinois at Urbana-Champaign since 2012. He received his PhD from the department of Electrical Engineering at Stanford University in 2011, under the supervision of Andrea Montanari. Following his PhD, he worked as a postdoctoral researcher at Laboratory for Information and Decision Systems (LIDS) at MIT, under the supervision of Devavrat Shah.

Talk information

  • Title: Differential privacy meets robust statistics

  • Time: Wednesday, Jun. 8th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

Consider a scenario where we are training a model or performing statistical analyses on a shared dataset with entries collected from several contributing individuals. Differential privacy provides protection against membership inference attacks that try to reveal sensitive information. Robust estimators provide protection against data poisoning attacks where malicious contributors inject corrupted data. Even though both types of attacks are powerful and easy to launch in practice, there is no practical algorithm providing protection against both simultaneously. In the first half of this talk, I will present the first efficient algorithm that guarantees both differential privacy and robustness to the corruption of a fraction of the data. I will focus on the canonical problem of mean estimation, which is a critical building block in many algorithms including stochastic gradient descent for training deep neural networks. In the second half of this talk, I will present a new framework that bridges differential privacy and robust statistics, which we call High-dimensional Propose-Test-Release (HPTR). This is a computationally intractable approach, but universally applicable to several statistical estimation problems including mean estimation, linear regression, covariance estimation, and principal component analysis. In most of these cases, HPTR achieves a near-optimal sample complexity by exploiting robust statistics in the algorithm, thus characterizing the minimax error rate of the corresponding private estimation problems for the first time. This talk is based on two papers: https://arxiv.org/abs/2102.09159 and https://arxiv.org/abs/2111.06578 .



Jun. 1th

Speaker: Francesco Croce

  • University of Tübingen

  • Bio : Francesco Croce is a Ph.D. student in the Machine Learning group at the University of Tübingen, Germany. He received his BS in Mathematics for Finance and Insurance and his MS in Mathematics from the University of Torino, Italy. His research focuses on adversarial attacks in different threat models and provable robustness.

Talk information

  • Title: Towards standardized and accurate evaluation of the robustness of image classifiers against adversarial attacks

  • Time: Wednesday, Jun. 1th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

It is well known that image classifiers are vulnerable to adversarial perturbations, and many defenses have been suggested to mitigate this phenomenon. However, testing the effectiveness of a defense is not straightforward. We propose a protocol for standardized and accurate evaluation of a large class of adversarial defenses, which allows to benchmark and track the progress of adversarial robustness in several threat models. Finally, we discuss the current limitations of standardized evaluations, and in which cases adaptive attacks might still be necessary.

May 25th

Speaker: Gauri Joshi

  • ECE Dept, Carnegie Mellon University

  • Bio : Gauri Joshi is an assistant professor in the ECE department at Carnegie Mellon University since September 2017. Previously, she worked as a Research Staff Member at IBM T. J. Watson Research Center. Gauri completed her Ph.D. from MIT EECS in June 2016. She received her B.Tech and M.Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Bombay in 2010. Her awards and honors include the NSF CAREER Award (2021), ACM Sigmetrics Best Paper Award (2020), NSF CRII Award (2018), IBM Faculty Research Award (2017), Best Thesis Prize in Computer science at MIT (2012), and Institute Gold Medal of IIT Bombay (2010).

Talk information

  • Title: Tackling Computational Heterogeneity in Federated Learning

  • Time: Wednesday, May 25th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

The future of machine learning lies in moving both data collection as well as model training to the edge. The emerging area of federated learning seeks to achieve this goal by orchestrating distributed model training using a large number of resource-constrained mobile devices that collect data from their environment. Due to limited communication capabilities as well as privacy concerns, the data collected by these devices cannot be sent to the cloud for centralized processing. Instead, the nodes perform local training updates and only send the resulting model to the cloud. A key aspect that sets federated learning apart from data-center-based distributed training is the inherent heterogeneity in data and local computation at the edge clients. In this talk, I will present our recent work on tackling computational heterogeneity in federated optimization, firstly in terms of heterogeneous local updates made by the edge clients, and secondly in terms of intermittent client availability.

Apr 20th

Speaker: Ruoqing Zhu

  • Statistics at the University of Illinois at Urbana - Champaign

  • Bio : Dr. Ruoqing Zhu is an Assistant Professor of Statistics at the University of Illinois at Urbana - Champaign. He received his B.S. in Mathematics from Nanjing University in 2006, M.A. of statistics from the Bowling Green State University in 2008 and Ph.D. in Biostatistics from the University of North Carolina at Chapel Hill in 2013. He worked as a postdoctoral associate at Yale University, department of Biostatists from 2013 to 2015 before joining Department of Statistics at UIUC. At UIUC, he is a course co-director and Biostat thread lead at the newly founded Carle Illinois College of medicine. He is also affiliated with the National Center for Supercomputing Applications and the Carl R. Woese Institute for Genomic Biology.

Talk information

  • Title: Proximal Temporal Consistent Learning for Estimating Infinite Horizon Dynamic Treatment Regimes

  • Time: Wednesday, Apr. 20th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

Recent advances in mobile health (mHealth) technology provide an effective way to monitor individuals' health statuses and deliver just-in-time personalized interventions. However, the practical use of mHealth technology raises unique challenges to existing methodologies on learning an optimal dynamic treatment regime. Many mHealth applications involve decision-making with large numbers of intervention options and under an infinite time horizon setting where the number of decision stages diverges to infinity. In addition, temporary medication shortages may cause optimal treatments to be unavailable, while it is unclear what alternatives can be used. To address these challenges, we propose a Proximal Temporal consistency Learning (pT-Learning) framework to estimate an optimal regime that is adaptively adjusted between deterministic and stochastic sparse policy models. The resulting minimax estimator avoids the double sampling issue in the existing algorithms. It can be further simplified and can easily incorporate off-policy data without mismatched distribution corrections. We study theoretical properties of the sparse policy and establish finite-sample bounds on the excess risk and performance error. The proposed method is implemented by our proximalDTR package and is evaluated through extensive simulation studies and the OhioT1DM mHealth dataset.

Apr 13th

Speaker: Lu Wei

  • Department of Computer Science at Texas Tech University

  • Bio : Lu Wei is currently an assistant professor in the Department of Computer Science at Texas Tech University. He received a Ph.D. degree from Aalto University, Finland. His current research interests lie in quantum information theory with applications to quantum information processing.

Talk information

  • Title: Recent progress on entanglement estimation

  • Time: Wednesday, Apr. 13th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

Quantum entanglement is the physical phenomenon, the medium, and, most importantly, the resources that enable quantum technologies. In this talk, we survey the recent results in estimating the degree of entanglement of the quantum bipartite model over different random state models and entropies. The problem may also be recast into a data mining problem involving symbolic data.

Apr 6th

Speaker: Emre Kıcıman

  • Microsoft Research

  • Bio : Emre Kıcıman is a Senior Principal Researcher at Microsoft Research, where his research interests span causal inference, machine learning, and AI’s implications for people and society.

Talk information

  • Title: Challenges in Causal Learning and Its Applications

  • Time: Wednesday, Apr. 6th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

Causal inference and causal learning methods promise improved generalizability and robustness as compared to conventional machine learning approaches by relying on patterns generated by stable and robust causal mechanisms, rather than potentially spurious correlational patterns. However, causal approaches require making crucial assumptions about a system or data-generating process that may be unverifiable in the absence of interventional (experimental) data. We find that eliciting causal assumptions from domain experts and validating or refuting these assumptions are key challenges to the practical application of these methods. This talk describes our research efforts to address these challenges---e.g., by seeking new sources of causal assumptions---as well as experiences with DoWhy, our open-source causal inference library, and its third-party usage.

Mar 23th

Speaker: Mihaela van der Schaar

  • University of Cambridge

  • Bio : Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge and a Fellow at The Alan Turing Institute in London. In addition to leading the van der Schaar Lab, Mihaela is founder and director of the Cambridge Centre for AI in Medicine (CCAIM).

Talk information

  • Title: Using ML to discover the underlying models of medicine

  • Time: Wednesday, Mar. 23th, 2022 12:001:00 pm

  • Location: Online via zoom (join)

Abstract

Not Provided

Feb 23th

Speaker: Jingbo Liu

  • Department of Statistics, UIUC

  • Bio : Jingbo Liu received the B.S. in Electrical Engineering degree from Tsinghua University, Beijing, China in 2012, and the M.A. and Ph.D. degrees from Princeton University, Princeton, NJ, USA, in 2014 and 2017, all in electrical engineering. After two years of postdoc at MIT IDSS, he joined the Department of Statistics at the University of Illinois, Urbana-Champaign as an assistant professor. His research interests include signal processing, information theory, coding theory, high dimensional statistics, and the related fields. His undergraduate thesis received the best undergraduate thesis award at Tsinghua University (2012). He gave a semi-plenary presentation at the 2015 IEEE Int. Symposium on Information Theory, Hong-Kong, China. He was a recipient of the Princeton University Wallace Memorial Honorific Fellowship in 2016. His Ph.D. thesis received the Bede Liu Best Dissertation Award of Princeton and the Thomas M. Cover Dissertation Award of the IEEE Information Theory Society (2018).


Talk information

  • Title: A few interactions improve distributed nonparametric estimation

  • Time: Wednesday, Feb. 23th, 2022 12:001:00 pm

  • Location: Online via zoom (join) (video)

Abstract

In recent years, the fundamental limits of distributed/federated learning have been studied under many statistical models, but often in the setting of horizontally partitioning, where data sets share the same feature space but differ in samples. Nevertheless, vertical federated learning, where data sets differ in features, have been in use in finance and medical care. In this talk, we consider a natural distributed nonparametric estimation problem with vertically partitioned datasets. Under a given budget of communication cost or information leakage constraint, we determine the minimax rates for estimating the density at a given point, which reveals that interactive protocols strictly improves over one-way protocols. Our novel estimation scheme in the interactive setting is constructed by carefully identifying a set of auxiliary random variables. The result also implies that interactive protocols strictly improve over one-way for biased binary sequences in the Gap-Hamming problem. (arXiv 2107.00211)

Feb 16th

Speaker: Hamed Hassani

  • Electrical and Systems Eng, U Penn

  • Bio : Hamed Hassani is currently an assistant professor of Electrical and Systems Engineering department as well as the Computer and Information Systems department, and the Statistics department at the University of Pennsylvania. Prior to that, he was a research fellow at Simons Institute for the Theory of Computing (UC Berkeley) affiliated with the program of Foundations of Machine Learning, and a post-doctoral researcher in the Institute of Machine Learning at ETH Zurich. He received a Ph.D. degree in Computer and Communication Sciences from EPFL, Lausanne. He is the recipient of the 2014 IEEE Information Theory Society Thomas M. Cover Dissertation Award, 2015 IEEE International Symposium on Information Theory Student Paper Award, 2017 Simons-Berkeley Fellowship, 2018 NSF-CRII Research Initiative Award, 2020 Air Force Office of Scientific Research (AFOSR) Young Investigator Award, 2020 National Science Foundation (NSF) CAREER Award, and 2020 Intel Rising Star award. He has recently been selected as the distinguished lecturer of the IEEE Information Theory Society in 2022-2023.


Talk information

  • Title: Learning in the Presence of Distribution Shifts: How does the Geometry of Perturbations Play a Role?

  • Time: Wednesday, Feb. 16th, 2022 12:001:00 pm

  • Location: Online via zoom (join) [slides] [video]

Abstract

In this talk, we will focus on the emerging field of (adversarially) robust machine learning. The talk will be self-contained and no particular background on robust learning will be needed. Recent progress in this field has been accelerated by the observation that despite unprecedented performance on clean data, modern learning models remain fragile to seemingly innocuous changes such as small, norm-bounded additive perturbations. Moreover, recent work in this field has looked beyond norm-bounded perturbations and has revealed that various other types of distributional shifts in the data can significantly degrade performance. However, in general our understanding of such shifts is in its infancy and several key questions remain unaddressed.

Feb 2nd

Speaker: Alberto Bietti

  • NYU Center for Data Science

  • Bio : Alberto Bietti received his PhD in 2019 from Inria Grenoble, where he worked under the supervision of Julien Mairal. He was a postdoc at Inria Paris in 2020 and is currently a Faculty Fellow at the NYU Center for Data Science. His main research focus is on the theoretical foundations of deep learning, particularly through the lens of kernel methods.

Talk information

  • Title: Benefits of Convolutional Models

  • Time: Wednesday, Feb. 2nd, 2022 12:001:00 pm

  • Location: Online via zoom (join) [slides] [video]

Abstract

Many supervised learning problems involve high-dimensional data such as images, text, or graphs. In order to make efficient use of data, it is often useful to leverage priors in the problem at hand, such as invariance to certain transformations or stability to small deformations. Empirically, deep convolutional architectures have been very successful on such problems, raising the question of how they are able to capture the structure of these problems for efficient learning. I study this question from a theoretical perspective using kernel methods, in particular convolutional kernels, which are constructed following similar architectural principles, and provide good empirical performance on standard vision benchmarks such as Cifar10. I will present three contributions that highlight the benefits of (deep) convolutional architectures in terms of stability to deformations and sample complexity.

Feb 9th

Speaker: Qizhi He

  • CSGE, University of Minnesota

  • Bio : Qizhi He is an Assistant Professor in the Department of Civil, Environmental, and Geo- Engineering at the University of Minnesota. He received his M.A. in applied mathematics and Ph.D. in structural engineering and computational science from UC San Diego, in 2016 and 2018, respectively. Afterwards, he was a Postdoctoral Research Associate at Pacific Northwest National Laboratory (PNNL), where he developed scientific machine learning methods for modeling flow and transport processes in porous media. His current research interests lie at the intersection of computational mechanics, materials modeling, and data-driven computing, with a focus on advancing data-driven machine learning enabled computational tools to predict mechanics of complex multiphysical processes and improve our fundamental understanding of multiscale materials and structures in engineered and natural systems.

Talk information

  • Title: Machine Learning Enhanced Computational Mechanics for Materials Modeling

  • Time: Wednesday, Feb. 9th, 2022 12:001:00 pm

  • Location: Online via zoom (join)

Abstract

Qizhi He is an Assistant Professor in the Department of Civil, Environmental, and Geo- Engineering at the University of Minnesota. He received his M.A. in applied mathematics and Ph.D. in structural engineering and computational science from UC San Diego, in 2016 and 2018, respectively. Afterwards, he was a Postdoctoral Research Associate at Pacific Northwest National Laboratory (PNNL), where he developed scientific machine learning methods for modeling flow and transport processes in porous media. His current research interests lie at the intersection of computational mechanics, materials modeling, and data-driven computing, with a focus on advancing data-driven machine learning enabled computational tools to predict mechanics of complex multiphysical processes and improve our fundamental understanding of multiscale materials and structures in engineered and natural systems.

Jan. 26th

Speaker: Victor Zavala

  • Department of Chemical and Biological Engineering, University of Wisconsin-Madison

  • Bio : Victor M. Zavala is the Baldovin-DaPra Professor in the Department of Chemical and Biological Engineering at the University of Wisconsin-Madison and a computational mathematician in the Mathematics and Computer Science Division at Argonne National Laboratory. He holds a B.Sc. degree from Universidad Iberoamericana and a Ph.D. degree from Carnegie Mellon University, both in chemical engineering. He is on the editorial board of the Journal of Process Control, Mathematical Programming Computation, and Computers & Chemical Engineering. He is a recipient of NSF and DOE Early Career awards and of the Presidential Early Career Award for Scientists and Engineers. His research interests include computational modeling, statistics, control, and optimization.

Talk information

  • Title: The Euler Characteristic: A General Topological Descriptor for Complex Data

  • Time: Wednesday, Jan. 26th, 2022 12:001:00 pm

  • Location: Online via zoom (join) [video]

Abstract

Datasets are mathematical objects (e.g., point clouds, matrices, graphs, images, fields/functions) that have shape. This shape encodes important knowledge about the system under study. Topology is an area of mathematics that provides diverse tools to characterize the shape of data objects. In this work, we study a specific tool known as the Euler characteristic (EC). The EC is a general, low-dimensional, and interpretable descriptor of topological spaces defined by data objects. We revise the mathematical foundations of the EC and highlight its connections with statistics, linear algebra, field theory, and graph theory. We discuss advantages offered by the use of the EC in the characterization of complex datasets; to do so, we illustrate its use in different applications of interest in chemical engineering such as process monitoring, flow cytometry, and microscopy. We show that the EC provides a descriptor that effectively reduces complex datasets and that this reduction facilitates tasks such as visualization, regression, classification, and clustering.

Jan. 19th

Speaker: Yihe Dong

  • Google Research

  • Bio : Yihe Dong is a machine learning researcher and engineer at Google, with interests in geometric deep learning and natural language processing.

Talk information

  • Title: Attention is not all you need

  • Time: Wednesday, Jan. 19th, 2022 12:001:00 pm

  • Location: Online via zoom (join) [slides] [video]

Abstract

I will be talking about our recent work on better understanding attention. Attention-based architectures have become ubiquitous in machine learning, yet our understanding of the reasons for their effectiveness remains limited. We show that self-attention possesses a strong inductive bias towards "token uniformity". Specifically, without skip connections or multi-layer perceptrons (MLPs), the output converges doubly exponentially to a rank-1 matrix. On the other hand, skip connections and MLPs stop the output from degeneration. Along the way, we develop a useful decomposition of attention architectures. This is joint work with Jean-Baptiste Cordonnier and Andreas Loukas.

Our paper and code are available online:

https://arxiv.org/abs/2103.03404;

https://github.com/twistedcubic/attention-rank-collapse.