Machine Learning Seminar Series

May. 6th

Speaker: Yi Zhou

  • department of ECE at the University of Utah

  • Bio : Yi Zhou is an Assistant Professor affiliated with the department of ECE at the University of Utah. Before joining the University of Utah in 2019, he received a Ph.D. in Electrical and Computer Engineering in 2018 from The Ohio State University and worked as a post-doctoral fellow at Information Initiative at Duke University. Dr. Zhou's research interests are nonconvex & distributed optimization, reinforcement learning, deep learning, statistical machine learning and signal processing.

Talk information

  • Title: On the intrinsic potential and convergence of nonconvex minimax and bi-level optimization

  • Time: Thursday, May. 6th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Many emerging machine learning applications are formulated as either minimax or bi-level optimization problems. This includes, for example, problems in adversarial learning and invariant representation learning that impose robustness and invariance on the model via minimax game, and problems in few-shot learning that train bi-level models for accomplishing different tasks. In this talk, we will provide a unified analysis of the popular optimization algorithms used for solving nonconvex minimax and bi-level optimization problems, based on a novel perspective of identifying the intrinsic potential function of these algorithms. In particular, our analysis establishes the model parameter convergence of these algorithms and characterizes the impact of local function geometry on their convergence rates. Our study reveals that, under some conditions, the dynamic of minimax and bi-level optimization algorithms are similar to that of the gradient descent for nonconvex minimization.

May. 3rd (Cross-listed from Computer Science Colloquium )

Speaker: Rebecca Willett

  • University of Chicago, Department of Statistics and Computer Science

  • Bio : Rebecca Willett is a Professor of Statistics and Computer Science at the University of Chicago. Her research is focused on machine learning, signal processing, and large-scale data science. Willett received the National Science Foundation CAREER Award in 2007, was a member of the DARPA Computer Science Study Group, received an Air Force Office of Scientific Research Young Investigator Program award in 2010, and was named a Fellow of the Society of Industrial and Applied Mathematics in 2021. She is a co-principal investigator and member of the Executive Committee for the Institute for the Foundations of Data Science, helps direct the Air Force Research Lab University Center of Excellence on Machine Learning, and currently leads the University of Chicago’s AI+Science Initiative. She serves on advisory committees for the National Science Foundation’s Institute for Mathematical and Statistical Innovation, the AI for Science Committee for the US Department of Energy’s Advanced Scientific Computing Research program, the Sandia National Laboratories Computing and Information Sciences Program, and the University of Tokyo Institute for AI and Beyond. She completed her PhD in Electrical and Computer Engineering at Rice University in 2005 and was an Assistant then tenured Associate Professor of Electrical and Computer Engineering at Duke University from 2005 to 2013. She was an Associate Professor of Electrical and Computer Engineering, Harvey D. Spangler Faculty Scholar, and Fellow of the Wisconsin Institutes for Discovery at the University of Wisconsin-Madison from 2013 to 2018.

Talk information

  • Title: Machine Learning and Inverse Problems in Imaging

  • Time: Monday, May. 3rd, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Many challenging image processing tasks can be described by an ill-posed linear inverse problem: deblurring, deconvolution, inpainting, compressed sensing, and superresolution all lie in this framework. Recent advances in machine learning and image processing have illustrated that it is often possible to learn inverse problem solvers from training data that can outperform more traditional approaches by large margins. These promising initial results lead to a myriad of mathematical and computational challenges and opportunities at the intersection of optimization theory, signal processing, and inverse problem theory.

In this talk, we will explore several of these challenges and the foundational tradeoffs that underlie them. First, we will examine how knowledge of the forward model can be incorporated into learned solvers and its impact on the amount of training data necessary for accurate solutions. Second, we will see how the convergence properties of many common approaches can be improved, leading to substantial empirical improvements in reconstruction accuracy. Finally, we will consider mechanisms that leverage learned solvers for one inverse problem to develop improved solvers for related inverse problems.

This is joint work with Davis Gilton and Greg Ongie.

Apr. 2nd (Cross-listed from Computer Science Colloquium )

Speaker: Qiaomin Xie

  • Carnegie Mellon University

  • Bio : Qiaomin Xie is a visiting assistant professor in the School of Operations Research and Information Engineering (ORIE) at Cornell. Prior to that, she was a postdoctoral researcher with LIDS at MIT, and was a research fellow at the Simons Institute during Fall 2016. Qiaomin received her Ph.D. degree in Electrical and Computing Engineering from University of Illinois Urbana Champaign, and her B.E. degree in Electronic Engineering from Tsinghua University. Her research interests lie in the fields of stochastic networks, reinforcement learning, computer and network systems. She is the recipient of Google System Research Award 2020, UIUC CSL PhD Thesis Award 2017 and the best paper award from IFIP Performance Conference 2011.

Talk information

  • Title: Reinforcement Learning for Complex Environments: Tree Search, Function Approximators and Markov Games

  • Time: Firday, Apr. 2nd, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Recent literature has witnessed much progress on the algorithmic and theoretical foundations of Reinforcement Learning (RL), particularly for single-agent problems with small state/action spaces. Our understanding and algorithm toolbox for RL under complex environments, however, remain relatively limited. In this talk, I will discuss some of my work on scalable and probably efficient RL for the challenging settings with large spaces and multiple strategic agents.

First, I will focus on simulation-based methods, as exemplified by Monte-Carlo Tree Search (MCTS). MCTS is a powerful paradigm for online planning that enjoys remarkable empirical success, but lacks theoretical understanding. We provide a complete and rigorous non-asymptotic analysis of MCTS. Our analysis develops a general framework based on a hierarchy of bandits, and highlights the importance of using a non-standard confidence bound (also used by AlphaGo) for convergence. I will further discuss combining MCTS with supervised learning and its generalization to continuous action space.

In the second part of the talk, I will discuss on-policy RL for zero-sum Markov games, which generalizes Markov decision processes to multi-agent settings. We consider function approximation to deal with continuous and unbounded state spaces. Based on a fruitful marriage with algorithmic game theory, we develop the first computational efficient algorithm for this setting, with a provable regret bound that is independent of the cardinality and ambient dimension of the state space.

Apr. 1st

Speaker: Rajdeep Chatterjee

  • School of Physics and Astronomy at University of Minnesota

  • Bio :

Talk information

  • Title: Applications of Machine Learning Methods by CMS Experiment at the CERN Large Hadron Collider

  • Time: Thursday, Apr. 1st, 2021 10:0011:00 am

  • Location: Online via zoom (join)

Abstract

The Compact Muon Solenoid (CMS) experiment is one of the two large, general purpose detectors at the European Organization of Nuclear Research (CERN), Large Hadron Collider (LHC). The CMS Collaboration is a multinational scientific collaboration, supported by 44 funding agencies from around the world. The LHC began operation in 2010 and will continue to run with significant upgrades into the next decade. The CMS experiment has been operational and collecting data for seven of the past ten years.

The data are collected from the approximately 130 million detector channels at a rate of 2 MByte at a rate of 40 MHz. After two stages of real-time online selection, where only those events that are of direct interest to the Physics program are selected, one thousand events, or 2 GBytes of data, are stored per second for future offline processing. Using this data, the properties of the recently discovered Higgs boson and many other physics processes at the very highest particle energies are studied. Managing and analyzing the tens of PetaBytes of data and metadata is a major challenge for the CMS Collaboration, which is only expected to become more challenging later in the decade, when the High Luminosity-LHC becomes operational with a collision rate that is a factor of five larger.

In order to cope with these challenges, the CMS collaboration has relied on an extensive and diverse set of supervised and unsupervised machine learning methods at every stage of the experiment, from the online selection of events to the offline analysis of the recorded data. This has resulted both in more efficient detector operations, as well as a significantly improved the quality of the physics results delivered. Some examples are the use of Boosted Decision Trees (BDTs) implemented into Field Programmable Gate Arrays (FPGAs) to select only those events that have the characteristics of the physics processes under study, with the requisite decision made within hundreds of nanoseconds. Autoencoders have been used to monitor the quality of the data being recorded and flagging anomalous detector channels. In order to optimize the identification and reconstruction of physics objects like electrons, photons, jets of particles, a variety of BDT and Deep Neural Network based algorithms have been developed. Many of the flagship physics results reported by the CMS Collaboration, like the measurement of the properties of recently discovered Higgs bosons, have been made possible by the use of dedicated machine learning algorithms.

In this seminar the authors will review the machine learning paradigm employed by the CMS collaboration through specific illustrative examples and also indicate directions for the future. As the LHC inches towards the close of the two-year shutdown and prepares for the Run 3, the collaboration has a golden opportunity to further explore novel machine learning methods that will continue to revolutionize the physics output of the CMS experiment.

Mar. 26th (Cross-listed from Computer Science Colloquium )

Speaker: Biwei Huang

  • Carnegie Mellon University

  • Bio : Biwei Huang is a Ph.D. candidate in the program of Logic, Computation and Methodology at Carnegie Mellon University. Her research interests are mainly in three aspects: (1) automated causal discovery in complex environments with theoretical guarantees, (2) advancing machine learning from the causal perspective, and (3) scientific applications of causal discovery approaches. On the causality side, her research has delivered more reliable and practical causal discovery algorithms by considering the property of distribution shifts and allowing nonlinear relationships, general data distributions, selection bias, and latent confounders. On the machine learning side, her work has shown how the causal view helps in understanding and solving machine learning problems, including classification, clustering, forecasting in nonstationary environments, reinforcement learning, and domain adaptation. Her research contributions have been published in JMLR, ICML, NeurIPS, KDD, AAAI, IJCAI, and UAI. She recently successfully led a NeurIPS’20 workshop on causal discovery and causality-inspired machine learning. She is a recipient of the Presidential Fellowship at CMU and the Apple PhD fellowship in AI/ML.

Talk information

  • Title: Learning and Using Causal Knowledge: A Further Step Towards Machine Intelligence

  • Time: Firday, Mar. 26th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Causality has recently attracted much interest across research communities in machine learning, computer science, and statistics. One of the fundamental problems in causality is how to find the underlying causal structure or causal model. One focus of this talk, accordingly, is how to find causal relationships from observational data, known as causal discovery. Specifically, I will show recent methodological developments in causal discovery in the presence of distribution shifts, together with their theoretical guarantees and other related issues in causal discovery in complex environments. Besides learning causality, another problem of interest is how causality is able to help understand and advance machine learning. I will show how a causal perspective benefits domain adaptation and forecasting in nonstationary environments. With causal representations, one can naturally make predictions under active interventions and achieve the goal by changing the system properly. Even without interventions involved, they help characterize how the data distribution changes from passively observed data, so that knowledge can be transferred in an interpretable, principled, and efficient way.

Mar. 22th (Cross-listed from Computer Science Colloquium )

Speaker: Hao Li

  • Amazon Web Services AI

  • Bio : Hao Li is an Applied Scientist at Amazon Web Services AI, Seattle, where he researches and develops efficient and automatic machine learning for the cloud-based image and video analysis service - Rekognition. He has contributed to the launch of new vertical services including Custom Labels, Content Moderation and Lookout. Before joining AWS, he received his PhD in Computer Science from University of Maryland, College Park, advised by Prof. Hanan Samet and Prof. Tom Goldstein. His research lies at the intersection of machine learning, computer vision and distributed computing, with a focus on efficient, interpretable and automatic machine learning on platforms ranging from high performance cluster to edge devices. His notable research contribution includes the first filter pruning method for accelerating CNNs/ResNets and the loss surface visualization for understanding the generalization of neural nets. His work on the trainability of quantized networks received Best Student Paper Award at ICML’17 Workshop on Principled Approaches to Deep Learning.

Talk information

  • Title: Seeking Efficiency and Interpretability in Deep Learning

  • Time: Monday, Mar. 22th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

The empirical success of deep learning in many fields, especially Convolutional Neural Networks (CNNs) for computer vision tasks, is often accompanied with significant computational cost for training, inference and hyperparameter optimization (HPO). Meanwhile, the mystery of why deep neural networks can be effectively trained with good generalization remains not fully unveiled. With the pervasive application of deep neural networks for critical applications on both cluster and edge devices, there is a surge in demand for efficient, automatic and interpretable model inference, optimization, and adaptation.

In this talk, I will present techniques we developed for reducing neural nets’ inference and training cost, with better understanding about their training dynamics and generalization ability. I will begin by introducing the filter pruning approach for accelerating the inference of CNNs and exploring the possibility of training quantized networks on devices with hardware constraints. Then, I will present how the loss surface of neural networks can be properly visualized with filter-normalized directions, which enables meaningful side-by-side comparisons of generalization ability of neural nets trained in different architectures or hyperparameters. Finally, I will revisit the common practices of HPO for transfer learning tasks. By identifying the correlation among hyperparameters and the connection between task similarity and optimal hyperparameters, the black-box hyperparameter search process can be whitened and expedited.

Mar. 19th (Cross-listed from Computer Science Colloquium )

Speaker: Tao Yu

  • Computer Science at Yale University

  • Bio : Tao Yu is a fourth-year Ph.D. candidate in Computer Science at Yale University. His research aims to build conversational natural language interfaces (NLIs) that can help humans explore and reason over data in any application (e.g., relational databases and mobile apps) in a robust and trusted manner. Tao’s work has been published at top-tier conferences in NLP and Machine Learning (ACL, EMNLP, NAACL, and ICLR). Tao introduced and organized multiple popular shared tasks for building conversational NLIs, which have attracted more than 100 submissions from top research labs and which have become the standard evaluation benchmarks in the field. He designed and developed language models that achieve new state-of-the-art results for seven representative tasks on semantic parsing, dialogue, and question answering. He has worked closely with and mentored over 15 students and collaborated with about 20 researchers from Salesforce Research, Microsoft Research, Columbia University, UC Berkeley, the University of Michigan, and Cornell University. He has been on the program committee for about ten NLP and Machine Learning conferences and workshops, including one of the main organizers of the Workshop of Interactive and Executable Semantic Parsing at EMNLP 2020. For more details, see Tao’s website: https://taoyds.github.io.

Talk information

  • Title: Learning to Build Conversational Natural Language Interfaces

  • Time: Firday, Mar. 19th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Natural language is a fundamental form of information and communication and is becoming the next frontier in computer interfaces. Natural Language Interfaces (NLIs) connect the data and the user, significantly promoting the possibility and efficiency of information access for many users besides data experts. All consumer-facing software will one day have a dialogue interface, the next vital leap in the evolution of search engines. Such intelligent dialogue systems should be able to understand the meaning of language grounded in various contexts and generate effective language responses in different formats for information requests and human-computer communication.

In this talk, I will cover three key developments that present opportunities and challenges for the development of deep learning technologies for conversational natural language interfaces. First, I will discuss the design and curation of large datasets to drive advancements towards neural-based conversational NLIs. Second, I will describe the development of scalable algorithms to parse complex and sequential questions to formal programs (e.g. mapping questions to SQL queries that can execute against databases). Third, I will discuss the general advances of language model pre-training methods to understand the meaning of language grounded in various contexts (e.g. databases and knowledge graphs). Finally, I will conclude my talk by proposing future directions towards human-centered, universal, and trustworthy conversational NLIs.

Mar. 15th (Cross-listed from Computer Science Colloquium )

Speaker: Kayhan Batmanghelich

  • University of Pittsburgh School of Medicine

  • Bio : Kayhan Batmanghelich is an Assistant Professor of the Department of Biomedical Informatics and Intelligent Systems Program with secondary appointments in the Computer Science Department at the University of Pittsburgh and an adjunct faculty in the Machine Learning Department at the Carnegie Mellon University. He received his Ph.D. from the University of Pennsylvania (UPenn) under the supervision of Prof. Ben Taskar and Prof. Christos Davatzikos. He spent three years as a postdoc in Computer Science and Artificial Intelligence Lab (CSAIL) at MIT, working with Prof. Polina Golland. His research is at the intersection of medical vision, machine learning, and bioinformatics. His group develops machine learning methods that address the interesting challenges of AI in medicine, such as explainability, learning with limited and weak data, and integrating medical image data with other biomedical data modalities. His research is supported by awards NIH and NSF, as well as industry-sponsored projects.

Talk information

  • Title: Incorporating Medical Insight into Machine Learning Algorithms for Learning, Inference, and Model Explanation

  • Time: Monday, Mar. 15th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

The healthcare industry is arriving at a new era where the medical communities increasingly employ computational medicine and machine learning. Despite significant progress in the modern machine learning literature, adopting the new approaches has been slow in the biomedical and clinical research communities due to the lack of explainability and limited data. Such challenges present new opportunities to develop novel methods that address AI's unique challenges in medicine.

In this talk, we show examples of incorporating medical insight to improve the statistical power of association between various data modalities, design a novel self-supervised learning algorithm, and develop a context-specific model explainer. This general strategy can be employed to integrate other biomedical data, an exciting future research direction discussed briefly.

Mar. 12th (Cross-listed from Computer Science Colloquium )

Speaker: Ali Anwar

  • IBM Research - Almaden

  • Bio : Ali Anwar is a Research Staff Member at IBM Research Almaden Center. He holds a Ph.D. degree in Computer Science from Virginia Tech. In his earlier years he worked as an open-source tools developer (GNU GDB) at Mentor Graphics. His research interest lies at the intersection of systems and machine learning. The overarching goal of his research is to enable efficient and flexible systems for the growing data demands of modern high-end applications running on existing as well as emerging computing platforms. His current ongoing work focuses on distributed machine/federated learning systems and platforms, serverless and microservice-based systems, and efficient storage for Docker containers.

His research has appeared in a number of premier conferences and workshops in computer systems, AI/ML, and high-performance computing, including USENIX FAST, ATC, HotStorage, ACM/IEEE SC, ACM HPDC, SoCC, AISec [Best Paper Award], and AAAI. He regularly performs professional community services and has served as a program committee member for conferences such as SC, HPDC, ICDCS, CCGrid, and a reviewer for journals like ToS, TPDS, TKDE, TCC and JPDC. He is also an associate editor for Neural Processing Letters. At IBM, he has been recognized as a 2019 Outstanding Research Accomplishment winner for Advancing Adversarial Robustness in AI Models. In 2020, he received two Research Accomplishment awards for his research on Enterprise-Strength Federated Learning for Hybrid Cloud and Edge, and Container Storage. He is also a recipient of Pratt Fellowship awarded by Dept. of Computer Science at Virginia Tech.

Talk information

  • Title: Analyze and rebuild: Redesigning distributed computing systems for the next killer app

  • Time: Firday, Mar. 12th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Modern data applications such as distributed machine learning are revolutionizing all aspects of computing based scientific discovery. As new applications, algorithms, and techniques are invented, the underlying distributed system platforms supporting these uses face fundamentally new challenges. One of such challenges is the workload dynamicity that renders static and design-time system decisions impractical in supporting ever-changing application needs. Studying the workload characteristics of these applications and making informed design decisions can significantly improve the efficiency of the underlying distributed system or platform that enables such applications. Similarly, the resource and data heterogeneity also play an important role in defining the overall performance of these applications.

This talk covers two of my projects where performing workload and resource usage analysis enabled us to design better systems. First, I will show how studying the workload characteristics of Docker - the de facto standard for data center containers management, at enterprise scale using IBM production systems enabled us to better deal with workload dynamicity, and create a number of optimizations to improve application performance. Second, I will present how we enhanced the powerful Federated Learning approach in distributed machine learning by making it aware of the underlying platform characteristics, such as resource and data heterogeneity, and show how the heterogeneity can affect the robustness of trained models under adversarial attacks. I will conclude with a discussion of plans for my future research.

Mar. 11th

Speaker: Shashi Shekhar

  • Department of Computer Science & Engineering at University of Minnesota

  • Bio : Shashi Shekhar is a Mcknight Distinguished University Professor at the University of Minnesota (Computer Science faculty). For contributions to geographic information systems (GIS), spatial databases, and spatial data mining, he was elected an IEEE Fellow as well as an AAAS Fellow and received the IEEE-CS Technical Achievement Award, and the UCGIS Education Award. He was also named a key difference-maker for the field of GIS by the most popular GIS textbook . He has a distinguished academic record that includes 300+ refereed papers, a popular textbook on Spatial Databases (Prentice Hall, 2003) and an authoritative Encyclopedia of GIS (Springer, 2008).

Shashi is serving as a co-Editor-in-Chief of Geo-Informatica : An International Journal on Advances in Computer Sciences for GIS (Springer), and a series editor for the Springer-Briefs on GIS. Earlier, he served on the Computing Community Consortium Council (2012-15), and multiple National Academies' committees including Models of the World for USDOD-NGA (2015), Geo-targeted Disaster Alerts and Warning (2013), Future Workforce for Geospatial Intelligence (2011), Mapping Sciences (2004-2009) and Priorities for GEOINT Research (2004-2005). He also served as a general or program co-chair for the Intl. Conference on Geographic Information Science (2012), the Intl. Symposium on Spatial and Temporal Databases (2011) and ACM Intl. Conf. on Geographic Information Systems (1996). He also served on the Board of Directors of University Consortium on GIS (2003-4), as well as the editorial boards of IEEE Transactions on Knowledge and Data Eng. and IEEE-CS Computer Sc. & Eng. Practice Board.

In early 1990s, Shashi's research developed core technologies behind in-vehicle navigation devices as well as web-based routing services, which revolutionized outdoor navigation in urban environment in the last decade. His recent research results played a critical role in evacuation route planning for homeland security and received multiple recognitions including the CTS Partnership Award for significant impact on transportation. He pioneered the research area of spatial data mining via pattern families (e.g. collocation, mixed-drove co-occurrence, cascade), keynote speeches, survey papers and workshop organization.

Shashi received a Ph.D. degree in Computer Science from the University of California (Berkeley, CA). More details are available from http://www.cs.umn.edu/~shekhar.

Talk information

  • Title: What is special about spatial and spatio-temporal data science?

  • Time: Thursday, Mar. 11th, 2021 10:0011:00 am

  • Location: Online via zoom (join)

Abstract

The importance of spatial and spatio-temporal data science is growing with the increasing incidence and importance of large datasets such as trajectories, maps, remote-sensing images, census and geo-social media. Applications include Public Health (e.g., monitoring spread of disease, spatial disparity, food deserts), Public Safety (e.g., crime hot spots), Public Security (e.g., common operational picture), Environment and Climate (change detection, land-cover classification), M(obile)-commerce (e.g., location-based services), etc.

Classical data science and machine learning techniques often perform poorly when applied to spatial and spatio-temporal data sets because of the many reasons. First, these dataset are embedded in continuous space with implicit relationships (e.g., distance), which are important. Second, the cost of spurious patterns is often high in many spatial application domains, which ask for guardrails (e.g., statistical significance tests) to reduce false positives and chance patterns. In addition, one of the common assumptions in classical statistical analysis and machine learning is that data samples are independently generated from identical distributions. However, this assumption is generally false due to spatio-temporal auto-correlation and variability. Ignoring autocorrelation and variability when analyzing data with spatial and spatio-temporal characteristics may produce hypotheses or models that are inaccurate or inconsistent with the data.

Thus, new methods are needed to analyze spatial and spatio-temporal data. This talk surveys common and emerging methods for spatial classification and prediction (e.g., spatial autoregression, GWR), as well as techniques for discovering interesting, useful and non-trivial patterns such as hotspots (e.g., circular, linear, arbitrary shapes ), spatiotemporal interactions (e.g., co-locations , cascade , tele-connections ), spatial outliers, and their spatio-temporal counterparts.

Mar. 8th (Cross-listed from Computer Science Colloquium )

Speaker: Raymond Yeh

  • Department of Computer Science at University of Illinois at Urbana-Champaign

  • Bio : Yunan Luo is a Ph.D. student advised by Prof. Jian Peng in the Department of Computer Science, University of Illinois at Urbana-Champaign. Previously, he received his Bachelor’s degree in Computer Science from Tsinghua University in 2016. His research interests are in computational biology and machine learning. His research has been recognized by a Baidu Ph.D. Fellowship and a CompGen Ph.D. Fellowship.

Talk information

  • Title: Extracting structures from data: The black-box, the manual and the discovered

  • Time: Monday, Mar. 8th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Representing structure in data is at the heart of computer vision and machine learning, i.e., the act of converting raw data into a useful mathematical form. In this talk, I will discuss solutions that are broadly characterized into three themes: the black-box, the manual, and the discovered. First, I will discuss how to use deep generative models to learn structures for face images and its application to image inpainting. Going beyond black-box models, I will explain how to manually impose structures in deep-nets for human pose-regression. Specifically, I will introduce chirality nets, a family of deep-nets that respects left/right symmetry of human poses. Lastly, I will illustrate how to discover pairwise word-to-object structures in the context of textual-grounding and discuss current efforts towards discovering general structures.

Mar. 5th

Speaker: Neeraja Yadwadkar

  • Computer Science Department at Stanford University

  • Bio : Neeraja Yadwadkar is a post-doctoral research fellow in the Computer Science Department at Stanford University, working with Christos Kozyrakis. She is a Cloud Computing Systems researcher, with a strong background in Machine Learning (ML). Neeraja's research focuses on using and developing ML techniques for systems, and building systems for ML. Neeraja graduated with a PhD in Computer Science from the RISE Lab at University of California, Berkeley, where she was advised by Randy Katz and Joseph Gonzalez. Before starting her PhD, she received her masters in Computer Science from the Indian Institute of Science, Bangalore, India, and her bachelors from the Government College of Engineering, Pune.

Talk information

  • Title: Overcoming the User-Provider Divide in Cloud Computing

  • Time: Firday, Mar. 5th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Today, even after more than a decade of the cloud computing revolution, users still do not have predictable performance for their applications, and the providers continue to suffer loss of revenue due to poorly utilized resources. Moreover, the environmental implications of these inefficiencies are dire: Cloud-hosted data centers consume as much power as a city of a million people and emit roughly as much CO2 as the airline industry. Fighting these implications, especially in the post Moore's law era, is crucial.

My work points out that the root of these inefficiencies is the gap between the users and the providers. To overcome this divide, my research brings out two key insights for building systems that render the cloud smart, a cloud that is easy-to-use, adaptive, and efficient. First, we must design interfaces to these systems that are intuitive and expressive for users. Such interfaces should open a dialog between users and providers, allowing users to specify high-level application goals, and transfer the responsibility of making low-level resource management decisions to the providers. This opens an opportunity for providers to optimize the use of their resources while still best aligning with user goals. Second, to make the resource management decisions in an adaptive manner in increasingly complex cloud systems, we must leverage Data-Driven or Machine Learning (ML) models. In doing so, my work uses and develops ML algorithms and studies the challenges that such data-driven models raise in the context of systems: modeling uncertainty, cost of training, and generalizability. In this talk, I will present two systems, INFaaS and PARIS, designed to demonstrate the efficacy of these two key insights. These systems represent key steps towards building a smart cloud: they significantly simplify the use of cloud, improve resource efficiency while meeting user goals.

Mar. 4th

Speaker: Ying Cui

  • Department of Industrial and Systems Engineering at the University of Minnesota

  • Bio : Ying Cui is currently an assistant professor of the Department of Industrial and Systems Engineering at the University of Minnesota. Her research focuses on the mathematical foundation of data science with emphasis on optimization techniques for operations research, machine learning and statistical estimations. Prior to UMN, she was a postdoc research associate at the University of Southern California. She received her Ph.D from the Department of Mathematics at the National University of Singapore.

Talk information

  • Title: Solving Non-(Clarke)-Regular Optimization Problems in Statistical and Machine Learning

  • Time: Thursday, Mar. 4th, 2021 10:0011:00 am

  • Location: Online via zoom (join) (slides)

Abstract

Although we have witnessed growing interests from the continuous optimization community in solving the nonconvex and nonsmooth optimization problems, most of the existing work focus on the Clarke regular objective or constraints so that many friendly properties hold. In this talk, we will discuss the pervasiveness of the non-(Clarke) regularity in the modern operations research and statistical estimation problems due to the complicated composition of nonconvex and nonsmooth functions. Emphasis will be put on the difficulties brought by the non-regularity both in terms of the computation and the statistical inference, and our initial attempts to overcome them.

Mar. 1st (Cross-listed from Computer Science Colloquium )

Speaker: Yunan Luo

  • Department of Computer Science at University of Illinois at Urbana-Champaign

  • Bio : Yunan Luo is a Ph.D. student advised by Prof. Jian Peng in the Department of Computer Science, University of Illinois at Urbana-Champaign. Previously, he received his Bachelor’s degree in Computer Science from Tsinghua University in 2016. His research interests are in computational biology and machine learning. His research has been recognized by a Baidu Ph.D. Fellowship and a CompGen Ph.D. Fellowship.

Talk information

  • Title: Machine learning for large- and small-data biomedical discovery

  • Time: Monday, Mar. 1st, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

In modern biomedicine, the role of computation becomes more crucial in light of the ever-increasing growth of biological data, which requires effective computational methods to integrate them in a meaningful way and unveil previously undiscovered biological insights. In this talk, I will discuss my research on machine learning for large- and small-data biomedical discovery. First, I will describe a representation learning algorithm for the integration of large-scale heterogeneous data to disentangle out non-redundant information from noises and to represent them in a way amenable to comprehensive analyses; this algorithm has enabled several successful applications in drug repurposing. Next, I will present a deep learning model that utilizes evolutionary data and unlabeled data to guide protein engineering in a small-data scenario; the model has been integrated into lab workflows and enabled the engineering of new protein variants with enhanced properties. I will conclude my talk with future directions of using data science methods to assist biological design and to support decision making in biomedicine.

Feb. 26th (Cross-listed from Computer Science Colloquium )

Speaker: Dongyeop Kang

  • University of California, Berkeley

  • Bio : Dongyeop Kang is a postdoctoral scholar at the University of California, Berkeley. He obtained his Ph.D. in the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University. His Ph.D. study has been supported by Allen Institute for AI (AI2) fellowship, CMU presidential fellowship, and ILJU graduate fellowship. During the study, he interned at Facebook AI research, AI2, and Microsoft Research.

Talk information

  • Title: Toward human-centric language generation systems

  • Time: Firday, Feb. 26th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Natural language generation (NLG) is a key component of many language technology applications such as dialogue systems, question-answering systems, automatic email replies, and story generation. Despite the recent advances of massive language models like GPT3, texts predicted by such systems are far from any human-like language. In fact, they most often produce either nonfactual text, incoherent text, or pragmatically inappropriate text. Also, the lack of interaction with real users makes the system less controllable and nonpractical. My research is focused on developing linguistically informed computational models in a wide range of generation tasks and building real-world NLG systems which can interact with humans. In this talk, I propose three steps to develop human-centric language generation systems: (i) Studying linguistic theories, (ii) Developing theory-informed models, and (iii) Building human-machine cooperative systems. My research lies at the intersection of three fields: computational linguistics as a theoretical basis, modern machine learning as a powerful technical tool, and human-computer interaction as a robust, reliable interactive testbed.

Feb. 22th

Speaker: Arvind Narayanan

  • Department of Computer Science & Engineering at the University of Minnesota

  • Bio : Arvind Narayanan is a Ph.D. Candidate in the Department of Computer Science & Engineering at the University of Minnesota, advised by Professor Zhi-Li Zhang and Professor Feng Qian. His research interests are broadly in the areas of emerging scalable network architectures (such as NFVs), 5G mobile networking, network data science and content distribution networks (CDNs). He has published papers in several top venues such as WWW, IMC, SIGCOMM, CoNEXT, APNET, Journal of Voice, GLOBECOM, ICDCS, etc. Arvind's recent work on 5G (including the publicly released datasets) has become the de facto baseline to understand and evaluate the evolution of commercial 5G's network performance. His work on DeepCache was the recipient of the Best Paper Award at SIGCOMM Workshop NetAI'18. Arvind completed his M.S. in Computer Science from the same department, and graduated with B.E. in Computer Engineering with highest distinction from the University of Mumbai where he was also awarded the Best Overall Student in his batch (1 out of 120).

Talk information

  • Title: Measuring, Mapping and Predicting Commercial 5G Performance: A UE Perspective

  • Time: Monday, Feb. 22th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

With its touted ultra-high bandwidth and low latency, 5th generation (5G) wireless technology is envisaged to usher in a new smart and connected world. The goals of our research on 5G are two fold. First, we want to get empirical insights of network and application performance of commercial 5G under several realistic settings, and compare them with its predecessor (4G/LTE). Second, we want to identify novel challenges that are 5G-specific and propose mechanisms to overcome them.

My talk consists of three parts. In the first part, I will describe our measurement study of commercial 5G networks with special focus on millimeter wave (mmWave) 5G. It is to our knowledge a first comprehensive characterization of 5G network performance on smartphones by closely examining 5G service of three carriers (two mmWave carriers, one mid-band carrier) in three U.S. cities. This study finds that commercial mmWave 5G can achieve an impressive throughput of 2 Gbps. However, due to the known poor signal propagation characteristics of mmWave, 5G throughput perceived by the user equipment (UE) is highly sensitive to user mobility and obstructions resulting in a high number of 4G-5G handoffs. Such characteristics of mmWave 5G can make the throughput fluctuate frequently and wildly (between a range of 0 and 2 Gbps) which may confuse applications (e.g., the video bitrate adaptation) and bring highly inconsistent user experiences. Motivated by such insights, the second part of my talk will go beyond the basic measurement and describe Lumos5G - a novel and composable ML-based 5G throughput prediction framework that judiciously considers features and their combinations to make context-aware 5G throughput predictions. Through extensive on-field experiments and statistical analysis, we identify key UE-side factors affecting mmWave 5G performance. Besides geolocation, we quantitatively reveal several other UE-side contextual factors (such as geometric features between UE and 5G panel, mobility speed/mode, etc.) impact 5G throughput -- far more sophisticated than those impacting 4G/LTE. Instead of independently affecting the performance, we find these factors may cause complex interplay that is difficult to model analytically. We demonstrate that compared to existing approaches, Lumos5G is able to achieve 1.37x to 4.84x reduction in prediction error. This work can be viewed as a feasibility study for building what we envisage as a dynamic 5G performance map (akin to Google traffic map). In the third part, I will use our 18-months of experience conducting field experiments of commercial 5G to give my thoughts on the current 5G landscape and highlight both the research opportunities and challenges offered by the 5G ecosystem.

For more information, visit us @ https://5gophers.umn.edu

Feb. 19th (Cross-listed from Computer Science Colloquium )

Speaker: Michał Dereziński

  • Department of Statistics at the University of California, Berkeley

  • Bio : Michał Dereziński is a postdoctoral fellow in the Department of Statistics at the University of California, Berkeley. Previously, he was a research fellow at the Simons Institute for the Theory of Computing (Fall 2018, Foundations of Data Science program). He obtained his Ph.D. in Computer Science at the University of California, Santa Cruz, advised by professor Manfred Warmuth, where he received the Best Dissertation Award for his work on sampling methods in statistical learning. Michał's current research is focused on developing scalable randomized algorithms with robust statistical guarantees for machine learning, data science and optimization. His work on reducing the cost of interpretability in dimensionality reduction received the Best Paper Award at the Thirty-fourth Conference on Neural Information Processing Systems. More information is available at: https://users.soe.ucsc.edu/~mderezin/.

Talk information

  • Title: Bridging algorithmic and statistical randomness in machine learning

  • Time: Thursday, Feb. 19th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Randomness is a key resource in designing efficient algorithms, and it is also a fundamental modeling framework in statistics and machine learning. Methods that lie at the intersection of algorithmic and statistical randomness are at the forefront of modern data science. In this talk, I will discuss how statistical assumptions affect the bias-variance trade-offs and performance characteristics of randomized algorithms for, among others, linear regression, stochastic optimization, and dimensionality reduction. I will also present an efficient algorithmic framework, called joint sampling, which is used to both predict and improve the statistical performance of machine learning methods, by injecting carefully chosen correlations into randomized algorithms.

In the first part of the talk, I will focus on the phenomenon of inversion bias, which is a systematic bias caused by inverting random matrices. Inversion bias is a significant bottleneck in parallel and distributed approaches to linear regression, second order optimization, and a range of statistical estimation tasks. Here, I will introduce a joint sampling technique called Volume Sampling, which is the first method to eliminate inversion bias in model averaging. In the second part, I will demonstrate how the spectral properties of data distributions determine the statistical performance of machine learning algorithms, going beyond worst-case analysis and revealing new phase transitions in statistical learning. Along the way, I will highlight a class of joint sampling methods called Determinantal Point Processes (DPPs), popularized in machine learning over the past fifteen years as a tractable model of diversity. In particular, I will present a new algorithmic technique called Distortion-Free Intermediate Sampling, which drastically reduced the computational cost of DPPs, turning them into a practical tool for large-scale data science.

Feb. 15th (Cross-listed from Computer Science Colloquium )

Speaker: Sarah Dean

  • Department of Electrical Engineering and Computer Science at UC Berkeley

  • Bio : Sarah is a PhD candidate in the Department of Electrical Engineering and Computer Science at UC Berkeley, advised by Ben Recht. She received her MS in EECS from Berkeley and BSE in Electrical Engineering and Math from the University of Pennsylvania. Sarah is interested in the interplay between optimization, machine learning, and dynamics in real-world systems. Her research focuses on developing principled data-driven methods for control and decision-making, inspired by applications in robotics, recommendation systems, and developmental economics. She is a co-founder of a transdisciplinary student group, Graduates for Engaged and Extended Scholarship in computing and Engineering, and the recipient of a Berkeley Fellowship and a NSF Graduate Research Fellowship.

Talk information

  • Title: Reliable Machine Learning in Feedback Systems

  • Time: Thursday, Feb. 15th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Machine learning techniques have been successful for processing complex information, and thus they have the potential to play an important role in data-driven decision-making and control. However, ensuring the reliability of these methods in feedback systems remains a challenge, since classic statistical and algorithmic guarantees do not always hold.

In this talk, I will provide rigorous guarantees of safety and discovery in dynamical settings relevant to robotics and recommendation systems. I take a perspective based on reachability, to specify which parts of the state space the system avoids (safety) or can be driven to (discovery). For data-driven control, we show finite-sample performance and safety guarantees which highlight relevant properties of the system to be controlled. For recommendation systems, we introduce a novel metric of discovery and show that it can be efficiently computed. In closing, I discuss how the reachability perspective can be used to design social-digital systems with a variety of important values in mind.

Feb. 12th (Cross-listed from Computer Science Colloquium )

Speaker: Tianyi Zhou

  • Paul G. Allen School of Computer Science and Engineering at University of Washington

  • Bio : Tianyi Zhou is a Ph.D. candidate in the Paul G. Allen School of Computer Science and Engineering at University of Washington, advised by Professor Jeff A. Bilmes. His research interests are in machine learning, optimization, and natural language processing. His recent research focuses on transferring human learning strategies to machine learning in the wild, especially when the data are unlabeled, redundant, noisy, biased, or are collected via interaction, e.g., how to automatically generate a curriculum of data/tasks during the course of training. The studied problems cover supervised/semi-supervised/self-supervised learning, robust learning with noisy data, reinforcement learning, meta-learning, ensemble method, spectral method, etc. He has published ~50 papers at NeurIPS, ICML, ICLR, AISTATS, NAACL, COLING, KDD, AAAI, IJCAI, Machine Learning (Springer), IEEE TIP, IEEE TNNLS, IEEE TKDE, etc., with ~2000 citations. He is the recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE Computer Society Technical Committee on Scalable Computing (TCSC) Most Influential Paper Award.

Talk information

  • Title: Learning Like a Human: How, Why, and When

  • Time: Thursday, Feb. 12th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Machine learning (ML) can surpass humans on certain complicated yet specific tasks. However, most ML methods treat samples/tasks equally, e.g., by taking a random batch per step and repeating many epochs' training on all data, which may work promisingly on well-processed data using sufficient computation but is extraordinarily suboptimal and inefficient from human perspectives, since we would never teach children or students in such a way. On the contrary, human learning is more strategic and smarter in selecting/generating the training contents for different learning stages via experienced teachers, collaboration of learners, curiosity and diversity in exploration, tracking of learned knowledge and progress, distributing a task into sub-tasks, etc., which have been underexplored in ML. The selection and scheduling of data/tasks is another type of intelligence as important as the optimization of model parameters on given data/tasks. My recent work aims to bridge this gap between human and machine intelligence. As we entering a new era of hybrid intelligence between humans and machines, it is important to make AI not only perform like humans in outcome presentations but also benefit from human-like strategies in its training.

In this talk, I will present several curriculum learning techniques we developed for improving supervised/semi-supervised/self-supervised learning, robust learning with noisy data, reinforcement learning, ensemble learning, etc., especially when the data are imperfect and thus a curriculum can make a big difference. Firstly, I will show how to translate human strategies in curriculum generation to discrete-continuous hybrid optimizations, which are challenging to solve in general but we can develop efficient and provable algorithms using techniques from submodular and convex/non-convex optimization. Curiosity and diversity play important roles in these formulations. Secondly, we build both empirical and theoretical connections between curriculum learning and the training dynamics of ML models on individual samples. Empirically, we find that deep neural networks are fast in memorizing some data but also fast in forgetting some others, so we can accurately allocate those easily forgotten data using training dynamics in very early stages and make the future training only focus on them. Moreover, we find that the consistency of model output overtime for an unlabeled sample is a reliable indicator of its prediction correctness and delineates the forgetting effects on previously learned data. In addition, the learning speed on samples/tasks provides critical information for future exploration. These discoveries are consistent with human learning strategies and lead to more efficient curricula for a rich class of ML problems. Theoretically, we derive a data selection criterion solely from the optimization of learning dynamics in continuous time. Interestingly, the resulted curriculum matches the previous empirical observations and has a natural connection to the neural tangent kernel in recent deep learning theories.

Feb. 8th (Cross-listed from Computer Science Colloquium )

Speaker: Alex Lamb

  • University of Montreal

  • Bio : Alex Lamb is a PhD student at the University of Montreal advised by Yoshua Bengio and a recipient of the Twitch PhD Fellowship 2020. His research is on the intersection of developing new algorithms for machine learning and new applications. In the area of algorithms, he is particularly interested in (1) making deep networks more modular and richly structured and (2) improving the generalization performance of deep networks, especially across shifting domains. He is particularly interested in techniques which use functional inspiration from the brain and psychology to improve performance on real tasks. In terms of applications of Machine Learning, his most recent work has been on historical Japanese documents and has resulted in KuroNet, a publicly released service which generates automatic analysis and annotations to make classical Japanese documents (more) understandable to readers of modern Japanese.

Talk information

  • Title: Latent Data Augmentation and Modular Structure for Improved Generalization

  • Time: Thursday, Feb. 8th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Deep neural networks have seen dramatic improvements in performance, with much of this improvement being driven by new architectures and training algorithms with better inductive biases. At the same time, the future of AI is systems which run in an open-ended way which run on data unlike what was seen during training and which can be drawn from a changing or adversarial distribution. These problems also require a greater scale and time horizon for reasoning as well as consideration of a complex world system with many reused structures and subsystems. This talk will survey some areas where deep networks can improve their biases as well as my research in this direction. These algorithms dramatically change the behavior of deep networks, yet they are highly practical and easy to use, conforming to simple interfaces that allow them to easily be dropped into existing codebases.

Feb. 5th (Cross-listed from Computer Science Colloquium )

Speaker: Wei Hu

  • Department of Computer Science at Princeton University

  • Bio : Wei Hu is a PhD candidate in the Department of Computer Science at Princeton University, advised by Sanjeev Arora. Previously, he obtained his B.E. in Computer Science from Tsinghua University. He has also spent time as a research intern at research labs of Google and Microsoft. His current research interest is broadly in the theoretical foundations of modern machine learning. In particular, his main focus is on obtaining solid theoretical understanding of deep learning, as well as using theoretical insights to design practical and principled machine learning methods.

Talk information

  • Title: On the Foundations of Deep Learning: Over-parameterization, Generalization, and Representation Learning

  • Time: Thursday, Feb. 5th, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Despite the phenomenal empirical successes of deep learning in many application domains, its underlying mathematical mechanisms remain poorly understood. Mysteriously, deep neural networks in practice can often fit training data perfectly and generalize remarkably well to unseen test data, despite highly non-convex optimization landscapes and significant over-parameterization. Moreover, deep neural networks show extraordinary ability to perform representation learning: feature representation extracted from a neural network can be useful for other related tasks.

In this talk, I will present our recent progress on the theoretical foundations of deep learning. First, I will show that gradient descent on deep linear neural networks induces an implicit regularization effect towards low rank, which explains the surprising generalization behavior of deep linear networks for the low-rank matrix completion problem. Next, turning to nonlinear deep neural networks, I will talk about a line of studies on wide neural networks, where by drawing a connection to the neural tangent kernels, we can answer various questions such as how training loss is minimized, why trained network can generalize, and why certain component in the network architecture is useful; we also use theoretical insights to design a new simple and effective method for training on noisily labeled datasets. Finally, I will analyze the statistical aspect of representation learning, and identify conditions that enable efficient use of training data, bypassing a known hurdle in the i.i.d. tasks setting.

Feb. 2rd (Cross-listed from Computer Science Colloquium )

Speaker: Hongyang Zhang

  • Toyota Technological Institute at Chicago

  • Bio : Hongyang Zhang is a Postdoc fellow at Toyota Technological Institute at Chicago, hosted by Avrim Blum and Greg Shakhnarovich. He obtained his Ph.D. from CMU Machine Learning Department in 2019, advised by Maria-Florina Balcan and David P. Woodruff. His research interests lie in the intersection between theory and practice of machine learning, robustness and AI security. His methods won the championship or ranked top in various competitions such as the NeurIPS’18 Adversarial Vision Challenge (all three tracks), the Unrestricted Adversarial Examples Challenge hosted by Google, and the NeurIPS’20 Challenge on Predicting Generalization of Deep Learning. He also authored a book in 2017.

Talk information

  • Title: New Advances in (Adversarially) Robust and Secure Machine Learning

  • Time: Thursday, Feb. 2rd, 2021 11:1512:15 pm

  • Location: Online via zoom (join)

Abstract

Deep learning models are often vulnerable to adversarial examples. In this talk, we will focus on robustness and security of machine learning against adversarial examples. There are two types of defenses against such attacks: 1) empirical and 2) certified adversarial robustness.

In the first part of the talk, we will see the foundation of our winning system, TRADES, in the NeurIPS’18 Adversarial Vision Challenge in which we won 1st place out of 400 teams and 3,000 submissions. Our study is motivated by an intrinsic trade-off between robustness and accuracy: we provide a differentiable and tight surrogate loss for the trade-off using the theory of classification-calibrated loss. TRADES has record-breaking performance in various standard benchmarks and challenges, including the adversarial benchmark RobustBench, the NLP benchmark GLUE, the Unrestricted Adversarial Examples Challenge hosted by Google, and has motivated many new attacking methods powered by our TRADES benchmark.

In the second part of the talk, to equip empirical robustness with certification, we study certified adversarial robustness by random smoothing in the L_infty threat model. On one hand, we show that random smoothing on the TRADES-trained classifier achieves SOTA certified robustness when the L_infty perturbation radius is small. On the other hand, when the perturbation is large, i.e., independent of inverse of input dimension, we show that random smoothing is provably unable to certify L_infty robustness for arbitrary random noise distribution. The intuition behind our theory reveals an intrinsic difficulty of achieving certified robustness by “random noise based methods”, and inspires new directions as potential future work.

Jan. 28th

Speaker: Miryung Kim

  • Department of Computer Science

  • Bio : Miryung Kim is a Full Professor in the Department of Computer Science at the University of California, Los Angeles. She is known for her research on code clones---code duplication detection, management, and removal solutions. Recently, she has taken a leadership role in defining the emerging area of software engineering for data science. She received her B.S. in Computer Science from Korea Advanced Institute of Science and Technology and her M.S. and Ph.D. in Computer Science and Engineering from the University of Washington. She received various awards including an NSF CAREER award, Google Faculty Research Award, Okawa Foundation Research Award, and Alexander von Humboldt Foundation Fellowship. She was previously an assistant professor at the University of Texas at Austin and also spent time as a visiting researcher at Microsoft Research. She is the lead organizer of a Dagstuhl Seminar on SE4ML---Software Engineering for AI-ML based Systems. She is a Keynote Speaker at ASE 2019, a Program Co-Chair of ESEC/FSE 2022, and an Associate Editor of IEEE Transactions on Software Engineering.

Talk information

  • Title: Software Engineering for Data Analytics (SE4DA)

  • Time: Thursday, Jan. 28th, 2021 10:00–11:00 pm

  • Location: Online via zoom (join)

Abstract

We are at an inflection point where software engineering meets the data-centric world of big data, machine learning, and artificial intelligence. As software development gradually shifts to the development of data analytics with AI and ML technologies, existing software engineering techniques must be re-imagined to provide the productivity gains that developers desire. We conducted a large scale study of almost 800 professional data scientists in the software industry to investigate what a data scientist is, what data scientists do, and what challenges they face. This study has found that ensuring correctness is a huge problem in data analytics.

We argue for re-targeting software engineering research to address new challenges in the era of data-centric software development. We showcase a few examples of my group's research on debugging and testing of data-intensive applications: e.g., data provenance, symbolic-execution based test generation, and fuzz testing in Apache Spark. We then conclude with open problems in software engineering to meet the needs of AI and ML workforce.

Jan. 21th

Speaker: Jie Ding

  • School of Statistics

  • Bio : Jie Ding is an Assistant Professor in Statistics at the University of Minnesota, also a graduate faculty of the ECE Department and the Data Science program. Before joining the University of Minnesota in 2018, he received a Ph.D. in Engineering Sciences in 2017 from Harvard University and worked as a postdoctoral fellow at Information Initiative at Duke University. Jie's recent research interests are in new principles and methodologies in machine learning, with a particular focus on collaborative AI, privacy, and streaming data.

Talk information

  • Title: New Directions in Privacy-Preserving Machine Learning

  • Time: Thursday, Jan. 21th, 2021 10:00–11:00 pm

  • Location: Online via zoom (join)

Abstract

In a number of emerging AI tasks, collaborations among different organizations or agents (e.g., human and robots, mobile units, and smart devices) are often essential to resolving challenging problems that are otherwise impossible to be dealt with by a single agent. However, to avoid leaking useful and possibly proprietary information, agents typically enforce stringent security measures, which significantly limits such kinds of collaboration. This talk will introduce new research directions in privacy-preserving learning beyond state of the art. A particular focus is on a new learning paradigm named Assisted Learning to enable agents to assist each other in a decentralized, personalized, and private manner. The talk will also introduce a new data privacy framework and a vista of future privacy-preserving machine learning.