Talk 1
1pm - 2pm
Title: How to Handle Data Shifts? Challenges, Research Progress and Path Forward
Abstract: The real world is open and full of unknowns, presenting significant challenges for machine learning systems that must reliably handle diverse, and sometimes anomalous inputs. Out-of-distribution (OOD) uncertainty arises when a machine learning model sees a test-time input that differs from its training data, and thus should not be predicted by the model. As machine learning is used for more safety-critical domains, the ability to handle out-of-distribution data is central in building open-world learning systems. In this talk, I will talk about challenges, research progress, and future opportunities in detecting OOD samples for safe and reliable predictions in an open world.
Bio: Sharon Yixuan Li is an Assistant Professor in the Department of Computer Sciences at the University of Wisconsin Madison. Her broad research interests are in deep learning and machine learning. Her research focuses on learning and inference under distributional shifts and open-world machine learning. Previously she was a postdoc research fellow in the Computer Science department at Stanford AI Lab. She completed her Ph.D. from Cornell University in 2017, where she was advised by John E. Hopcroft. She led the organization of the ICML workshop on Uncertainty and Robustness in Deep Learning in 2019 and 2020. She is the recipient of several awards, including the Facebook Research Award, Amazon Research Award, and was named Forbes 30Under30 in Science.
Talk 2
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/64d0d46879244acb930be202b0e08d861d
Title: Event-Centric Multimedia Understanding
Abstract: Human memories can be viewed as repositories of historical events. An event is a semantic structure encapsulating the fundamental questions of Who? What? Where? When? Why?, which are of primary concern to humans. The enormous volume of data requires machines to automatically obtain such factual knowledge including events and their arguments (participants) and perform reasoning by synthesizing a wide variety of unstructured data, such as text, images, and videos. However, current event understanding is text-only, local, and lacking in reasoning. This talk focuses on constructing event graphs to deal with real-world events that are multimedia, interconnected, probabilistic, and span a long time period. We discover the local event structures (i.e., who, what, where, and when) by transferring the semantic understanding ability from text to vision during vision-language pretraining. To perform global reasoning across events (i.e., what is likely to occur, and why), we further investigate the interactions between events such as the temporal orders and the evolution patterns along a long period, and leverage historical events to discover such event schema knowledge globally. Our structural event graph modeling is able to represent the global inter-dependencies of events and long-distance interactions via arguments, leading to a comprehensive understanding of events and effective forecasting of future events.
Bio: Manling Li is a Ph.D. candidate at the Computer Science Department of University of Illinois Urbana-Champaign. Her work on multimedia knowledge extraction won the ACL'20 Best Demo Paper Award and NAACL'21 Best Demo Paper Award. She was selected as a DARPA Riser (nominated by DARPA) and an EE CS Rising Star in 2022. She was a recipient of Microsoft Research PhD Fellowship in 2021. She was awarded C.L. Dave and Jane W.S. Liu Award, and has been selected as Mavis Future Faculty Fellow. She has more than 30 publications on multimedia knowledge extraction and reasoning, and gave tutorials about multimedia IE at AAAI'21, ACL'21 and NAACL'22. Additional information is available at https://limanling.github.io/ .
Talk 3
1pm - 2pm
Speaker: Rishi Bommasani
Video: https://mediasite.osu.edu/Mediasite/Play/d22584bcd05449138c7313bdafca85e51d
Title: Castles in the sky: Towards sturdy foundation models
Abstract: Increasingly, AI systems across different domains, modalities, and industries are built by adapting models that function as a shared foundation. These foundation models have demonstrated remarkable promise: these models are being deployed rapidly, spawning a new wave of startups, and even being used to co-author testimony before the US Senate! Alongside their clear potential, they pose both oft-discussed and oft-neglected societal risks, especially if their development is unbridled and uncritical. In this talk, I will begin with my perspective on foundation models, to hopefully provide some conceptual clarity on this emerging paradigm. With this lens, I will then introduce several of our recent efforts to ensure foundation models are beneficial to society (e.g. increasing transparency via benchmarking, building community norms and standards, articulating systemic harms to individuals). Overall, these are steps towards a mature ecosystem for foundation models.
Bio: Rishi Bommasani is a third-year CS PhD at Stanford, co-advised by Percy Liang and Dan Jurafsky. His research centers the societal impact of AI with a special focus on foundation models. He helped build the Stanford Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered AI (HAI). Prior to Stanford, Rishi completed his bachelors at Cornell advised by Claire Cardie. Rishi is currently supported by the NSF GRFP.
Talk 4
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/96e24497d88f40faa893cafe744af06b1d
Title: Large Language Models: Will they keep getting bigger? And, how will we use them if they do?
Abstract: The trend of building ever larger language models has dominated much research in NLP over the last few years. In this talk, I will discuss our recent efforts to (at least partially) answer two key questions in this area: Will we be able to keep scaling? And, how will we actually use the models, if we do? I will cover our recent efforts on learning new types of sparse mixtures of experts (MoEs) models. Unlike model-parallel algorithms for learning dense models, which are very difficult to further scale with existing hardware, our sparse approaches have significantly reduced cross-node communication costs and could possibly provide the next big leap in performance, although finding a version that scales well in practice remains an open challenge. I will also present our recent work on prompting language models that better controls for surface form variation, to improve performance of models that are so big we can only afford to do inference, with little to no task-specific fine tuning. Finally, time permitting, I will discuss work on new forms of supervision for language model training, including learning from the hypertext and multi-modal structure of web pages to provide new signals for both learning and prompting the model. Together, these methods present our best guesses for how to keep the scaling trend alive as we move forward to the next generation of NLP models. This talk describes work done at the University of Washington and Meta, primarily led by Armen Aghajanyan, Suchin Gururangan, Ari Holtzmann, Mike Lewis, Margaret Li, Sewon Min, and Peter West.
Bio: Luke Zettlemoyer is a Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and a Research Director at Meta. His research focuses on empirical methods for natural language semantics, and involves designing machine learning algorithms, introducing new tasks and datasets, and, most recently, studying how to best develop self-supervision signals for pre-training. His honors include being named an ACL Fellow as well as winning a PECASE award, an Allen Distinguished Investigator award, and multiple best paper awards. Luke received his PhD from MIT and was a postdoc at the University of Edinburgh.
Talk 5
1pm - 2pm
Speaker: Jing Qian
Video: https://mediasite.osu.edu/Mediasite/Play/d5335ea937c349aa8e6dfc438b2f07181d
Title: Language Model Detoxification in Natural Language Processing
Abstract: The rapid rise in user-generated web content has not only yielded a vast increase in information accessibility but has also given individuals an easy platform on which to share their beliefs and publicly communicate with others. Unfortunately, this has also led to improper use of online spaces, such as the propagation of toxic speech. Worse yet, we recently observed that user-generated toxic speech can propagate beyond online social platforms. Pre-training on large amounts of web text enables large language models (LMs) to generate coherent texts but also results in toxic degeneration and biased behavior. In this talk, we investigate how to tackle toxic content in model-generated text using natural language processing techniques. Besides developing automatic tools for post-processing, such as hate speech detection, another complementary solution is to detoxify the pre-trained LMs by reducing the likelihood that the model will generate toxic content. Our work effectively lowers the toxic content rate in both freeform text generation and dialogue response generation.
Bio: Jing Qian is a researcher at Microsoft AI Platform team. She obtained her PhD in Computer Science at UCSB in 2022. Her research interests lie in Natural Language Processing and Machine Learning. Primarily, she works on Natural Language Generation and Computational Social Science.
Talk 6
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/204eaab844004cd1acd23c9f91ab528a1d
Title: Interaction and Natural Language Learning
Abstract: This talk focuses on the challenges and opportunities that interactions provide for natural language learning. It is divided into two parts. First, I will show how collaborative interactions enable continual learning, where agentive systems improve over time through interaction. I will describe an interactive game-like environment that instantiates collaborative interactions with natural language coordination, and show how it creates a contextual bandit learning scenario for language production (i.e., generation). I will also briefly show how this formulation generalizes to language comprehension. In the second part, I will discuss language-conditioned reinforcement learning (RL). Although the pairing of natural language and RL is promising for both research and application, it is a research avenue that remains challenging to pursue. I will describe a new approach for language-conditioned RL benchmarking that strikes a balance between research accessibility and retaining the complexities and nuances of natural language.
Bio: Yoav Artzi is an Associate Professor in the Department of Computer Science and Cornell Tech at Cornell University. His research focuses on developing learning methods for natural language understanding and generation in automated interactive systems. He received an NSF CAREER award, and his work was acknowledged by awards and honorable mentions at ACL, EMNLP, NAACL, and IROS. Yoav holds a B.Sc. from Tel Aviv University and a Ph.D. from the University of Washington.
Talk 7
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/b4325527ba604f32b082930e073a72071d
Title: Text Mining: An Embedding-Based, Annotation-Free Approach
Abstract: The real-world big data are largely dynamic, interconnected, and unstructured texts. It is important to transform such massive unstructured text into structured knowledge. Many researchers rely on labor-intensive labeling and curation to extract knowledge from text data. Such approaches, however, are not scalable. We vision that massive text itself may disclose a large body of hidden structures and knowledge. Equipped with pretrained language models and data mining/machine learning methods, it is promising to transform unstructured text into structured knowledge without extensive human annotation. In this talk, we overview a set of annotation-free text mining methods developed recently in our group for such an exploration, including discriminative topic mining, taxonomy construction, text classification, and taxonomy-guided text analysis. We show that weakly supervised, annotation-free approach could be promising at transforming massive text into structured knowledge.
Bio: Jiawei Han is Michael Aiken Chair Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign. He received ACM SIGKDD Innovation Award (2004), IEEE Computer Society Technical Achievement Award (2005), IEEE Computer Society W. Wallace McDowell Award (2009), and Japan's Funai Achievement Award (2018). He is Fellow of ACM and Fellow of IEEE and served as the Director of Information Network Academic Research Center (INARC) (2009-2016) supported by the Network Science-Collaborative Technology Alliance (NS-CTA) program of U.S. Army Research Lab and co-Director of KnowEnG, a Center of Excellence in Big Data Computing (2014-2019), funded by NIH Big Data to Knowledge (BD2K) Initiative. Currently, he is serving on the executive committees of two NSF funded research centers: MMLI (Molecular Make Research Institute)—one of NSF funded national AI centers since 2020 and I-Guide—The National Science Foundation (NSF) Institute for Geospatial Understanding through an Integrative Discovery Environment (I-GUIDE) since 2021.
Talk 8
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/f0c4149f6957487b8da175e71e1b37021d
Title: A data-centric view on reliable generalization
Abstract: Researchers have proposed many methods to make neural networks more reliable under distribution shift, yet there is still large room for improvement. Are better training algorithms or training data the more promising way forward? In this talk, we study this question in the context of computer vision and OpenAI’s CLIP model for learning from image-text data.
First, we survey the current robustness landscape based on a large-scale experimental study involving more than 200 different models and test conditions. The CLIP models stand out with unprecedented robustness gains on multiple challenging distribution shifts. To further improve CLIP, we then introduce new methods for reliably fine-tuning models by interpolating the weights of multiple models. Finally, we investigate the cause of CLIP’s robustness via controlled experiments to disentangle the influence of language supervision and training distribution. While CLIP leveraged large scale language supervision for the first time, its robustness actually comes from the pre-training dataset.
Based on our findings, we will conclude with initial experiments to improve the pre-training datasets for image-text models.
Bio: Ludwig Schmidt is an assistant professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Ludwig’s research interests revolve around the empirical and theoretical foundations of machine learning, often with a focus on datasets, evaluation, and reliable methods. Ludwig completed his PhD at MIT under the supervision of Piotr Indyk and was a postdoc at UC Berkeley hosted by Benjamin Recht and Moritz Hardt. Ludwig’s research received a new horizons award at EAAMO, a best paper award at ICML, a best paper finalist at CVPR, and the Sprowls dissertation award from MIT.
Talk 9
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/14ed922ab78644d3a9ecacf7d381fff01d
Title: Capturing the Dynamic World in 3D
Abstract: Cameras allow us to effortlessly capture the visual world around us and share memorable moments of our lives. While current computer vision systems work remarkably well on recognizing 2D patterns in images, they often have difficulty recovering the complete 3D geometry of dynamic scenes. On the other hand, humans can perceive complex dynamic scenes in terms of physical surfaces, objects, and scenes in 3D and imagine plausible scene appearances from novel viewpoints. In this talk, I will present my research on reconstructing and rendering our 3D world.
Bio: Jia-Bin Huang is an Associate Professor in the Department of Computer Science at the University of Maryland College Park. He received his Ph.D. degree from the Department of Electrical and Computer Engineering at the University of Illinois, Urbana-Champaign. His research interests include computer vision, computer graphics, and machine learning with a focus on visual analysis and synthesis with physically grounded constraints. His research received the best student paper award at IAPR International Conference on Pattern Recognition (ICPR) and the best paper award at the ACM Symposium on Eye Tracking Research & Applications (ETRA). Huang is the recipient of the Dissertation Completion Fellowships, Thomas & Margaret Huang award from UIUC, NSF CRII award, Samsung Global Outreach Award, 3M non-tenured faculty award, and a Google faculty research award.
Link: https://jbhuang0604.github.io/
Talk 10
1pm - 2pm
Video: https://mediasite.osu.edu/Mediasite/Play/e8330f8f7232496abbe3e47a07289ebc1d
Title: Tilted Losses in Machine Learning: Theory and Applications
Abstract: Exponential tilting is a technique commonly used to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this talk, I discuss a simple extension to ERM---tilted empirical risk minimization (TERM)---which uses tilting to flexibly tune the impact of individual losses. I make connections between TERM and related approaches, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO), and present batch and stochastic first-order optimization methods for solving TERM at scale. Finally, I show that this baseline can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance---delivering state-of-the-art performance relative to more complex, bespoke solutions for these problems.
Bio: Virginia Smith is an assistant professor in the Machine Learning Department at Carnegie Mellon University. Her research spans machine learning, optimization, and distributed systems. Virginia’s current work addresses challenges related to optimization, privacy, fairness, and robustness in distributed settings in order to make federated learning safe, efficient, and reliable. Virginia’s work has been recognized by several awards, including an NSF CAREER Award, MIT TR35 Innovator Award, Intel Rising Star Award, and faculty awards from Google, Apple, and Meta. Prior to CMU, Virginia was a postdoc at Stanford University and received a Ph.D. in Computer Science from UC Berkeley.