The Speakers

 

Ali Arsanjani

Director of AI/ML Engineering, Head of AI Center of Excellence,Google Cloud

Causal Reasoning for Responsible AI: Challenges and Opportunities in the Age of Large Language Models

Abstract: Large language models (LLMs) have rapidly advanced our ability to generate text, translate languages, write different kinds of creative content, and answer questions in informative ways. However, the lack of robust causal reasoning abilities within these models raises significant concerns about their potential for spreading misinformation, amplifying biases, lacking transparency, and having real-world unintended harmful consequences.


This talk will introduce the concept of causal reasoning and why developing causal capabilities in LLMs is crucial for mitigating these ethical risks. I'll discuss the following:


Dr. Arsanjani will be covering ten best practices on Generative Causal Models.


Causal Generative Models (CGMs) and the Path Towards AGI


We'll examine ten key Causal Generative Model patterns, analyzing them through the lens of causation. This analysis will highlight both the potential benefits of CGMs and the ethical considerations surrounding the development of Artificial General Intelligence (AGI) in the context of ten Causal Generative Model Patterns:



Conclusion

This talk aims to highlight the importance of incorporating causal reasoning into the next generation of LLMs. It will underscore the need for responsible AI development to ensure that these powerful technologies remain trustworthy, reliable, and beneficial to society. While CGMs offer concrete steps towards a more causal AI, the development of AGI demands careful ethical deliberation and rigorous safeguards.




Dr. Arsanjani is the Director of AI/ML Partner Engineering at Google Cloud, and Head of the Global AI Center of Excellence. His work includes research of AI best-practices that move the SOTA and development of strategic co-innovation assets and partnerships, focusing on Generative AI, Data/Analytics, and Predictive AI/ML. Academically, he is an Adjunct Professor at San Jose State University at the Masters in Data Science program and at the University of California, San Diego, mentoring capstones for Halicioglu Data Science Institute.

Associate Professor, UC Berkeley

Causal Inference in Network Experiments: Regression-based Analysis and Design-based Properties


Abstract:  Investigating interference or spillover effects among units is a central task in many social science problems. Network experiments are powerful tools for this task, which avoids endogeneity by randomly assigning treatments to units over networks. However, it is non-trivial to analyze network experiments properly without imposing strong modeling assumptions. Previously, many researchers have proposed sophisticated point estimators and standard errors for causal effects under network experiments. We further show that regression-based point estimators and standard errors can have strong theoretical guarantees if the regression functions and robust standard errors are carefully specified to accommodate the interference patterns under network experiments. We first recall a well-known result that the Hajek estimator is numerically identical to the coefficient from the weighted-least-squares fit based on the inverse probability of the exposure mapping. Moreover, we demonstrate that the regression-based approach offers three notable advantages: its ease of implementation, the ability to derive standard errors through the same weighted-least-squares fit, and the capacity to integrate covariates into the analysis, thereby enhancing estimation efficiency. Furthermore, we analyze the asymptotic bias of the regression-based network-robust standard errors. Recognizing that the covariance estimator can be anti-conservative, we propose an adjusted covariance estimator to improve the empirical coverage rates. Although we focus on regression-based point estimators and standard errors, our theory holds under the design-based framework, which assumes that the randomness comes solely from the design of network experiments and allows for arbitrary misspecification of the regression models.

I am an Associate Professor in the Department of Statistics at UC Berkeley. I obtained my Ph.D. from the Department of Statistics, Harvard University in May 2015, and worked as a postdoctoral researcher in the Department of Epidemiology, Harvard T. H. Chan School of Public Health until December 2015. Previously, I received my B.S. in Mathematics, B.A. in Economics, and M.S. in Statistics from Peking University.

Assistant Professor, Columbia University

Identifiable Deep Generative Models for Rich Data Types with Discrete Latent Layers

Abstract: We propose a class of identifiable deep generative models for rich and flexible data types. The key features of the proposed models include (a) discrete latent layers and (b) a shrinking ladder-shaped deep architecture as a sparse probabilistic graphical model. We establish a new identifiability theory for these models by developing transparent conditions on the sparsity structure of the deep generative graph. The proposed identifiability conditions can ensure estimation consistency in both the Bayesian and frequentist senses. As an illustration, we consider the two-latent-layer model and propose shrinkage estimation methods to recover the latent structure and model parameters. Simulation results empirically corroborate the identifiability theory and also demonstrate the excellent performance of our estimation algorithms. Applications of the methodology to a DNA nucleotide sequence dataset and an educational test response time dataset both give interpretable results. The proposed framework provides a recipe for identifiable, interpretable, and reliable deep generative modeling. The new identifiability results have useful implications for causal structure discovery and causal representation learning with highly expressive and nonlinear discrete latent layers.

I am an Assistant Professor in the Department of Statistics at Columbia University. I am also a member of the Data Science Institute. Prior to joining Columbia in 2021, I spent a year as a postdoc at Duke University, mentored by David B. Dunson. In 2020 I received a Ph.D. in Statistics from the University of Michigan, advised by Gongjun Xu. In 2015 I received a B.S. in Mathematics from Tsinghua University.

I am generally interested in latent variable models, statistical machine learning, and psychometrics. I have been working on the methods, theory, and applications of graphical models with latent variables, mixture models, spectral methods, tensor decompositions, deep generative models.

Senior Principal Researcher, Microsoft Research

A New Frontier at the Intersection of Causality and LLMs


Abstract: Correct causal reasoning requires domain knowledge beyond observed data.  Consequently, the first step to correctly frame and answer cause-and-effect questions in medicine, science, law, and engineering requires working closely with domain experts and capturing their (human) understanding of system dynamics and mechanisms.  This is a labor-intensive practice, limited by expert availability, and a significant bottleneck to widespread application of causal methods.

In this talk, we will delve into the causal capabilities of large language models (LLMs), discussing recent studies and benchmarks of their ability to retrieve and apply causal knowledge, as well as the limitations of their causal reasoning capabilities.  Most notably, LLMs present the first instance of general-purpose assistance for constructing causal arguments, including generating causal graphical models and identifying contextual information from natural language.  This promises to reduce the necessary human effort and error in end-to-end causal inference and reasoning, broadening their practical usage.  Ultimately, by capturing common sense and domain knowledge, we believe LLMs are a catalyst for a new frontier facilitating translation between real world scenarios and causal questions, and formal and data-driven methods to answer them.

I am a Senior Principal Researcher at Microsoft Research.  My research interests span causal inference, machine learning, and AI’s implications for people and society.

I am working to broaden the use of causal methods for decision-making across many application domains; and, in the broad area of AI’s implications for society, my projects include work at the intersection of security and machine learning.  I have a strong interest in computational social science questions and social media analyses, especially that require causal understanding of phenomenon in health, mental health; issues of data bias; and understanding how new technologies affect our awareness of the world and enable new kinds of information discovery and retrieval.

Associate Professor, Stanford

Treatment Effects in Market Equilibrium 

Abstract: When randomized trials are run in a marketplace equilibriated by prices, interference arises. To analyze this, we build a stochastic model of treatment effects in equilibrium. We characterize the average direct (ADE) and indirect treatment effect (AIE) asymptotically. A standard RCT can consistently estimate the ADE, but confidence intervals and AIE estimation require price elasticity estimates, which we provide using a novel experimental design. We define heterogeneous treatment effects and derive an optimal targeting rule that meets an equilibrium stability condition. We illustrate our results using a freelance labor market simulation and data from a cash transfer experiment. 

I am an associate professor of Operations, Information, and Technology at the Stanford Graduate School of Business, and an associate professor of Statistics (by courtesy). My research lies at the intersection of causal inference, optimization, and statistical learning. I am particularly interested in developing new solutions to problems in statistics, economics and decision making that leverage recent advances in machine learning.

I gratefully acknowledge support from the National Science Foundation (Methodology, Measurement, and Statistics progam) and the Prime Early Climate Infrastructure. I am currently serving as an associate editor for Biometrika, Management Science (Stochastic Models and Simulation), the Journal of Econometrics, the Journal of the American Statistical Association (Theory and Methods), and the Journal of the Royal Statistical Society (Series B).

Associate Professor, Mohamed bin Zayed University of Artificial Intelligence & Carnegie Mellon University

Causal Representation Learning: Discovery of the Hidden World


Abstract: Causality is a fundamental notion in science, engineering, and even in machine learning. Causal representation learning aims to reveal the underlying high-level hidden causal variables and causal relations.  The modularity property of a causal system implies properties of conditional independence among the variables, independent noise, minimal changes, and independent changes in causal representations.  In this talk, we show how those properties make it possible to recover the underlying causal structure, including hidden variables, from observational data with identifiability guarantees: under appropriate assumptions, the learned representations are consistent with the underlying causal process. Various problem settings are considered, involving independent and identically distributed (i.i.d.) data, temporal data, or data with distribution shift as input. We demonstrate when identifiable causal representation learning can benefit from flexible deep learning and when suitable parametric assumptions have to be imposed on the causal process, with various examples and applications.

Kun Zhang is currently on leave from Carnegie Mellon University (CMU), where he is an associate professor of philosophy and an affiliate faculty in the machine learning department; he is working as a professor and the acting chair of the machine learning department and the director of the Center for Integrative AI at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). He develops methods for making causality transparent by torturing various kinds of data and investigates machine learning problems including transfer learning, representation learning, and reinforcement learning from a causal perspective. He has been frequently serving as a senior area chair, area chair, or senior program committee member for major conferences in machine learning or artificial intelligence, including UAI, NeurIPS, ICML, IJCAI, AISTATS, and ICLR.  He was a co-founder and general & program co-chair of the first Conference on Causal Learning and Reasoning (CLeaR 2022), a program co-chair of the 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022), and is a general co-chair of UAI 2023.

Assistant Professor, CMU

Leveraging Causal Inference to Elucidate the Relationship between Genes and Disease

Abstract: Identifying genes and cell types underlying human diseases and complex traits is critical for our understanding of disease etiology and will inform the development of therapeutic treatments. Differential expression analysis based on case-control single-cell RNA-seq data (scRNA-seq) suffers from unobserved confounding and the intrinsic variable nature of single-cell data (Park 2021 Genome Biol). Genetic approaches prioritize genes from SNPs from genome-wide association studies (GWASs), but have limited overlaps with orthogonal approaches (Smillie 2019 Cell). In this talk, I will discuss ongoing projects that utilize causal inference to improve accuracy in genetic studies, potentially leading to better convergence of discoveries from multiple sources. First, I will introduce csde (Causal Single-cell Differential Expression), a method that accurately identifies differentially expressed genes in cell types by estimating unobserved confounding. csde outperforms existing methods in simulations with unobserved confounding. We applied csde to a case-control scRNA-seq dataset containing 1.3M PBMCs from 261 healthy and lupus individuals. csde identifies substantially more differentially expressed genes across cell types than baseline methods (median improvement of 71% across 8 PBMC cell types) and revealed novel pathways (such as the “response to interferon-beta” pathway for CD4 T cells, FDR<1e-6). csde further estimated individual treatment effects (ITE) for individual control cells in the dataset and revealed two cell populations with distinct disease responses, one consisting of T cells and B cells, and the other primarily monocytes. Second, I will further discuss the differences between differentially expressed genes identified in case-control scRNA-seq and disease genes implicated by GWAS: a) GWAS genes have limited overlap with differentially expressed genes, b) integrating GWAS with eQTL improves overlap, c) differentially expressed genes overlaps more with known drug targets than GWAS genes. Taken together, these results provide comprehensive insights into the causal relationships between genes and disease.  

Hello! I am Martin, an assistant professor at the Computational Biology Department at CMU. I just ended my role as a research associate at Harvard School of Public Health working with Prof. Alkes Price on statistical genetics. Currently, I am interested in the following topics:

I did my PhD at Stanford with Prof. David Tse and Prof. James Zou on statistics, machine learning, and computational biology. My PhD works span a wide range of topics from theory to algorithm design to applications. I have worked on covariate-adaptive multiple hypothesis testing, algorithm acceleration via multi-armed bandits, optimal design for single-cell RNA-seq experiments, and analyzing single-cell RNA-seq data sets.