Responsible AI in the Natural Sciences

a mini workshop at Carnegie Mellon  University and online, Monday May 8th 2023


(Update: Thank you to everyone who participated. Links to the workshop recordings are below.)


Motivation for the (mini) Workshop:

AI is now routinely used in fundamental scientific research, from medical science to mathematics and particle physics. How this can be done responsibly is also a topic of current study. We invite participants to an afternoon workshop to explore issues related to the use of AI in basic science, including explainability, reproducibility, bias and ethics. An example topic is the use of large language models in scientific writing. The talks and discussion are intended for a broad audience and we hope for an entertaining meeting (see for example the "AI vs Human" contest - try now to see if you can tell apart Human and AI generated entries). We aim in particular to spark discussion and thought about these topics for researchers in the Natural Sciences who have started using AI in the last few years but would like to know more about Responsible AI.

The workshop is hosted jointly by the Carnegie Mellon University NSF AI Planning Institute for Data-Driven Discovery in Physics and the Block Center for  Technology and Society, also at CMU.

Invited Speakers

Hoda Heidari

Carnegie Mellon University

Co-Leader CMU Responsible AI Initiative

Ahmad P. Tafti

University of Pittsburgh


Director Pitt HexAI

Hima Lakkaraju

Harvard University


Lead: AI4LIFE, TrustML

Steinn Sigurdsson

Penn State University


Scientific Director, arXiv

Savannah Thais

Columbia University


Founding Editor of Springer AI Ethics

Format/Location/Time

Schedule details

12:15 pm Light Lunch (catered by Salem's Grill)

1:00 pm    Welcome: Rupert Croft  (CMU AI Planning Institute)

1:05 pm    Invited talk: Hoda Heidari  (Responsible AI and the Need for a Stronger Governance Ecosystem)

Abstract: In this talk, I will provide an overview of the field that has come to be known as "Responsible AI" by some and "AI Ethics" and "FAccT" (for Fairness, Accountability, and Transparency) by others. Centering the ethical and societal concerns around Large Language Models (LLMs), I will argue that the responsible development and use of AI will require strengthening the governance ecosystem around these technologies. I will unpack what that could look like and how we, as researchers and educators, can contribute to it positively.


Bio: Hoda Heidari is an Assistant Professor in Machine Learning and Societal Computing at the School of Computer Science, Carnegie Mellon University. Her research is broadly concerned with the social, ethical, and economic implications of Artificial Intelligence. In particular, her research addresses issues of unfairness and opaqueness through Machine Learning. Her work in this area has won a best-paper award at the ACM Conference on Fairness, Accountability, and Transparency (FAccT), an exemplary track award at the ACM Conference on Economics and Computation (EC), and a best paper award at IEEE Conference on Secure and Trustworthy Machine Learning (SAT-ML). She has organized several scholarly events on topics related to Responsible and Trustworthy AI, including a tutorial at the Web Conference (WWW) and several workshops at the Neural and Information Processing Systems (NeurIPS) conference. Dr. Heidari completed her doctoral studies in Computer and Information Science at the University of Pennsylvania. She holds an M.Sc. degree in Statistics from the Wharton School of Business. Before joining Carnegie Mellon as a faculty member, she was a postdoctoral scholar at the Machine Learning Institute of ETH Zurich, followed by a year at the Artificial Intelligence, Policy, and Practice (AIPP) initiative at Cornell University.


1:25 pm    Invited talk: Hima Lakkaraju (Regulating Explainable AI: Technical Challenges, Solutions, and Opportunities)

Abstract: As predictive and generative models are increasingly being deployed in various high-stakes applications in critical domains including healthcare, law, policy and finance, it becomes important to ensure that relevant stakeholders understand the behaviors and outputs of these models so that they can determine if and when to intervene. To this end, several techniques have been proposed in recent literature to explain these models. In addition, multiple regulatory frameworks (e.g., GDPR, CCPA) introduced in recent years also emphasized the importance of enforcing the key principle of “Right to Explanation” to ensure that individuals who are adversely impacted by algorithmic outcomes are provided with an actionable explanation. In this talk, I will discuss the gaps that exist between regulations and state-of-the-art technical solutions when it comes to explainability of predictive and generative models. I will then present some of our latest research that attempts to address some of these gaps. I will conclude the talk by discussing bigger challenges that arise as we think about enforcing right to explanation in the context of large language models and other large generative models.


Bio: Himabindu (Hima) Lakkaraju is an assistant professor at Harvard University focusing on the algorithmic, theoretical, and applied aspects of explainability, fairness, and robustness of machine learning models. Hima has been named as one of the world’s top innovators under 35 by both MIT Tech Review and Vanity Fair. She also received several prestigious awards including the NSF CAREER award and multiple best paper awards at top-tier ML conferences, and grants from NSF, Google, Amazon, JP Morgan, and Bayer. Hima has given keynote talks at various top ML conferences and associated workshops including CIKM, ICML, NeurIPS, ICLR, AAAI, and CVPR, and her research has also been showcased by popular media outlets including the New York Times, MIT Tech Review, TIME magazine, and Forbes. More recently, she co-founded the Trustworthy ML Initiative to enable easy access to resources on trustworthy ML and to build a community of researchers/practitioners working on the topic.

1:45 pm    Invited talk: Savannah Thais (AI as an object of study: implications for science and society)

Abstract:  AI is increasingly considered as a powerful tool for scientific inquiry, study, and discovery; similarly, AI is now pervasive in nearly all facets of society, including many high stakes and life altering systems. However, despite the ubiquity of AI, many questions remain around the capabilities, reliability, and robustness of AI systems. Moreover, many publicly deployed AI systems are not rigorously designed or evaluated in ways that enable trustworthiness and performance guarantees. This approach to AI research and development has substantial implications for its utility as a scientific tool as well as AI's impact on society. In this talk, I will explore reframing AI systems as an object of scientific study themselves and discuss ways that slow science approaches to AI R&D can enable scientific advances and perhaps mitigate some of the adverse societal impacts of 'unscientific' or under-evaluated AI systems. I will also present a scientific framework for developing and evaluating AI models and highlight some ways that scientists can contribute to a more just, equitable, and participatory AI future.


Bio: Savannah Thais is a Research Scientist at the Columbia University Data Science Institute where she focuses on machine learning (ML). She is interested in complex system modeling and in understanding what types of information is measurable or modelable and what impacts designing and performing measurements have on systems and societies. This work is informed by her background in high energy particle physics and incorporates traditional scientific experiment design components such as uncertainty quantification, experimental blinding, and decorrelation/de-biasing methods. Her recent work has focused on geometric deep learning, methods to incorporate physics-based inductive biases into ML models, regulation of emerging technology, social determinants of health, and community education. She is the founder and Research Director of Community Insight and Impact, an non-profit organization focused on data-driven community needs assessments for vulnerable populations and effective resource allocation. She is passionate about the impacts of science and technology on society and is a strong advocate for improving access to scientific education and literacy, community centered technology development, and equitable data practices.

2:05 pm    Break (20 mins) 

2:25 pm    Invited talk: Ahmad P. Tafti ( Explainable AI: What, why, and the next big jump )

Abstract: Artificial Intelligence (AI) has made a big leap in a variety of scientific disciplines, ranging from mathematics, biochemistry, and biological sciences to medical and health sciences. It is now no longer enough to build highly accurate AI models. There is also a pressing need to make sure that those computational models are explainable and interpretable. Explainable AI (XAI) refers to collection of computational methods that aim to make AI models and their decision-making processes more transparent, comprehensive, and understandable to humans. The rationale behind XAI is rooted in the complexity of many AI models, which can make them challenging to interpret, leading to biased or erroneous decisions. By providing explanations for the reasoning behind AI models' decisions or predictions, XAI can help foster trust and confidence in these systems, enhance their accountability, and ensure compliance with ethical and regulatory standards. In this talk, we will be discussing the effectiveness of XAI in high-stakes applications such as health sciences where the consequences of AI errors or biases can be severe. We, together, will explore what XIA components are, what they do, and how.

Bio: Ahmad P. Tafti is an Assistant Professor of Health Informatics in the Department of Health Information Management within the School of Health and Rehabilitation Sciences at the University of Pittsburgh, with a secondary appointment in the Intelligent Systems Program (ISP), at the University of Pittsburgh’s School of Computing and Information. He is leading the Pitt HexAI Research Laboratory, and is affiliated with the Center for AI Innovation in Medical Imaging (CAIIMI) and UPMC Hillman Cancer Therapeutics Program. Ahmad P. Tafti is an advisory board member of Stanford Deep Data Research Center at Stanford University, and he serves our community as the Vice Chair of IEEE Computer Society at Pittsburgh. Ahmad has a deep passion for AI-Powered healthcare informatics and health data science with better patient diagnosis, prognosis, and treatment using large-scale multiple clinical data sources and advanced computational algorithms. Ahmad P. Tafti is the 2021 SiiM Imaging Informatics Innovator awardee, Oracle for Research Project awardee, University of Pittsburgh CTSI awardee, Mayo Clinic Benefactor funded CDAs Orthopedics Career Development awardee, Mayo Clinic Transform the Practice awardee, an NVIDIA GPU awardee, and GE Healthcare Honorable Mention awardee. To date, he has authored 45+ peer-reviewed publications, organizing numerous workshops on intelligent health systems and he has served on the program committee of 15+ conferences, symposiums, and journals in AI and health data science.  



2:45 pm    Contributed talks (Przemyslaw Grabowicz, Katherine Stange, Stefanus Christian Relmasira, Soo Kyung Kim :  10 mins each)

2:45-3:00 pm Speaker: Przemyslaw Grabowicz (University of Massachussetts)

Title: Towards fair and explainable automated decision-making for hiring and student admissions

Abstract:  Legal and judicial systems prevent discrimination by establishing whether there was unjustified disparate impact or treatment. Explanations of a consequential decision-making process are crucial in establishing respective justifications. We introduce methods marrying fairness and explainability to remove the impact of protected attributes, such as race and gender, from automated decision-making, while avoiding discrimination via proxies. In the times when the supreme court is leaning against affirmative actions in student admissions, we argue that these methods provide a path towards legally admissible fair and explainable automated hiring and student admission systems.

3:00-3:10 pm Speaker: Katherine Stange (University of Colorado)

Title: Can large language models prove theorems?

Abstract: I will briefly discuss the current capabilities of ChatGPT, Bard and Bing at mathematical reasoning, at least in the modest personal experience of this one curious mathematical researcher, and speculate on what role they may play in mathematical research and writing, now or in the future.

3:10-3:20 pm Speaker:  Stefanus Christian Relmasira (The Education University of Hong Kong and Satya Wacana Christian University)

Title:  Developing AI Literacy for the Next Generation of Scientists

Abstract: The rapid advancement of AI technology will greatly impact society, including children's lives. As AI becomes increasingly integrated into industries, education, and all aspects of our lives, it is crucial to develop AI literacy among the younger generation. In response to this need, we have developed an AI literacy model based on four main constructs: recognition; explanation and evaluation; interaction; and ethics. We have also created a pedagogical network map of elements of the model and alignments with principles from learning theories. We designed and implemented an intervention based on this model in three elementary schools in Indonesia. This mini-workshop talk aims to introduce the AI literacy model and highlight the significance of cultivating AI literacy among those who will become the next generation of scientists.

3:20-3:30pm Speaker: Soo Kyung Kim (Palo Alto Research Center [PARC])

Title: Towards Physically Reliable Molecular Representation Learning 

Abstract: Estimating the energetic properties of molecular systems is a critical task in material design. Machine learning has shown remarkable promise on this task over classical force-fields, but a fully data-driven approach suffers from limited labeled data; not just the amount of available data lacks, but the distribution of labeled examples is highly skewed to stable states. In this work, we propose a molecular representation learning method that extrapolates well beyond the training distribution, powered by physics-driven parameter estimation from classical energy equations and self-supervised learning inspired from masked language modeling. To ensure reliability of the proposed model, we introduce a series of novel evaluation schemes in multifaceted ways, beyond the energy or force accuracy that has been dominantly used. From extensive experiments, we demonstrate that the proposed method is effective in discovering molecular structures, outperforming other baselines. Furthermore, we extrapolate it to the chemical reaction pathways beyond stable states, taking a step towards physically reliable molecular representation learning.

3:30 pm    Breakout activity : AI vs Human (see below). 

Conference participants are invited to act as "discriminators" in the second phase of the AI vs Human contest (see below). This phase opened up shortly after the May 4th deadline for submitting real and generated content, and will end at 4:00pm on the day of the workshop. 

4:00 pm   Invited talk: Steinn Sigurdsson (The Return of Clippy or the End of Science: guiderails for use and abuse of LLMs in Science)

Abstract: The current rapid developments in the applications of Large Language Models to producing coherent structured

text and iteratively improving output through “conversations” with users is impacting science writing at all levels, and likely to

be severely disruptive to many traditional expectations of authorship, literature research and the effort required to communicate

academic information in general, and science in particular.  Even if there is diminishing return in the rate of improvement of future

generations of LLM drive Chatbots, their use as core modules in more general guided AI services in the near future is likely

to dramatically extend the applicability of AI services in the making of science.  I briefly discuss some current guidelines on 

how Chatbots may be responsibly used in the delivery of science results, and speculate on what may come next.

We are not done yet.


Bio: Steinn Sigurðsson is a Professor in the Department of Astronomy & Astrophysics at the Pennsylvania State University. He received his PhD in physics in 1991 from the California Institute of Technology, and completed postdoctoral fellowships at the University of California at Santa Cruz and Cambridge University. He does research in theoretical astrophysics. Steinn is a member of the Center for Exoplanets and Habitable Worlds at Penn State; the Institute for Gravitation and the Cosmos at Penn State; and the Penn State Astrobiology Research Center. Steinn is a Science Editor of the AAS Journals, a Member of the Aspen Center for Physics, and the Scientific Director of arΧiv at Cornell University.

4:20 pm  Break (10 mins)

4:30 pm   Discussion panel.

5:30 pm  Dinner Reception (catered by Salem's Grill: Lamb curry, veggie curry, chicken kebob.)

Contest: AI vs Human

As part of the workshop there will be an Human vs Generative AI contest.  The aim of the contest is to raise awareness of generative AI capabilities in a lighthearted fashion and initiate debate between workshop participants. There are modest prizes for the top "generators" and "discriminators".


Registered participants were invited to submit entries for the "generator" phase of the contest. The "discriminator" phase is open now - if you click on the competition link below you will be asked to try your hand at telling apart human and AI generated content.  The answers will be revealed after the 4pm deadline on the day of the workshop.  Post workshop addition: although the official contest is over, you are free to try out the questions for fun (link below works).

Organizing Committee

Kasun Amarasinghe

Rupert Croft

Yesukhei Jagvaral

 Patrick Shaw