Emerging Problems - FAccT 2022 CRAFT session

Applied FAccT Practices

Model Monitoring in Practice: Lessons Learned and Open Challenges

Dr. Krishnaram Kenthapadi, Chief Scientist, Fiddler.AI

How do we develop machine learning models and systems taking fairness, accuracy, explainability, robustness, and privacy into account? How do we operationalize models in production, and address their governance, management, and monitoring? Model validation, monitoring, and governance are essential for building trust and adoption of AI systems in high-stakes domains such as hiring, lending, and healthcare. In this talk, we will highlight challenges faced by various stakeholders when operationalizing AI/ML models, and emphasize the need for adopting responsible AI practices not only during model validation but also post deployment as part of model monitoring. Please refer to our FAccT'22 tutorial for a detailed overview of techniques & tools for monitoring deployed ML models, industry case studies, key takeaways, and open challenges.

What are Your Metrics Not Telling You?

Tulsee Doshi, Head of Product, Responsible AI & ML Fairness, Google

Common practice in Responsible AI is to develop metrics that represent the concerns we want to better understand, concerns of fairness, safety, privacy. These metrics are critical: they hold us accountable, help us find gaps in our product development, and drive meaningful, measurable, change. But even so, our metrics have their limitations, especially by the ways in which we define who we are evaluating and improving for. Through a set of short case studies, Tulsee will highlight some of the nuances in metric development for fairness evaluation, how the ways in which we define demographics can critically change our understanding, and the importance of clarity & accountability when discussing these goals.

Measuring Fairness-Related Harms in Language Technologies

Dr. Su Lin Blodgett, Postdoctoral Researcher, Microsoft Research Montreal

This talk will explore how existing algorithmic fairness approaches are often unsuited to capturing fairness-related harms arising from language technologies, suggesting the need for language-focused measurement approaches to support NLP practitioners. At the same time, this talk will illustrate persistent challenges to developing benchmark datasets—one emerging such measurement approach—and call for FATE research that addresses measurement appropriate to language-related harms, and more broadly, NLP practitioners’ needs in their ethical work.

Automated Testing and Debugging AI Models

Dr. Diptikalyan Saha, Sr. Technical Staff Member, IBM Research India

Many industries have centered on AI-based innovation in their business development. However, trust in AI output is crucial for the broad adoption of AI systems. For ensuring reliability, industrial practices use testing and debugging of their applications. In this talk, we discuss the unsolved problems related to the automated testing and debugging of AI models. Specifically, we emphasize incorporating user-driven specifications to create realistic test data and mapping misprediction to the data.

Page updated

Google Sites

Report abuse