Context
There’s no well-established path to disclose a flaw you find in an AI system. As technologists, researcher, and lawyers from 24 organizations, we came together to design a better system.
Proposal
We propose that researchers submit AI flaw reports, and AI system providers adopt coordinated disclosure programs, with built-in researcher protections, specifically for the broad spectrum of AI flaws.
Implementation
Our paper on third-party flaw disclosure provides a standardized template for AI flaw reports, language AI providers can adopt to protect researchers, and next steps to build an ecosystem that delivers.
Key Findings
Flaw disclosure for AI systems has major gaps. Researchers who discover flaws in AI systems often do not report them, cannot find an effective reporting outlet, or merely post them on social media.
Transferable flaws pose significant risks. Some flaws affect many systems, implicating system developers, deployers, and users. If these flaws are not reported to all the relevant actors, AI systems could cause much greater harm.
Building infrastructure can tackle this problem. Standardized reporting, legal protections, and tools to coordinate disclosure can transform how we identify and address AI flaws before harm occurs.
Changing the Status Quo
The current state of AI flaw disclosure is fragmented and underdeveloped compared to established practices in software security.
While reporting security flaws in AI systems is often in scope for companies’ current flaw reporting mechanisms, AI flaws are broader than security, spanning safety, trustworthiness, and sociotechnical issues.
There are hundreds of millions of users of popular general-purpose AI systems, but users often never find out about serious issues in the products or services they use. Additionally, developers are not often held accountable for flaws in their systems.
We propose a centralized disclosure coordination scheme for the ecosystem that will help route and triage AI Flaw Reports to actors across the AI supply chain.
Third-Party Evaluation Is Critical
Third-party evaluation has unique benefits, but AI companies often default to first- and second-party evaluation as shown in the figure below.
Third-party evaluation:
Enhances scale and diversity of participation beyond what AI system providers can achieve internally
Improves coverage of evaluations by incorporating perspectives and expertise not represented within AI system providers
Provides evaluations with independence that do not suffer from conflicts of interest that may undermine their assessments
A Checklist for Evaluators and Providers
Both evaluators and providers of general-purpose AI systems have room to improve their practices for the better.
Evaluators should abide by “good faith” rules of engagement such as not harming real users and systems, protecting privacy, and evaluating systems that providers identify as in-scope.
Evaluators should responsibly disclose flaws, sharing key information via standardized AI Flaw Reports (like the example in the figure above) and giving system providers an opportunity to mitigate flaws before public disclosure.
Providers of general-purpose AI systems should offer safe harbor to good faith research and encourage third-party evaluation.
Questions? Read the paper, contact slongpre@media.mit.edu