Reading List

MIT, Stanford, and Princeton:
The Hacking Policy Council:
- AI red teaming - Legal clarity and protections needed
- Comments to NIST’s AI Safety Institute on Managing Misuse Risk for Dual-Use Foundation Models
HackerOne:
- An Emerging Playbook for AI Red Teaming With HackerOne
- Anthropic Expands Their Model Safety Bug Bounty Program
BugCrowd:
- Vulnerability Rating Taxonomy
Disclose.io:
- VDP Policy Generator
CMU / Community Emergency Response Team (CERT) / Artificial Intelligence Security Incident Response Team (AISIRT):
Cybersecurity and Infrastructure Security Agency (CISA):
Center for Security and Emerging Technology (CSET):
- Adversarial Machine Learning and Cybersecurity: Risks, Challenges, and Legal Implications
Data and Society:
- AI Red-Teaming Is Not a One-Stop Solution to AI Harms: Recommendations for Using Red-Teaming for AI Accountability
Humane Intelligence:
DeepMind:
OpenAI:
- Preparedness Framework
- Response to NIST Executive Order on AI
Microsoft:
- How Microsoft Approaches AI Red Teaming
- Microsoft AI Red Team building future of safer AI
Bugcrowd:
- Vulnerability Rating Taxonomy
- Ultimate Guide to Vulnerability Disclosure
OpenPolicy:
- Innovative Leaders' Perspectives on AI Red Teaming
- Comments on National Institute of Standards and Technology (NIST) SP 800-218A: Secure Software Development Practices for Generative AI and Dual-Use Foundation Models
Academic Communities:
- Identifying and Mitigating the Security Risks of Generative AI
- Adversarial Nibbler
DEFCON AI Village (and HuggingFace):
- Coordinated Flaw Disclosure for AI: Beyond Security Vulnerabilities

Hugging Face:
- Red-Teaming Large Language Models
- LLM Evaluation Guidebook
Algorithmic Justice League:
- Change From the Outside: Towards Credible Third-Party Audits of AI Systems
Queer in AI:
- Bound by the Bounty

Knight Columbia First Amendment Institute:
- A Safe Harbor for Platform Research
Federation of American Scientists:
- A Safe Harbor For AI Researchers: Promoting Safety And Trustworthiness Through Good-Faith Research
The U.S. Department of State:
- Risk Management Profile for Artificial Intelligence and Human Rights

National Institute of Standards and Technology (NIST):

US Legislative Proposals:
- AI Incident Reporting and Security Enhancement Act
Mozilla:
- External researcher access to closed models

DEFCON AI Village, Generative AI Red Teaming 2:
- To Err is AI : A Case Study Informing LLM Flaw Reporting Practices
MITRE:
- MITRE Adversarial Threat Landscape for Artificial-Intelligence Systems (ATLAS)
- MITRE Common Weakness Enumeration (CWE)
OECD:
- OECD AI Incidents Monitor (AIM)
AIID:
- AI Incident Database (AIID)
AI Risk and Vulnerability Alliance:
- AI Vulnerability Database (AVID)
Allen Institute for AI/UW:
- WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
- WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Google Sites

Report abuse