Search this site
Embedded Files
  • MIT, Stanford, and Princeton:

    • Open Letter (350+ co-signers) on the need for AI safe harbors

    • A Safe Harbor for AI Evaluation and Red Teaming

    • AIR-Bench

    • AI Risk Repository

  • The Hacking Policy Council:

    • AI red teaming - Legal clarity and protections needed

    • Comments to NIST’s AI Safety Institute on Managing Misuse Risk for Dual-Use Foundation Models

  • HackerOne:

    • An Emerging Playbook for AI Red Teaming With HackerOne

    • Anthropic Expands Their Model Safety Bug Bounty Program

  • BugCrowd:

    • Vulnerability Rating Taxonomy

  • Disclose.io:

    • VDP Policy Generator

  • CMU / Community Emergency Response Team (CERT) / Artificial Intelligence Security Incident Response Team (AISIRT):

    • Lessons Learned in Coordinated Disclosure for Artificial Intelligence and Machine Learning Systems

    • CERT Guide to Coordinated Vulnerability Disclosure

    • On managing vulnerabilities in AI/ML systems

  • Cybersecurity and Infrastructure Security Agency (CISA): 

    • Software Must Be Secure by Design, and Artificial Intelligence Is No Exception

    • Secure-by-Design

    • CISA and Joint-Seal AI Publications

    • Cyber Incident Reporting for Critical Infrastructure Act (CIRCIA) Reporting Requirements

  • Center for Security and Emerging Technology (CSET): 

    • Adversarial Machine Learning and Cybersecurity: Risks, Challenges, and Legal Implications

  • Data and Society: 

    • AI Red-Teaming Is Not a One-Stop Solution to AI Harms: Recommendations for Using Red-Teaming for AI Accountability

  • Humane Intelligence:

    • Generative AI Red Teaming Challenge: Transparency Report

    • Algorithmic Risk Assessments, Audits, & Human Rights Report (GNI)

    • Red teaming large language models (LLMs) for resilience to scientific disinformation

  • DeepMind:

    • Frontier Safety Framework

    • Ethical and social risks of harm from Language Models

    • Taxonomy of Risks posed by Language Models

  • OpenAI:

    • Preparedness Framework

    • Response to NIST Executive Order on AI

  • Microsoft:

    • How Microsoft Approaches AI Red Teaming

    • Microsoft AI Red Team building future of safer AI

  • Bugcrowd:

    • Vulnerability Rating Taxonomy

    • Ultimate Guide to Vulnerability Disclosure

  • OpenPolicy:

    • Innovative Leaders' Perspectives on AI Red Teaming

    • Comments on National Institute of Standards and Technology (NIST) SP 800-218A: Secure Software Development Practices for Generative AI and Dual-Use Foundation Models

  • Academic Communities:

    • Identifying and Mitigating the Security Risks of Generative AI

    • Adversarial Nibbler

  • DEFCON AI Village (and HuggingFace):

    • Coordinated Flaw Disclosure for AI: Beyond Security Vulnerabilities

  • Hugging Face:

    • Red-Teaming Large Language Models

    • LLM Evaluation Guidebook

  • Algorithmic Justice League:

    • Change From the Outside: Towards Credible Third-Party Audits of AI Systems

  • Queer in AI:

    • Bound by the Bounty

  • Knight Columbia First Amendment Institute: 

    • A Safe Harbor for Platform Research

  • Federation of American Scientists:

    • A Safe Harbor For AI Researchers: Promoting Safety And Trustworthiness Through Good-Faith Research

  • The U.S. Department of State:

    • Risk Management Profile for Artificial Intelligence and Human Rights

  • National Institute of Standards and Technology (NIST): 

    • NIST AI 600-1- Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile

    • NIST Artificial Intelligence Risk Management Framework (AI RMF 1.0)

    • NIST AI 800-1 (Initial Public Draft) Managing Misuse Risks of Dual-Use Foundation Models 

    • NIST SP 800-218A Secure Software Development Practices for Generative AI and Dual-Use Foundation Models: An SSDF Community Profile

    • NIST AI 100-2 E2023: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations

    • NIST Cybersecurity Framework (CSF) 2.0

    • NIST Vulntology

  • US Legislative Proposals:

    • AI Incident Reporting and Security Enhancement Act

  • Mozilla:

    • External researcher access to closed models

  • DEFCON AI Village, Generative AI Red Teaming 2:

    • To Err is AI : A Case Study Informing LLM Flaw Reporting Practices

  • MITRE:

    • MITRE Adversarial Threat Landscape for Artificial-Intelligence Systems (ATLAS)

    • MITRE Common Weakness Enumeration (CWE)

  • OECD:

    • OECD AI Incidents Monitor (AIM)

  • AIID:

    • AI Incident Database (AIID)  

  • AI Risk and Vulnerability Alliance:

    • AI Vulnerability Database (AVID)

  • Allen Institute for AI/UW:

    • WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

    • WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Google Sites
Report abuse
Google Sites
Report abuse