6th Workshop on Integrity in Social Networks and Media [TBD date/time: August 10, 2026]
Co-located with KDD 2026 · August 9–13, 2026 · Jeju, Korea
Social networks and social media have become the default communication channels for billions of people worldwide. While these platforms enable connection and discovery at unprecedented scale, they also expose fundamental integrity challenges — from misinformation and coordinated manipulation to child safety risks and harms from AI-generated synthetic media. The rapid evolution of AI, especially generative models, has transformed this landscape. Advancing AI systems simultaneously intensify online risks while unlocking powerful new capabilities in automated moderation, behavioral anomaly detection, and human-AI safety operations.
This half-day workshop aims to bring together researchers, practitioners and policy makers from academia and industry to explore these dual-use dynamics. The event features invited talks from academic experts and industry leaders, peer-reviewed papers through an open call-for-papers, and a panel discussion.
We welcome submissions on, but not limited to, the following topics:
Adversarial Dynamics in the GenAI Era — Evolving evasion strategies, automated red-teaming, and real-time detection of AI-generated misinformation and behavioral anomalies.
AI-Accelerated Coordinated Operations & Agent Dynamics — Emerging integrity risks driven by synthetic personas, agent-to-agent manipulation, and coordinated influence operations.
Parasocial Harms from Synthetic Entities — Risks from AI-driven personas and virtual influencers engineered to create emotional dependency and manipulate users.
Open-Source Trust & Safety Toolkits — Advancing collaborative, open-source ecosystems for content moderation, threat detection, red-teaming, and safety evaluation.
Foundational Models for Integrity — Generative AI for content moderation, open-source integrity oracles, and reliable non-synthetic ground truth datasets.
AI-Enabled Evaluation Frameworks — Developing AI-enabled benchmarks and continuous evaluation pipelines that measure robustness and real-world harm reduction.
Data Pollution, Model Collapse & Knowledge Discovery — Long-term ecological risks from synthetic training data loops and the challenge of retrieving authentic human insight.
Human-AI Collaboration in Safety Operations — Improving reviewer well-being, quality, and efficiency through hybrid human-AI pipelines and LLM-assisted labeling.
Regulatory Alignment & Global Compliance — Balancing user rights, cultural nuance, and regional regulations while preventing over-enforcement.
Multimodal Safety at Scale — Challenges in moderating large-scale video, audio, and synthetic media, including efficient architectures and cross-modal reasoning.
Safety for Autonomous Agents — Ensuring safe behavior in agentic systems capable of planning, tool use, and long-horizon actions.
We invite submissions of original research papers, position papers, and work-in-progress reports on topics related to the use of AI for integrity in social networks and media (see potential topic areas above). These will be peer reviewed and selected papers will be presented in-person during the event and included in the Integrity Workshop proceedings. We welcome participation from academia, industry, and government to foster cross-disciplinary collaboration.
Submission Guidelines:
Short papers: 2–4 pages (excluding references)
Full papers: 5–8 pages (excluding references)
Submissions should be PDFs formatted using the ACM conference template
All submissions undergo peer review by the program committee
Accepted papers will be presented as in-person short talks during the workshop and posted on the workshop website
Submission link: https://openreview.net/group?id=KDD.org/2026/Workshop/Integrity
Important Dates:
Paper Submission Opens: April 9, 2026
Paper Submission Deadline: April 30, 2026
Paper Notification: June 4, 2026
Camera-Ready Deadline: June 22, 2026
Workshop Date: August 10, 2026
All deadlines are 11:59 PM AoE (Anywhere on Earth).
Speaker and Panelist
Title: Overview of AI-models for Trust and Safety at YouTube
Abstract: In this talk, I’ll provide an overview of how we use AI to detect policy-violative content on YouTube across different entity types: videos, comments, livestreams, etc and keep our community safe. AI is an essential part of YouTube Trust and Safety and I'll share the lessons we learned over many years while building, deploying and maintaining models at scale. Furthermore, I’ll cover the unique challenges for different product surfaces, like livestreams and shorts, which makes the problem domain particularly intriguing.
Bio: Mehmet Emre Sargin received his Ph.D. degree in Electrical and Computer Engineering from University of California, Santa Barbara, in 2010. He is now a Sr. Director at Google, leading the Trust and Safety detection teams under YouTube. His research interests include computer vision, recommendation systems and machine learning.
Speaker
Title: Scaling Abuse Prevention for the Next Wave of AI at Linkedin
Abstract: AI is reshaping the abuse landscape by enabling attackers to generate more convincing content, automate workflows, and iterate at unprecedented speed. At the same time, platforms are increasingly using AI themselves, creating both new opportunities for defense and new classes of failure modes. As organizations build more sophisticated AI solutions and agentic experiences, abuse prevention must evolve in parallel to address risks that are more dynamic, scalable, and difficult to contain. This talk explores how LinkedIn is scaling abuse prevention for this changing environment, including the challenges of detection and enforcement in AI-mediated ecosystems, the need to embed trust and safety into advanced AI systems from the start, and how to build sophisticated AI-powered defenses and agents that can detect, adapt, and respond at scale, , while continuously improving decision quality, agility, and operational efficiency.
Bio: With over 20 years of work experience in diverse domains and industries, Daniel is a seasoned leader and innovator in the field of artificial intelligence, data science and product infrastructure, having managed large teams across multiple sites world-wide. His mission is to support machine learning efforts at LinkedIn to ensure a safe, trusted, and professional platform, while committing to the advancement of AI driven by ethical principles that put people first. He brings diverse perspectives and experiences to the team, as a multilingual professional with a PhD in computer science and a background in industry, consulting, academia, and research.
Speaker and Panelist
Title: AI Integrity as a Search Problem: Diversity-Driven Behavioral Evaluation
Abstract: It has become too easy to generate and too hard to evaluate. AI systems ship fast, but making sure they actually behave correctly is still slow, manual, and narrow. Current evaluation makes this worse - it optimizes for one number (attack success rate) and misses the breadth of how systems really break. We present Flint, a framework that treats the gap between how AI behaves and how it should behave as a search problem. Flint is a search engine - the target system is just a parameter. The same infrastructure that probes a chatbot's safety boundaries works for agent frameworks, RL reward models, and world model evaluation. Only the executor adapter changes. Flint nests evolutionary strategy search with RL-guided prompt optimization. The outer loop evolves multi-turn, multi-modal strategies and maintains a quality-diversity archive that keeps diverse high-performers rather than collapsing to one best approach. The inner loop sequences mutations turn by turn, guided by a belief tracker (BeliefNet) that reads target model state and routes through a dual memory system - discrete retrieval for known patterns, a neural policy for novel situations. The strategy bank grows with every run. We walk through a controlled evaluation of a frontier language model as one case study among broader deployments. The core finding: behavioral gaps are distributed. Cognitive bias chaining, consensus loops, creative framing, credential-based extraction, and role-inversion attacks all succeed independently, yet no single strategy dominates - patching the top vector leaves most of the surface exposed. Successful multi-turn attacks oscillate between refusal and compliance, revealing that per-turn enforcement is stateless. The same search framework applies to any customer-defined policy or behavioral expectation - business requirements, regulatory compliance, tone guidelines, over-refusal, domain-specific constraints. Discovered patterns feed into a production guardrail cascade; production bypasses re-enter the search loop, so evaluation and enforcement co-improve over time.
Bio: Anish Das Sarma is the Founder and CEO of Reinforce Labs, a startup dedicated to ensuring responsible adoption of enterprise AI systems. A repeat founder, Anish previously built Trooly, an AI-powered identity and trust platform that was acquired by Airbnb. At Airbnb, he went on to lead key initiatives in AI, trust & safety. Most recently, Anish served as a Director of Engineering at Google, where he led large-scale AI/ML teams across Google Ads Safety. Anish holds a Ph.D. in Computer Science from Stanford University and a B.Tech in Computer Science from IIT Bombay. Across more than a decade in AI, he has combined deep technical expertise with hands-on operational leadership, building products and teams across multiple domains of applied ML.
Speaker
Title: Measuring Online Abuse in an AI-Powered World
Abstract: How do you know if your internet platform is getting safer or more dangerous? At Enigma 2020 I argued that measurement (and not detection) is the hardest problem in abuse fighting: How do we find out if we’re failing to block some attackers? How do we know if we’ve inadvertently blocked benign users or content? I shared some approaches we use at Meta to produce labeled samples in domains where we can never definitively obtain “ground truth”; I also introduced an “uncertainty principle” that surfaces a fundamental tension between measurement and detection.
Six years later, we still need to answer the same questions, but now both sides have AI. Adversaries’ use of AI makes the core measurement questions even harder to answer, while the widespread adoption of AI in both abuse detection and abuse measurement risks introducing circular logic that threatens the efficacy of the entire system — if AI is “grading its own homework,” then how can we trust the numbers it gives us?
In this talk I’ll recap the prior art in abuse measurement, discuss how AI has changed the landscape, and describe multicalibration — a mathematical framework that addresses the circularity concerns raised above. Meta researchers have used multicalibration approaches to demonstrate that, contrary to conventional wisdom, the same model can be used for both enforcement and measurement. It turns out that with the right controls, AI can indeed grade its own homework on finding abuse — but I can’t offer the same guarantee for your university courses!
Bio: David Freeman is a Research Scientist/Engineer on the Central Integrity team at Meta, where his work focuses on identifying and stopping various forms of abuse on Meta's platforms. Dr. Freeman has 14 years of industry experience working at the intersection of data science, anti-abuse engineering, and policy. He has published, presented, and workshop-chaired at international conferences such as Enigma, Usenix Security, WWW, and AISec, and has written (with Clarence Chio) a book on Machine Learning and Security published by O'Reilly. He holds a Ph.D. in Mathematics from UC Berkeley and did postdoctoral research in the Applied Cryptography group at Stanford.
Panelist
Baraa is a Principal Software Engineer at Meta, where he currently drives the vision and strategy for AI-powered developer productivity - architecting Agent-AKI tools, workflows, and platforms that are transforming how engineers build software at scale. Prior to this role, Baraa spent eight years leading Trust and Safety Initiatives at Meta, where he designed and built Integrity Systems and user-facing safety products that protect billions of people across Meta's family of apps. His works span the detection and mitigation of platform abuse, misinformation, and harmful content, contributing to the development of scalable enforcement and content moderation infrastructure. Baraa also founded cross-functional engineering communities at Meta focused on raising the bar for product quality and engineering excellence. He is passionate about the intersection of artificial intelligence, platform integrity, and building systems that operate reliably and responsibly at scale.
Panagiotis Papadimitriou, Senior Director of Engineering, Trust and Safety, Meta
Mehmet Emre Sargin, Senior Director of Engineering, Youtube Trust and Safety, Google
Baraa Hamodi, Principal Software Engineer, Trust and Safety, Meta
Anish Das Sarma, Founder & CEO, Reinforce Labs
Additional speakers and panelists to be announced.
All times below are in Korean Standard Time (KST) timezone (GMT+9), on August 10th 2026.
[Agenda is still being finalized and may change.]
Draft program:
The workshop is a half-day event featuring 4 invited talks and 2 paper sessions.
Opening Remarks — 10 min
Invited Talk 1 — 20 min talk + 5 min Q&A
Invited Talk 2 — 20 min talk + 5 min Q&A
Invited Talk 3 — 20 min talk + 5 min Q&A
Break - 10 min
Paper Session 1 — 3 papers × 15 min (10 min talk + 5 min Q&A)
Invited Talk 4 — 20 min talk + 5 min Q&A
Break - 10 min
Paper Session 2 — 2 papers × 15 min (10 min talk + 5 min Q&A)
Panel Discussion — 40 min
Closing Remarks — 10 min
Exact times will be posted after the KDD schedule is finalized.
Panagiotis Papadimitriou, Senior Director of Engineering, Trust and Safety, Meta
Mehmet Emre Sargin, Senior Director of Engineering, Youtube Trust and Safety, Google
Madhu Ramanathan, Senior Engineering Manager, Meta
Sach Sokol, Senior Engineering Manager, Meta
Kiran Garimella, Rutgers University, USA
Timos Sellis, Archimedes / Athena Research Center, Greece
Mohamed Abdelhady, Principal Group Applied Scientist Manager, Trust and Safety, Microsoft
Daniel Olmedilla, Distinguished Engineer, Trust and Safety, LinkedIn
Panayiotis Tsaparas, University of Ioannina, Greece
Prathyusha Senthil Kumar, Senior Engineering Manager, Meta
Vasilis Verroios, Research Scientist, Meta
For questions about the workshop, please contact:
Madhu Ramanathan — madram@meta.com
Sach Sokol — sachsokol@meta.com