Web Agent Revolution: Enhancing Accuracy, Safety, and Trustworthiness
March 3, 2025 | Philadelphia, Pennsylvania, USA
Web Agent Revolution: Enhancing Accuracy, Safety, and Trustworthiness
March 3, 2025 | Philadelphia, Pennsylvania, USA
Web Agent Revolution: Enhancing Accuracy, Safety, and Trustworthiness for Enterprise Adoption
The Web Agent Revolution workshop at AAAI 2025 focuses on advancing the development of general web agents through innovative benchmarks, datasets, and agent architectures. Web agents—autonomous AI systems capable of navigating and interacting with the web—have seen rapid technological advancements, however existing agents lack essential components to ensure safeguards mandates for enterprise adoption, and evaluation benchmarks lack rigorous methods for testing those safeguards. This workshop addresses key challenges in improving trustworthiness and reliability in real-world settings, making it a critical discussion for academia and industry.
Topics:
Safety and Trustworthiness of web agents
Agentic workflows in enterprise contexts
Agentic learning and teaching
Web agents as a tool for other agents
Agent architectures for policy awareness
Standardizing and open-source benchmark development
Novel methods for scaling online benchmarks
New metrics and evaluation functions
Learning from human behavior and standard operating procedures
Human-in-the-loop in web agents
Implementing safeguards in multi-agent architectures
8:30 - 9:00: Gathering
9:00 - 9:10 : Welcome and Opening Remarks - Avi Yaeli
9:10 - 9:50 : Keynote - Avi Yaeli, The Rise of Generalist Agents and their Role in the Agentic AI Revolution
9:50 - 10:15 - Invited Talk 1 - Léo Boisvert, WorkArena: An Enterprise Software Benchmark for Web Agents
10:15 - 10:40 - Invited Talk 2 - Arthur Bucker, Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
10:40 - 11:10 - Coffee Break
11:10 - 11:35 - Invited Talk 3 - Dehan Kong, Democratizing Web Automation: End-to-End Infrastructure for Building Personalized Web Agents at Scale
11:35 - 12:00 - Invited Talk 4 - Xiang Deng, Scale AI
12:00 - 12:25 - Invited Talk 5 - Tamer Abuelsaad, Chronicles of Web Agent Development: How Each Discovery Shaped Our Enterprise-Ready Approach
12:25 - 12:30 - Buffer
12:30 - 14:00 - Lunch Break
14:00 - 14:25 - Invited Talk 9 - Yunpu Ma, WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration
14:25 - 14:50 - Invited Talk 7 - Zora Wang, From Workflows to Tools: Building Adaptive Agents On the Fly
14:50 - 15:15 - Invited Talk 8 - Zora Wang, The Agent Company, Benchmarking LLM Agents on Consequential Real World Tasks
15:15 - 15:35 - Invited Talk 6 - Pranav Putta, Evaluations & Simulation Environments for Web Agents
15:35 - 16:00 - Coffee Break
16:00 - 16:30 - Panel
16:30 - 17:00 - Open discussion, roadmap, and closing remarks
Attendance
The workshop is open to researchers, practitioners, and industry professionals interested in web agents, AI, and automation.
Submission Requirements
We invite submissions of extended abstracts (2-4 pages) presenting the latest and greatest research, case studies, or position papers relevant to the workshop topics. Submissions should follow the AAAI formatting guidelines.
Submission Site
Please submit your papers via https://easychair.org/conferences/?conf=waretea1.
Deadline Dec, 24, 2024.
Workshop Chairs:
Segev Shlomov (IBM Research, segev.shlomov1@ibm.com)
Xiang Deng (Google, xiangdeng@google.com)
Ronen Brafman (Ben-Gurion University, brafman@bgu.ac.il)
Avi Yaeli (IBM Research, aviy@il.ibm.com)