📐 AI Evaluation Beyond Metrics
workshop at IJCAI-ECAI 2022 (Vienna, Austria)
July 24th (Schubert 1 Room)
Invited Speakers & PANELS
Facebook AI Research
University of St Andrews
+ panel on "Cognitive Evaluation with the Animal AI Environment", with
Murray Shanahan (Imperial, Deepmind)
Tomer D. Ullman (Harvard)
Amanda Seed (St. Andrews)
+ panel on "Evaluating pre-trained, generative and prompted systems", with
Matthias Samwald (Medical Univ. Vienna)
Lama Ahmad (OpenAI)
Jo Plested (University of New South Wales)
+ special session on "OECD’s Artificial Intelligence and the Future of Skills (AIFS)"
with Stuart Elliot (OECD), Virginia Dignum (Umeå), Tony Cohn (Leeds) and Songül Tolan (European Commission)
Call for Papers
The 1st international workshop on AI Evaluation Beyond Metrics (EBeM) will be held in Vienna, Austria (July 23-25, 2022).
Cutting edge AI and ML systems are able to solve a variety of problems that were not solvable a few years ago, such as machine translation and medical image analysis. With these AI systems starting to be deployed across important and consequential contexts, robust evaluation of their capabilities and limitations is critical. More generally, traditional approaches to evaluation lack the necessary robustness to analyse the capabilities of complex AI systems. Many AI systems solve a task or excel at a particular benchmark, but then fail at other tasks or instances that putatively represent the same capability.
Therefore, the goal of this workshop is to challenge the widespread but limited approach of evaluating the performance of intelligent systems with aggregated metrics over a benchmark or distribution of tasks. We will discuss further alternative approaches that draw on ideas and recent progress in cognitive and developmental psychology, psychometrics, software testing, and other areas.
Topics (not exhaustive)
Evaluation methods founded on cognitive, developmental or comparative psychology
Measurement of skills, capabilities, or cognitive abilities
Evaluation methods based on software testing or other engineering practices
Meta-analysis or comparisons of evaluation instruments
The role of evaluation in AI development, policy making, and modeling of social impact
Measurements of generality or common-sense
Capture and use of evaluation data
Analysis of the task space and its relation to corresponding capabilities
The role of causality in evaluation
Topics complementary to evaluation such as documentation or auditing
Alternative evaluation methods with added benefits
Discussion and progress in hard to evaluate scenarios
Organisers
Universitat Politècnica de València
Cambridge
Harvard
European Commission
Cambridge
Cambridge
Cambridge
Universitat Politècnica de València
Program Committee
Atia Cortés - Barcelona Supercomputing Center
Alex Taylor - University of Auckland
Alex Wang - New York University
Celeste Kidd - University of California Berkeley
Craig S. Greenberg - NIST
David Fernández-Llorca - European Commission, JRC
Deborah Raji - Mozilla
Ellen Voorhees - NIST
Ernest Davis - New York University
Guillaume Avrin - Lab. Nat. de Métrologie et d'Essais
Isabelle Hupont-Torres - European Commission, JRC
Jan Feyereisl - GoodAI
Joel Leibo - DeepMind
Kevin Smith - MIT
Koustuv Sinha - McGill University
Ljerka Ostojic - University of Rijeka
Melanie Mitchell - Santa Fe Institute
Moira Dillon - New York University
Naman Shukla - Deepair Solutions
Panos Ipeirotis - New York University
Peter Flach - University of Bristol
Raul Santos-Rodriguez - University of Bristol
Ricardo Prudencio - Informatics Center, UFPE
Ricardo Vinuesa - KTH Royal Institute of Technology
Richard Mallah - Future of Life Institute
Rotem Dror - University of Pennsylvania
Sean Holden - University of Cambridge
Sebastian Gehrmann - Google Research
Songul Tolan - European Commission, JRC
Tadahiro Taniguchi - Ritsumeikan University
Vicky Charisi - European Commission, JRC
VENUE & Registration
All registration is handled by IJCAI (more info), and the actual platform for doing so is https://registration.ijcai.org.
The EBeM venue is Messe Wien Exhibition and Congress Center:
Messe Wien
Hall B, entrance Congress Center
Messeplatz 1
A-1020 Vienna
Metro stop U2 “Messe Prater”