Evaluation

Benchmark Platform

The PCBA Standard-to-Real Grand Challenge is open on Codabench. Participants can register for the challenge on HuggingFace by applying for a Dataset and submitting their prediction files through the official competition platform.

For our Grand Challenge, the top-3 performing teams will be invited to submit Grand Challenge papers for inclusion in the ACM MM 2026 main conference proceedings. Grand Challenge papers must follow the official ACM Multimedia 2026 author instructions and are limited to 6 pages plus 2 additional pages for references.

🔗 Competition Platform: PCBA Standard-to-Real Grand Challenge on Codabench

📮 Contact: aimmifm@gmail.com

Evaluation

All submissions are evaluated automatically on the official Codabench server. Participants submit a single prediction file for the released public test set, and the public leaderboard reports one Overall Score.

Metrics

Submissions are ranked based on a normalized Overall Score, aggregating:

Accuracy: For single-choice questions (Cause & Handling, Factuality).
F1-Score: For defect existence detection.
MAE (Mean Absolute Error): Bounded and normalized for counting tasks.

Page updated

Report abuse