MV2026 - Evaluation

MV2026

The 2026 Grand Challenge on Multimedia Verification (MV2026)

Evaluation Criteria

Overall Evaluation.

Main score on each task:

Score = max(0, Q - w.T²)

Q: Quality score assessed by professional fact-checkers (see details in Section Quality Evaluation).
T: Time - number of hours after the task release.
w: Weight - controls the penalty. (w = 0.001)
Simply put, the score decreases proportionally to the square of the time spent.

Report Evaluation (Quality Q)

Total score is 110 points/case

Summary & Content Classification (10)
- Concise Overview – Clearly summarizes the findings, highlighting uncertainties and unknowns.
- Correctly Categorized – Correctly assign relevant tags based on platforms, people, brands, or specific topics (e.g., TikTok, Trump, Coca-Cola, Ukraine War, or AI-generated).
Verified Evidence (65)
- Source Details (15) – Identifies where the content originates (e.g., URLs, original posts, and metadata).
- Where? (Location) (15) – Determines the correct geographical context.
- When? (Time) (15) – Establishes the accurate timeframe.
- Who? (People, Organizations, Entities Involved) (10) – Identifies key individuals or groups.
- Why? (Motivation or Intent) (10) – Provides a reasoned explanation of possible intent.
Forensic Analysis (20)
- Authenticity Assessment – Determines if the content is synthetic, modified, or recaptured.
- Verification Tools & Methods – Clearly documents the tools and techniques used.
- Synthetic Type (if applicable) – Identifies AI-generated content (e.g., GANs, or Stable Diffusion).
- Other Artifacts – Notes any detected anomalies or manipulations.
Evidence & Findings (5)
- Supporting Sources – Uses additional fact-checks, reports, or metadata to back claims.
- Cross-Checking Information – Ensures verification through multiple independent sources.
Clarity & Structure (10)

Well-Organized Report – Logically structured for readability.
Concise & Understandable Language – Avoids unnecessary complexity or ambiguity.

Note: Not all points may be verifiable in every case. Clearly stating the failure type ("indeterminate", "inconclusive", or "not feasible") is a valid verification outcome and should be included where necessary.

Verification Summarization

The summarization subtask will be evaluated by a jury committee based on the following criteria:

Clarity: The summary should be well-structured, easy to read, and clearly convey the verification findings.
Concise: The summary should be brief while still covering the essential points of the verification.
Readability: The language should be accessible to a general audience, avoiding technical jargon.
Accuracy: The summary should correctly reflect the key findings from the detailed verification report.

The jury committee will assess submissions to ensure they effectively communicate the verification findings in a clear, concise, and accessible manner.

Page updated

Google Sites

Report abuse