ICDAR 2026, Vienna (Austria), Aug 31 - Sep 2
The Sci-ImageMiner competition defines four complementary tasks that evaluate multimodal understanding of scientific figures in Atomic Layer Deposition and Etching (ALD/E) research. Task 1 targets chart-type Classification, Task 2 focuses on Data Table Extraction from plots, and Task 3 assesses Summarization of key scientific insights conveyed by figures. Task 4 introduces Visual Question Answering across four scientific reasoning categories, requiring systems to interpret complex visual and textual elements. Together, these tasks provide a comprehensive benchmark for visual scientific reasoning in a highly specialized domain.
For the competition, we define four complementary image-comprehension tasks that will be grounded in ALD/E scientific figures. The four tasks are briefly described as follows:
A supervised multi-class image classification task. Systems must identify the predefined 47 figure classes
Inputs
A scientific figure extracted from an ALD/E research paper.
Optional metadata, such as figure caption text, may be provided as context.
Output
Predicted class labels corresponding to one of 23 predefined chart types.
Band Diagram
This task focuses on structured reconstruction of the underlying tabular data encoded in scientific charts. Systems must identify column/field labels, table structure, and the textual or numerical values in each cell, producing a machine-readable Markdown representation of the data shown in the quantitative chart.
Input
A scientific chart or plot image (e.g., bar chart, line chart, scatter plot, spectra chart, phase diagram).
Output
A Markdown-formatted table containing:
Field/column names of the visualized entities
Extracted textual and/or numeric cell values
| Time (s) | Mass Change (ng/cm²) |
|---|---|
| 0 | 0 |
| 2000 | -500 |
| 4000 | -1000 |
| 6000 | -1500 |
The goal of this task is to generate concise, factual summaries that capture the key trends, relationships, and scientific insights presented in the figure. Systems must demonstrate accurate semantic interpretation grounded in the visual content, possibly supported by the caption when available.
Input
A scientific chart or plot image
figure caption text (optional).
Output
A short textual summary (1–3 sentences) describing the main trends and takeaways.
Polar heatmap of Al₂O₃ etch rate distribution (nm/cycle) across a wafer, with radial axis (0–3 cm from center) and angular labels (0°–360°). Rates peak centrally at 0.139 nm/cycle (red contours) and decrease radially to 0.131–0.133 nm/cycle at the edge (blue), forming a ~6% gradient with mild azimuthal modulation (slight lobes <5%). This reveals high overall isotropy, typical of optimized ALE for uniform thin-film removal.
This task tests fine-grained reasoning over scientific figures by requiring systems to answer natural-language questions that reference the visual content, including axes, legends, and data patterns. The VQA task is divided into four scientifically meaningful sub-tasks:
Assesses understanding of ALD/E cycles, precursor chemistry, and reaction mechanisms.
Evaluates reasoning about how experimental variables (temperature, cycle count, pulse length, etc.) influence outcomes such as growth rate or film thickness.
Tests the ability to link precursor families or chemical structures to material properties such as thermal stability or growth characteristics etc.
Sub-task 4.4: Application/Performance
Measures reasoning related to device-relevant outcomes such as luminescence behavior or photovoltaic performance etc.
Inputs
A scientific chart or plot image
A natural-language question (e.g., "At what temperature does the maximum growth rate occur?")
Outputs
Yes/No: "yes" or "no"
Factoid: a textual term (e.g., "O₂ plasma")
List: comma-separated values (order-insensitive)
Paragraph: ≥ 3 sentences providing an explanatory answer
Question type: Comparative/Trend
Question: For the etching of aluminum based material, does it matter which type of material it is (i.e. nitride/oxide)?
Answer type: Paragraph
Answer: No it does not seem to matter. Both Al2O3 and AlNx have similar etch results.