[XIMAGENET-12: An Explainable Visual Benchmark Dataset for Robustness Evaluation]
A Dataset for Image Background Exploration
Mission of the project
Welcome to the Explanatory AI Synthetic Dataset, where we delve into the significant role of backgrounds in enhancing object recognition tasks.
Our research builds upon the foundation laid by "Noise or Signal: The Role of Image Backgrounds in Object Recognition" (Xiao et al., ICLR 2022), "Explainable AI: Object Recognition With Help From Background" (Qiang et al., ICLR Workshop 2022), reinforced the notion that models trained solely on backgrounds can substantially improve accuracy. One noteworthy discovery highlighted in their studies is that more accurate models tend to rely less on backgrounds.
We categorize the elements present in the background into two main domains:
Class-Independent Factors: These elements are not specific to certain classes and exhibit properties commonly found across the entire image dataset, such as colors and background edges.
Class-Dependent Factors: These elements pertain to unique aspects of the class depicted in the background. Examples include shadows, reflections, land or sea backgrounds, classes often co-occurring with non-target classes, and the relative size of the class object in the background.
Our current focus is on comprehending the influence of image backgrounds on Computer Vision ML models, particularly in the realms of Detection and Classification. Inspired by the work, "Explainable AI: Object Recognition With Help From Background" ICLRw 2022, we aim to expand our dataset and explore the following topics:
Blur Background
Segmented Background
AI-generated Background
Bias of Tools During Annotation
Color in Background
Random Background with Real Environment
Furthermore, we've devised a mathematical equation for the Robustness Score Scheme based on our dataset.
If you are interested in collaborating with us or learning more about our research project, please don't hesitate to reach out. Your contributions and insights are highly valued as we continue to advance our understanding of the intricate relationship between image backgrounds and AI models.
Dataset Download (Coming Soon) Reference for existing SOTA Benchmark Code
GenAI Labeling Tool Used:
Blur the Background
Blur the Foreground
Synthesize Background Color
Segment Background
Random Background with Real Environment
GenAI generated Background
Stay tuned for more exciting updates! Our dataset currently comprises 12 classes, exceeding 23 GB in size, and boasting nearly 200K images. The meticulous process of manually generating these GenAI backgrounds spanned over one and a half years. We extend our heartfelt appreciation to all the contributors who dedicated their time and expertise to assist in labeling and the remarkable generation work.
Questions?
Contact us to get more information on the project!