Why we want to make this dataset?

The motivation behind the creation of this dataset stems from our experiences with anomaly detection models deployed in industry projects. Often, we rely on open-source computer vision models that claim robustness and performance based on extensive testing and verification against large datasets and benchmarks, as mentioned in their research papers.

However, in real-world deployment, we frequently encounter unexpected scenarios where these models exhibit issues. These issues could range from model failures when confronted with backgrounds exhibiting overexposure due to intense lighting to cases where the model erroneously detects objects that are unrelated to the primary target in the background. These practical challenges highlight a disconnect between the expected model performance and its actual behavior in real-world scenarios.

The core problem lies in the absence of suitable datasets that address this specific challenge. While numerous open datasets are available for various tasks, there is a noticeable gap when it comes to evaluating model robustness against background variations. In practice, what truly matters is not just the ability of the model to detect objects under ideal conditions but also its resilience to variations in the background while maintaining accurate object recognition. This aspect of within-class robustness becomes particularly crucial in real production-level applications.

Our primary objective in creating this dataset is to bridge the gap between the academic and industrial worlds. We aim to provide a valuable benchmark that aligns more closely with real-world scenarios encountered in industrial applications. By offering a dataset where the target object remains consistent while the background introduces noise or undergoes alterations, we enable researchers and practitioners to evaluate and enhance the true robustness of their models in context.

In essence, this dataset serves as a critical tool to help the industry and academia collaborate effectively, ensuring that computer vision models are not only accurate in controlled settings but also capable of withstanding the complexities and uncertainties of real-world environments.

What tool we used in order to generate those image?

We employed a combination of tools and methodologies to generate the images in this dataset, ensuring both efficiency and quality in the annotation and synthesis processes.

1. IoG Net: Initially, we utilized the IoG Net, which played a foundational role in our image generation pipeline.

2. Polygon Faster Labeling Tool: To facilitate the annotation process, we developed a custom Polygon Faster Labeling Tool, streamlining the labeling of objects within the images.

3. AnyLabeling Open-source Project: We also experimented with the AnyLabeling open-source project, exploring its potential for our annotation needs.

4. V7 Lab Tool: Eventually, we found that the V7 Lab Tool provided the most efficient labeling speed and delivered high-quality annotations. As a result, we standardized the annotation process using this tool.

5. Data Augmentation: For the synthesis of synthetic images, we relied on a combination of deep learning frameworks, including scikit-learn and OpenCV. These tools allowed us to augment and manipulate images effectively to create a diverse range of backgrounds and variations.

6. GenAI: Our dataset includes images generated using the Stable Diffusion XL model, along with versions 1.5 and 2.0 of the Stable Diffusion model. These generative models played a pivotal role in crafting realistic and varied backgrounds.

For a detailed breakdown of our prompt engineering and hyperparameters, we invite you to consult our upcoming paper. This publication will provide comprehensive insights into our methodologies, enabling a deeper understanding of the image generation process.

What value does this dataset can bring to the ML, the Explainable AI community?

After participating in this nearly 9 hours workshop at NeurIPS2022 Synthetic Data Workshop, there are a few questions still firmly stuck in my mind!

During the Fairness Panel discussion moderated by Kim Branson from GSK UK, Prof. Mihaela van der Schaar raised a question: “do you see there is any publicly well-known benchmark that focuses on Synthetic data fairness topics, I mean, not only limited to Synthetic data… overall Fairness benchmark dataset.”


This dataset is a valuable addition to the Explainable AI (XAI) community. It addresses the scarcity of datasets tailored for XAI research by offering a meticulously curated creation of nearly 100,000 images. These images feature both synthetic GenAI backgrounds and real-world backgrounds, enabling researchers to explore model robustness in the face of background variations.

Key contributions of this dataset include:



What next?

More finding around StableDiffusion will be bring up!