SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection [CVPR'24] [Highlight] [paper]
Who Should Read This Paper
Researchers/engineers exploring class incremental learning (CIL) for object detection
Anyone looking to reduce catastrophic forgetting when adding new classes, without storing real old data
Practitioners in dynamic environments (e.g., surveillance, autonomous systems) who need to update object detectors continuously
What the Paper Covers
Using Stable Diffusion (SD): Leverages a text-to-image diffusion model (GLIGEN-enhanced) to generate synthetic images of previous classes
Iterative Refinement: Dynamically filters out low-quality synthetic images using a trained detector
Pseudo-Labeling: Identifies old objects in new images to avoid classifying them as background
L2 Distillation: Distills the old model’s predictions into the new model—boosting old-class retention
Real-World Applications
Scalable Model Updates: Ideal for scenarios where new object classes continuously appear, but storing all past data is costly
Limited Storage Environments: Eliminates the need for large replay buffers of real data, which is beneficial in edge or embedded systems
Continuous Deployment: Simplifies frequent re-training of detectors in fields like retail (new products), robotics, and industrial inspection
Key Strengths
No Real-Data Replay Needed: Achieves state-of-the-art incremental detection performance without storing previous real images
High-Fidelity Synthetic Data: Incorporates class-wise bounding boxes and iterative refinement to generate realistic images that preserve old classes
Effective Distillation: Uses an L2 distillation on synthetic data to maintain old knowledge while adding new classes
Robust Against Forgetting: Demonstrates significant gains on MS COCO benchmarks, outperforming existing CIOD methods in both two-phase and multi-phase scenarios