This project addresses the challenge of estimating scene illumination in images affected by complex and mixed lighting conditions. Traditional color constancy methods rely on global color statistics, which often break down when multiple light sources are present or when dominant colors bias the estimation. To overcome this limitation, I explored a semantic approach that leverages object-level information to better infer the true chromatic properties of a scene.
The core idea behind the system is that different objects exhibit characteristic color distributions that can be exploited to estimate illumination more accurately. I implemented a semantic color constancy pipeline that uses object-level predictions from a fine-tuned YOLOv8 model to identify and segment semantically meaningful regions within an image. For each detected object, chromaticity priors were modeled using voxel-based clustering of color distributions, allowing the system to remain robust even in cluttered or visually complex scenes.
These object-specific estimates were then combined using a semantic consensus algorithm to produce a global illumination estimate for the entire image. By aggregating information across multiple semantic regions, the pipeline avoids over-reliance on any single object or color cue, leading to more stable and accurate illumination correction.
The system demonstrated strong performance in both accuracy and efficiency. It achieved a mean angular error of 5.7 degrees, representing a 63% improvement over a traditional baseline approach. Despite incorporating semantic reasoning, the full pipeline runs in approximately 130 milliseconds on a CPU, showing that meaningful semantic enhancements can be achieved without requiring GPU acceleration.
The effectiveness of the approach is visually illustrated through a three-stage evaluation. Starting from an original image, a synthetic illumination tint is applied to simulate challenging lighting conditions. The final corrected output shows that the pipeline successfully removes the imposed color cast and restores the scene’s original appearance, validating the robustness of the semantic correction strategy.