ELLAR: An Action Recognition Dataset
for Extremely Low-Light Conditions
with Dual Gamma Adaptive Modulation
ELLAR: An Action Recognition Dataset
for Extremely Low-Light Conditions
with Dual Gamma Adaptive Modulation
ACCV 2024
Abstract
In this paper, we address the challenging problem of action recognition in extremely low-light environments. Currently, available datasets built under low-light settings are not truly representative of extremely dark conditions because they have a sufficient signal-to-noise ratio, making them visible with simple low-light image enhancement methods. Due to the lack of datasets captured under extremely low-light conditions, we present a new dataset with more than 12K video samples, named Extremely Low-Light condition Action Recognition (ELLAR). This dataset is constructed to reflect the characteristics of extremely low-light conditions where the visibility of videos is corrupted by overwhelming noise and blurs. ELLAR also covers a diverse range of dark settings within the scope of extremely low-light conditions. Furthermore, we propose a simple yet strong baseline method, leveraging a Mixture of Experts in gamma intensity correction, which enables models to be flexible and adaptive to a range of low illuminance levels. Our approach significantly surpasses state-of-the-art results by 3.39% top-1 accuracy on ELLAR dataset.
The dataset and code will be made publicly available at: https://github.com/knu-vis/ELLAR
Introduction
Pixel distributions and signal-to-noise ratio among various low-light datasets: the line graph depicts pixel intensity distributions, with the x-axis representing pixel values from 0 to 255, plotted on a logarithmic scale to highlight values near 0. The y-axis represents the frequency of each pixel value, also on a logarithmic scale. The bar graph illustrates the signal-to-noise ratio for the aforementioned datasets, meaning that the higher the value, the more signals are present in an image.
Two rows of 10 images show the original images from the datasets and images that Gamma Intensity Correction (GIC) is applied respectively.
ELLAR Dataset
Extremely Low-Light condition Action Recognition Dataset
📂
Ellar dataset is divided into two parts based on the illumination of the locations, LL(Low Light) and ELL(Extremely Low Light) dataset. You can find both annotation files in the link above.
Method
The core idea of DGAM(Dual Gamma Adaptive Modulation) is its dual Mixture of Experts structure. This structure first identifies the characteristics of each sample and performs adaptive image enhancement that is optimal for action recognition. To this end, we designed a gating network named Adaptive Gamma Correction (AGC) that selects and applies the optimal GIC based on the illuminance information of the samples. Moreover, the matching classification head called Adaptive Head Selection (AHS) is selected to recognize features specific to the enhanced inputs. This dual mixture of expert systems allows the action recognition model to dynamically respond to inputs from diverse dark settings.
Result
Comparison to state-of-the-art on ELLAR data
Contact
Your feedback, questions, and suggestions are always welcome.
Feel free to reach out to us anytime!
✉️haminse@knu.ac.kr ✉️bwg7408@knu.ac.kr ✉️flora8207@knu.ac.kr