Day-to-Day Drinking: Detecting Drunk Face Using Real Image Datasets
Day-to-Day Drinking: Detecting Drunk Face Using Real Image Datasets
Explored the potential for real-world application and research based on real-life data.
Grew as a project advisor through hands-on experience.
Alcohol-related Accident Issues
Alcohol-related accidents such as drunk driving, violence, and industrial accidents are significant societal problems, and regulations are needed to prevent them.
Traditional methods of alcohol detection, like breath alcohol testers (BrAC), are unhygienic, time-inefficient, and prone to false positives in non-drinkers.
Limitations of Existing Studies
Previous alcohol detection studies have primarily used infrared images or audio-visual datasets.
These studies often rely on images taken in controlled environments, limiting their applicability in real-world scenarios.
Develop a CNN-based classification model capable of detecting alcohol consumption using facial images captured from various angles in everyday life.
Build a dataset to distinguish between intoxicated and non-intoxicated states and classify images from multiple angles to compare and analyze the performance of each model.
Collected alcohol-related videos from YouTube using search terms like "drunk" and "drinking alcohol."
Extracted a total of 4,057 facial images from 117 videos to build the dataset.
Labeled images as either "intoxicated" or "non-intoxicated."
Categorized images based on facial angles
Label 1 (front)
Label 2 (slightly tilted front)
Label 3 (side view).
Used OpenCV and dlib libraries for facial recognition and feature point detection.
Calculated the facial angle based on central coordinates of the face, jawline, and eyes, and classified and aligned images accordingly.
Removed tilted images (greater than 10 degrees) and separated front-facing and side-facing images to construct the preprocessed dataset.
Total number of images: 4,057
(Intoxicated: 1,557 images, Non-intoxicated: 1,658 images)
Data distribution by angle:
Label 1 (Front): Intoxicated 680, Non-intoxicated 813
Label 2 (Tilted Front): Intoxicated 126, Non-intoxicated 94
Label 3 (Side): Intoxicated 751, Non-intoxicated 751
CNN-based models
InceptionV3, ResNet152, DenseNet201, EfficientNetB7
Pre-trained models on ImageNet were used.
Adjusted input image size to 224x224
(set to 299x299 for InceptionV3).
Label 1 (Front Data)
Label 3 (Side Data)
Label 1 + Label 3 (Front + Side Data)
Ktrain (Combined Data of Label 1, 2, and 3)
Overview of experiments
Model Performance by Angle:
Using only front data (Label 1): Average accuracy of 69.77%
Using only side data (Label 3): Average accuracy of 73.73%
Using both front and side data (Label 1 + Label 3): Average accuracy of 78.79%
Using all angle data (Ktrain): Average accuracy of 82.75%
Model Performance Comparison:
DenseNet201 achieved the highest accuracy of 86.12% when using all angle data.
InceptionV3 and EfficientNetB7 models performed well with 81.87% and 79.27% accuracy, respectively, while ResNet152 showed relatively lower performance.
Result of drunk face detection
[Research Outcomes and Contributions]
Built a dataset for alcohol detection that includes facial images from various angles collected in real-world scenarios.
Demonstrated through data preprocessing and model training that a dataset containing facial images from multiple angles is more effective for alcohol detection than using only front-facing images.
[Limitations and Future Research Directions]
Lack of Ethnic Diversity in the Dataset:
The current dataset primarily contains images of Western faces. Adding data from other ethnicities is necessary to improve the model's generalization performance.
Need for Real-time Alcohol Detection Research:
This study focused on image-based detection, but future research can expand to real-time alcohol detection using video data from sources like CCTV.
Potential Application in Various Fields:
Beyond alcohol detection, the model could be developed to address other societal issues, such as detecting drug use or fatigue.