Real-Time Congestion Detection Challenge
Traffic management agencies are using drones to get a real-time bird’s-eye view of road networks. Unlike fixed traffic cameras, drones can cover large areas quickly and respond to incidents dynamically, providing up-to-the-minute data on vehicle flow elistair.com. In this challenge, participants will analyze drone footage or images of roadways to monitor traffic conditions. The goal is to automatically detect traffic density or congestion and identify anomalies (accidents, unusually slowdowns) from aerial video feeds, enabling faster response to traffic jams or incidents.
Vehicle Detection – Build a model to detect vehicles in drone images or video frames. This could involve using an object detection model to draw bounding boxes around cars, trucks, motorcycles, etc. in each frame.
Traffic Density Estimation – Based on the detections, calculate metrics like the number of vehicles per frame or vehicles per road segment. Participants might classify each frame (or region of the frame) as “heavy traffic”, “moderate”, or “light” traffic based on vehicle count or spacing.
Congestion/Incident Alert – Determine if a traffic jam or incident is present. For example, if vehicles are densely packed and mostly stationary in a segment of the video, flag that as a congestion event. If time allows, participants can also attempt to detect anomalies like a sudden stop (which might indicate an accident) or unusual behavior (vehicles on the wrong side).
Visualization – Output the results in an intuitive way. For instance, annotate the video with colored overlays on roads (green for smooth flow, red for jam). Or output a time series graph of vehicle count over time for a given intersection. Even a simple print-out like “Frame 200: Heavy congestion detected on highway on-ramp” adds value.
VisDrone2019 Dataset – A large benchmark dataset of drone-based traffic footage, including 288 video clips and 10,209 static images with annotations for objects like vehicles and pedestrians paperswithcode.com. This dataset contains diverse scenarios (urban intersections, highways, etc.) with bounding boxes for each vehicle, which can be used to train or test your vehicle detection algorithms.
(If focusing on a specific aspect, participants might subset the data. E.g., use a highway sequence from VisDrone for counting vehicles, or a public traffic CCTV dataset if drone data is too heavy. But VisDrone’s labeled drone images are ideal for this challenge.)
Detection & Counting Accuracy – Solutions will be evaluated on how well they detect vehicles (if your approach is detection-based). High precision and recall in identifying cars/trucks in various scenes is important. If the output is a traffic level (e.g., “congested” vs “free flow”), the classification should match ground truth conditions in most cases.
Timeliness (Real-time Feasibility) – A strong solution should process frames quickly (near real-time for a reasonable video resolution). Judges may consider the frame-per-second processing rate of your solution, or at least the complexity. Simpler algorithms that run faster are appropriate for a one-day hack. If you use heavy deep learning, using techniques like frame skipping or region of interest focusing to speed up processing could score bonus points.
Scenario Robustness – The solution’s ability to handle different scenarios: day vs night (if dataset includes), urban roads vs highways, camera angles. While it’s okay to target a narrow scenario (e.g., daytime highway traffic), solutions that mention or demonstrate adaptability to other scenarios will be rated higher.
Alert Accuracy – If an “alert” or condition detection is part of the output (like flagging congestion or accidents), judges will look at false alarms vs missed events. It’s better to maybe slightly over-alert (few false positives) than miss a major congestion. Clearly defining what constitutes “congestion” or an “incident” in your logic will help judges understand your criteria.
Clarity of Output – As with other challenges, presenting the traffic analysis clearly (visually or in a report) is important. A clean visualization (like drawing bounding boxes and maybe a counter on the frame, or highlighting the congested area) can make your solution stand out.
Object Detection Models – Pretrained YOLOv5/YOLOv8 models are well-suited for vehicle detection and are relatively lightweight to run. They can be fine-tuned on VisDrone images which include vehicles paperswithcode.com. Alternatively, one could use OpenCV’s HOG + SVM detector for cars, but modern YOLO/SSD models will be more accurate.
OpenCV for Video – Use OpenCV to read video frames from a file or stream. You can then feed frames to your detection model. OpenCV can also do background subtraction or optical flow if you opt for a motion-based approach (e.g., detecting regions where motion has stopped to identify a traffic jam).
Tracking (optional) – For advanced teams, integrating a multi-object tracking (like SORT/DeepSORT) can help maintain vehicle counts consistently between frames (avoiding double counting the same car). However, this might be complex for one day. A simpler approach is counting per frame and averaging over a few frames.
Analytics – Python libraries like NumPy/Pandas can help aggregate counts over time (if computing flow rates). For visualization of counts or alerts, Matplotlib or Plotly could plot traffic flow over time.
Edge Processing – If aiming for real-time, note that downsizing images or focusing on regions (e.g., if the drone is static above an intersection, one can mask everything outside the roads) will reduce computation. Also, using a smaller model (like Tiny-YOLO) could be a strategic choice to ensure quick inference.
Executable Code – Provide a Python script or notebook that ingests a sample video (or sequence of images) and produces the traffic analysis. It should be easy for judges to run on provided sample data (include instructions if any dependencies beyond the standard libraries are needed).
Demo Video/Images – If possible, include a short snippet of the drone video with your annotations overlayed (e.g., bounding boxes on cars, or colored lanes). A before-and-after comparison (raw vs annotated) frame or a side-by-side video makes it clear what your system is doing. If you cannot include video, a series of example frames with annotations and a brief explanation is fine.
Report – A brief report describing your approach: the algorithm/model used for detection, how you determined congestion (thresholds or logic), and results. Note any assumptions (e.g., “assuming camera is static and overlooking a 4-way intersection”). If you evaluated performance (say you counted vehicles and compared to ground truth counts on a test video), mention the accuracy.
Output Data – In addition to visualization, provide the raw outputs your code generates. For example, a CSV file of “timestamp vs vehicle_count” or a text log stating “Frame 50: 20 vehicles (Heavy Traffic)”. This shows the quantitative side of your solution in case judges want to inspect the numbers.
Discussion of Improvement – Since a one-day hack will yield prototypes, include a short discussion in your report or presentation on how you’d improve if given more time. For instance, using higher-end models, handling night-time imagery, integrating with a traffic alert system, etc. This shows you understand the broader context and limitations of your current solution.