Advanced Applied Deep Learning
Practice Course
Sheng Yun Wu
Practice Course
Sheng Yun Wu
In Week 13, students will explore advanced object tracking techniques integrated with object detection. Object tracking involves continuously locating a detected object across frames in a video sequence. This week’s focus includes understanding single-object and multi-object tracking, implementing various tracking algorithms (such as SORT, DeepSORT, and Centroid tracking), and enhancing the tracking process by combining it with real-time object detection models.
Description:
This example introduces object tracking concepts, discussing the difference between single-object tracking (SOT) and multi-object tracking (MOT), and explaining how tracking can be combined with object detection.
No Code for this example – Theoretical Explanation
Single-object tracking (SOT): Tracks one object across frames, initialized by an object detector.
Multi-object tracking (MOT): Tracks multiple objects simultaneously, requiring a more complex tracking mechanism.
Use cases: Surveillance, autonomous vehicles, robotics.
Description:
This example demonstrates how to implement single-object tracking using OpenCV’s built-in KCF (Kernelized Correlation Filter) tracker to track an object across video frames after it has been detected.
import cv2
# Initialize video capture
cap = cv2.VideoCapture(0)
# Initialize KCF tracker
tracker = cv2.TrackerKCF_create()
# Read first frame and select the object to track
ret, frame = cap.read()
bbox = cv2.selectROI(frame, False)
# Initialize the tracker with the bounding box
tracker.init(frame, bbox)
while True:
ret, frame = cap.read()
if not ret:
break
# Update the tracker
success, bbox = tracker.update(frame)
if success:
# Draw the bounding box
x, y, w, h = [int(v) for v in bbox]
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
# Display the frame
cv2.imshow('Single Object Tracking', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Description:
This example introduces the Centroid Tracking algorithm, which uses the centroids of bounding boxes detected by an object detector to track multiple objects across frames.
from scipy.spatial import distance as dist
import numpy as np
import cv2
class CentroidTracker:
def __init__(self, maxDisappeared=50):
self.nextObjectID = 0
self.objects = {}
self.disappeared = {}
self.maxDisappeared = maxDisappeared
def register(self, centroid):
self.objects[self.nextObjectID] = centroid
self.disappeared[self.nextObjectID] = 0
self.nextObjectID += 1
def deregister(self, objectID):
del self.objects[objectID]
del self.disappeared[objectID]
def update(self, rects):
inputCentroids = np.zeros((len(rects), 2), dtype="int")
for (i, (startX, startY, endX, endY)) in enumerate(rects):
cX = int((startX + endX) / 2.0)
cY = int((startY + endY) / 2.0)
inputCentroids[i] = (cX, cY)
if len(self.objects) == 0:
for i in range(0, len(inputCentroids)):
self.register(inputCentroids[i])
else:
objectIDs = list(self.objects.keys())
objectCentroids = list(self.objects.values())
D = dist.cdist(np.array(objectCentroids), inputCentroids)
rows = D.min(axis=1).argsort()
cols = D.argmin(axis=1)[rows]
usedRows = set()
usedCols = set()
for (row, col) in zip(rows, cols):
if row in usedRows or col in usedCols:
continue
objectID = objectIDs[row]
self.objects[objectID] = inputCentroids[col]
self.disappeared[objectID] = 0
usedRows.add(row)
usedCols.add(col)
unusedRows = set(range(0, D.shape[0])).difference(usedRows)
unusedCols = set(range(0, D.shape[1])).difference(usedCols)
for row in unusedRows:
objectID = objectIDs[row]
self.disappeared[objectID] += 1
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
for col in unusedCols:
self.register(inputCentroids[col])
return self.objects
# Initialize tracker
ct = CentroidTracker()
# Example: detected bounding boxes in frame
rects = [(100, 150, 200, 250), (300, 350, 400, 450)]
# Update the tracker with new bounding boxes
objects = ct.update(rects)
print(f"Tracked objects: {objects}")
Description:
This example demonstrates how to combine YOLO object detection with Centroid Tracking to track multiple objects across frames in a video stream.
import cv2
import torch
from centroidtracker import CentroidTracker # Import CentroidTracker class from previous example
# Load pre-trained YOLOv5 model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
# Initialize Centroid Tracker
ct = CentroidTracker()
# Open webcam feed
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Perform object detection with YOLO
results = model(frame)
rects = []
# Extract bounding boxes from YOLO detections
for *box, conf, cls in results.xyxy[0]:
startX, startY, endX, endY = map(int, box)
rects.append((startX, startY, endX, endY))
# Update tracker with bounding boxes
objects = ct.update(rects)
# Draw bounding boxes and tracked object IDs
for (objectID, centroid) in objects.items():
text = f"ID {objectID}"
cv2.putText(frame, text, (centroid[0] - 10, centroid[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# Display the frame
cv2.imshow('Object Detection & Tracking', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Description:
This example introduces SORT (Simple Online and Realtime Tracking), a simple yet effective algorithm for tracking multiple objects in real time using Kalman Filters and the Hungarian algorithm for data association.
from sort import Sort # Assuming SORT algorithm is implemented in sort.py
# Initialize SORT tracker
tracker = Sort()
# Example: detected bounding boxes with confidence scores (from object detector)
detections = np.array([[100, 150, 200, 250, 0.9], [300, 350, 400, 450, 0.8]])
# Update tracker with new detections
tracked_objects = tracker.update(detections)
print(f"Tracked objects: {tracked_objects}")
Description:
This example demonstrates how to implement DeepSORT, an advanced object tracking algorithm that uses deep learning for appearance-based re-identification along with Kalman filtering for tracking.
No Code for this example – Practical Guidance
DeepSORT combines motion-based tracking with appearance features (via CNNs) to track objects even when they leave the frame and reappear.
Install and integrate DeepSORT with YOLO or another object detector.
Description:
This example explains how to use Kalman Filters for object tracking, estimating the future position of an object based on its previous state and motion model.
import numpy as np
class KalmanTracker:
def __init__(self):
self.dt = 1.0 # Time interval
self.A = np.array([[1, self.dt], [0, 1]]) # State transition matrix
self.H = np.array([[1, 0]]) # Measurement matrix
self.Q = np.array([[1, 0], [0, 1]]) # Process noise covariance
self.R = np.array([[1]]) # Measurement noise covariance
self.P = np.eye(2) # Initial estimation covariance
self.x = np.zeros((2, 1)) # Initial state
def predict(self):
self.x = np.dot(self.A, self.x)
self.P = np.dot(np.dot(self.A, self.P), self.A.T) + self.Q
return self.x
def update(self, z):
y = z - np.dot(self.H, self.x) # Measurement residual
S = np.dot(np.dot(self.H, self.P), self.H.T) + self.R # Residual covariance
K = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S)) # Kalman gain
self.x = self.x + np.dot(K, y)
self.P = self.P - np.dot(np.dot(K, self.H), self.P)
# Initialize tracker and simulate a sequence of measurements
tracker = KalmanTracker()
measurements = [np.array([[10]]), np.array([[12]]), np.array([[14]])]
for z in measurements:
tracker.predict()
tracker.update(z)
print(f"Predicted state: {tracker.x}")
Description:
This example demonstrates how to combine YOLOv5 for object detection with SORT for multi-object tracking in real-time video feeds.
import cv2
import torch
from sort import Sort
# Load YOLOv5 model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
# Initialize SORT tracker
tracker = Sort()
# Open webcam feed
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Perform object detection with YOLO
results = model(frame)
# Extract bounding boxes and confidence scores
detections = []
for *box, conf, cls in results.xyxy[0]:
detections.append([*map(int, box), conf.item()])
detections = np.array(detections)
# Update SORT tracker with new detections
tracked_objects = tracker.update(detections)
# Draw tracked bounding boxes and object IDs
for obj in tracked_objects:
x1, y1, x2, y2, obj_id = map(int, obj)
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(frame, f"ID {obj_id}", (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
# Display the frame
cv2.imshow('Object Detection & Tracking (YOLOv5 + SORT)', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Description:
This example shows how to combine DeepSORT with YOLOv5 for tracking people across video frames, handling occlusion and re-identification.
No Code for this example – Practical Guidance
DeepSORT is integrated with YOLO for tracking people across frames in surveillance video.
DeepSORT provides an additional re-identification feature to handle cases where people leave and re-enter the frame.
Description:
This example introduces metrics like MOTA (Multi-Object Tracking Accuracy) and IDF1 to evaluate the performance of object tracking models.
from sklearn.metrics import accuracy_score
# Example ground truth and predicted tracks (simplified for illustration)
true_tracks = [1, 2, 1, 2, 1]
pred_tracks = [1, 2, 1, 1, 2]
# Calculate IDF1 (Intersection over Union of tracks)
idf1_score = accuracy_score(true_tracks, pred_tracks)
print(f"IDF1 Score: {idf1_score}")
Objective: Implement and optimize advanced object tracking techniques in combination with object detection.
Skills Developed:
Implement single-object tracking with KCF tracker and multi-object tracking with Centroid, SORT, and DeepSORT.
Integrate YOLO with tracking algorithms for real-time object detection and tracking.
Understand and apply Kalman Filters for tracking object motion.
Evaluate object tracking performance using MOTA and IDF1 metrics.
Tools: OpenCV, PyTorch, SORT, DeepSORT, Kalman Filters, YOLOv5.
By the end of Week 13, students will be capable of building real-time object tracking systems using a combination of detection and tracking algorithms, enhancing their ability to handle complex multi-object tracking scenarios in real-world applications.