Object tracking is a crucial task in computer vision that involves locating and following objects over time in a video stream. It has numerous applications in various fields such as surveillance, automotive, robotics, and more. OpenCV, the open-source computer vision library, provides a rich set of tools for object tracking. In this blog post, we will dive deep into the world of object tracking using OpenCV and Python.
What is Object Tracking?
Object tracking is the process of identifying and tracking an object’s location in a video stream over time. It involves detecting the object in each frame of the video and then using a tracking algorithm to follow it as it moves. Object tracking can be challenging due to various factors such as occlusion, illumination changes, and object deformation. However, it is an essential task in many applications, such as surveillance, traffic monitoring, and robotics.
Object Tracking OpenCV Python Algorithms
OpenCV provides several object tracking algorithms that can be used for various applications. These algorithms include:
- MeanShift: A non-parametric algorithm that uses the color histogram to track objects.
- CamShift: An extension of MeanShift that adapts to the object’s size and orientation changes.
- KCF (Kernelized Correlation Filter): A correlation-based algorithm that uses a kernel function to map the object and its surrounding area.
- MOSSE (Minimum Output Sum of Squared Error): A real-time object tracking algorithm that uses a circulant structure to reduce the computational cost.
- CSRT (Channel and Spatial Reliability Tracking): An extension of the KCF algorithm that uses the spatial and color information to improve tracking performance.
Object Tracking Using MeanShift OpenCV Python
MeanShift is a popular object tracking algorithm that works by finding the centroid of the object in the current frame and then shifting a window around the centroid in the next frame to find the new position of the object. MeanShift is simple and efficient, but it has some limitations like the inability to handle scale and rotation changes.
To perform object tracking using the MeanShift algorithm, we need to follow these steps:
- Initialize the object’s location in the first frame.
- Calculate the object’s color histogram in the first frame.
- In each subsequent frame, calculate the color histogram of the object’s region.
- Calculate the Bhattacharya distance between the object’s histogram in the previous frame and the current frame.
- Update the object’s location based on the maximum probability of the histogram.
Here’s how to perform object tracking using MeanShift in OpenCV:
import cv2
# Read the video
cap = cv2.VideoCapture('video.mp4')
# Read the first frame
ret, frame = cap.read()
# Set the ROI (Region of Interest)
x, y, w, h = cv2.selectROI(frame)
# Initialize the tracker
roi = frame[y:y+h, x:x+w]
roi_hist = cv2.calcHist([roi], [0], None, [256], [0,256])
roi_hist = cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)
while True:
ret, frame = cap.read()
if not ret:
break
# Convert the frame to HSV color space
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Calculate the back projection of the histogram
dst = cv2.calcBackProject([hsv], [0], roi_hist, [0,256], 1)
# Apply the MeanShift algorithm
ret, track_window = cv2.meanShift(dst, (x,y,w,h), term_crit)
# Draw the track window on the frame
x,y,w,h = track_window
img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0), 2)
# Display the resulting frame
cv2.imshow('frame',img2)
# Exit if the user presses 'q'
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
CamShift Object Tracking OpenCV Python
CamShift is an extension of MeanShift that can handle scale and rotation changes. CamShift uses an iterative algorithm to find the optimal window size and orientation that best matches the object in the current frame.
Here’s how to perform object tracking using CamShift in OpenCV:
import cv2
# Read the video
cap = cv2.VideoCapture('video.mp4')
# Read the first frame
ret, frame = cap.read()
# Set the ROI (Region of Interest)
x, y, w, h = cv2.selectROI(frame)
# Initialize the tracker
roi = frame[y:y+h, x:x+w]
roi_hist = cv2.calcHist([roi], [0], None, [256], [0,256])
roi_hist = cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)
while True:
ret, frame = cap.read()
if not ret:
break
# Convert the frame to HSV color space
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Calculate the back projection of the histogram
dst = cv2.calcBackProject([hsv], [0], roi_hist, [0,256], 1)
# Apply the CamShift algorithm
ret, track_window = cv2.CamShift(dst, (x,y,w,h), term_crit)
# Draw the track window on the frame
pts = cv2.boxPoints(ret)
pts = np.int0(pts)
img2 = cv2.polylines(frame, [pts], True, (0,255,0), 2)
# Display the resulting frame
cv2.imshow('frame',img2)
# Exit if the user presses 'q'
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
MOSSE for Object Tracking in OpenCV Python
MOSSE (Minimum Output Sum of Squared Error) is a simple and efficient object tracking algorithm that uses a correlation filter to track the object. MOSSE is less computationally expensive than other algorithms like MeanShift and CamShift.
Here’s how to perform object tracking using MOSSE in OpenCV:
If using enumerate can make your Python code more efficient, why do so many developers overlook it? Are you one of them? Python Enumerate
import cv2
# Read the video
cap = cv2.VideoCapture('video.mp4')
# Read the first frame
ret, frame = cap.read()
# Set the ROI (Region of Interest)
x, y, w, h = cv2.selectROI(frame)
# Initialize the tracker
tracker = cv2.TrackerMOSSE_create()
tracker.init(frame, (x,y,w,h))
while True:
ret, frame = cap.read()
if not ret:
break
# Update the tracker
ret, track_window = tracker.update(frame)
# Draw the track window on the frame
x,y,w,h = int(track_window[0]),int(track_window[1]),int(track_window[2]),int(track_window[3])
img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0), 2)
# Display the resulting frame
cv2.imshow('frame',img2)
# Exit if the user presses 'q'
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
KCF Object Tracking OpenCV Python
KCF (Kernelized Correlation Filters) is a state-of-the-art object tracking algorithm that uses a kernelized correlation filter to track the object. KCF is very fast and accurate and can handle scale and rotation changes.
Here’s how to perform object tracking using KCF in OpenCV:
import cv2
# Read the video
cap = cv2.VideoCapture('video.mp4')
# Read the first frame
ret, frame = cap.read()
# Set the ROI (Region of Interest)
x, y, w, h = cv2.selectROI(frame)
# Initialize the tracker
tracker = cv2.TrackerKCF_create()
tracker.init(frame, (x,y,w,h))
while True:
ret, frame = cap.read()
if not ret:
break
# Update the tracker
ret, track_window = tracker.update(frame)
# Draw the track window on the frame
x,y,w,h = int(track_window[0]),int(track_window[1]),int(track_window[2]),int(track_window[3])
img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0), 2)
# Display the resulting frame
cv2.imshow('frame',img2)
# Exit if the user presses 'q'
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
References:
- OpenCV documentation: https://docs.opencv.org/master/d9/df8/tutorial_root.html