September 11, 2024

Object Detection Using OpenCV in Python

OpenCV (Open Source Computer Vision Library) is a powerful library for computer vision tasks, including object detection. Object detection involves identifying and locating objects within an image or video stream. This guide will walk you through the basics of object detection using OpenCV in Python.

1. Installing OpenCV

First, you need to install OpenCV. You can do this using pip:

pip install opencv-python

2. Loading an Image

Before detecting objects, you’ll need an image to work with. You can load an image using OpenCV’s imread() function.

2.1. Example: Loading an Image

import cv2

# Load an image
image = cv2.imread('image.jpg')

# Display the image
cv2.imshow('Loaded Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

This code loads an image from the specified file path and displays it in a window.

3. Using Pre-Trained Models for Object Detection

OpenCV provides pre-trained models for object detection, such as Haar Cascades and deep learning-based models. These models can detect faces, eyes, and other objects in images or videos.

3.1. Example: Face Detection Using Haar Cascades

Haar Cascades are a popular object detection method provided by OpenCV. OpenCV includes pre-trained classifiers for various objects like faces, eyes, smiles, etc.

import cv2

# Load an image
image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Load the pre-trained Haar Cascade classifier for face detection
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Perform face detection
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))

# Draw rectangles around detected faces
for (x, y, w, h) in faces:
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

# Display the result
cv2.imshow('Detected Faces', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, the program detects faces in an image using the Haar Cascade classifier and draws rectangles around the detected faces.

4. Object Detection with Deep Learning (YOLO)

YOLO (You Only Look Once) is a popular deep learning-based object detection framework. OpenCV allows you to load YOLO models and use them for real-time object detection.

4.1. Example: Object Detection Using YOLO

To use YOLO with OpenCV, you’ll need the pre-trained YOLO model weights, configuration file, and class labels.

import cv2
import numpy as np

# Load YOLO model
net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]

# Load the class labels
with open('coco.names', 'r') as f:
    classes = [line.strip() for line in f.readlines()]

# Load the image
image = cv2.imread('image.jpg')
height, width = image.shape[:2]

# Prepare the image for the YOLO model
blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)

# Process the output
class_ids = []
confidences = []
boxes = []
for out in outs:
    for detection in out:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]
        if confidence > 0.5:
            center_x = int(detection[0] * width)
            center_y = int(detection[1] * height)
            w = int(detection[2] * width)
            h = int(detection[3] * height)
            x = int(center_x - w / 2)
            y = int(center_y - h / 2)
            boxes.append([x, y, w, h])
            confidences.append(float(confidence))
            class_ids.append(class_id)

# Apply non-max suppression to remove overlapping boxes
indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)

# Draw the bounding boxes
for i in indices:
    i = i[0]
    box = boxes[i]
    x, y, w, h = box[0], box[1], box[2], box[3]
    label = str(classes[class_ids[i]])
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
    cv2.putText(image, label, (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

# Display the result
cv2.imshow('YOLO Object Detection', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, the program uses the YOLO model to detect objects in an image. The detected objects are highlighted with bounding boxes and labeled with class names.

5. Real-Time Object Detection

OpenCV can also perform real-time object detection using a webcam feed. The steps are similar to those used for images but involve continuously capturing frames from the webcam and processing each frame.

5.1. Example: Real-Time Object Detection with Webcam

import cv2

# Load YOLO model
net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]

# Load the class labels
with open('coco.names', 'r') as f:
    classes = [line.strip() for line in f.readlines()]

# Initialize the webcam
cap = cv2.VideoCapture(0)

while True:
    # Capture frame-by-frame
    ret, frame = cap.read()
    height, width = frame.shape[:2]

    # Prepare the frame for the YOLO model
    blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
    net.setInput(blob)
    outs = net.forward(output_layers)

    # Process the output
    class_ids = []
    confidences = []
    boxes = []
    for out in outs:
        for detection in out:
            scores = detection[5:]
            class_id = np.argmax(scores)
            confidence = scores[class_id]
            if confidence > 0.5:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * height)
                w = int(detection[2] * width)
                h = int(detection[3] * height)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confidences.append(float(confidence))
                class_ids.append(class_id)

    # Apply non-max suppression to remove overlapping boxes
    indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)

    # Draw the bounding boxes
    for i in indices:
        i = i[0]
        box = boxes[i]
        x, y, w, h = box[0], box[1], box[2], box[3]
        label = str(classes[class_ids[i]])
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv2.putText(frame, label, (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Display the frame
    cv2.imshow('YOLO Object Detection', frame)

    # Break the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the capture and close windows
cap.release()
cv2.destroyAllWindows()

This example captures real-time video from the webcam and applies YOLO-based object detection to each frame. Detected objects are highlighted with bounding boxes and labels. The loop continues until the user presses the ‘q’ key.

Conclusion

OpenCV is a versatile library for performing various computer vision tasks, including object detection. By leveraging pre-trained models like Haar Cascades and YOLO, you can quickly build powerful applications that detect objects in images, videos, or real-time streams. Whether you’re working with static images or need real-time performance, OpenCV provides the tools you need to get started with object detection in Python.