Click here to Skip to main content
15,867,756 members
Articles / Artificial Intelligence

Research on Methods for Counting the Number of People in a Video Stream Using OpenCV

,
Rate me:
Please Sign up or sign in to vote.
4.81/5 (5 votes)
19 Oct 2018CPOL6 min read 16.7K   25   1
Two ways of performing object recognition using OpenCV and comparing them to each other. Both approaches have their own pros and cons. This comparison will help to choose the best one for your task.

Introduction

With the advent of AI, machine learning, and automation, computer vision becomes all the more relevant. In our team, we build an expertise of working with computer vision as a part of working on a new set of projects involving AI and machine learning.

And we want to share our experience, specifically with regards to object detection with OpenCV.

Our objective is to count the number of people who have crossed an abstract line on-screen using computer vision with OpenCV library.

In this article, we will look at two ways to perform object recognition using OpenCV and compare them to each other. Both approaches have their own pros and cons, and we hope that this comparison will help you choose the best one for your task.

Contents

Object Recognition with Machine Learning Algorithms

The first method for counting people in a video stream is to distinguish each individual object with the help of machine learning algorithms. For this purpose, the HOGDescriptor class has been implemented in OpenCV.

HOG (Histogram of Oriented Gradients) is a feature descriptor used in computer vision and image processing to detect objects. This technique is based on counting occurrences of gradient orientation in localized portions of an image.

HOGDescriptor implements a detector of histogram objects with oriented gradients. When utilizing HOG in object recognition, descriptors are classified based on supervised learning (support vector machines).

A Support Vector Machine, or SVM, is a supervised learning model that includes a set of associated learning algorithms. These algorithms are used for classification purposes. Coefficients calculated based on support vectors serve as the basis for classification. A set of coefficients should be calculated on the basis of training data (an XML file). This file contains complete information on a model/classifier.

OpenCV includes two sets of pre-set nodes for people detection: Daimler People Detector and Default People Detector.

Here’s an example of how HOGDescriptor can be used (function interfaces can be found on the official OpenCV website):

C++
cv::HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());
   
// for every frame
std::vector<cv::Rect> detected;
hog.detectMultiScale(frame, detected, 0, cv::Size(8, 8), cv::Size(32, 32), 1.05f, 2);

OpenCV - Top view recognition

Figure 1. Top View Recognition

OpenCV - Side view recognition

Figure 2. Side View Recognition

More accurate results can be obtained by modifying parameters of the detectMultiScale() function. Another advantage of this approach is that due to the learning ability, in some cases, recognition can be improved to 90% or higher.

The next step is to recognize object movements. For more precise recognition, an object can be divided into parts (head, upper and lower torso, arms, and legs, for instance). In this way, the object will not be lost if it’s overlapped by another object. Body parts can be identified with the help of the HOGDescriptor, which contains a set of vertices for each body part. However, it has to be applied not to the whole image, but to a certain part with already identified object borders.

The disadvantage of this approach is that the algorithm slows down as the number of image vectors and their resolution increase. However, speed can be improved if all calculations are carried out using compute capacity of the video card.

Recognition of Moving Objects through Background Subtraction Algorithms

The second method for counting people in a video stream, which is more efficient, is based on recognition of moving objects. Considering the background to be static, two sequential frames are compared to identify differences. The simplest way is to compare two frames and generate a third one called a mask:

C++
cv::absdiff(firstFrame, secondFrame, outputMaskFrame);

However, as this method does not take into account such aspects as shades, reflections, position of light sources, or other changes to the environment, the mask might be quite inaccurate. For such cases, the following three algorithms for automatic background detection and subtraction are implemented in OpenCV: BackgroundSubtractorMOG, BackgroundSubtractorMOG2, and BackgroundSubtractorGMG.

Here, the BackgroundSubtractorMOG2 method is used as an example. This method uses a Gaussian mixture model to detect and subtract the background from an image:

C++
auto subtractor = cv::createBackgroundSubtractorMOG2(500, 2, false);

//for every frame
subtractor->apply(currentFrame, foregroundFrame);

For better recognition, any extra image noise that’s present in a video has to be removed, especially in videos that have been recorded in low light. In some cases, a Gaussian smoothing can be used to slightly blur an image before subtracting its background:

C++
auto subtractor = cv::createBackgroundSubtractorMOG2(500, 2, false);

// for every frame
cv::GaussianBlur(currentFrame, currentFrame, cv::Size(10, 10), 0);
subtractor->apply(currentFrame, foregroundFrame);
cv::threshold(foregroundFrame, threshFrame, 10, 255.0f, CV_THRESH_BINARY);      

After image noise and background have been subtracted, the image is displayed in black and white, where the background is black and all moving objects are white.

OpenCV - BackgroundSubtractorMOG2 in use

Figure 3. BackgroundSubtractorMOG2 in use

In most cases, moving objects cannot be recognized completely and therefore have some empty space inside. To improve this, a triangle is created to cover the static space of an image. Then, a morphological operation is performed to apply the triangle to the image so the empty space is filled:

C++
// for every frame
cv::Mat structuringElement10x10 = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(10, 10));
cv::morphologyEx(threshFrame, maskFrame, cv::MORPH_OPEN, structuringElement10x10);
cv::morphologyEx(maskFrame, maskFrame, cv::MORPH_CLOSE, structuringElement10x10);

After that, the contours of all objects have to be identified and the boundaries of rectangles have to be saved to an array:

C++
// for every frame
std::vector<std::vector<cv::Point>> contours;
cv::findContours(maskFrame.clone(), contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_NONE);

As objects can overlap each other, there are several OpenCV algorithms that identify pixels. One of them is based on the Lucas-Kanade method for optical flow estimation.

C++
cv::TermCriteria termСrit(cv::TermCriteria::COUNT | cv::TermCriteria::EPS, 20, 0.03f);
// for every frame
cv::calcOpticalFlowPyrLK(prevFrame, nextFrame, pointsToTrack, 
trackPointsNextPosition, status, err, winSize, 3, termСrit, 0, 0.001f);

The Lucas-Kanade method estimates the optical flow for a sparse set of functions. OpenCV also provides the Farneback method, which allows searching for a solid optical flow. This method estimates the optical flow for all pixels in the frame.

C++
cv::calcOpticalFlowFarneback(firstFrame, secondFrame, resultFrame, 0.4f, 1, 12, 2, 8, 1.2f, 0);

In addition, OpenCV has the following five algorithms for object tracing with automatic detection of the contours of moving objects: MIL, BOOSTING, MEDIANFLOW, TLD, and KCF. All of them are implemented for the cv::Tracker class:

C++
cv::Ptr<cv::Tracker> tracker = cv::Tracker::create("KCF");
tracker->init(frame, trackObjectRect);

// for every frame
tracker->update(frame, trackObjectRect);

As a result, each object is boxed in a rectangle. OpenCV traces whether this rectangle crosses an abstract line. In this way, the number of passers-by is counted.

Computer vision - System for Passers-by Counting

Figure 4. System for Passers-by Counting in Operation

To filter out small objects and to detect nearby moving objects, maximum and minimum width can be defined.

The method described above works best in good lighting and when there’s considerable distance between the camera and an object. At the same time, recognition can be improved by defining object width at a given point on the plane, as sometimes a weakly recognized object can be perceived as multiple objects.

This algorithm can also be used for recognizing other moving objects such as cars.

Conclusion

In this article, we provided an OpenCV object detection example using two different approaches: machine learning and background subtraction algorithm. Neither of these methods resolve the issue of how to trace objects that move into invisible sectors. A possible solution might be an algorithm that can predict the trajectory of movement depending on the speed of an object.

Neither of these methods resolve the issue of how to trace objects that move into invisible sectors. A possible solution might be an algorithm that can predict the trajectory of movement depending on the speed of an object.

To sum up, there are two main methods for counting the number of people in a video stream, each of which has its advantages and disadvantages. In this article, we’ve only provided basic information on these methods, as there’s no one-size-fits-all solution in available open source computer vision libraries.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Chief Technology Officer Apriorit Inc.
United States United States
ApriorIT is a software research and development company specializing in cybersecurity and data management technology engineering. We work for a broad range of clients from Fortune 500 technology leaders to small innovative startups building unique solutions.

As Apriorit offers integrated research&development services for the software projects in such areas as endpoint security, network security, data security, embedded Systems, and virtualization, we have strong kernel and driver development skills, huge system programming expertise, and are reals fans of research projects.

Our specialty is reverse engineering, we apply it for security testing and security-related projects.

A separate department of Apriorit works on large-scale business SaaS solutions, handling tasks from business analysis, data architecture design, and web development to performance optimization and DevOps.

Official site: https://www.apriorit.com
Clutch profile: https://clutch.co/profile/apriorit
This is a Organisation

33 members

Written By
Software Developer Apriorit
Ukraine Ukraine
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionInteresting ... Pin
Jerry Evans22-Oct-18 2:04
Jerry Evans22-Oct-18 2:04 
subject area and OpenCV always worth examining in more detail.

However, for anyone getting into OCV for the first (or even second time), your article would be much improved with the addition of some ready to run code. This is pretty easy to do, especially given OCV's cross platform GUI.

Thanks for posting.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.