Here we give very short instructions on how to use pretrained MobileSSD model to detect objects. We then provide and explain Python code for detecting animals on video using the SSD model. Finally we demonstrate the detection result on a video file for some animal type (that the SSD model is able to detect).
Unruly wildlife can be a pain for businesses and homeowners alike. Animals like deer, moose, and even cats can cause damage to gardens, crops, and property.
In this article series, we’ll demonstrate how to detect pests (such as a moose) in real time (or near-real time) on a Raspberry Pi and then take action to get rid of the pest. Since we don’t want to cause any harm, we’ll focus on scaring the pest away by playing a loud noise.
You are welcome to download the source code of the project. We are assuming that you are familiar with Python and have a basic understanding of how neural networks work.
In the previous article in the series, we compared two DNN types we can use to detect pests: detectors and classifiers. The detectors won. In this article, we’ll develop Python code for detecting pests using a pre-trained detection DNN.
Selecting Network Architecture
There are several common network architectures for object detection, such as Faster-RCNN, Single-Shot Detector (SSD), and You Only Look Once (YOLO).
Since our network needs to run on an edge device that has limited memory and CPU, we’re going to use the MobileNet Single Shot Detector (SSD) architecture. MobileNet SSD is a lightweight object detector network that performs well on mobile and edge devices. It was trained on the Pascal VOC 2012 dataset, which contains some classes that may represent pests, such as cat, cow, dog, horse, and sheep.
We’ll use the same algorithm for pest detection on video as the algorithm used for human detection in this prior article series.
Code for Pest Detection
First, we need to modify the MobileNet code to make it detect pests.
Let’s start by creating some utility classes to make this task easier:
import numpy as np
def load(proto, model):
net = cv2.dnn.readNetFromCaffe(proto, model)
def __init__(self, size, scale, mean):
self.size = size
self.scale = scale
self.mean = mean
def get_blob(self, frame):
img = frame
(h, w, c) = frame.shape
if w>h :
dx = int((w-h)/2)
img = frame[0:h, dx:dx+h]
resized = cv2.resize(img, (self.size, self.size), cv2.INTER_AREA)
blob = cv2.dnn.blobFromImage(resized, self.scale, (self.size, self.size), self.mean, False, False)
def draw_object(obj, label, color, frame):
(confidence, (x1, y1, w, h)) = obj
x2 = x1+w
y2 = y1+h
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
y3 = y1-12
text = label + " " + str(confidence)+"%"
cv2.putText(frame, text, (x1, y3), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 1, cv2.LINE_AA)
def draw_objects(objects, label, color, frame):
for (i, obj) in enumerate(objects):
Utils.draw_object(obj, label, color, frame)
CaffeModelLoader class loads a Caffe model from disk using the provided paths for prototype and model files.
The next utility class,
FrameProcessor, converts frames to blobs (specially structured data used as CNN input).
Utils class draws bounding rectangles around any objects detected in a frame. Most of the methods our utility classes use come from the Python version of the OpenCV library. Let’s look at these in detail.
That’s it for our utility classes. Next, we’ll write code that actually detects pests.
Well start with the
SSD class, which detects objects of a specified class in a frame:
def __init__(self, frame_proc, ssd_net):
self.proc = frame_proc
self.net = ssd_net
def detect(self, frame):
blob = self.proc.get_blob(frame)
detections = self.net.forward()
k = detections.shape
obj_data = 
for i in np.arange(0, k):
obj = detections[0, 0, i, :]
def get_object(self, frame, data):
confidence = int(data*100.0)
(h, w, c) = frame.shape
r_x = int(data*h)
r_y = int(data*h)
r_w = int((data-data)*h)
r_h = int((data-data)*h)
if w>h :
dx = int((w-h)/2)
r_x = r_x+dx
obj_rect = (r_x, r_y, r_w, r_h)
return (confidence, obj_rect)
def get_objects(self, frame, obj_data, class_num, min_confidence):
objects = 
for (i, data) in enumerate(obj_data):
obj_class = int(data)
obj_confidence = data
if obj_class==class_num and obj_confidence>=min_confidence :
obj = self.get_object(frame, data)
The key methods in the class are
detect method applies the loaded DNN model to each frame to detect objects of all possible classes.
get_objects method looks at the detected objects and selects only those that both belong to the specified class and have a high probability of being correctly detected (confidence).
Then, we’ll the
VideoSSD class, which runs pest detection on an entire video clip:
def __init__(self, ssd):
self.ssd = ssd
def detect(self, video, class_num, min_confidence, class_name):
detection_num = 0;
capture = cv2.VideoCapture(video)
img = None
dname = 'Pest detections'
cv2.resizeWindow(dname, 1280, 960)
(ret, frame) = capture.read()
if frame is None:
obj_data = self.ssd.detect(frame)
class_objects = self.ssd.get_objects(frame, obj_data, class_num, min_confidence)
p_count = len(class_objects)
detection_num += p_count
Utils.draw_objects(class_objects, class_name, (0, 0, 255), frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
The only method in the class is
detect. It processes all the frames extracted from a video file. In each frame, it detects all objects of the class specified by the
class_num parameter and then displays the frame with bounding rectangles around the objects it detected.
Does It Work?
Let’s launch our code and see how it handles a video file. The following code loads a video file and tries to detect dogs:
proto_file = r"C:\PI_PEST\net\mobilenet.prototxt"
model_file = r"C:\PI_PEST\net\mobilenet.caffemodel"
ssd_net = CaffeModelLoader.load(proto_file, model_file)
mobile_proc_frame_size = 300
ssd_proc = FrameProcessor(mobile_proc_frame_size, 1.0/127.5, 127.5)
pest_class = 12
pest_name = "DOG"
ssd = SSD(ssd_proc, ssd_net)
video_file = r"C:\PI_PEST\video\dog_1.mp4"
video_ssd = VideoSSD(ssd)
detections = video_ssd.detect(video_file, pest_class, 0.2, pest_name)
We set the value of
pest_class to 12 because "dog" is the 12th class in the MobileNet SSD model. Here is the video captured while running the above code.
Will It Work on an Edge Device?
As you can see, our SSD detector successfully detected dogs in the video when run on a PC. What about an edge device? Will the detector process the feed fast enough to detect objects in real-time? We can find out by testing the frame rate, measured in frames per second (FPS).
In the article we’d quoted before, the model we borrowed ran at about 1.25 FPS on a Raspberry Pi 3 device. Is that enough to detect pests? We can assume that, on average, an animal would be captured on camera for at least 2 to 3 seconds. That means we’ll have 2 to 3 frames to detect a pest and react to itk. Sounds like decent odds.
So far, the results aren’t very promising for wildlife detection... But let’s not give up!
In the next article, we’ll talk about some ideas for detecting "exotic" pests, such as moose and armadillos.