Training a Custom Model with OpenCV and ImageAI

Glenn Prince

5.00/5 (7 votes)

Jun 10, 2020

CPOL

5 min read

47002

371

In this article, we begin the process of creating a custom object detection model.

Download source - 6.5 KB

In this series, we’ll learn how to use Python, OpenCV (an open source computer vision library), and ImageAI (a deep learning library for vision) to train AI to detect whether workers are wearing hardhats. In the process, we’ll create an end-to-end solution you can use in real life—this isn’t just an academic exercise!

This is an important use case because many companies must ensure workers have the proper safety equipment. But what we’ll learn is useful beyond just detecting hardhats. By the end of the series, you’ll be able to use AI to detect nearly any kind of object in an image or video stream.

You’re currently on article 5 of 6:

Installing OpenCV and ImageAI for Object Detection
Finding Training Data for OpenCV and ImageAI Object Detection
Using Pre-trained Models to Detect Objects With OpenCV and ImageAI
Preparing Images for Object Detection With OpenCV and ImageAI
Training a Custom Model With OpenCV and ImageAI
Detecting Custom Model Objects with OpenCV and ImageAI

In the previous article, we cleaned our data and separated it into training and validation datasets.

Now we can begin the process of creating a custom object detection model. The general steps for training a custom detection model are:

Train the model
Validate the model; if validation is poor, tweak and retrain
Visually test the model and the results
Deploy the model

Training Our Model

Let's jump right into training our model. Create a new code block and enter the following:

from imageai.Detection.Custom import DetectionModelTrainer

trainer = DetectionModelTrainer()
trainer.setModelTypeAsYOLOv3()
trainer.setDataDirectory(data_directory="hardhat")

trainer.setTrainConfig(object_names_array=["person hardhat"], batch_size=4, num_experiments=20, 
                       train_from_pretrained_model="yolo.h5")

trainer.trainModel()

This code block utilizes a new method in ImageAI's detection class, DetectionModelTrainer. The process for training any model is:

Define a new DetectionModelTrainer() method.
Set the model type as YOLOv3. At this point in time, YOLOv3 is the only model type that can be trained using ImageAI.
Set the directory that contains your data. Note that this folder must contain one folder named "train" and one named "validation." These folders must each contain one folder called "images" and another called "annotations."
Set the trainer’s configuration as follows:
- Specify the names of the annotations used within the images. In our case, we use only "person hardhat."
- Define the batch size, four in our case. This determines how many images the model will train in each batch. The higher your batch size, the better your model can be trained but, remember, the more powerful GPU you need.
- Specify how many iterations of modelling to perform with num_experiments. The more iterations, the better your end result, but the longer it will take.
- Specify, if desired, a pretrained model to transfer learning from, to give a better result quicker.
Start the model training process with trainModel().

The model will begin training and output a status for each cycle (or epoch). For each of these cycles, a loss is reported that determines if a model is better than in the previous cycle. If it is, that model will be saved, so make sure you have plenty of disk space free!

Validating the Model

Training a model can take a very long time. This model, with 20 iterations, took a bit over four hours to train. Some recommendations for training models suggest more than 200 hours. Before we leave our computer for several days to train a model, let's see what’s created after our model has been trained over 20 iterations.

In your "hardhat" directory, you’ll see that some extra directories have been created: "cache," "json," "logs," and "models." The two important directories here are "json" and "models." The "json" directory contains the JSON configuration file required to use the model. The "model" directory contains a number of rather large model files with incremental numbers. Each of these files is the result of an iteration of model training that was better than the last.

So we have a number of models that are theoretically better and better, depending on the cycle number. Let’s test these out by validating them. Start a new code block and enter the following:

trainer.evaluateModel(model_path="hardhat\models\detection_model-ex-020--loss-0008.462.h5", 
                      json_path="hardhat\json\detection_config.json", iou_threshold=0.5, 
                      object_threshold=0.3, nms_threshold=0.5)

The only change you’ll need to make is the model path with the string: hardhat\models\detection_model-ex-020--loss-0008.462.h5, as each training run will be different. This method takes the following parameters:

model_path – specifies the model you wish to run a validation against
json_path – specifies the configuration file for the model training
iou_threshold – represents the ratio of intersection and union of the predicted and actual bounding box
object_threshold – the confidence level of detections to remove
nms_threshold – the confidence level when multiple bounding boxes are detected.

When we run this validation for a 20-iteration model, we get an average precision of 0.84464, or roughly 84%, which isn’t bad. But how does that compare against some of the others? Let's expand our code block to the following:

model05 = trainer.evaluateModel(model_path="hardhat\models\detection_model-ex-005--loss-0014.238.h5", 
                      json_path="hardhat\json\detection_config.json", iou_threshold=0.5, 
                      object_threshold=0.3, nms_threshold=0.5)
model10 = trainer.evaluateModel(model_path="hardhat\models\detection_model-ex-010--loss-0011.053.h5", 
                      json_path="hardhat\json\detection_config.json", iou_threshold=0.5, 
                      object_threshold=0.3, nms_threshold=0.5)
model15 = trainer.evaluateModel(model_path="hardhat\models\detection_model-ex-015--loss-0009.620.h5", 
                      json_path="hardhat\json\detection_config.json", iou_threshold=0.5, 
                      object_threshold=0.3, nms_threshold=0.5)
model20 = trainer.evaluateModel(model_path="hardhat\models\detection_model-ex-020--loss-0008.462.h5", 
                      json_path="hardhat\json\detection_config.json", iou_threshold=0.5, 
                      object_threshold=0.3, nms_threshold=0.5)

print('---------------------------------------------------------')
print('Iteration 05:', model05[0]['average_precision']['person hardhat'])
print('Iteration 10:', model10[0]['average_precision']['person hardhat'])
print('Iteration 15:', model15[0]['average_precision']['person hardhat'])
print('Iteration 20:', model20[0]['average_precision']['person hardhat'])
print('---------------------------------------------------------')

This code block will take some time to run as it needs to load 4 different models, validate them, and save the results, so if you do run this, walk away and come back in a little while. When this code block eventually finishes, the last few lines will give you the results:

5 Iterations – 71%
10 Iterations – 78%
15 Iterations – 83%
20 Iterations – 84%

So, from these results, the more we iterate, the better our model gets. However, at some point, there are diminishing returns, so you need to take that into account when training your model as well.

Up Next

In this article, we learned how to train a custom image detection model using a prepared dataset.

Next, we’ll look at how to use this model to detect if people are wearing hardhats.