Click here to Skip to main content
16,002,635 members
Articles / Artificial Intelligence / Tensorflow
Article

Preparing Images for AI Model Training

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
26 Jan 2021CPOL4 min read 8.4K   6  
In this article we prepare a face mask images for our AI model training.
Here I’ll show you how to collect, preprocess, and augment the data required for our model training.

Introduction

In the previous article of this series, we talked about the different approaches you can take to create a face mask detector. In this article, we’ll prepare a dataset for the mask detector solution.

The procedure of gathering images, preprocessing them, and augmenting the resulting dataset is essentially the same for any image dataset. We’ll take the long way through to cover real-life scenarios where data is scarce. I’ve obtained the images from two different sources, and I will show you how to standardize and augment them for future labeling.

Although there are several automated tools that make this process painless, we’ll do it the hard way to learn more.

We'll be using a Roboflow dataset that contains 149 images of people wearing face masks, all of them with black padding and the "same dimensions," and another set of images that obtained from a completely different source at Kaggle that only contains human faces (without masks). With these two data sets representing two classes – faces in masks and faces without masks – let’s go through the steps to achieve a standardized and augmented dataset.

Roboflow Dataset Normalization

I’ll be using Kaggle notebooks to run the code in this article because they provide easy access to computing power, and they're pre-configured with all of the tools we'll need so we won't have to install Python, Tensorflow, or anything else. But they are not mandatory; you can achieve the same result running a Jupyter Notebook locally if you prefer to do so.

In this case, I manually downloaded the dataset, zipped and uploaded it to a Kaggle Notebook. To launch a Kaggle Notebook, go to https://kaggle.com, log in, go to Notebooks in the left panel, and click New notebook. Once it’s running, upload the zip file and run the following cells.

Basic libraries import:

Python
import os # to explore directories
import matplotlib.pyplot as plt #to plot images
#import matplotlib.image as mpimg
import cv2 #to make image transformations
from PIL import Image,ImageOps #for images handling

Let’s explore the images’ dimensions. We’ll read each image, get its shape, and get the unique dimensions in our dataset:

Python
#Image size exploration
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
 
print('Unique shapes at imageset: ',set(shapes))

Here is where I got something I was not expecting to see. This is the output:

Python
Unique shapes at imageset:  {(415, 415, 3), (415, 416, 3), (416, 415, 3), (416, 416, 3)}

As you may know, we cannot feed any model with images of different dimensions. Let’s normalize them to a single dimension (415x415):

Python
def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
    x, y = image.size
    size = max(minimun_size, x, y)
    new_image = Image.new('RGB', (size, size), fill)
    new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
 
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
            new_image = Image.open(os.path.join(dirname, filename))
            new_image = make_square(new_image)
            new_image = new_image.resize((415, 415))
            new_image.save("/kaggle/working/"+str(counter)+"-roboflow.jpg")
            if counter == 150:
                break

The convenient directory to save files in Kaggle and get them as output is /kaggle/working.

Before downloading the normalized dataset, run this cell to zip all images that way you’ll find the final archive easier:

Python
!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

Now you can look for the output.zip file in the directory explorer on the right-hand side:

Normalization of the Human Face Dataset

The approach to this task is slightly different from the one we chose for the Roboflow dataset above. This time, the dataset contains 4,000+ images, all of them of completely different dimensions. Go to the dataset link and launch a Jupyter Notebook from there. We’ll select the first 150 images.

Basic imports:

Python
import os # to explore directories
import matplotlib.pyplot as plt #to plot images
import cv2 #to make image transformations
from PIL import Image #for images handling

If you want to explore the dataset:

Python
#How many images do we have?
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
print('Images in directory: ',counter)
 
#Let's explore an image
%matplotlib inline
plt.figure()
image = cv2.imread('/kaggle/input/human-faces/Humans/1 (719).jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
plt.imshow(image)
plt.show()
 
 
#Image size exploration
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
 
print('Unique shapes at imageset: ',set(shapes))

This last cell returns a huge variety of dimensions, so the normalization is imperative. Let’s resize all images to (415x415), black-padded:

Python
def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
    x, y = image.size
    size = max(minimun_size, x, y)
    new_image = Image.new('RGBA', (size, size), fill)
    new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
 
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
            test_image = Image.open(os.path.join(dirname, filename))
            new_image = make_square(test_image)
            new_image = new_image.convert("RGB")
            new_image = new_image.resize((415, 415))
            new_image.save("/kaggle/working/"+str(counter)+"-kaggle.jpg")
            if counter == 150:
                Break

To download the dataset:

Python
!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

Now you will find it easily in the right-hand panel.

Dataset Augmentation

Once you have both datasets normalized, it’s time to join the data and augment the resulting set. Data augmentation gives us a way to artificially generate more small training data from a relatively small data set. Augmentation is often necessary because any model needs a huge amount of data to achieve good results during training.

Unzip both files on your computer, place all images in the same folder, zip them, launch a new Kaggle Notebook (mine is here), and upload the resulting file.

Next, Let’s see what you have to do to augment the data. We could cut some corners using automated services, but we’ve decided to do everything by ourselves, so as to learn more.

Basic imports:

Python
import numpy as np
from numpy import expand_dims
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import cv2
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from PIL import Image

Let’s go straight to the augmentation. We’ll use the ImageDataGenerator method from Keras, which is widely used in the computer vision community:

Python
def data_augmentation(filename):
    
    """
    This function will perform data augmentation:
    for each one of the images, will create expanded/reduced, darker/lighter, rotated images. 5 for every modification type.
    In total, we will create 15 extra images for every one in the original dataset.
    """
    
    image_data = []
    #reading the image
    image = cv2.imread(filename,3)
    #image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    #expanding the image dimension to one sample
    samples = expand_dims(image, 0)
    # creating the image data augmentation generators
    datagen1 = ImageDataGenerator(zoom_range=[0.5,1.2])
    datagen2 = ImageDataGenerator(brightness_range=[0.2,1.0])
    datagen3 = ImageDataGenerator(rotation_range=20)
      
    # preparing iterators
    it1 = datagen1.flow(samples, batch_size=1)
    it2 = datagen2.flow(samples, batch_size=1)
    it3 = datagen3.flow(samples, batch_size=1)
    image_data.append(image)
    for i in range(5):
        # generating batch of images
        batch1 = it1.next()
        batch2 = it2.next()
        batch3 = it3.next()
        # convert to unsigned integers
        image1 = batch1[0].astype('uint8')
        image2 = batch2[0].astype('uint8')
        image3 = batch3[0].astype('uint8')
        #appending to the list of images
        image_data.append(image1)
        image_data.append(image2)
        image_data.append(image3)
        
    return image_data

To implement it, let’s iterate over every image in the /kaggle/input directory and save all results in /kaggle/working for future download:

Python
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
        result = data_augmentation(os.path.join(dirname, filename))
        for i in range(16):
            cv2.imwrite('/kaggle/working/'+str(counter)+'.jpg', result[i])

Again, before the download, just run the next to lines to make the files easier to find in the right panel:

Python
!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

Now you can download the output.zip file.

Next Step

In the next article, we’ll see how to properly label the resulting images in order to train a YOLO model. Stay tuned!

This article is part of the series 'AI on the Edge: Face Mask Detection View All

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
Sergio Virahonda grew up in Venezuela where obtained a bachelor's degree in Telecommunications Engineering. He moved abroad 4 years ago and since then has been focused on building meaningful data science career. He's currently living in Argentina writing code as a freelance developer.

Comments and Discussions

 
-- There are no messages in this forum --