Click here to Skip to main content
15,749,090 members
Articles / Artificial Intelligence / Tensorflow
Posted 27 Apr 2021


2 bookmarked

Running AI Models in Docker Containers

Rate me:
Please Sign up or sign in to vote.
5.00/5 (2 votes)
27 Apr 2021CPOL3 min read
In this article, we’ll create a container to run a CPU inference on the trained model.
Here, we create a container to run predictions on a trained model. We run a ML inference using a Docker container.


Container technologies, such as Docker, significantly simplify dependency management and portability of your software. In this series of articles, we explore Docker usage in Machine Learning (ML) scenarios.

This series assumes that you are familiar with ML, containerization in general, and Docker in particular. You are welcome to download the project code.

In the previous article, we created a basic container for experimentation and training. In this one, we’ll create a container to run predictions on a trained model.

Why a Separate Container for Inference?

In theory, we could use our previous experimentation container for production inference. The recommended approach though is to keep the image as simple and small as possible. This means getting rid of any components not required for the inference, such as redundant packages, Jupyter Notebook, or build tools (if we had any included).


While our experimentation image in the previous article contained multiple ML libraries, we have trained our model using TensorFlow. Therefore, we can trim our requirements.txt as follows:


NumPy is added here to only pin it to a specific version (a different version could be installed as TensorFlow dependency). Unfortunately, too often an update to a Python library introduces breaking changes or bugs. Adding explicit requirements with fixed versions – even for indirect dependencies – helps avoid headaches when recreating an environment.


Most of our Dockerfile is the same as it was in the previous article:

FROM python:3.8.8-slim-buster
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get -y install --no-install-recommends ffmpeg libsm6 libxext6 \
&& apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/*
RUN useradd --system --create-home --shell /bin/bash --uid $USERID $USERNAME

COPY requirements.txt /tmp/requirements.txt
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt \
&& rm /tmp/requirements.txt

Because we still use opencv-python, we need to keep the apt-get statements to install system components.

The last two lines are the only ones that are different from the previous version:

COPY app /home/$USERNAME/app
ENTRYPOINT ["python", ""]

First, we copy our application code to the image. The copied app folder contains both the inference code and the model we have trained in the previous article. Such a COPY statement is usually one of the last statements in the Dockerfile, which ensures that only a few top layers need to be recreated when the application code changes.

Next, we define ENTRYPOINT for our application. It is sort of a prefix for any command that will be executed when the container starts. Any CMD statement added using Dockerfile or the docker run command will be added as arguments to the ENTRYPOINT statement.

Building and Running Container

After downloading our application code and data, you can build the image as mld03_cpu_predict (don’t forget the dot (.) at the end):

$ docker build --build-arg USERID=$(id -u) -t mld03_cpu_predict .

On Windows, you can skip the --build-arg USERID part:

$ docker build -t mld03_cpu_predict .

After the image is built, we run the inference:

docker run -v $(pwd)/data:/home/mluser/data --rm --user $(id -u):$(id -g) mld03_cpu_predict 
--images_path /home/mluser/data/test_mnist_images/*.jpg

On Windows:

docker run -v $(pwd)/data:/home/mluser/data --rm mld03_cpu_predict 
--images_path /home/mluser/data/test_mnist_images/*.jpg

Please note that --images_path above is an argument for our application included in the container. It is appended to the command defined by the ENTRYPOINT statement in the Dockerfile. We map only the folder with the data this time, relying on the app code (and trained model) embedded in the image.

The expected outcome is as follows:

Image 1

Using the expected value included in each file name (the digit following "test_"), we can confirm that all predictions for our limited dataset are correct.


In this article, we run an ML inference using a Docker container. In the next article, we’ll adapt this image for Raspberry Pi with an ARM processor.

This article is part of the series 'Containerized AI and Machine Learning View All


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By
Poland Poland
Jarek has two decades of professional experience in software architecture and development, machine learning, business and system analysis, logistics, and business process optimization.
He is passionate about creating software solutions with complex logic, especially with the application of AI.

Comments and Discussions

QuestionHow did you pass trained model? Pin
Marcin Zybura30-Apr-21 21:26
Marcin Zybura30-Apr-21 21:26 
AnswerRe: How did you pass trained model? Pin
Jarek Szczegielniak1-May-21 1:34
Jarek Szczegielniak1-May-21 1:34 
GeneralRe: How did you pass trained model? Pin
Marcin Zybura1-May-21 2:35
Marcin Zybura1-May-21 2:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.