Portable, Accelerated Image Processing with the oneAPI Image Processing Library

Valentin Kubarev

5.00/5 (2 votes)

Jul 28, 2022

CPOL

2 min read

6118

In this article we discuss how to use SYCL and oneIPL to offload Gaussian image filtering to an accelerator.

Sanjiv Shah (Intel vice president and general manager of Developer Software Engineering) announced that the oneAPI Image Processing Library (oneIPL) is a new element in the oneAPI v1.2 provisional specification.

As the name implies, oneIPL contains image-processing functionality—filters, geometric transformations, color and type conversions, and various 3D operations—that allows developers to take advantage of diverse computational devices through SYCL* APIs without changing their code. The oneIPL specification is a top-level API (similar to the oneAPI Math Kernel Library [oneMKL] specification) that describes the image data-abstraction and processing pipelines plus the programming, running, and memory models. (The proceedings of the oneIPL Technical Advisory Board are on GitHub*. An overview of recent discussions is in this presentation.)

Continued High Performance & API Support

The upcoming Intel® oneAPI Image Processing Library product (Intel's implementation of the oneIPL specification) carries the imaging-processing capabilities from Intel® Integrated Performance Primitives (Intel® IPP), which has been delivering high performance for decades. It continues to:

Support the C API
Provide the new SYCL* API to offload image computations to accelerator devices in a portable, performant way

The oneIPL specification (provisional version 0.8) includes an initial set of functionalities targeted to image preprocessing for deep learning:

Basic geometry transformations
Color conversion of RGBA and RGB images to grayscale, NV12, i420, or RGBP
Basic fixed filters

Similar to the APIs of oneMKL and oneAPI Data Analytics Library, oneIPL:

Uses the SYCL queue to construct pipelines of heterogeneous parallel operations
Has APIs designed to work over linear device memory and with hardware-accelerated tiled-image memory for supported formats and datatypes
Includes a new data abstraction (to represent images) that works over several types of memory
Controls memory allocation via an allocator
Supports the region-of-interest part of the processed image

Example: Offloading to an Accelerator

Let's discuss how to use SYCL and oneIPL to offload Gaussian image filtering to an accelerator.

Gaussian filtering is commonly used to blur images, remove noise, and remove detail. A Gaussian function is used to calculate the transformation for each pixel in the image.

The radius of the blur defines the standard deviation value of the Gaussian function (in other words, how many pixels are used in the blend operation to compute each new pixel). A larger radius means more blurring.

Notice how the SYCL queue is used to specify where the images are initialized (host or device memory) and where the computation takes place (host or device).
The Gaussian function is nonblocking (asynchronous) so the host can continue while the computation runs on the device.

 #include <oneapi/ipl.hpp>

 using namespace oneapi::ipl;

 const sycl::range<2> size{1920, 1080};

 // Create device queue

 sycl::queue queue;

 // Create images on device associated with queue

 image<layouts::channel4, std::uint8_t> src_image{queue, src_image_pointer, size};

 image<layouts::channel4, std::uint8_t> dst_image{queue, size};

 // Set the radius of the filter

 const std::size_t radius = 20;

 // Apply Gaussian filter on the device associated with queue

 const gaussian_spec spec{radius};

 gaussian(queue, src_image, dst_image, spec);