Click here to Skip to main content
15,897,273 members
Articles / All Topics
Technical Blog

Parallelizing data-processing with the TPL DataFlow Library

Rate me:
Please Sign up or sign in to vote.
4.82/5 (3 votes)
25 Nov 2016CPOL 9.1K   4  
How to parallelize data-processsing with the TPL DataFlow library

I highly recommend the TPL - Tasks Parallel Library - 'DataFlow' library. It's a very good abstraction of the TPL itself, easy to use. I was in a situation where I had to parallelize the execution of a file-converter, which in a single instance-run used only 15% CPU. By parallelzing it, I was able to utilize 100% CPU and finish the conversion-job much, much quicker.

IT works with .NET 4.5 and onwards, and I believe I saw a .NET Core version, too. But here's the .NET 4.5 version: https://www.nuget.org/packages/Microsoft.Tpl.Dataflow.

Install with NuGet and look to the web for examples of use. Note that many of the examples deal with async-awaitable methods, but the library works quite well with synchronous tasks as well. I had no need for async use, so my inspiration-example below is synchronous tasks only:

C#
public void ConvertFilesInFolder(string sourceFilesFolderPath)
{
string[] filePathsAndNames = getFilePathsAndNames(sourceFilesFolderPath);

// define a new 'ActionBlock', that you can push Tasks to.
var block = new ActionBlock<string>(foobar =>
{
ConvertAndMoveTheFile(foo);
}, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 6 // 6 simultaneous conversions 
                           // (limit of my 3rd-party conversion library-licence)
});

// Go ahead and add conversion-Tasks to the action-block:
foreach (string filePathAndName in filePathsAndNames)
{
block.SendAsync(filePathAndName);
}

block.Complete(); // that's enough jobs...
block.Completion.Wait(); // ... now go ahead and execute until they're done.

/* Note that as I set the max-degree-of-parallelism to 6, 
we're limited to this number of executed tasks at the same - parallel - time. 
As soon as one task completes, another is retrieved from the action-block 'queue' */
}

public void ConvertAndMoveTheFile(string filePathAndName)
{
try
{
ConvertFile(filePathAndName);
moveOriginalFileToArchive(filePathAndName);
}
catch (Exception ex)
{
// log, but otherwise suppress and move to next.
}
}

I found this blog-post very helpful in getting introduced and started with the library.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Denmark Denmark
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --