Cudafy Me: Part 2 of 4





4.00/5 (1 vote)
These posts are meant to inspire you to enter into the world of graphics processor programming.
These posts are meant to inspire you to enter into the world of graphics processor programming.
Posts in This Series
- Part 1 – Introduction and Single Threaded CPU Solution
- Part 2 – Multi-Threaded CPU Solution
- Part 3 – Massively Parallel GPGPU Solution
- Part 4 – Discussion
Full Source Code
http://cudafytsp.codeplex.com/
Recap
In the last post, we simply put together a single-threaded CPU solution to the traveling salesman problem. Here was the code that accomplished our work:
private Answer CpuTsp()
{
var bestDistance = float.MaxValue;
var bestPermutation = -1;
var stopWatch = Stopwatch.StartNew();
for (var permutation = 0; permutation < _permutations; permutation++)
{
var path = new int[1, Cities];
var distance = FindPathDistance(_permutations, permutation,
Cities, _latitudes, _longitudes, path, 0);
if (distance < bestDistance)
{
bestDistance = distance;
bestPermutation = permutation;
}
}
return new Answer { Distance = bestDistance, Permutation = bestPermutation,
Milliseconds = stopWatch.ElapsedMilliseconds };
}
Multi-Threaded CPU Solution
Now we will modify the previous solution to take advantage of a multi-core CPU.
private Answer MpuTsp()
{
var bestDistance = float.MaxValue;
var bestPermutation = -1;
var locker = new Object();
var stopWatch = Stopwatch.StartNew();
Parallel.For(0, _permutations, permutation =>
{
var path = new int[1, Cities];
var distance = FindPathDistance(_permutations, permutation, Cities, _latitudes, _longitudes, path, 0);
lock (locker)
{
if (distance < bestDistance)
{
bestDistance = distance;
bestPermutation = permutation;
}
}
});
return new Answer { Distance = bestDistance,
Permutation = bestPermutation, Milliseconds = stopWatch.ElapsedMilliseconds };
}
The only difference in this solution is that we use a Parallel.For
loop instead of a tradition for loop. Because the interior of the loop runs in parallel (concurrently), we need to use a bit of thread locking to ensure we find the shortest distance. The function
FindPathDistance
is identical to the single-threaded solution.
We could optimize the code above and use a more efficient locking scheme, but that is not the point of this series of posts.
Next Step
What we have left to do is to write a method that takes advantage of the code we have already written to run it on the massively parallel GPGPU.
>>> Part 3 – Massively Parallel GPGPU Solution