OPEN MP bitmap-matrix computation

Question

0.00/5 (No votes)

See more:

Hello world! I am currently trying to develop a good way for bitmap-matrix , matrix - bitmap operation. I used lockbits on bitmap and things were pretty fast. But recently i learned about OPENMP and it got me thinking whether it is possible to use it on IO operation like this. When i tried "omp for" as shown bellow i get into trouble with some random colors in imported/exported images. I understand theory about parallel computing but in real code i struggle. Can anybody help with an working implementation of OPEN MP for this case ?

(this is windows::forms c++ app,currently for 24 bit bitmap)

C#

int i,j,pos;
int width=bitmap->Width;
int height=bitmap->Height;

float **mat;
mat=new float*[width];
for(i=0;i<width;i++)
    mat[i]=new float [height];

BitmapData ^bitmapData;
bitmapData=gcnew BitmapData();
System::Drawing::Rectangle a(0,0,width,height);
bitmap->LockBits(a, System::Drawing::Imaging::ImageLockMode::ReadOnly, bitmap->PixelFormat, bitmapData);
unsigned char *ptr=(unsigned char*)bitmapData->Scan0.ToPointer();

int stride=bitmapData->Stride;
//#pragma omp parallel for private(j)
for (int i=0;i<width;i++)
    for(int j=0;j<height;j++)
    {
        pos=(j*stride)+(i*3);
        mat[i][j]=(float)(ptr[pos]+ptr[pos+1]+ptr[pos+2])/3;
    }
bitmap->UnlockBits(bitmapData);
return mat;

Posted 21-Nov-11 7:05am

Peter Kottas

Add a Solution

3 solutions

Solution 3

thanks for the typecasting hint. about that dereferencing ...

C#

float * arrptr=mat[i];
        for(int j=0;j<width;j++)
        {
            *(arrptr++)=(float)((int) *(LinePtr++) + *(LinePtr++) + *(LinePtr++))/3;
        }

i was thinking sth like that would be good idea but in that case i would presumably have to change internal dimensions of float ** matrix to [width][height] do you have som better suggestion ?

i am doing stuff like gauss blur and laplace of this so keeping it in one dimensional array would really mess things up. But in my code i have a lot of double even triple for's with dereferencing mat[i][j][u][v] etc ... i consider it to be slow but do i have a better choice ??

Posted 22-Nov-11 13:15pm

Peter Kottas

Comments

JackDingler 22-Nov-11 19:41pm

You can keep it as a 2D array.
Your change is what I had in mind.
Your main optimizations should be in the innermost loop. After all, the code executing there is happening (height * width) times. The array lookup and the multiply aren't worth optimizing out of the outer loop.

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

JackDingler · Accepted Answer · 2011-11-22T09:02:00

You're sharing the single 'pos' variable among all of your threads.

You'll get random values, as each thread is sharing the same location and writing and reading from it. Who knows what value it will have by the time the thread gets to the next line? And even in that line, the value can change in mid execution.

Try this.

C++

int pos=(j*stride)+(i*3);

You might get some speed improvement trying it like this.

C++

#pragma omp parallel for private(i)
for (int i=0;i<width;i++)
{
    unsigned char * LinePtr = ptr + (j * stride);

    for(int j=0;j<height;j++)
    {
        mat[i][j]=(float)((int) *(LinePtr++) + *(LinePtr++) + *(LinePtr++))/3;
    }
}

There is some overhead in spawning the threads. So rather than spawn a thread for each pixel, spawn one for each line. They'll then get more calculations done between the spawn and the destructon.

By precalculating the line, we divide the number of multiplications by height.
Increments are also faster than adds. And by removing the j variable from the inner loop, it gives the compiler more room to optimize. Now it only has two variables to manipulate, 'mat' and 'LinePtr'.

Also notice that I cast the pixels to an int before adding. Adding unsigned chars can lead to value wrapping at the 255 boundary.

Peter Kottas · Accepted Answer · 2011-11-22T10:53:00

Solution 2

Thanks , you cleared it very good. I am new to unmanaged code to but it seem there is a little error in your fix ... just in case somebody reads this i believe it should be like. But the idea of increments and line pointers is brilliant thanks a lot.

C#

#pragma omp parallel for private(i)
for (i=0;i<height;i++)
{
    unsigned char * LinePtr = ptr + (i * stride);

    for(int j=0;j<width;j++)
    {
        mat[j][i]=(float)((int) *(LinePtr++) + (int)*(LinePtr++) + (int)*(LinePtr++))/3;
    }
}

Posted 22-Nov-11 10:53am

Peter Kottas

Comments

JackDingler 22-Nov-11 16:58pm

Good catch. :)

JackDingler 22-Nov-11 17:09pm

Typecasting each element to an (int) is unecessary. The first typecast determines the data type that the equation inside the parenthesis, will evaluate too,

JackDingler 22-Nov-11 17:57pm

This little lookup 'mat[j][i] = ' seems a little heavy too, for an inner loop...

OPEN MP bitmap-matrix computation

3 solutions

Solution 1

Solution 2

Solution 3

Add your solution here

Preview 0

Existing Members

...or Join us