Click here to Skip to main content
15,881,424 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
It is possible to copy float arrays to float4?
I do not know if a float4 array elements can be aligned with a float array to copy them.

I tried this but failed to compile:

What I have tried:

#define WD2 WIDTH/4
__global float A[WIDTH*HEIGHT];
...
__local float4 B[WD2];
barrier(CLK_LOCAL_MEM_FENCE);
async_work_group_copy(B,A+j*WIDTH,WD2,0);
Posted
Updated 13-Sep-17 6:15am
Comments
Jochen Arndt 8-Sep-17 8:16am    
If you have a look at the async_work_group_copy() declaration you will see that the src and dst arguments must have the same type but you are passing a float4* and a float*.

You might try passing &(B[0].x) which should make the compiler happy. But don't ask me if that is allowed or recommended.
Javier Luis Lopez 13-Sep-17 11:57am    
Accordingly Khronos the float4 must be aligned with float2 and float1 (float3 with float4) as can be seen here: https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/dataTypes.html

Unfortunately there is something wrong because it did not work, I tried this with uchars:
__global uchar* image0;
.....
if (pix==0)
{
__local uchar16 vv;
async_work_group_copy((__local uchar *) &vv,&imagen0[0],16,0);
printf("===GPU vv: %4v16i \n",vv);
}
if (pix<16)
printf("imagen0[%2i]=%3i",pix,imagen0[pix]);


But the result was wrong, only the first value was right. It can be seen result here:
https://photos.app.goo.gl/QrgjyaNRvxnnKN672
Jochen Arndt 13-Sep-17 12:12pm    
Just ensure that the alignment is correct. With your initial code that should be the case because float4 is always properly aligned and the global float should be at least 32-bit aligned by the compiler (64-bit with 64-bit apps).

However, I did not know much about OpenCL and therefore I did not know if its is allowed, recommended, or working.
Javier Luis Lopez 11-Sep-17 2:35am    
Thank you Jachen, but as you say it can be wrong depending on compiler (and gpu driver).

Could be used vload or vload4 to load arrays of data?
Karthik_Mahalingam 12-Sep-17 23:51pm    
use  Reply  button, to post Comments/query to the user, so that the user gets notified and responds to your text.

1 solution

Finally vloadn works, but unfortunately I have to copy to only one vector, not an array of them:
__global float*  imagen0
...
long pix = get_global_id(0);
if (pix==0)
{
	float16 vv=vload16(0,imagen0);
	printf("===GPU vv: %6v16f \n",vv);
}
if (pix<16)
	printf("imagen0[%2i]=%6f",pix,imagen0[pix]);
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900