Click here to Skip to main content
15,887,970 members
Everything / OpenCL

OpenCL

OpenCL

Great Reads

by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
by tugrulGtx
Multi-device OpenCL load balancer and pipeliner for C# in few lines of code.
by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
by Matt Scarpino
Using GPU Acceleration to Compute Ray-Triangle Intersection

Latest Articles

by aroman
In this post I explore Lattice Boltzmann methods and build a related project
by tugrulGtx
Header-only C++ tool that supports basic array-like usage pattern and uses multiple graphics cards in system as storage with LRU caching
by tugrulGtx
Accessing VRAM-cached nucleotide sequences in FASTA formatted files (*.fna, *.faa) by index
by Arthur V. Ratz
In this article I will thoroughly discuss about the several aspects of using the revolutionary new Intel® oneAPI HPC Toolkit to deliver a modern code that implements a parallel “stable” sort

All Articles

Sort by Title

OpenCL 

1 Jun 2017 by Intel
This paper introduces Intel software tools recently made available to accelerate deep learning inference in edge devices (such as smart cameras, robotics, autonomous vehicles, etc.) incorporating Intel® Processor Graphics solutions across the spectrum of Intel SOCs.
14 Oct 2021 by prother123
while compiling the host code specificly this line of code: cl_program program = clCreateProgramWithBinary(context, 1, &fpga_device, &length, (const unsigned char**)&binary, NULL, NULL); I got this error: acl_util.h:100: safe_memcpy:...
16 Oct 2014 by Tim_Duncan
In this article, we’ll explore the strides Adobe engineers have made over the last few years to enhance Photoshop using OpenGL* and OpenCL™ to increase hardware utilization.
17 Dec 2013 by kdgupta87
A 2D analog clock designed using OpenTK in C# and WinForms.
13 Sep 2017 by Javier Luis Lopez
It is possible to copy float arrays to float4? I do not know if a float4 array elements can be aligned with a float array to copy them. I tried this but failed to compile: What I have tried: #define WD2 WIDTH/4 __global float A[WIDTH*HEIGHT]; ... __local float4 B[WD2];...
13 Sep 2017 by Javier Luis Lopez
Finally vloadn works, but unfortunately I have to copy to only one vector, not an array of them: __global float* imagen0 ... long pix = get_global_id(0); if (pix==0) { float16 vv=vload16(0,imagen0); printf("===GPU vv: %6v16f \n",vv); } if (pix
3 Jun 2010 by taheretaheri
I compare the performance of Encog, Neuroph and JOONE
2 Jan 2014 by Member 10501094
Hello,I am student on high school (not University) and technically i dont study programming but i can do my project on programming too. My teachers suggested me to make some useless sorting algorithms programs but that not really something i would like to do. I would like to create some...
3 Jan 2014 by CPallini
You might write a useful parallelized sorting algorithm.
3 Jan 2014 by OriginalGriff
I think to be honest that your teachers are right: as you say CUDA / OpenCL are not easy, and despite tutorials being available, I don't think you will be able to do anything that you would find interesting and that would be acceptable to your teachers in the time you have left to do it....
15 Mar 2021 by tugrulGtx
Header-only C++ tool that supports basic array-like usage pattern and uses multiple graphics cards in system as storage with LRU caching
11 Nov 2018 by Javier Luis Lopez
I program in GPUs using OpenCL but I would be happy with a easier system to parallelize the program What of them implies less code to be changed to introduce in GPU? C++ amp and Trust allows run several functions sequentially inside the GPU before returning results? What I have tried: I made...
11 Nov 2018 by tugrulGtx
but a lot of work must be done added to the parallelizing the algorithm that is the system preparision, buffer ctreation, data transfers If you are re-writing whole OpenCL stages for every new programming work, then you need to shorten the writing part by using OOP such as creating an object...
23 Jun 2020 by PontiacGTX
I am trying to compile an opencl project where I expect an output buffer to be assigned through a cl_mem object but when clEnqueueReadBuffer executes the std::vector items in the array aren't assigned the source code for the host in c++...
22 Jun 2012 by Razvan Aguridan
Beginner optimization tutorial.
30 Jul 2012 by Razvan Aguridan
Beginner optimization tutorial
29 Aug 2017 by Javier Luis Lopez
I have done a VS2013 project to test opencl at github OpenCL dir: GitHub - jlopez2022/cpp_utils: Example of c++ programs[^] In that example I calculated differential rms of a big vector (200mega size), then on CPU and debug mode it calculated at 100 Megaops/data At CPU and release mode the...
29 Aug 2017 by Jochen Arndt
The speed doubling in release mode is not sourced by parallel processing. It is sourced by the compiler optimising the code in release mode and omitting additional checks which are done in debug builds. You have to explicitly write code for parallel processing. Which method is finally faster...
1 Mar 2016 by Android on Intel
In this article we are going to do a walkthrough of how to do CPU-bound offline analysis of the workflow.
16 Sep 2013 by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
23 Oct 2014 by pi19404
Dense Motion Estimation based on Polynomial expansion IntroductionIn this article we will look at dense motion estimation based on polymonial repsentation of image.The polynomial basis representation of the image is obtained by approximating the local neighborhood of image us
30 Aug 2017 by morzel
Detecting a Drone - OpenCV in .NET for Beginners (Emgu CV 3.2, Visual Studio 2017). Part 1
6 Nov 2014 by praveen_kundurthy
This paper presents four guidelines that can help guide software developers as they design applications that encourage touch interaction and deliver a memorable user experience on Intel® processor-based pAIOs.
17 Nov 2014 by Android on Intel
In the conclusion of this two-part series, I detail the best 3D game engine and middleware solutions for Android* tablets, including free, open source, and proprietary options. I also note which have native support for x86 Intel® processors.
3 Apr 2018 by Intel
The Retail Workshop: Hands on Learning with Intel®-based Retail Solutions
26 Aug 2013 by Buddhi Chaturanga
I want to distinguish of these two technologies relevant to their technological aspects.What are major differences and usage of each one of them?Pros and Cons.How we can handle process through GPU using each one of them?How those technologies can be implemented for 3D game programming?
27 Aug 2013 by Stefan_Lang
Basically, CUDA only works for NVIDIA cards, whereas OpenCL supports all. In theory. To my knowledge the only usable implementation for OpenCL is by AMD; NVIDIA also implemented OpenCL for their cards, but didn't put a lot effort into it.CUDA only is more powerful than the more generalized...
5 Oct 2017 by tugrulGtx
Multi-device OpenCL load balancer and pipeliner for C# in few lines of code.
23 Feb 2018 by Member 13680125
Boosting Efficiency and Performance for Automotive, Networking, and Cloud Computing
17 Oct 2021 by prother123
when compiling the bellow . the following error showed up: error: passing 'int *' to parameter of type ' int *' changes address space of pointer array_dist[i] = Euclidean_distance(X_train, mydatapoint); note: passing...
17 Oct 2021 by Richard MacCutchan
I am not an OpenCL expert but looking at your code I suspect it is because you are passing a local address to a function that expects a global: // your function definition inline float Euclidean_distance(__global int* restrict array_point_A,...
2 Mar 2021 by tugrulGtx
Accessing VRAM-cached nucleotide sequences in FASTA formatted files (*.fna, *.faa) by index
6 Jan 2014 by Adam Wojnar
Simple .jp2/.j2k viewer using Kakadu executables demonstration pack for decoding
8 Aug 2011 by Adnan Boz
An entry level example of how to use NVIDIA CUDA technology to achieve better performance within C# with minimum possible amount of code
10 Jan 2024 by w4de
Compose a programme using OpenCL and using local memory. Task: b=min(A+C) I have a ready block diagram, but the teacher says that it shows everything sequentially, and we have a parallel programme and I don't know how to fix the block diagram. ...
10 Jan 2024 by OriginalGriff
Um. No. That's not a diagram of how your app should work: it's a mishmash of an app and instructions to produce it - and if your assignment wants you to produce a parallel based app, then you need to examine the assignment carefully and identify...
2 Aug 2014 by Bartlomiej Filipek
How to start optimizing the particle system code.
14 Apr 2014 by Bartlomiej Filipek
Flexible Particle System - Start
19 Dec 2023 by w4de
Help me make a flowchart on opencl code, I don't quite understand how it should look like.The code performs the min(A+C) operation. #define _CRT_SECURE_NO_WARNINGS #define CL_USE_DEPRECATED_OPENCL_1_2_APIS #include #include...
19 Dec 2023 by M Imran Ansari
Creating a flowchart for OpenCL code involves representing the logical flow of the program with steps. Note that OpenCL programming typically involves both host (CPU) and device (GPU) code flow. Check the below link as guide and try to build of...
30 May 2013 by Doug Wyrembek
A fun utility to apply blend modes to an image.
17 Apr 2016 by Ryan Scott White
an assembler/compiler for AMD’s GCN (Generation Core Next Architecture) Assembly Language
7 Apr 2015 by Android on Intel
The intention of this guide is to provide quick steps to create, build, debug, and analyze OpenCL™ applications with the OpenCL™ Code Builder, a part of Intel® Integrated Native Development Environment (Intel® INDE)
3 Dec 2012 by Ilya Suzdalnitski
Image processing basics on the GPU using OpenCL.NET.
13 Oct 2012 by Alesiani Marco
A Wave PDE simulation using GPGPU capabilities
22 May 2013 by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
22 May 2013 by John Michael Hauck
Some ad hoc performance test results for a simple program written in C# as obtained from my current desktop computer: Dell Precision T3600, 16GB RAM, Intel Xeon E5-2665 0 @ 2.40GHz, NVidia GTX Titan.
22 Nov 2015 by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards.
20 Jan 2015 by Android on Intel
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add
2 Oct 2011 by Richard MacCutchan
Here[^] is a CodeProject article to get you started. You may find that a further search of the articles will yield even more.
3 Sep 2021 by prother123
I have wrote the kernel code that will be run on FPGA device. Currently, I am writing the host code which will be run on CPU. In the last meeting with my professor, he told me that the data (arrays in my case) in the host should not be located in...
15 Sep 2021 by prother123
Hi all, When trying to compile the host code using command in MobaXterm terminal. I got this error. I am not sure what to do to fix this error. I need help In file included from /usr/include/c++/4.8.2/array:35:0, ...
15 Sep 2021 by Richard MacCutchan
Do what the message tells you and add the appropriate option to the Makefile entry for that source module.
2 Feb 2015 by Android on Intel
This tutorial will guide you through Intel® INDE 2015 installation and demonstrate how to develop native Android* applications that target either x86 based or ARM based processors.
20 Jan 2015 by Android on Intel
This tutorial will guide you through writing a native “Hello World” Android* app in Visual Studio* through the IDE Integration feature of Intel® INDE 2015.
15 Sep 2021 by prother123
I am compiling a host code using make command. The bellow errors showed up. The sub code that caused this part of errors: free(data_point); free(host_output); free(index_arr); free(array_dist); Errors: host/src/host.cpp:380:21:...
15 Sep 2021 by OriginalGriff
You should tidy up after yourself, yes - but you can only free object that you have allocated with malloc. You don't free items from the stack, which includes local and global variables, unless they contain a pointer to a heap object (i.e. one...
6 Sep 2021 by OriginalGriff
This is a linker error, and it's telling you that the main function that it needs in order to run your code isn't there. main is the start point for any C based application, and it can't run without it. Probably, you aren't compiling the right...
7 Aug 2023 by prother123
When I am trying to compile the host code using make command. It gave me this error: `/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crt1.o: In function `_start': (.text+0x20): undefined reference to `main' collect2: error: ld...
24 Oct 2017 by Packt Publishing
In this section, we'll take our first steps in using the low-level TensorFlow API.
18 Aug 2014 by kyleK89
I'm trying to make kernel.cl with opencvfor (i = 0; i
7 May 2020 by Arthur V. Ratz
In this article I will thoroughly discuss about the several aspects of using the revolutionary new Intel® oneAPI HPC Toolkit to deliver a modern code that implements a parallel “stable” sort
5 Sep 2021 by Richard MacCutchan
This is the fifth question you have posted on this subject, which suggests that maybe you are trying to get ahead of yourself. Perhaps spend some time studying the OpenCL documentation, and some of the tutorials, would be a good idea.
14 Aug 2014 by kyleK89
The original C code isvoid GaussianArray(float* Gauss, int variable){ for(int i = 1; i 0; j-- ) { Gauss[j] += Gauss[j - 1]; } ...
14 Aug 2014 by Sergey Alexandrovich Kryukov
You misunderstand something very basic in computing. First, OpenCL is not a language, this is a framework: http://en.wikipedia.org/wiki/OpenCL[^].So, first, the C language remains C language. Second: high-level languages languages produce the same code, for kernel or not, does not matter. In...
5 Oct 2020 by Member 12087553
I have been developing a project. it is developed with C#, OpenCL.NET. Aforge, OpenCV. I have a big problem. it is I can't release Memory on OpenCL.NET. I tried release() and dispose() but there weren't released. So I need the method that can...
10 Sep 2021 by Member 15329613
No one is going to give you their email address and no one knows how to compile the code you are talking about because you haven't said anything about the code. This feels like SPAM.
10 Sep 2021 by Richard MacCutchan
You have posted a number of questions on this subject and as far as I can see, there are no OpenCL experts active here. I suggest you try the forums at OpenCL | NVIDIA Developer[^]?
10 Sep 2021 by Dave Kreskowiak
You already have an expert. It's your teacher.
15 Sep 2020 by Member 12087553
Hello, I have been reviewing OpenCL Library in C# for what will use in my project. But I don't know how to get calculated data from OpenCL Library. public static void RunGPU() { try { EasyCL cl = new EasyCL() { ...
15 Sep 2020 by Richard MacCutchan
You will most likely get a quicker response at OpenCL Overview - The Khronos Group Inc[^]
24 Dec 2015 by Dave Kreskowiak
OK, so when do you start writing it? If you came here looking for someone to just hand over completed code to you you've come to the wrong site.
14 Aug 2014 by Android on Intel
The standard API for 3D graphics on Android is OpenGL ES, which is the most widely used 3D graphics API on all mobile devices today.
24 Oct 2017 by Intel
The Intel® Computer Vision SDK is a new software development package for development and optimization of computer vision and image processing pipelines for Intel System-on-Chips (SoCs).
1 Jun 2016 by Android on Intel
Intel® System Studio 2017 Beta has been released. This is the Beta program page which guides you further on Intel® System Studio 2017 Beta new features and enhanced usability experience.
18 Aug 2017 by Intel
This paper addresses how the Smart Video (SV) system architecture is increasing in complexity and evolving into new industries and use cases.
8 Jul 2010 by pnolte64
This article helps make OpenCL™ easier to understand and implement.
10 Oct 2017 by Intel
The Face Access Control application is one of a series of IoT reference implementations aimed at instructing users on how to develop a working solution for a particular problem.
31 Aug 2017 by Javier Luis Lopez
To obtain data resume from results like following code: int k = get_global_id(0); double result=d[k]*d[k]; It must be used reductions that is very difficult to perform and reduces the code cleariness as said in following link:...
10 Feb 2019 by tugrulGtx
OpenCL 2.0 already has reductions for thread groups. work_group_reduce_add() this command adds all participant threads' elements into single value and broadcasts it to all participant threads. Its execution time will be comparable to an optimized custom algorithm so it can get automatically...
21 Jul 2021 by aroman
In this post I explore Lattice Boltzmann methods and build a related project
1 Sep 2009 by ChaoJui
Image processing with a burst of performance from CUDA
20 Sep 2015 by Bartlomiej Filipek
A little guide about modern OpenGL and why it gives us so much value.
14 Aug 2017 by Intel
Intel is uniquely positioned for AI development—the Intel’s AI Ecosystem offers solutions for all aspects of AI by providing a unified front end for a variety of backend technologies, from hardware to edge devices.
29 Jan 2013 by Dilan Shaminda
Hi, i want to know what are the techniques that i can follow in order to reduce noises in webcam images in low,high and normal lighting conditions.I am going to extract the features in the image.Any suggestions? Thank you
29 Jan 2013 by Sergey Alexandrovich Kryukov
I can advise a couple of cross-platform C++ libraries which certainly offer noise reduction algorithms:OpenCV. Please see:http://en.wikipedia.org/wiki/OpenCV[^],http://opencv.org/[^].CImg. Please see:http://cimg.sourceforge.net/[^].Both libraries are open source.Good...
25 Jan 2018 by Intel
This tutorial will walk you through the basics of using the Deep Learning Deployment Toolkit's Inference Engine (included in the Intel® Computer Vision SDK).
19 Oct 2011 by headmyshoulder
odeint v2 - Solving ordinary differential equations in C++
12 Feb 2015 by Optimistic76
Hello ,I'm trying to integrate OpenCascade Framework in Qt application but I'm facing real problems.I tried to embed opencascade code in QOpenGLWidget but nothing shows.does anyone have some examples to show me?thank you
14 May 2017 by Mahdi Nejadsahebi
Have a good timei have a problem in OpenCL 1.2.Look, i have an array as global in the kernel and the group size is 1000.The problem is that the atomic_add() function doesn't work correctly.My kernel code is :buffer[3] = 100;atomic_add(&buffer[3], 1);if i create 1000...
26 Mar 2017 by Richard MacCutchan
See OpenCL 2.0 Atomics Overview[^].
14 May 2017 by tugrulGtx
``` buffer[3] = 100; ``` is a race condition. Do it from CPU host side or another kernel.
31 Aug 2017 by Javier Luis Lopez
I tried the posted code. My idea was to obtain partial sums of input data on array rms, then make barriers (GLOBAL and LOCAL) to wait until all rms[k] are filled, then sum all them to obtain the media value. I placed some printf to advises if there are errors in the calculus. I obtained errors...
4 Sep 2017 by Javier Luis Lopez
I am not very happy with this solution: Opencl 1.2 does not allow synchronize across all work groups, as I stated, so it must be going out of kernel and enter in a new one to use data from all work items. If somebody know how to do it in the new openCL 2.x standard I would appreciate it. ...
15 Aug 2011 by caglarozbek89
Is there any one who can give me some hints about writing the OpenCL internet checksum algorithm..Or who has the OpenCL code of this algorithm??
15 Aug 2011 by Richard MacCutchan
Is this[^] what you are looking for?
28 Jul 2011 by caglarozbek89
Is there anybody who can give me a direction to write a code that calculate the pi number by using OpenCL..If you have any Pi calculator sample code, please share with me..Thanks for your great interest..
29 Jul 2011 by YvesDaoust
The first question is "How many decimal places?" 1000000, 10000000, 100000000, 1000000000... ? This will influence the choice of an algorithm.Then there is the issue of parallelizing the algorithm. Possibly by just by parallelizing the extended-precision arithmetic primitives (+, -, *, /,...