Click here to Skip to main content
15,887,812 members
Everything / OpenCL

OpenCL

OpenCL

Great Reads

by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
by tugrulGtx
Multi-device OpenCL load balancer and pipeliner for C# in few lines of code.
by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
by Matt Scarpino
Using GPU Acceleration to Compute Ray-Triangle Intersection

Latest Articles

by aroman
In this post I explore Lattice Boltzmann methods and build a related project
by tugrulGtx
Header-only C++ tool that supports basic array-like usage pattern and uses multiple graphics cards in system as storage with LRU caching
by tugrulGtx
Accessing VRAM-cached nucleotide sequences in FASTA formatted files (*.fna, *.faa) by index
by Arthur V. Ratz
In this article I will thoroughly discuss about the several aspects of using the revolutionary new Intel® oneAPI HPC Toolkit to deliver a modern code that implements a parallel “stable” sort

All Articles

Sort by Updated

OpenCL 

10 Jan 2024 by w4de
Compose a programme using OpenCL and using local memory. Task: b=min(A+C) I have a ready block diagram, but the teacher says that it shows everything sequentially, and we have a parallel programme and I don't know how to fix the block diagram. ...
10 Jan 2024 by OriginalGriff
Um. No. That's not a diagram of how your app should work: it's a mishmash of an app and instructions to produce it - and if your assignment wants you to produce a parallel based app, then you need to examine the assignment carefully and identify...
19 Dec 2023 by M Imran Ansari
Creating a flowchart for OpenCL code involves representing the logical flow of the program with steps. Note that OpenCL programming typically involves both host (CPU) and device (GPU) code flow. Check the below link as guide and try to build of...
19 Dec 2023 by w4de
Help me make a flowchart on opencl code, I don't quite understand how it should look like.The code performs the min(A+C) operation. #define _CRT_SECURE_NO_WARNINGS #define CL_USE_DEPRECATED_OPENCL_1_2_APIS #include #include...
7 Aug 2023 by prother123
When I am trying to compile the host code using make command. It gave me this error: `/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crt1.o: In function `_start': (.text+0x20): undefined reference to `main' collect2: error: ld...
17 Oct 2021 by prother123
when compiling the bellow . the following error showed up: error: passing 'int *' to parameter of type ' int *' changes address space of pointer array_dist[i] = Euclidean_distance(X_train, mydatapoint); note: passing...
17 Oct 2021 by Richard MacCutchan
I am not an OpenCL expert but looking at your code I suspect it is because you are passing a local address to a function that expects a global: // your function definition inline float Euclidean_distance(__global int* restrict array_point_A,...
14 Oct 2021 by prother123
while compiling the host code specificly this line of code: cl_program program = clCreateProgramWithBinary(context, 1, &fpga_device, &length, (const unsigned char**)&binary, NULL, NULL); I got this error: acl_util.h:100: safe_memcpy:...
14 Oct 2021 by prother123
I am trying to read the binary file which is kernel.aocx from the host code. But, I got a segmentation fault. I understand that segmentation fault caused because of accessing memory that does not belong to me. But I am not sure why this happened...
15 Sep 2021 by prother123
I am compiling a host code using make command. The bellow errors showed up. The sub code that caused this part of errors: free(data_point); free(host_output); free(index_arr); free(array_dist); Errors: host/src/host.cpp:380:21:...
15 Sep 2021 by OriginalGriff
You should tidy up after yourself, yes - but you can only free object that you have allocated with malloc. You don't free items from the stack, which includes local and global variables, unless they contain a pointer to a heap object (i.e. one...
15 Sep 2021 by prother123
Hi all, When trying to compile the host code using command in MobaXterm terminal. I got this error. I am not sure what to do to fix this error. I need help In file included from /usr/include/c++/4.8.2/array:35:0, ...
15 Sep 2021 by Richard MacCutchan
Do what the message tells you and add the appropriate option to the Makefile entry for that source module.
10 Sep 2021 by Dave Kreskowiak
You already have an expert. It's your teacher.
10 Sep 2021 by Richard MacCutchan
You have posted a number of questions on this subject and as far as I can see, there are no OpenCL experts active here. I suggest you try the forums at OpenCL | NVIDIA Developer[^]?
10 Sep 2021 by Member 15329613
No one is going to give you their email address and no one knows how to compile the code you are talking about because you haven't said anything about the code. This feels like SPAM.
6 Sep 2021 by OriginalGriff
This is a linker error, and it's telling you that the main function that it needs in order to run your code isn't there. main is the start point for any C based application, and it can't run without it. Probably, you aren't compiling the right...
5 Sep 2021 by Richard MacCutchan
This is the fifth question you have posted on this subject, which suggests that maybe you are trying to get ahead of yourself. Perhaps spend some time studying the OpenCL documentation, and some of the tutorials, would be a good idea.
3 Sep 2021 by prother123
I have wrote the kernel code that will be run on FPGA device. Currently, I am writing the host code which will be run on CPU. In the last meeting with my professor, he told me that the data (arrays in my case) in the host should not be located in...
29 Aug 2021 by prother123
I wrote the following kernel code, but it gave me the following warning. I have tried different solutions but failed. plz any suggestion would help 28:48: warning: incompatible integer to pointer conversion passing '__global int' to parameter...
29 Aug 2021 by Richard MacCutchan
Look at your declaration of Euclidean_distance : inline float Euclidean_distance(int * array_point_A, int * array_point_B) { // both parameters are pointers to arrays And now the call: array_dist[i] = Euclidean_distance(X_train[i], ...
29 Aug 2021 by OriginalGriff
X_train is a parameter, and is declared as: __global int * restrict X_train __kernel void KNN_classifier(__global int * restrict X_train, __global int * restrict Y_train, __global int * restrict data_point, int k) The method you are...
28 Aug 2021 by Richard MacCutchan
Please try to do your own research: OpenCL coding - Google Search[^]
21 Jul 2021 by aroman
In this post I explore Lattice Boltzmann methods and build a related project
15 Mar 2021 by tugrulGtx
Header-only C++ tool that supports basic array-like usage pattern and uses multiple graphics cards in system as storage with LRU caching
2 Mar 2021 by tugrulGtx
Accessing VRAM-cached nucleotide sequences in FASTA formatted files (*.fna, *.faa) by index
5 Oct 2020 by Member 12087553
I have been developing a project. it is developed with C#, OpenCL.NET. Aforge, OpenCV. I have a big problem. it is I can't release Memory on OpenCL.NET. I tried release() and dispose() but there weren't released. So I need the method that can...
15 Sep 2020 by Member 12087553
Hello, I have been reviewing OpenCL Library in C# for what will use in my project. But I don't know how to get calculated data from OpenCL Library. public static void RunGPU() { try { EasyCL cl = new EasyCL() { ...
15 Sep 2020 by Richard MacCutchan
You will most likely get a quicker response at OpenCL Overview - The Khronos Group Inc[^]
23 Jun 2020 by PontiacGTX
I am trying to compile an opencl project where I expect an output buffer to be assigned through a cl_mem object but when clEnqueueReadBuffer executes the std::vector items in the array aren't assigned the source code for the host in c++...
7 May 2020 by Arthur V. Ratz
In this article I will thoroughly discuss about the several aspects of using the revolutionary new Intel® oneAPI HPC Toolkit to deliver a modern code that implements a parallel “stable” sort
18 Feb 2019 by Apriorit Inc, ruksovdev
A detailed description of an FPGA-specific framework called ISE Design Suite, and the main steps you need to take in order to create a VGA driver using FPGA
10 Feb 2019 by tugrulGtx
OpenCL 2.0 already has reductions for thread groups. work_group_reduce_add() this command adds all participant threads' elements into single value and broadcasts it to all participant threads. Its execution time will be comparable to an optimized custom algorithm so it can get automatically...
11 Nov 2018 by tugrulGtx
but a lot of work must be done added to the parallelizing the algorithm that is the system preparision, buffer ctreation, data transfers If you are re-writing whole OpenCL stages for every new programming work, then you need to shorten the writing part by using OOP such as creating an object...
11 Nov 2018 by Javier Luis Lopez
I program in GPUs using OpenCL but I would be happy with a easier system to parallelize the program What of them implies less code to be changed to introduce in GPU? C++ amp and Trust allows run several functions sequentially inside the GPU before returning results? What I have tried: I made...
28 Aug 2018 by DaveAuld
The pursuit of Serenity, it's new build time!
16 May 2018 by Javier Luis Lopez
The only one solution is to RUN C++ MULTITHREAD on GPU and completely ABANDON OPENCL or cuda forever. I am speaking about increase the performance of a simple PC with multithread by 41x And a improvement over a very complex opencl sw by 12x Of course a lot of modifications on HW and drivers...
16 May 2018 by Javier Luis Lopez
It is very hard to use the GPU because the user has to do memory segmentation and transfer, the use of local memory and in the most applications very low performance increase 10-20x is reached. In other hand using multithreads is easy and fast. It would be better use 1280 threads in parallel...
16 May 2018 by KarstenK
It is depending on what you want to do. Even multithreading isnt optimal, when a lot of short threads are running because multithreading means also overhead in the CPU. Graphical output and low level computations are best done on GPU, computations also when the usage of the GPU leads to less...
3 Apr 2018 by Intel
The Retail Workshop: Hands on Learning with Intel®-based Retail Solutions
23 Feb 2018 by Member 13680125
Boosting Efficiency and Performance for Automotive, Networking, and Cloud Computing
13 Feb 2018 by Intel
The SDK includes components to develop applications: IDE integration, offline compiler, debugger, and other tools.
13 Feb 2018 by Intel
Intel just released Intel® System Studio 2018, an all-in-one, cross-platform, comprehensive tool suite for system and IoT device application development.
25 Jan 2018 by Intel
This tutorial will walk you through the basics of using the Deep Learning Deployment Toolkit's Inference Engine (included in the Intel® Computer Vision SDK).
24 Oct 2017 by Intel
The Intel® Computer Vision SDK is a new software development package for development and optimization of computer vision and image processing pipelines for Intel System-on-Chips (SoCs).
24 Oct 2017 by Packt Publishing
In this section, we'll take our first steps in using the low-level TensorFlow API.
10 Oct 2017 by Intel
The Face Access Control application is one of a series of IoT reference implementations aimed at instructing users on how to develop a working solution for a particular problem.
5 Oct 2017 by tugrulGtx
Multi-device OpenCL load balancer and pipeliner for C# in few lines of code.
18 Sep 2017 by Intel
OpenCL™ Drivers and Runtimes for Intel® Architecture
13 Sep 2017 by Javier Luis Lopez
It is possible to copy float arrays to float4? I do not know if a float4 array elements can be aligned with a float array to copy them. I tried this but failed to compile: What I have tried: #define WD2 WIDTH/4 __global float A[WIDTH*HEIGHT]; ... __local float4 B[WD2];...
13 Sep 2017 by Javier Luis Lopez
Finally vloadn works, but unfortunately I have to copy to only one vector, not an array of them: __global float* imagen0 ... long pix = get_global_id(0); if (pix==0) { float16 vv=vload16(0,imagen0); printf("===GPU vv: %6v16f \n",vv); } if (pix
4 Sep 2017 by Javier Luis Lopez
I am not very happy with this solution: Opencl 1.2 does not allow synchronize across all work groups, as I stated, so it must be going out of kernel and enter in a new one to use data from all work items. If somebody know how to do it in the new openCL 2.x standard I would appreciate it. ...
31 Aug 2017 by Intel
Intel® GO™ SDK Offers Automotive Solution Developers an Integrated Solutions Environment
31 Aug 2017 by Javier Luis Lopez
I tried the posted code. My idea was to obtain partial sums of input data on array rms, then make barriers (GLOBAL and LOCAL) to wait until all rms[k] are filled, then sum all them to obtain the media value. I placed some printf to advises if there are errors in the calculus. I obtained errors...
31 Aug 2017 by Javier Luis Lopez
To obtain data resume from results like following code: int k = get_global_id(0); double result=d[k]*d[k]; It must be used reductions that is very difficult to perform and reduces the code cleariness as said in following link:...
30 Aug 2017 by morzel
Detecting a Drone - OpenCV in .NET for Beginners (Emgu CV 3.2, Visual Studio 2017). Part 1
29 Aug 2017 by Javier Luis Lopez
I have done a VS2013 project to test opencl at github OpenCL dir: GitHub - jlopez2022/cpp_utils: Example of c++ programs[^] In that example I calculated differential rms of a big vector (200mega size), then on CPU and debug mode it calculated at 100 Megaops/data At CPU and release mode the...
29 Aug 2017 by Jochen Arndt
The speed doubling in release mode is not sourced by parallel processing. It is sourced by the compiler optimising the code in release mode and omitting additional checks which are done in debug builds. You have to explicitly write code for parallel processing. Which method is finally faster...
18 Aug 2017 by Intel
The Intel® Computer Vision SDK is an Intel-optimized and accelerated computer vision software development kit based on the OpenVX standard. The SDK integrates pre-built OpenCV with deep learning support using an included Deep Learning (DL) Deployment toolkit.
18 Aug 2017 by Intel
This paper addresses how the Smart Video (SV) system architecture is increasing in complexity and evolving into new industries and use cases.
14 Aug 2017 by Intel
Intel is uniquely positioned for AI development—the Intel’s AI Ecosystem offers solutions for all aspects of AI by providing a unified front end for a variety of backend technologies, from hardware to edge devices.
1 Jun 2017 by Intel
This paper introduces Intel software tools recently made available to accelerate deep learning inference in edge devices (such as smart cameras, robotics, autonomous vehicles, etc.) incorporating Intel® Processor Graphics solutions across the spectrum of Intel SOCs.
14 May 2017 by Mahdi Nejadsahebi
Have a good timei have a problem in OpenCL 1.2.Look, i have an array as global in the kernel and the group size is 1000.The problem is that the atomic_add() function doesn't work correctly.My kernel code is :buffer[3] = 100;atomic_add(&buffer[3], 1);if i create 1000...
14 May 2017 by tugrulGtx
``` buffer[3] = 100; ``` is a race condition. Do it from CPU host side or another kernel.
11 Apr 2017 by Intel
As IoT demand drives increases in data volume, a more powerful processor is required, as well as additional storage.
11 Apr 2017 by Intel
Digital displays and signs are all around you. You may have seen them cropping up at shopping centers and doctors’ offices. From video walls, to AR fitting mirrors, to ordering menus, digital signs are pervasive and are becoming a part of everyday shopping experience.
26 Mar 2017 by Richard MacCutchan
See OpenCL 2.0 Atomics Overview[^].
23 Feb 2017 by Intel
From safe roads to enjoyable commutes, automated driving is poised to change lives and society for the better.
30 Nov 2016 by Dino Konstantopoulos
Running Theano with an Nvidia 1070 GPU on Windows 10, with CUDA 8 and Visual Studio 2015
10 Nov 2016 by Farhad Reza
This article will show you how you can use the OpenGL graphics library in Google's Go language.
1 Jun 2016 by Android on Intel
Intel® System Studio 2017 Beta has been released. This is the Beta program page which guides you further on Intel® System Studio 2017 Beta new features and enhanced usability experience.
19 Apr 2016 by Android on Intel
In this guide, we will show a variety of tools to use as well as features in the Unity software that can help you enhance the performance of your Unity project.
17 Apr 2016 by Ryan Scott White
an assembler/compiler for AMD’s GCN (Generation Core Next Architecture) Assembly Language
12 Apr 2016 by Shao Voon Wong
Finding lexicographical permutations on GPU
12 Apr 2016 by Shao Voon Wong
Using SSE2 to speed up alphablending.
21 Mar 2016 by Javier Luis Lopez
Many people have a lot of troubles installing the Microsoft SDKs, then discovered that I must uninstall the faulty VS2010 runtimes and installing of the directx_Jun2010_redist.exe instead of DXSDK_Jun10.exe The questions are: - It is needed to reinstall VS2010 runtime or if it is better...
21 Mar 2016 by KarstenK
If you really need that installation you must reinstall it. Install the newest SDK which fulfills your needs.Tip 1: You shouldnt touch the registry, but first run once the official deinstallation.Tip 2: deactivate the UAC or set it at a low level for the installation.
1 Mar 2016 by Android on Intel
In this article we are going to do a walkthrough of how to do CPU-bound offline analysis of the workflow.
16 Feb 2016 by Max R McCarty
OWASP's #6 most vulnerable security risk has to do with keeping secrets secret.
24 Dec 2015 by Dave Kreskowiak
OK, so when do you start writing it? If you came here looking for someone to just hand over completed code to you you've come to the wrong site.
1 Dec 2015 by Android on Intel
Using OpenCL™ 2.0 Read-Write Images
22 Nov 2015 by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards.
1 Oct 2015 by Android on Intel
This article walks through an example Android application that offloads image processing using OpenCL™ and RenderScript programming languages.
20 Sep 2015 by Bartlomiej Filipek
A little guide about modern OpenGL and why it gives us so much value.
16 Sep 2015 by Intel
In this article, we will introduce the components of INDE and show how developers can use them to create new applications and optimize existing applications. To start with Intel® INDE provides support for IDE integration.
26 Jun 2015 by Member 11794279
When processor is not enough
26 May 2015 by Intel
In this article we are going to demonstrate how to optimize Single precision floating General Matrix Multiply (SGEMM) kernels for the best performance on Intel® Core™ Processors with Intel® Processor Graphics.
7 Apr 2015 by Android on Intel
The intention of this guide is to provide quick steps to create, build, debug, and analyze OpenCL™ applications with the OpenCL™ Code Builder, a part of Intel® Integrated Native Development Environment (Intel® INDE)
18 Mar 2015 by Intel
This tutorial demonstrates how to share surfaces between OpenCL™ and DirectX 11 with Intel ® Processor Graphics on Microsoft Windows, using the surface sharing extension in OpenCL.
12 Feb 2015 by Optimistic76
Hello ,I'm trying to integrate OpenCascade Framework in Qt application but I'm facing real problems.I tried to embed opencascade code in QOpenGLWidget but nothing shows.does anyone have some examples to show me?thank you
2 Feb 2015 by Android on Intel
This tutorial will guide you through Intel® INDE 2015 installation and demonstrate how to develop native Android* applications that target either x86 based or ARM based processors.
20 Jan 2015 by Android on Intel
This tutorial will guide you through writing a native “Hello World” Android* app in Visual Studio* through the IDE Integration feature of Intel® INDE 2015.
20 Jan 2015 by Android on Intel
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add
5 Dec 2014 by TERENCE S
The Intel SDK for OpenCL Applications provides a rich mix of OpenCL extensions and optional features that are designed for developers who want to utilize all resources available on Intel CPUs. This article focuses on device fission, available as a feature in this SDK.
17 Nov 2014 by Android on Intel
In the conclusion of this two-part series, I detail the best 3D game engine and middleware solutions for Android* tablets, including free, open source, and proprietary options. I also note which have native support for x86 Intel® processors.
6 Nov 2014 by Colleen Culbertson
This article, aimed at developers, will provide a glimpse into this 64-bit, multi-core SOC processor, and gives an overview of the available Intel® technologies, including Intel® HD Graphics 5300.
6 Nov 2014 by praveen_kundurthy
This paper presents four guidelines that can help guide software developers as they design applications that encourage touch interaction and deliver a memorable user experience on Intel® processor-based pAIOs.
6 Nov 2014 by Maxim_Shevtsov
This article is an overview of the OpenCL support provided in System Analyzer and Platform Analyzer on the Windows* OS
23 Oct 2014 by pi19404
Dense Motion Estimation based on Polynomial expansion IntroductionIn this article we will look at dense motion estimation based on polymonial repsentation of image.The polynomial basis representation of the image is obtained by approximating the local neighborhood of image us