Click here to Skip to main content
15,883,705 members
Everything / Programming Languages / CUDA

CUDA

CUDA

Great Reads

by CodeProject
Version 2.6.2. Our fast, free, self-hosted Artificial Intelligence Server for any platform, any language
by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
by Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.

Latest Articles

by CodeProject
Version 2.6.2. Our fast, free, self-hosted Artificial Intelligence Server for any platform, any language
by Robert Mueller-Albrecht
Using the Intel® oneAPI Math Kernel Library SYCL API
by Ryan Scott White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.
by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing

All Articles

Sort by Title

CUDA 

27 Jun 2010 by Wayne Wood
Verify the execution efficiency of a short CUDA program when using the library thrust
1 Jul 2010 by Wayne Wood
Verify the execution efficiency of a series of short .NET 4.0 parallel programming samples
13 Mar 2008 by billconan, kavinguy
This article describes the implementation of a neural network with CUDA.
3 May 2017 by Intel
In this blog post, we highlight one particular class of low precision networks named binarized neural networks (BNNs), the fundamental concepts underlying this class, and introduce a Neon CPU and GPU implementation.
8 Sep 2010 by Dan Buskirk
Understanding the organization of a Visual Studio project for CUDA development
16 Sep 2013 by Nick Kopp
Performing base64 encoding on a graphics processing unit using CUDAfy.NET (CUDA in .NET).
29 Feb 2024 by CodeProject
Version 2.6.2. Our fast, free, self-hosted Artificial Intelligence Server for any platform, any language
28 May 2011 by grilialex
Flow and tools to convert Xilinx bitstreams to C source code for programming FPGA/CPLD
16 Sep 2013 by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
28 Mar 2023 by Ryan Scott White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.
14 Sep 2016 by Mike Lanzetta
In this post, I'll walk you through how to get one of the most popular toolkits up and running on Windows, and run through and explain some fun examples.
15 Mar 2011 by Roman Ginzburg
A text overlay filter and a JPEG/JPEG2000 encoder using transform filters.
25 Oct 2010 by hax_
Introduction to the open-source hxGrid library for distributed computing. Main benefits of the library: cluster uses only idle time of Windows 2000/XP/Vista workstation (no dedicated workstations required); easy to use; free.
10 Jan 2011 by phoaivu
GPU Implementation of Extended Gaussian mixture model for Background Subtraction
22 Jul 2016 by Afzaal Ahmad Zeeshan
In this post, I am going to walk you through creating your own central hub to allow your connected devices to authenticate people using facial recognition system.
10 Sep 2009 by ChaoJui
High performance and good quality of image blurring
6 Jan 2014 by Adam Wojnar
Simple .jp2/.j2k viewer using Kakadu executables demonstration pack for decoding
17 Apr 2016 by Ryan Scott White
an assembler/compiler for AMD’s GCN (Generation Core Next Architecture) Assembly Language
4 May 2017 by Intel
Theano is a Python library developed at the LISA lab to define, optimize, and evaluate mathematical expressions, including the ones with multi-dimensional arrays (numpy.ndarray)
13 Oct 2012 by Alesiani Marco
A Wave PDE simulation using GPGPU capabilities
22 May 2013 by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
21 Sep 2013 by Mark H Bishop
Tutorial: GPU computing with JCuda and Nsight (Eclipse)
20 Jan 2015 by Android on Intel
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add
16 Jul 2012 by Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
16 Sep 2013 by Nick Kopp
How to get 30x performance increase for queries by using your Graphics Processing Unit (GPU) instead of LINQ and PLINQ.
25 Jul 2016 by Igor Gribanov
Performing linear static analysis on a tetrahedral mesh with a little bit of help from a third-party solver.
2 Nov 2018 by Vangos
This post will show you how to build OpenCV for Windows with CUDA.
24 Oct 2017 by Packt Publishing
In this section, we'll take our first steps in using the low-level TensorFlow API.
27 Jun 2023 by Robert Mueller-Albrecht
Using the Intel® oneAPI Math Kernel Library SYCL API
18 Dec 2013 by Joren Heit
A Hybrid Framework Code-Generator for CUDA
10 Jul 2012 by Kerem Kat
Process webcam images on the CPU and GPU with OpenCV, CUDA and C++ AMP
9 Dec 2016 by Arthur V. Ratz
In this article, we'll demonstrate an approach the allows to increase the performance (up to 600%) of the code that implements the conventional distribution counting algorithm (DCA) using NVIDIA CUDA 8.0 Runtime API
10 Dec 2018 by Apriorit Inc, Vadym Zhernovyi
The experience of improving Mask R-CNN performance six to ten times by applying TensorRT
22 Jun 2020 by Thomas Daniels
In this article, let’s dive into Keras, a high-level library for neural networks.
9 Mar 2012 by Adnan Boz
From spam filters to movie recommendation and face detection, nowadays machine learning algorithms are used everywhere to make the machine think for us. But, running these algorithms require high computation power and in most cases supercomputers. This is where the 500 core GPUs step in...
1 Sep 2009 by ChaoJui
Image processing with a burst of performance from CUDA
20 Sep 2015 by Bartlomiej Filipek
A little guide about modern OpenGL and why it gives us so much value.
10 May 2010 by Kevin Drzycimski
Unroll loops at compile time, deduced by a template argument.
3 Apr 2022 by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
19 Oct 2011 by headmyshoulder
odeint v2 - Solving ordinary differential equations in C++
16 Feb 2016 by Max R McCarty
OWASP's #6 most vulnerable security risk has to do with keeping secrets secret.
1 Oct 2008 by Andrew Kirillov
This article describes the implementation of parallel computations using plain C#.
9 Feb 2013 by Debdatta Basu
Examine the various approaches to implementing Radix sort on the GPU
2 May 2017 by Arthur V. Ratz
This article is a practical guide on using Intel® Threading Building Blocks (TBB) and OpenMP libraries for C++ based on the example of delivering parallel scalable code that implements Burrows-Wheeler Transformation (BWT) algorithm.
26 May 2014 by CatchExAs
How to make best use of current technology for computationally intensive applications?
2 Apr 2012 by manythreads
This sixth article in a series on portable multithreaded programming using OpenCL™ where Rob Farber discusses how to calculate data in OpenCL™ and render it with OpenGL within the same application.
12 Apr 2016 by Shao Voon Wong
Finding lexicographical permutations on GPU
10 Nov 2020 by Jeremy C. Ong
A quick 5-minute introduction to porting a CUDA app to Data Parallel C++ (DPC++)
13 Oct 2012 by Maxim Kartavenkov
Article describes how to make DirectShow Filters in .NET, it consist of BaseClasses and couple of samples
11 Jul 2013 by Matthew Faithfull
Querysoft Open Runtime: Architecture compatibility aspect.
16 Jan 2021 by Shao Voon Wong
How to convert a code from parallel C++ ray-tracing code to CUDA, then to SYCL 2020 via Intel® DPC++
3 Aug 2014 by Sushil Sh.
How to setup android development enviornment using eclipse and Android studio.
24 Jun 2005 by Philippe Kirsanov
A small class representing DateTime in seconds elapsed since "01 Jan, 0001 00:00:00".
26 Jul 2012 by headmyshoulder, Denis Demidov
This article shows how ordinary differential equations can be solved with OpenCL. In detail it shows how odeint - a C++ library for ordinary differential equations - can be adapted to work with VexCL - a library for OpenCL. The resulting performance is studied on two examples.
24 Aug 2003 by Alex Mikunov
Runtime MSIL Code Instrumentation and .NET Metadata Extensions
30 Nov 2016 by Dino Konstantopoulos
Running Theano with an Nvidia 1070 GPU on Windows 10, with CUDA 8 and Visual Studio 2015
18 Sep 2017 by Intel
TotalView includes a set of tools that provide scientific and academic developers with controlover processes and thread execution, along with deep visibility into program states and data.
21 May 2012 by Jeff B. Cromwell
Granger Causality in both R and C#.NET with open source libraries.
16 Sep 2013 by Nick Kopp
Ultra high quality frequency domain image rotation on a GPU.
16 Sep 2013 by Nick Kopp
An introduction to using Cudafy.NET to perform processing on a GPU
9 Jan 2013 by Denis Demidov
This article is an introduction to VexCL. VexCL is vector expression template library created for ease of C++ based OpenCL development.
9 Nov 2011 by grilialex
How-To Embed Xilinx FPGA Configuration Data to AVRILOS