Vincent's Blog

An introduction to CUDA in Python (Part 5)

@Vincent Lunot · Dec 10, 2017

In Part 4 of this introduction, we saw that the performance of our convolution kernel is limited by memory bandwidth. We are going to see how to improve performance by using shared memory.

An introduction to CUDA in Python (Part 4)

@Vincent Lunot · Dec 4, 2017

In this part, we will learn how to profile a CUDA kernel using both nvprof and nvvp, the Visual Profiler. We will use the convolution kernel from Part 3, and discover thanks to profiling how to improve it.

An introduction to CUDA in Python (Part 3)

@Vincent Lunot · Dec 1, 2017

This is the third part of an introduction to CUDA in Python. If you missed the beginning, you are welcome to go back to Part 1 or Part 2. In this third part, we are going to write a convolution kernel to filter an image.

An introduction to CUDA in Python (Part 2)

@Vincent Lunot · Nov 26, 2017

In the first part of this introduction, we saw how to launch a CUDA kernel in Python using the Open Source just-in-time compiler Numba. In this part, we will learn more about CUDA kernels.

An introduction to CUDA in Python (Part 1)

@Vincent Lunot · Nov 19, 2017

Coding directly in Python functions that will be executed on GPU may allow to remove bottlenecks while keeping the code short and simple. In this introduction, we show one way to use CUDA in Python, and explain some basic principles of CUDA programming.