Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- What is GPU programming?
- Why use CUDA with Python?
- Key concepts: Threads, Blocks, Grids
Overview of CUDA Features and Architecture
- GPU vs CPU architecture
- Understanding SIMT (Single Instruction, Multiple Threads)
- CUDA programming model
Setting up the Development Environment
- Installing CUDA Toolkit and drivers
- Installing Python and Numba
- Setting up and verifying the environment
Parallel Programming Fundamentals
- Introduction to parallel execution
- Understanding threads and thread hierarchies
- Working with warps and synchronization
Working with the Numba Compiler
- Introduction to Numba
- Writing CUDA kernels with Numba
- Understanding @cuda.jit decorators
Building a Custom CUDA Kernel
- Writing and launching a basic kernel
- Using threads for element-wise operations
- Managing grid and block dimensions
Memory Management
- Types of GPU memory (global, shared, local, constant)
- Memory transfer between host and device
- Optimizing memory usage and avoiding bottlenecks
Advanced Topics in GPU Acceleration
- Shared memory and synchronization
- Using streams for asynchronous execution
- Multi-GPU programming basics
Converting CPU-based Applications to GPU
- Profiling CPU code
- Identifying parallelizable sections
- Porting logic to CUDA kernels
Troubleshooting
- Debugging CUDA applications
- Common errors and how to resolve them
- Tools and techniques for testing and validation
Summary and Next Steps
- Review of key concepts
- Best practices in GPU programming
- Resources for continued learning
Requirements
- Python programming experience
- Experience with NumPy (ndarrays, ufuncs, etc.)
Audience
- Developers
14 Hours
Testimonials (1)
Very interactive with various examples, with a good progression in complexity between the start and the end of the training.