Sol 1
CUDA basics + resources
Finally got setup with CMake and Nvidia Nsight to get going with CUDA programming. Surprisingly, there’s not a lot of content around it, despite its importance in running the entire GPU-based infrastructure for AI.
A majority of the tutorials on CUDA are quite old. I guess the AI revolution sent all the Gandalfian C++ wielding CUDA devs into their lairs, driven to pull every ounce of performance out of the NVIDIA GPU clusters.
Some materials which are proving quite useful in this effort are:
There is definitely some resistance in moving from Python, which benefits from using a garbage collector, and a dynamic type system, to C++ which violently warns when you’ve goofed up your memory allocations, or mismatched your types. Learning to work with the compiler and not against it is something I came across in trying out Rust. I was definitely contemplating getting on the Rust hype train, but figured I’d have enough time for it later on.
I am also discovering the utility in using Makefiles, which help you run long or complex commands for single or multiple files, with a decent granularity of control. The cycle of build-run-clean is an interesting process, and definitely pushes you to structure your projects in a different way, compared to the typical Python project structure I have come across.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# NVCC Compiler
NVCC=/usr/local/cuda-12.6/bin/nvcc
# Directories
SRC_DIR=src
INCLUDE_DIR=include
BIN_DIR=bin
# Source file and target exec
SRC=$(SRC_DIR)/vecAdd.cu
TARGET=$(BIN_DIR)/vecAdd
# Compiler flags
NVCC_FLAGS=-I$(INCLUDE_DIR)
# Build the project
build: $(TARGET)
# Compile CUDA src file
$(TARGET): $(SRC) | $(BIN_DIR)
$(NVCC) $(NVCC_FLAGS) $(SRC) -o $(TARGET)
@echo "Build complete: $(TARGET)"
# Create bin directory if it doesn't exist
$(BIN_DIR):
mkdir -p $(BIN_DIR)
# Run compiled CUDA executable
run: build
@echo "Running the program..."
$(TARGET)
# Clean the build
clean:
rm -rf $(BIN_DIR)
@echo "Cleanup complete"
.PHONY: build run clean
Looking forward to some interesting CUDA projects in future sols.