NVIDIA GPU Code Generation Guide - sdake/dotfiles GitHub Wiki
NVIDIA GPU Code Generation Guide
Overview
GPU code comes in two forms:
- SASS: Machine code specific to GPU architecture.
- PTX: Intermediate code JIT-compiled to SASS at runtime.
These are packaged in containers (CUBIN/FATBIN) during compilation.
Build System Syntax
Usage | Contents | Syntax | Example |
---|---|---|---|
PyTorch | SASS only | Space separated with -real suffix |
80-real 86-real 87-real |
PyTorch | SASS+PTX | Space separated | 80 86 87 |
PyTorch | PTX only | Space separated with +PTX suffix |
80+PTX 86+PTX 87+PTX |
CMake | SASS only | Semicolon with -real suffix |
80-real;86-real;87-real |
CMake | SASS+PTX | Semicolon separated | 80;86;87 |
CMake | PTX only | Semicolon with -virtual suffix |
80-virtual;86-virtual;87-virtual |
NVCC | SASS only | sm_ prefix |
sm_80 sm_86 sm_87 |
NVCC | SASS+PTX | gencode specification | arch=compute_80,code=sm_80 |
NVCC | PTX only | compute_ prefix |
compute_80 compute_86 compute_87 |
Environment Variables
PyTorch style:
- SASS only:
TORCH_CUDA_ARCH_LIST="80-real 86-real 87-real"
- SASS+PTX:
TORCH_CUDA_ARCH_LIST="80 86 87"
- PTX only:
TORCH_CUDA_ARCH_LIST="80+PTX 86+PTX 87+PTX"
CMake style:
- SASS only:
CMAKE_CUDA_ARCHITECTURES="80-real;86-real;87-real"
- SASS+PTX:
CMAKE_CUDA_ARCHITECTURES="80;86;87"
- PTX only:
CMAKE_CUDA_ARCHITECTURES="80-virtual;86-virtual;87-virtual"
NVCC flags:
- SASS only:
CUDA_NVCC_FLAGS="-arch=sm_80 -arch=sm_86 -arch=sm_87"
- SASS+PTX:
CUDA_NVCC_FLAGS="-gencode arch=compute_80,code=sm_80 -gencode
arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87"` - PTX only:
CUDA_NVCC_FLAGS="-arch=compute_80 -arch=compute_86 -arch=compute_87"