arch and gencode flags for CUDA builds on NVIDIA - lmmx/devnotes GitHub Wiki

When trying to build the software Gpufit on Linux with CUDA 11, I received an error that compute_30 was an "unsupported architecture".

nvcc fatal   : Unsupported gpu architecture 'compute_30'
CMake Error at Gpufit_generated_cuda_gaussjordan.cu.o.RELEASE.cmake:222 (message):
  Error generating
  /home/louis/dev/gpufit_dev/gpufit-build/Gpufit/CMakeFiles/Gpufit.dir//./Gpufit_generated_cuda_gaussjordan.cu.o


make[2]: *** [Gpufit/CMakeFiles/Gpufit.dir/build.make:93:
Gpufit/CMakeFiles/Gpufit.dir/Gpufit_generated_cuda_gaussjordan.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1114: Gpufit/CMakeFiles/Gpufit.dir/all] Error 2
make: *** [Makefile:95: all] Error 2

This had not been reported in the bug tracker, and when inspecting the cmake output it was taking a "try everything" approach:

-- CUDA_ARCHITECTURES=3.0;3.5;5.0;5.2;3.2;3.7;5.3;6.0;6.1;6.2;7.0+PTX
-- CUDA_NVCC_FLAGS=-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,...
  • That 2nd line went on all the way to compute_70 and sm_70.

Essentially the CUDA_ARCHITECTURES, e.g. "3.0" gets interpreted into CUDA_NVCC_FLAGS such as compute_30: simple enough.

The problem was that since CUDA 11, compute_30 is deprecated.

  • This is documented here
Fermi† Kepler† Maxwell‡ Pascal Volta Turing Ampere Lovelace* Hopper**
sm_20 sm_30 sm_50 sm_60 sm_70 sm_75 sm_80 sm_90? sm_100c?
sm_35 sm_52 sm_61 sm_72 sm_86
sm_37 sm_53 sm_62

† Fermi and Kepler are deprecated from CUDA 9 and 11 onwards

‡ Maxwell is deprecated from CUDA 12 onwards

* Lovelace is the microarchitecture replacing Ampere (AD102)

** Hopper is NVIDIA’s rumored “tesla-next” series, with a 5nm process.

  • The latest series is RTX 30, which have the Ampere architecture and can only use CUDA 11 and upwards.
  • As noted, Fermi and Kepler are deprecated from CUDA 9 and 11 upwards (respectively I presume), i.e. in CUDA 11, Kepler is deprecated, and with it sm_30, sm_35, and sm_37 compute capabilities.

This explains why compute_30 was throwing an error and suggests how to fix it. Simply test if the CUDA version is greater than or equal to 11, and then skip the architectures from 37 and below.