Install NVIDIA drivers and CUDA - gkgkgk1215/else GitHub Wiki

If possible, I recommend installing ROS prior to the followings.

Pre-setup

Uninstall any previously installed Nvidia drivers & Nouveau driver

sudo apt-get remove nvidia* && sudo apt autoremove
sudo apt-get install dkms build-essential linux-headers-generic

Open configuration file and blacklist Nouveau

sudo gedit /etc/modprobe.d/blacklist.conf

Add this line in the end: blacklist nouveau

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

Disable the Kernel Nouveau and kernel rebuild.

echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u

Reboot system

sudo reboot

Install NVIDIA graphics driver

Tested with my laptop (Samsung NT900-X5N, Geforce 940MX)
Stop lightdm (GUI) and login to console (Ctrl+Alt+F1)

sudo service lightdm stop

Install the appropriate driver. To check the recommended drivers,

sudo ubuntu-drivers devices

For the SRI desktop, nvidia-455 is compatible.

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-460 (for my laptop)

Reboot system and check the installed driver after rebooting.

sudo reboot
nvidia-smi

CUDA install

Select CUDA Version that fits your graphics driver and download it.
For both the SRI desktop and laptop, Cuda 11.0 is compatible.

https://developer.nvidia.com/cuda-11.0-download-archive

Stop lightdm (GUI) and login to console (Ctrl+Alt+F1)

sudo service lightdm stop

Go to the download path and run.
Follow guidelines of CUDA installer
Select 'No' when asked to install the graphics driver.

chmod +x cuda_9.0.176_384.81_linux.run
sudo ./cuda_9.0.176_384.81_linux.run

Edit .bashrc

export PATH=$PATH:/usr/local/cuda-9.0
export PATH=$PATH:/usr/local/cuda-9.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/lib64

Verify CUDA

cd ~/NVIDIA_CUDA-9.0_Samples
make
cd 0_Simple/asyncAPI/
./asyncAPI

Successfully installed if you can see the below messages.

GPU Device 0: "GeForce 940MX" with compute capability 5.0

CUDA device [GeForce 940MX]
time spent executing by the GPU: 93.61
time spent by CPU in CUDA calls: 0.03
CPU executed 412872 iterations while waiting for GPU to finish

or just type `nvcc --version'
Change configuration of "Software & Updates"

Install PyTorch

Find Version suitable for CUDA Version.

https://pytorch.org/get-started/previous-versions/

For CUDA 9.0,

pip install torch==1.1.0
pip install torchvision==0.3.0

Downgrade the version of pillow.

pip uninstall pillow
pip install pillow==6.1

cuDNN Install (Find version that matches with the installed CUDA)

cuDNN 8.0.4 is compatible at the SRI desktop

https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

Bug report

If you are using RTX graphics driver, possibly you encounter "CUDNN_STATUS_INTERNAL_ERROR" or cudnn initialize error. Putting the following in the .bash file would resolve the problem.

export TF_FORCE_GPU_ALLOW_GROWTH=true

Reference

http://www.kwangsiklee.com/2017/07/%EC%9A%B0%EB%B6%84%ED%88%AC-16-04%EC%97%90%EC%84%9C-cuda-%EC%84%B1%EA%B3%B5%EC%A0%81%EC%9C%BC%EB%A1%9C-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0/