Install NVIDIA drivers and CUDA - gkgkgk1215/else GitHub Wiki

  • If possible, I recommend installing ROS prior to the followings.

Pre-setup

  • Uninstall any previously installed Nvidia drivers & Nouveau driver
sudo apt-get remove nvidia* && sudo apt autoremove
sudo apt-get install dkms build-essential linux-headers-generic
  • Open configuration file and blacklist Nouveau
sudo gedit /etc/modprobe.d/blacklist.conf
  • Add this line in the end: blacklist nouveau
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
  • Disable the Kernel Nouveau and kernel rebuild.
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
  • Reboot system
sudo reboot

Install NVIDIA graphics driver

  • Tested with my laptop (Samsung NT900-X5N, Geforce 940MX)
  • Stop lightdm (GUI) and login to console (Ctrl+Alt+F1)
sudo service lightdm stop
  • Install the appropriate driver. To check the recommended drivers,
sudo ubuntu-drivers devices
  • For the SRI desktop, nvidia-455 is compatible.
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-460 (for my laptop)
  • Reboot system and check the installed driver after rebooting.
sudo reboot
nvidia-smi

CUDA install

  • Select CUDA Version that fits your graphics driver and download it.
  • For both the SRI desktop and laptop, Cuda 11.0 is compatible.

https://developer.nvidia.com/cuda-11.0-download-archive

  • Stop lightdm (GUI) and login to console (Ctrl+Alt+F1)
sudo service lightdm stop
  • Go to the download path and run.
  • Follow guidelines of CUDA installer
  • Select 'No' when asked to install the graphics driver.
chmod +x cuda_9.0.176_384.81_linux.run
sudo ./cuda_9.0.176_384.81_linux.run
  • Edit .bashrc
export PATH=$PATH:/usr/local/cuda-9.0
export PATH=$PATH:/usr/local/cuda-9.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/lib64 

Verify CUDA

cd ~/NVIDIA_CUDA-9.0_Samples
make
cd 0_Simple/asyncAPI/
./asyncAPI
  • Successfully installed if you can see the below messages.
GPU Device 0: "GeForce 940MX" with compute capability 5.0

CUDA device [GeForce 940MX]
time spent executing by the GPU: 93.61
time spent by CPU in CUDA calls: 0.03
CPU executed 412872 iterations while waiting for GPU to finish
  • or just type `nvcc --version'

  • Change configuration of "Software & Updates"

Install PyTorch

  • Find Version suitable for CUDA Version.

https://pytorch.org/get-started/previous-versions/

For CUDA 9.0,

pip install torch==1.1.0
pip install torchvision==0.3.0
  • Downgrade the version of pillow.
pip uninstall pillow
pip install pillow==6.1

cuDNN Install (Find version that matches with the installed CUDA)

  • cuDNN 8.0.4 is compatible at the SRI desktop

https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

Bug report

  • If you are using RTX graphics driver, possibly you encounter "CUDNN_STATUS_INTERNAL_ERROR" or cudnn initialize error. Putting the following in the .bash file would resolve the problem.
export TF_FORCE_GPU_ALLOW_GROWTH=true

Reference

http://www.kwangsiklee.com/2017/07/%EC%9A%B0%EB%B6%84%ED%88%AC-16-04%EC%97%90%EC%84%9C-cuda-%EC%84%B1%EA%B3%B5%EC%A0%81%EC%9C%BC%EB%A1%9C-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0/