Install NVIDIA drivers and CUDA - gkgkgk1215/else GitHub Wiki
- If possible, I recommend installing ROS prior to the followings.
Pre-setup
- Uninstall any previously installed Nvidia drivers & Nouveau driver
sudo apt-get remove nvidia* && sudo apt autoremove
sudo apt-get install dkms build-essential linux-headers-generic
- Open configuration file and blacklist Nouveau
sudo gedit /etc/modprobe.d/blacklist.conf
- Add this line in the end: blacklist nouveau
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
- Disable the Kernel Nouveau and kernel rebuild.
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
- Reboot system
sudo reboot
Install NVIDIA graphics driver
- Tested with my laptop (Samsung NT900-X5N, Geforce 940MX)
- Stop lightdm (GUI) and login to console (Ctrl+Alt+F1)
sudo service lightdm stop
- Install the appropriate driver. To check the recommended drivers,
sudo ubuntu-drivers devices
- For the SRI desktop, nvidia-455 is compatible.
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-460 (for my laptop)
- Reboot system and check the installed driver after rebooting.
sudo reboot
nvidia-smi
CUDA install
- Select CUDA Version that fits your graphics driver and download it.
- For both the SRI desktop and laptop, Cuda 11.0 is compatible.
https://developer.nvidia.com/cuda-11.0-download-archive
- Stop lightdm (GUI) and login to console (Ctrl+Alt+F1)
sudo service lightdm stop
- Go to the download path and run.
- Follow guidelines of CUDA installer
- Select 'No' when asked to install the graphics driver.
chmod +x cuda_9.0.176_384.81_linux.run
sudo ./cuda_9.0.176_384.81_linux.run
- Edit .bashrc
export PATH=$PATH:/usr/local/cuda-9.0
export PATH=$PATH:/usr/local/cuda-9.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/lib64
Verify CUDA
cd ~/NVIDIA_CUDA-9.0_Samples
make
cd 0_Simple/asyncAPI/
./asyncAPI
- Successfully installed if you can see the below messages.
GPU Device 0: "GeForce 940MX" with compute capability 5.0
CUDA device [GeForce 940MX]
time spent executing by the GPU: 93.61
time spent by CPU in CUDA calls: 0.03
CPU executed 412872 iterations while waiting for GPU to finish
-
or just type `nvcc --version'
-
Change configuration of "Software & Updates"
Install PyTorch
- Find Version suitable for CUDA Version.
https://pytorch.org/get-started/previous-versions/
For CUDA 9.0,
pip install torch==1.1.0
pip install torchvision==0.3.0
- Downgrade the version of pillow.
pip uninstall pillow
pip install pillow==6.1
cuDNN Install (Find version that matches with the installed CUDA)
- cuDNN 8.0.4 is compatible at the SRI desktop
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
Bug report
- If you are using RTX graphics driver, possibly you encounter "CUDNN_STATUS_INTERNAL_ERROR" or cudnn initialize error. Putting the following in the .bash file would resolve the problem.
export TF_FORCE_GPU_ALLOW_GROWTH=true