window10inside wsl2 cuda11 - ballgle/saltwiki GitHub Wiki

总体

  • winget install Canonical.Ubuntu 比从msstore安装ubuntu更好点,先用winget list检查下

  • 参考nvidia和 微软官方文档安装测试。

  • nvidia驱动安装注意每个显卡适配都不一样,参考nvidia

  • cuda工具的安装不需要安装最新的版本,要到这里去找老版本,目前看来最高标配是11.4.2-470_470.57.02

  • 动手前仔细阅读11.4.2文本

  • 注意nvidia官方安装顺序

  • windows内部版会提前预装了cuda,目前已经到了11.10版本;使用命令wsl -l -o查看在线可以安装的wsl子系统版本

  • 另外关于是否在ubuntu中安装docker,目前windows docker已经支持wls2

  • 原有有安装过cuda或者其他操作的系统,需要先用 dpkg -l | grep nvidia*来看看都安装了啥

干净空间

https://gsy00517.github.io/ubuntu20200126083448/

先看看安装好了哪些包,前面两个字母为ii正常,rc什么的都有问题:

dpkg -l | grep nvidia

dpkg -l | grep cuda

例如用下面命令彻底删除:

sudo dpkg -P cuda-repo-wsl-ubuntu-11-5-local

sudo dpkg --purge --force-all nvidia-kernel-common-495

命令重点

完整版

wget https://mirrors.aliyun.com/nvidia-cuda/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://mirrors.aliyun.com/nvidia-cuda/ubuntu2004/x86_64/7fa2af80.pub
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F60F4B3D7FA2AF80

sudo add-apt-repository "deb https://mirrors.aliyun.com/nvidia-cuda/ubuntu2004/x86_64/ /"

sudo apt-get update
sudo apt-get -y install cuda-11-4  # -y:yes,在命令行交互提示中,直接输入 yes
sudo apt-get install -y cuda-toolkit-11-4

在没有安装toolkit之前,安装完cuda-11-4之后是这样的:

ii  libnvidia-cfg1-495:amd64             495.29.05-0ubuntu1                    amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-495                 495.29.05-0ubuntu1                    all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-495:amd64          495.29.05-0ubuntu1                    amd64        NVIDIA libcompute package
ii  libnvidia-decode-495:amd64           495.29.05-0ubuntu1                    amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-495:amd64           495.29.05-0ubuntu1                    amd64        NVENC Video Encoding runtime library
ii  libnvidia-extra-495:amd64            495.29.05-0ubuntu1                    amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-495:amd64             495.29.05-0ubuntu1                    amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-495:amd64               495.29.05-0ubuntu1                    amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  nvidia-compute-utils-495             495.29.05-0ubuntu1                    amd64        NVIDIA compute utilities
ii  nvidia-dkms-495                      495.29.05-0ubuntu1                    amd64        NVIDIA DKMS package
ii  nvidia-driver-495                    495.29.05-0ubuntu1                    amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-495             495.29.05-0ubuntu1                    amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-495             495.29.05-0ubuntu1                    amd64        NVIDIA kernel source package
ii  nvidia-modprobe                      495.29.05-0ubuntu1                    amd64        Load the NVIDIA kernel driver and create device files
ii  nvidia-prime                         0.8.16~0.20.04.1                      all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                      495.29.05-0ubuntu1                    amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-495                     495.29.05-0ubuntu1                    amd64        NVIDIA driver support binaries
ii  screen-resolution-extra              0.18build1                            all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-495        495.29.05-0ubuntu1                    amd64        NVIDIA binary Xorg driver

cuda这样:

ii  cuda-11-4                            11.4.2-1                              amd64        CUDA 11.4 meta-package
ii  cuda-cccl-11-4                       11.4.122-1                            amd64        CUDA CCCL
ii  cuda-command-line-tools-11-4         11.4.2-1                              amd64        CUDA command-line tools
ii  cuda-compiler-11-4                   11.4.2-1                              amd64        CUDA compiler
ii  cuda-cudart-11-4                     11.4.108-1                            amd64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-11-4                 11.4.108-1                            amd64        CUDA Runtime native dev links, headers
ii  cuda-cuobjdump-11-4                  11.4.120-1                            amd64        CUDA cuobjdump
ii  cuda-cupti-11-4                      11.4.120-1                            amd64        CUDA profiling tools runtime libs.
ii  cuda-cupti-dev-11-4                  11.4.120-1                            amd64        CUDA profiling tools interface.
ii  cuda-cuxxfilt-11-4                   11.4.120-1                            amd64        CUDA cuxxfilt
ii  cuda-demo-suite-11-4                 11.4.100-1                            amd64        Demo suite for CUDA
ii  cuda-documentation-11-4              11.4.126-1                            amd64        CUDA documentation
ii  cuda-driver-dev-11-4                 11.4.108-1                            amd64        CUDA Driver native dev stub library
ii  cuda-drivers                         495.29.05-1                           amd64        CUDA Driver meta-package, branch-agnostic
ii  cuda-drivers-495                     495.29.05-1                           amd64        CUDA Driver meta-package, branch-specific
ii  cuda-gdb-11-4                        11.4.120-1                            amd64        CUDA-GDB
ii  cuda-libraries-11-4                  11.4.2-1                              amd64        CUDA Libraries 11.4 meta-package
ii  cuda-libraries-dev-11-4              11.4.2-1                              amd64        CUDA Libraries 11.4 development meta-package
ii  cuda-memcheck-11-4                   11.4.120-1                            amd64        CUDA-MEMCHECK
ii  cuda-nsight-11-4                     11.4.120-1                            amd64        CUDA nsight
ii  cuda-nsight-compute-11-4             11.4.2-1                              amd64        NVIDIA Nsight Compute
ii  cuda-nsight-systems-11-4             11.4.2-1                              amd64        NVIDIA Nsight Systems
ii  cuda-nvcc-11-4                       11.4.120-1                            amd64        CUDA nvcc
ii  cuda-nvdisasm-11-4                   11.4.120-1                            amd64        CUDA disassembler
ii  cuda-nvml-dev-11-4                   11.4.120-1                            amd64        NVML native dev links, headers
ii  cuda-nvprof-11-4                     11.4.120-1                            amd64        CUDA Profiler tools
ii  cuda-nvprune-11-4                    11.4.120-1                            amd64        CUDA nvprune
ii  cuda-nvrtc-11-4                      11.4.120-1                            amd64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-11-4                  11.4.120-1                            amd64        NVRTC native dev links, headers
ii  cuda-nvtx-11-4                       11.4.120-1                            amd64        NVIDIA Tools Extension
ii  cuda-nvvp-11-4                       11.4.120-1                            amd64        CUDA Profiler tools
ii  cuda-runtime-11-4                    11.4.2-1                              amd64        CUDA Runtime 11.4 meta-package
ii  cuda-samples-11-4                    11.4.120-1                            amd64        CUDA example applications
ii  cuda-sanitizer-11-4                  11.4.120-1                            amd64        CUDA Sanitizer
ii  cuda-toolkit-11-4                    11.4.2-1                              amd64        CUDA Toolkit 11.4 meta-package
ii  cuda-toolkit-11-4-config-common      11.4.108-1                            all          Common config package for CUDA Toolkit 11.4.
ii  cuda-toolkit-11-config-common        11.5.50-1                             all          Common config package for CUDA Toolkit 11.
ii  cuda-toolkit-config-common           11.5.50-1                             all          Common config package for CUDA Toolkit.
ii  cuda-tools-11-4                      11.4.2-1                              amd64        CUDA Tools meta-package
ii  cuda-visual-tools-11-4               11.4.2-1                              amd64        CUDA visual tools

docker in windows

设计架构为在win11 inside preview版本中我们用docker和cuda,在普通win10版本中我们只用docker,全部安装Minancoda

两台电脑中一台为开发服务器装win10inside,另外一台也不再需要用双系统,浪费.

机器配置

  • win11 22471.1000 geforce1060-GTX 5g 内存32g 1T

  • 偏GPU gefore1070 ti 8g 内存16g windows中文版21H1 双ssd硬盘 222G

先后顺序是

  • 检查windows版本 ver=22471.1000
  • 安装wsl2 (wsl.exe --update) ver=5.10.60.1
  • 安装windows cuda
  • 安装linux cuda
  • 安装docker,注意仅需要在宿主windows上安装docker即可
  • 可选 安装Minanconda

默认下载cuda in wsl安装后好像只能到11.4,需要用官方命令在ubuntu里面安装下:

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.5.0/local_installers/cuda-repo-wsl-ubuntu-11-5-local_11.5.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-5-local_11.5.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-5-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
sudo apt list --upgradable  # 一定要看看以前哪些插件需要升级
sudo apt upgrade
  • cuda正常后,开始安装(机器配置差的用,也就是无法用directml或者无法升级到win11的,cpu不支持的,没必要安装其他版本)

conda install -c rapidsai -c nvidia -c numba -c conda-forge cudf=21.06 python=3.7 cudatoolkit=11.4

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin

sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600

wget https://developer.download.nvidia.com/compute/cuda/11.4.2/local_installers/cuda-repo-wsl-ubuntu-11-4-local_11.4.2-1_amd64.deb

sudo dpkg -i cuda-repo-wsl-ubuntu-11-4-local_11.4.2-1_amd64.deb sudo apt-key del /var/cuda-repo-wsl-ubuntu-11-5-local/7fa2af80.pub sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-4-local/7fa2af80.pub

sudo apt-get update

sudo apt-get -y install cuda

Old is here:

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
$ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600

$ wget https://developer.download.nvidia.com/compute/cuda/11.4.2/local_installers/cuda-repo-ubuntu2004-11-4-local_11.4.2-470_470.57.02-0ubuntu1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu2004-11-4-local_11.4.0-470.42.01-1_amd64.deb
$ sudo apt-key add /var/cuda-repo-ubuntu2004-11-4-local/7fa2af80.pub
$ sudo apt-get update

https://blog.csdn.net/momodosky/article/details/119673472 总体

https://zhuanlan.zhihu.com/p/356397851 升级wsl1为wsl2

https://zhuanlan.zhihu.com/p/350399229 win10的wsl2安装cuda并配置pytorch]

https://blog.csdn.net/Veritaz/article/details/113826386 用阿里云镜像加速CUDA安装(Ubuntu20.04)

PS C:\Users\xxz> wsl --install Windows Subsystem for Linux is already installed. The following is a list of valid distributions that can be installed. Install using 'wsl --install -d <Distro>'.

NAME FRIENDLY NAME Ubuntu Ubuntu Debian Debian GNU/Linux kali-linux Kali Linux Rolling openSUSE-42 openSUSE Leap 42 SLES-12 SUSE Linux Enterprise Server v12 Ubuntu-16.04 Ubuntu 16.04 LTS Ubuntu-18.04 Ubuntu 18.04 LTS Ubuntu-20.04 Ubuntu 20.04 LTS PS C:\Users\xxz> wsl --update Checking for updates... No updates are available. Kernel version: 5.10.60.1 PS C:\Users\xxz> wsl --install -d Ubuntu

安装docker

安装加速器

{ "registry-mirrors": ["https://***.mirror.aliyuncs.com"] }

# Running Windows 10 17.09?
sudo mkdir /c
sudo mkdir /d
sudo mount --bind /mnt/c /c
sudo mount --bind /mnt/d /d

5.设置cuda环境变量(可选,新版本也许没必要)

在主目录下的~/.bashrc文件添加如下路径:

sudo su - vim ~/.bashrc

末尾添加并保存:

export CUDA_HOME=/usr/local/cuda
export PATH=$PATH:$CUDA_HOME/bin
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

如果提示缺少相应的依赖库,直接执行如下代码自动安装相应的依赖库

sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

检查

/usr/local/cuda-11.4/bin/nvcc -V

最早安装完成wsl2,检查的方式如下,如果显示为5.10.60.1-microsoft-standard-WSL2,是正确的wsl2,截止2021年10月6日

uname -r

再用wal -l -v命令检查已经安装的app及版本号

1、nvi### dia官网介绍通过wsl安装docker,实际上还是建议安装windows原装docker

WSL DETECTED: We recommend using Docker Desktop for Windows. Please get Docker Desktop from https://www.docker.com/products/docker-desktop

2、nvcc命令默认不会安装,不要傻呼呼按照提示安装什么nvidia-cuda-toolkit,解决方法如下.

sudo vi /etc/profile

3、如何查看wsl下的文件

cd \\wsl$\ubuntu

在ubuntu里面不需要将 CUDA Toolkit 安装在host system

参考 https://github.com/NVIDIA/nvidia-docker ,原文为 ··· Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed ···

如何将docker安装到非系统盘

参考 https://github.com/DDoSolitary/LxRunOffline scoop search lxrunoffline scoop install lxrunoffline

LxRunOffline move -n Ubuntu -d D:\wsl\Ubuntu,如果报错下面方法解决:

LxRunOffline:https://github.com/DDoSolitary/LxRunOffline/releases
如果使用过程中报以下错误,则应使用此版本:
https://ddosolitary-builds.sourceforge.io/LxRunOffline/LxRunOffline-v3.5.0-11-gfdab71a-msvc.zip
具体操作步骤可参考https://blog.csdn.net/Jioho_chen/article/details/103988647

迁移完成后查看用 'LxRunOffline.exe get-dir -n Ubuntu'

检测性能,如果寻找不到显卡需要重新打开终端

In some cases, when running a Docker container, you may encounter nvidia-container-cli : initialization error:
$ sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Pascal" with compute capability 6.1

> Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1070 Ti]
19456 bodies, total time for 10 iterations: 14.612 ms
= 259.050 billion interactions per second
= 5180.995 single-precision GFLOP/s at 20 flops per interaction

one other gpu is:

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Pascal" with compute capability 6.1

> Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1060 5GB]
10240 bodies, total time for 10 iterations: 8.270 ms
= 126.795 billion interactions per second
= 2535.909 single-precision GFLOP/s at 20 flops per interaction

or Users can also now query the GPU architecture in plain text using "nvidia-smi -q"

/mnt目录下挂载的文件系统默认权限为777的问题,参考微软

把下面automount的选项添加到/etc/wsl.conf文件中,新建或者修改

[automount]
enabled = true
root = /mnt/
options = "metadata,umask=22,fmask=11"
mountFsTab = false

在.profile、.bashrc、.zshrc或者其他shell配置文件中重新设置一下umask

#Fix mkdir command has wrong permissions
if grep -q Microsoft /proc/version; then
  if [ "$(umask)" == '0000' ]; then
  umask 0022
  fi
fi

以上不一定准确,最后以官方回答为主.

/usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link

原因,/usr/lib/wsl/lib/目录下都是文件而不是链接

又因为该目录只读,因此只能将在其他目录操作,具体步骤如下

cd /usr/lib/wsl
sudo mkdir lib2
sudo ln -s lib/* lib2

之后将文件/etc/ld.so.conf.d/ld.wsl.conf中的 /usr/lib/wsl/lib 改为 /usr/lib/wsl/lib2

注意更改链接路径之后,以后更新驱动之后需要重新链接,否则lib2中和lib中不一致从而导致wsl中不可使用windows下的驱动

tensorflow 验证

docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter

⚠️ **GitHub.com Fallback** ⚠️