【参考例子】安装训练环境和记录测试结果 - bettermorn/IAICourse GitHub Wiki
1 服务器环境
功能 | 参数/工具名称 | 版本号 |
---|---|---|
操作系统 | CentOS 7.7 | 内核linux mstation 3.10.0-1062.18.1.el7.x86 64 |
CPU | 12vCPU | - |
内存 | 128GB | - |
GPU 2卡 | NVIDIA GP100GL Quadro | - |
- 查看 硬盘信息确定安装路径
df -h
fdisk -l
- 后台运行生成文件
python x.py &>> xxx.out
参考https://askubuntu.com/questions/420981/how-do-i-save-terminal-output-to-a-file
访问方法
- SSH: putty
- SFTP:fileZilla
2 安装基础应用
2.1 运行环境安装参考
Python 3.8
下载链接:https://www.python.org/downloads/release/python-3811/ 安装方法:https://packaging.python.org/guides/installing-using-linux-tools/
Anaconda
https://docs.anaconda.com/anaconda/install/linux/ 可在conda上建立多个环境,以便用于不同的模型训练
NVIDIA CUDA-enabled
CUDA Installation Guide for Linux 参考 https://docs.nvidia.com/cuda/archive/10.2/cuda-installation-guide-linux/index.html
cuDNN
用于深层神经网络的gpu加速库 下载: https://developer.nvidia.com/rdp/cudnn-download 安装:https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html
- https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux Note: the tar file installation applies to all Linux platforms.
- get following installation from https://developer.nvidia.com/rdp/cudnn-download https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.4/10.2_20210831/cudnn-10.2-linux-x64-v8.2.4.15.tgz
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.4/10.2_20210831/cudnn-10.2-linux-x64-v8.2.4.15.tgz
tar -xzvf cudnn-10.2-linux-x64-v8.2.4.15.tgz
2.2 安装过程命令
- Conda 2021.05
- Python 3.8.11
- CUDA 10.2
- KB:custom install path only for shell file:
sudo sh cuda_10.2.89_440.33.01_linux.run --silent \
--toolkit --toolkitpath=/home/hpc/cuda/toolkit \
--samples --samplespath=/home/hpc/cuda/samples \
--defaultroot=/home/hpc/cuda --tmpdir=/home/hpc/cuda/tmp --installpath=/home/hpc/install --extract=/home/hpc/extract --no-man-page --kernel-source-path=/home/hpc/kernel --librarypath=/home/hpc/lib
refer to:
- How do I install the Toolkit in a different location? https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#faq1
- how can I install CUDA on linux on a custom path (not in /usr/local) https://superuser.com/questions/1131813/how-can-i-install-cuda-on-linux-on-a-custom-path-not-in-usr-local
- cuDNN 8.1.1.33_CUDA11.2
3 路径设置
3.1 备用文件
- /home/hpc/files
3.2 Conda 环境
- tsai
- cnnlstmdense
3.3 代码
- /home/hpc/cnnlstmdense
- data
- models
- outputs
- /home/hpc/tsai
- data
- models
- output
4 运行报告
编号 | 运行时长(分钟或者小时) | epochs | batch_size | 模型和运行参数 | 评价指标:正确率%等 | GPU | cuDNN |
---|---|---|---|---|---|---|---|