Spock Installation: Computing Node - calab-ntu/gpu-cluster GitHub Wiki
1. Check switche settings on MB
-
VGA switch -> off
-
IPMI switch -> left (default)
-
PSU(PHANDEKS ) hybrid -> press down
-
Change cooling fan header from
CPU_OPT
toCHA_FAN1
-
Plug the micro USB plug off the CPU pump.
2. Set up BIOS
-
Bootup machine with BIOS flash disk plugged in.
If the machine is boot for the first time, it would ask if you want to initial the CPU config. Press
Y
to confirm initial. -
Check BIOS version an update
- Get into BIOS with press
delete
orF2
during booting.
- Check BIOS version : Main -> BIOS Information -> Version
if Version = 1106 x64, then skip the steps in
Update BIOS
- Plug in the USB disk with "BIOS" label to USB socket labeled with "BIOS".
- Keep pressing
delete
orF2
during booting to get in to BIOS. Tool
->ASUS EZ Flash 3 Utility
- Find the folder
PRO_WS_WRX80E-SAGE_SE_WIFI-ASUS-1106
- Find the file
PRO-WS-WRX80E-SAGE-SE-WIFI-ASUS-1106.CAP
Yes
- Reboot with
save changes and exit
or pressF10
. - Check again.
- Get into BIOS with press
-
DRAM overclock setting
Ai Tweaker
->Ai overclock Tuner
-> ChooseD.O.C.P
D.O.C.P
-> ChooseD.O.C.P DDR4-3200 16-18-18-38-1.35V
F10
reboot.- Check:
Main
->Total Memory
:262144 MB
-> Speed :3200 MHz
-
Enable NUMA
Advanced
->AMD CBS
->DF Common Option
->Memory Addressing
NUMA nodes persocket
-> ChooseNPS2
-
F10
Reboot
3. Install ubuntu server 22.04
-
Download ubuntu 22.04 from https://www.ubuntu-tw.org/modules/tinyd0/ Make bootable USB disk with rufus.
-
Set up boot disk in BIOS
- Boot with bootable USB disk plugged in.
- Get into BIOS with press
delete
orF2
during booting. Boot
> Choose USB to boot.F10
to reboot.
-
Install Ubuntu 22.04 0. Choose
Try or install ubuntu server
-
Select language :
English
->Done
-
Keyboard configuration.
- Layout :
English (US)
- Variant :
English (US)
->Done
- Layout :
-
Choose type of install
- Ubuntu Server
- Search for third party drivers
->
Done
-
Network connecions ->
Continue without network
-
Configure proxy ->
Done
Leave the field empty.
-
Configure Ubuntu archive mirror ->
Done
Don't change the url
-
Guided storage configuration
Custom storage layout
->Done
- Select disks and reformat all of them.
/boot
- Choose
free space
->Add GPT Partition
- Size : 1G
- Format:
ext4
- Mount:
/boot
->Create
- Choose
/
: Same steps as/boot
with changes- Size : 448G
- Mount :
/
swap
- Size :
leave empty to get rest of volume
- Format :
swap
->Done
->Continue
- Size :
-
Porfile setup
- Your name: spock**
** is the number of node name
- Your server's name: spock**
- Pick a username: tmp_account
- Choose a password: ************
- Confirm your password: ***********
- Your name: spock**
-
Upgrade to Ubuntu Pro ->
Skip Ubuntu Pro setup for now
->Done
-
SSH Setup ->
Done
Don't check the option.
-
Third-party drivers
- Do not install third-party drivers now
->
Done
- Do not install third-party drivers now
->
-
Reboot Now
-> Unplug the install medium and pressenter
to reboot.
-
-
Check.
- Kernel :
uname -r
->5.15.0-60-generic
- CPU :
lscpu | grep Model
->AMD Ryzen Threadripper PRO 5975WX 32-Cores
- RAM :
sudo dmidecode memory | grep Speed
Configured Speed: 3200 MT/s Speed: 2667 MT/s
- NUMA :
lscpu | grep NUMA
NUMA node(s) : 2 NUMA node0 CPU(s) : 0-15, 32-47 NUMA node1 CPU(s) : 16-31, 48-63
- Kernel :
4. Set up settings
-
Network settings.
- Edit netplan :
sudo vim /etc/netplan/00-installer-config.yaml
# This is the network config written by 'subiquity' network: ethernets: enp*****0: dhcp4: true enp*****1: dhcp4: false addresses: [192.168.0.2**/22] # ** would be replaced by the number of node. nameservers: addresses: [140.112.254.4] routes: - to: default via: 192.168.0.1 version:2
- Apply netplan:
sudo netplan apply
- Poweroff the machine and move it to machine room.
- Plug the ethernet cable to the upper ethernet port.
- Check
- ip settings :
ip addr show dev enp*****1
inet 192.168.0.2**/22
- DNS server :
resolvectl status
Link 3 (enp*****1) Current Scopes: DNS Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported Current DNS Server: 140.112.254.4 DNS Servers: 140.112.254.4
ping 192.168.0.150
- ip settings :
- Get system network informations.
sudo -i scp [your_account]@192.168.0.150:/work1/shared/spock/etc/hosts /etc/hosts
- Edit netplan :
-
Update system 0. Operate in sudo privilage
sudo -i
apt update
apt-get install -y linux-image-5.15.0-78-generic
Pressenter
twice as kernel update UI appears.reboot
sudo -i
- check :
uname -r
5.15.0-78-generic # or above
- Change group name of ID 1000 :
groupmod --new-name calab tmp_account
- Set root password :
passwd
- Delete
/home/tmp_account
:rm -r /home/tmp_account
- Change
sh
link fromdash
tobash
:sudo dpkg-reconfigure dash # Then configure UI will ask if want to set /usr/bin/sh to dash # Press "No" to set the /usr/bin/sh to bash
apt install --only-upgrade -y openssh* wget
-
Time stamp of command history
su
- Add
export HISTTIMEFORMAT='%d/%m/%y %T '
to the end of file/etc/profile
source /etc/profile
- Check by
history
-
Set timezone
su
timedatectl set-timezone Asia/Taipei
- Check
timedatectl show
-
NFS settings
- Client
sudo -i
- Install NFS client.
apt -y install nfs-common
- Get auto mount settings from
work1
.ssh [your_account]@eureka00 cat /work1/shared/spock/etc/fstab >> /etc/fstab
- Create directories.
mkdir /software /work1 /projectV /projectW /projectX /projectY /projectZ
- Check the accessibility of the target NFS servers
showmount -e spock00 # /software 192.168.0.0/24 **[Skip on login node]** showmount -e tumaz # /home 192.168.0.0/24 showmount -e ironman # /volume1/gpucluster1 192.168.0.0/24 # /volume3/gpucluster3 192.168.0.0/24 showmount -e eater # /volume1/gpucluster3 192.168.0.0/24 # /volume2/gpucluster4 192.168.0.0/24 # /volume3/gpucluster6 192.168.0.0/24 showmount -e pacific # /volume1/gpucluster1 192.168.0.0/24
- Mount all remote directories.
mount /software; # Skip in process on login node mount /home; mount /work1; mount /projectW; mount /projectX; mount /projectY; mount /projectZ; mount /projectV
- Check :
df -h
tumaz:/home 208G 22G 176G 12% /home ironman:/volume1/gpucluster1 70T 47T 24T 67% /work1 ironman:/volume3/gpucluster3 70T 70T 643G 100% /projectX eater:/volume1/gpucluster3 70T 67T 3.6T 95% /projectY eater:/volume2/gpucluster4 88T 77T 12T 88% /projectZ eater:/volume3/gpucluster6 88T 75T 13T 86% /projectW pacific:/volume1/gpucluster1 140T 20T 120T 15% /projectV
- Client
-
NIS settings
- Install NIS client.
sudo apt -y install nis
- Configure as a NIS Client.
vim /etc/yp.conf
, add follow text at the end.domain tumaz.gpucluster.calab server tumaz
vim /etc/nsswitch.conf
passwd: files systemd nis group: files systemd nis shadow: files nis hosts: files dns nis
- Set NIS domain name,
vim /etc/defaultdomain
tumaz.gpucluster.calab
- Start and enable nis.
systemctl restart ypbind systemctl enable ypbind
- check :
ll /home
yptest
:1 test fail
ypwhich
:tumaz
- Logout and login with your own account
su
Deletetmp_account
:userdel --remove tmp_account
It's okay to receive error message:userdel: tmp_account mail spool (/var/mail/tmp_account) not found userdel: tmp_account home directory (/home/tmp_account) not found
- Install NIS client.
-
Install GPU driver
- Set the text mode as default (since the NVIDIA driver cannot be installed while X window is running)
systemctl set-default multi-user.target
- Reboot.
su
- Install dkms :
apt -y install dkms
- Disable
nouveau
: Create file/etc/modprobe.d/blacklist-nouveau.conf
with content:blacklist nouveau options nouveau modeset=0
- Apply system changes
update-initramfs -u
- Reboot.
su
- Check
nouveau
is disabled :lsmod | grep nouveau
This should print nothing.
- Install nvidia dirver
- Install :
su sh /work1/shared/spock/package/cuda/cuda_12.1.0_530.30.02_linux.run --silent --driver
- Validate with
cat /proc/driver/nvidia/version
:NVRM version: NVIDIA UNIX x86_64 Kernel Module 530.30.02 GCC version: gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)
- Copy the default profile files.
cp /work1/shared/spock/init_script/*.sh /etc/profile.d/ cp /work1/shared/spock/init_script/*.csh /etc/profile.d/ cp /work1/shared/spock/etc/rc.local /etc/ chmod +x /etc/rc.local
- Reboot
- Install :
- Check
nvidia-smi
NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1
- Set the text mode as default (since the NVIDIA driver cannot be installed while X window is running)
-
NTP client 0.
su
apt -y install ntp ntpdate
- Edit
/etc/ntp.conf
- Add
pool time.google.com iburst
- Comment out other pool servers.
- Add
systemctl restart ntp
systemctl status ntp
systemctl enable ntp
-
TORQUE
- Install the required packages
apt -y install libnuma-dev apt -y install tcl-dev tk-dev apt -y install libntirpc-dev sh /work1/shared/spock/package/torque/src/torque-3.0.6/spock_library_set.sh
- Compile and install from source code.
cd /work1/shared/spock/package/torque/src/torque-3.0.6 # WARNING: do NOT run "spock_Install.sh" in parallel (i.e., install one node at a time) # [Login node ] uncomment "--enable-server" # [Computing nodes] comment "--enable-server" sh spock_Install.sh >& log.spockXX cd ../../etc cp pbs_spock /etc/init.d/pbs ln -s /etc/init.d/pbs /etc/systemd/system/ cp pbs.conf /etc/ # [Login node only]: edit "pbs.conf" to set "start_server=1" and "start_mom=0" cp nodes_spock /var/spool/TORQUE/server_priv/nodes systemctl enable pbs source /etc/profile.d/torque.sh cd ../src/torque-3.0.6/ ./torque.setup root killall pbs_server systemctl start pbs # This error message is fine: "LOG_ERROR::No such file or directory (2) in read_config, fstat: config" systemctl status pbs
- Check
cat /var/spool/TORQUE/pbs_environment
:LANG=en_US.utf-8
- Setup
overcommit-ratio
and Disableovercommit-memory
incrontab
cp /work1/shared/spock/helper_script/disable_memory_overcommit.sh /root/
- Edit crontab with
crontab -e
and add a new line:@reboot /usr/bin/sh /root/disable_memory_overcommit.sh 1> /tmp/disable_memory_overcommit.log 2>&1
- Setup prologue and epilogue to
/var/spool/TORUE/mom_priv/
cp /work1/shared/spock/package/torque/mom_priv/* /var/spool/TORQUE/mom_priv/
- Install the required packages
-
InfiniBand
ref. https://docs.nvidia.com/networking/display/MLNXOSv3105002/Getting+Started#heading-RerunningtheWizard
- Check hardware
lspci | grep Mellanox
01:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
- Install necessary package
apt -y install libsasl2-dev libldap2-dev libssl-dev
- Install driver
su
cd /work1/shared/spock/package/ib/adaptor/driver/MLNX_OFED_LINUX-5.9-0.5.6.0-ubuntu22.04-x86_64
./mlnxofedinstall
Device #1: ---------- Device Type: ConnectX6 Part Number: MCX653105A-HDA_Ax Description: ConnectX-6 VPI adapter card; HDR IB (200Gb/s) and 200GbE; single-port QSFP56; PCIe4.0 x16; tall bracket; ROHS R6 PSID: MT_0000000223 PCI Device Name: 01:00.0 Base GUID: 0c42a10300ef2a1a Versions: Current Available FW 20.34.1002 20.36.1010 PXE 3.6.0700 3.6.0901 UEFI 14.27.0014 14.29.0014 Status: Up to date ---------
/etc/init.d/openibd restart
reboot
- Check
0.
su
ibstatus
Infiniband device 'mlx5_0' port 1 status: default gid: fe80:0000:0000:0000:0c42:a103:00ef:2a1a base lid: 0xffff sm lid: 0x0 state: 4: ACTIVE phys state: 5: LinkUp rate: 200 Gb/sec (4X HDR) link_layer: InfiniBand
cat /etc/security/limits.conf
* soft memlock unlimited * hard memlock unlimited
systemctl status openibd
Active: active (exited)
systemctl is-enabled openibd
enabled
systemctl status opensmd
Active: inactive (dead)
systemctl is-enabled opensmd
disabled
hca_self_test.ofed
---- Performing Adapter Device Self Test ---- Number of CAs Detected ................. 1 PCI Device Check ....................... PASS Kernel Arch ............................ x86_64 Host Driver Version .................... MLNX_OFED_LINUX-5.9-0.5.6.0 (OFED-5.9-0.5.6): 5.15.0-69-generic Host Driver RPM Check .................. PASS Firmware on CA #0 HCA .................. v20.36.1010 Host Driver Initialization ............. PASS Number of CA Ports Active .............. 1 Port State of Port #1 on CA #0 (HCA)..... UP 4X HDR (InfiniBand) Error Counter Check on CA #0 (HCA)...... PASS Kernel Syslog Check .................... PASS Node GUID on CA #0 (HCA) ............... 0c:42:a1:03:00:ef:2a:1a ------------------ DONE ---------------------
ibdev2netdev -v | grep -i MCX
0000:01:00.0 mlx5_0 (MT4123 - MCX653105A-HDAT) ConnectX-6 VPI adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56 fw 20.36.1010 port 1 (ACTIVE) ==> ibp1s0 (Down)
- IB connection and band width test.
- Computing nodes -> Login node
On
spock00
Onib_write_bw -aF
spockXX
ib_write_bw -aF spock00
************************************ * Waiting for client to connect... * ************************************ --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF PCIe relax order: ON ibv_wr* API : ON CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x02 QPN 0x0027 PSN 0xcb8c4 RKey 0x1fffbe VAddr 0x007f9c96aaa000 remote address: LID 0x01 QPN 0x0027 PSN 0x560b74 RKey 0x1fffbe VAddr 0x007f0894517000 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] 8388608 5000 23452.55 23452.55 0.002932 ---------------------------------------------------------------------------------------
- Computing nodes <- Login node
On
spock00
Onib_read_bw -aF
spockXX
ib_read_bw -aF spock00
************************************ * Waiting for client to connect... * ************************************ --------------------------------------------------------------------------------------- RDMA_Read BW Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF PCIe relax order: ON ibv_wr* API : ON CQ Moderation : 100 Mtu : 4096[B] Link type : IB Outstand reads : 16 rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x02 QPN 0x0028 PSN 0x593c01 OUT 0x10 RKey 0x1fffbf VAddr 0x007efc3f67f000 remote address: LID 0x01 QPN 0x0028 PSN 0xbaa0aa OUT 0x10 RKey 0x1fffbf VAddr 0x007f6fd2a85000 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] 8388608 1000 23517.75 23517.73 0.002940 ---------------------------------------------------------------------------------------
- Computing nodes -> Login node
On
- Start mst to make us enable monitor IB adaptor
systemctl enable mst systemctl start mst mst status
- Check hardware
-
ssh without password for the root
echo "#Allow only following users access this node by ssh" >> /etc/ssh/sshd_config echo "AllowUsers root" >> /etc/ssh/sshd_config cd /work1/shared/spock/ssh_root/ cp authorized_keys id_rsa* /root/.ssh/ # Verification ssh spock00 # "yes" to "continue connecting" ssh spockXX # "yes" to "continue connecting" exit exit
5. install compilers [Login node only]
- Intel compiler
su
cd /opt
ln -s /software/intel
6. install packages
-
python2
source /etc/profile.d/openmpi.sh; source /etc/profile.d/intel.sh; source /etc/profile.d/hdf5.sh apt -y install python2 python2-dev apt -y install python-tk cd /work1/shared/spock/package/python2 python2 get-pip.py sh install-python-packages.sh
-
python3
apt -y install python3 python3-dev apt -y install python3-tk apt -y install python3-pip cd /work1/shared/spock/package/python3 sh install-python-packages.sh
Add
/usr/local/bin
toPATH
by adding a line at the end of/etc/profile
export PATH=/usr/local/bin:$PATH
-
Module
cd /work1/shared/spock/package/module/modules-5.1.1 make clean ./configure make make install
After installation
cp init/profile.sh /etc/profile.d/10-modules.sh cp init/profile.csh /etc/profile.d/modules.csh source init/bash
Add
/software/intel/oneapi/modulefiles
to default module directories by adding the line to the file/usr/local/Modules/etc/initrc
module use /software/intel/oneapi/modulefiles
Set up preload module
ln -s /software/modulefiles/default_modules.sh /etc/profile.d/default_modules.sh
7. Miscellaneous setup
-
IPMI tool
- Install IPMI driver and tool :
apt -y install openipmi ipmitool
- Check :
ipmitool sensor get "CPU Temp."
- Install IPMI driver and tool :
-
ffmpeg
apt -y install ffmpeg
-
gnuplot
apt -y install gnuplot-x11
-
screen
apt -y install screen
-
pdsh
apt -y install pdsh
-
locate
apt -y install plocate
-
ClamAV
apt -y install clamav clamav-daemon systemctl stop clamav-freshclam freshclam systemctl start clamav-freshclam systemctl enable clamav-freshclam
-
X11 server
apt -y install xorg openbox
-
CPU usage monitor
apt -y install sysstat
-
Image display
feh
apt -y install feh
-
cmake
apt -y install cmake
-
GNU Readline Library
apt -y install lib32readline8 lib32readline-dev
-
Disable auto update.
- Edit the
apt
config file at/etc/apt/apt.conf.d/20auto-upgrades
as follow.APT::Periodic::Update-Package-Lists "0"; APT::Periodic::Unattended-Upgrade "0";
- Apply config
apt-config dump APT::Periodic::Update-Package-Lists apt-config dump APT::Periodic::Unattended-Upgrade
- Edit the
8. Check
-
CPU burn-in test
- Install CPU test program
apt -y install stress-ng
- Run CPU test
stress-ng --cpu 0 --timeout 30m &
- Detect CPU temperature every minute during test
AMD Threadripper allows temperature up to 95 degree. And thefor i in {1..40}; do ipmitool sensor | grep "CPU Temp."; sleep 1m; done
non-critical upper limit
is 85 degree. forspock02
the highest temperature is 82 degree.
- Install CPU test program
-
GPU burn-in test
cd /work1/shared/spock/tests/gpu_burn-in/gpu-burn ./gpu_burn 1800 # run for 30 minutes
during the test, watch the gpu temperature shown on screen. For RTX3080Ti, hightest temperature is 93 degree celsius. And the
non-critical upper limit
is 90 degree. Forspock02
, the highest temperature is 81 degree. -
MPI suit test [Run as regular user]
- Download @spock00
git clone https://github.com/open-mpi/mpi-test-suite.git
- Compile @spock00
cd mpi-test-suite
./autogen.sh
./configure CC=mpicc
make
- Run tests
cp /work1/shared/tests/mpi_test_suite/run_test.sh ./
qsub -I -lnodes=spockXX:ppn=32
cd {directory of mpi_test_suite}
sh run_test.sh >& spockXX.log
- Check test result
tail spockXX.log
# Number of failed tests: 0
- Download @spock00