Spock Installation: InfiniBand - calab-ntu/gpu-cluster GitHub Wiki
Switch
Initialization
-
Plug both power cables and wait for all system status led bright solid green.
-
Connect a host PC (e.g., spock00) to the console (RJ-45) port of the switch using the supplied RJ-451-to-DB9 cable + DB9-to-USB cable
-
Login with the ubuntu PC
- Get the USB device name :
ls /dev/ttyUSB*
If there is only one USB device plug on the PC, it would show
ttyUSB0
- Connect to switch with
su
priviligescreen /dev/ttyUSB0 115200
and pressenter
twice. - Login:
Username: admin Password: admin
- Get the USB device name :
-
Configuration (Below question will be ask at the first connection)
Do you want to use the wizard for initial configuration? yes Step 1: Hostname? [switch-d79b5a] Step 2: Use DHCP on mgmt0 interface? [yes] no Step 3: Use zeroconf on mgmt0 interface [no] Step 4: Primary IPv4 address and masklen? [0.0.0.0/0] 192.168.0.100/24 Step 5: Default gateway? 192.168.0.1 Step 6: Primary DNS server? 140.112.254.4 Step 7: Domain name? Step 8: Enable IPv6? [yes] Step 9: Enable IPv6 autoconfig (SLAAC) on mgmt0 interface? [no] Step 10: Enable DHCPv6 on mgmt0 interface? [yes] no Step 11: Admin password (Must be typed)? #set it the same as spock Step 11: Confirm admin password? Step 12: Monitor password (Must be typed)? #same as admin password Step 12: Confirm monitor password?
If there is needed to resetup the configure
enable
config terminal
configuration jump-start
-
Check
- System version
show version
Product name: MLNX-OS Product release: 3.8.2102 Build ID: #1-dev Build date: 2019-11-26 21:48:40 Target arch: x86_64 Target hw: x86_64 Built by: jenkins@c776fa44be2b Version summary: X86_64 3.8.2102 2019-11-26 21:48:40 x86_64 Product model: x86onie Host ID: 043F72D79B5A System serial num: MT2039J30791 System UUID: f73a8370-1456-11eb-8000-043f72d00e66 Uptime: 18h 12m 33.108s CPU load averages: 3.11 / 3.05 / 3.01 Number of CPUs: 4 System memory: 468 MB used / 7333 MB free / 7801 MB total Swap: 0 MB used / 0 MB free / 0 MB total
- mgmt0 interface
enable show interfaces mgmt0
Interface mgmt0 status: Comment : Admin up : yes Link up : yes DHCP running : no IP address : 192.168.0.100 Netmask : 255.255.255.0 IPv6 enabled : yes Autoconf enabled: no Autoconf route : yes Autoconf privacy: no DHCPv6 running : no IPv6 addresses : 1
- System version
-
Enable OpenSM
enable
configure terminal
ib smnode switch-d79b5a enable
show ib sm
enable
no configure
-
Logout and exit with [CTRL + A] and [CTRL + K]
Rerun initialization
- Login to switch (w/ console port or ssh)
enable
configure terminal
configuration jump-start
Enable OpenSM
enable
configure terminal
ib smnode switch-d79b5a enable
show ib sm
enable
on configure
SSH
Unable to negotiate with 192.168.0.100 port 22: no matching key exchange method found. Their offer: diffie-hellman-group14-sha1
Unable to negotiate with 192.168.0.100 port 22: no matching host key type found. Their offer: ssh-rsa
Above error message shown while we try to ssh to the switch with ubuntu 22.04
- Add lines at the end of the file
etc/ssh/ssh_config
KexAlgorithms=+diffie-hellman-group14-sha1 HostKeyAlgorithms=+ssh-rsa
- Restart ssh service
service ssh restart