System Overview - calab-ntu/gpu-cluster GitHub Wiki
Eureka
- One Login node + 33 Computing nodes
- Each computing node has
- One AMD 16-cores CPU (Ryzen Threadripper 2950X) with 128 GB memory
- One NVIDIA GPU (GeForce RTX 2080 Super) with 8 GB memory
- See System Specification for details
- Operating system: CentOS Linux 7.7
- Storage : ~350 TB
- Interconnect: InfiniBand 100 Gb/s EDR
Spock
- One Login node + 28 Computing nodes
- Each computing node has
- One AMD 32-cores CPU (Ryzen Threadripper Pro 5975wx) with 256 GB memory
- One NVIDIA GPU (GeForce RTX 3080 Ti) with 12 GB memory
- See System Specification for details
- Operating system: Ubuntu server 22.04
- Storage : ~350 TB
- Interconnect: InfiniBand 200 Gb/s EDR
Important notes about switching from eureka to spock
- Performance: spock should be about 2-3 times faster in both CPU and GPU
- CPU RAM: 2x larger (128 GB โ 256 GB)
- GPU RAM: 1.5x larger (8 GB โ 12 GB)
- Interconnet bandwidth: 2x higher (100 Gb/s โ 200 Gb/s)
- Disk I/O bandwidth: 5-10x higher when using
/projectV
- Use environment modules to deploy different software tools
- For running GAMER
- Job submission script: submit_spock.job
- Configuration file: spock_intel.config
- Edit
generate_make.sh
to adopt--machine=spock_intel
and--gpu_arch=AMPERE
- Set
OMP_NTHREAD 8
inInput__Parameter
- [Optional] Change
GPU_COMPUTE_CAPABILITY
from800
to860
forGPU_ARCH == AMPERE
(here) if you havenโt updated to the latestmain
orpsidm
branch yet
Best Practice
See User Policy .