Cluster Specification - nthu-ioa/cluster GitHub Wiki

Summary

The CICA cluster is a memory-intensive system optimized for mid-scale research computing, data analysis and the development of astronomical HPC codes in preparation for production runs on larger supercomputers. It comprises 2112 logical cores over 23 parallel computing nodes with 2GB RAM/core, a further 232 logical cores in 3 dedicated shared-memory nodes with a total of 4.3TB RAM, and a further 312 cores in 4 specialized nodes with access to 16 GPUs and 0.9TB of RAM. All nodes are connected with a 100Gb/s Infiniband interconnect and have access to 1.3PB of storage, include a 1.1PB Lustre parallel filesystem.

Nodes

  • Head node fomalhaut: 2 Intel Xeon Silver 4110, 16 cores, 200G RAM.
  • Memory Nodes:
    • m01 (1536 GB RAM), m02 (1536 GB RAM)
      • 2 Xeon Gold 6248 2.5GHz CPUs with hyperthreading: 80 logical cores (2x20 physical cores) per node
      • 10TB of high-performance SSD scratch space per node
    • m03 (2015 GB RAM)
      • 2 Xeon Gold 6354 3.0GHz CPUs with hyperthreading: 72 logical cores (2x18 physical cores) per node
      • 40TB of SSD scratch space
  • Compute nodes:
    • 23 nodes in total, 3 groups.
    • c01 to c04
      • 2 Xeon Gold 6140 2.3GHz CPUs with hyperthreading: 72 logical cores (2x18 physical cores) per node
      • 196 GB of RAM per node (2.5 GB per logical core)
      • Total 144 physical cores, 784GB RAM
      • Each node has 900 GB of SSD scratch space
    • c05 to c17
      • 2 Xeon Gold 6240R 2.4GHz CPUs with hyperthreading: 96 logical cores (2x24 physical cores) per node
      • 256 GB of RAM per node (2.3 GB per logical core)
      • Total 624 physical cores, 3.3TB RAM
      • Nodes c05-c13 each have 440GB SSD scratch space
      • Nodes c14-c17 each have 900GB SSD scratch space
    • c18 to c23
      • 2 Xeon Gold 6324 2.8GHz CPUs with hyperthreading: 96 logical cores (2x24 physical cores) per node
      • 256 GB of RAM per node (2.3 GB per logical core)
      • Total 240 physical cores, 1.3TB RAM
      • Each node has 900GB SSD scratch space
    • Over all cpu nodes, 2112 logical cores with ~2GB/core (5.3TB total).
  • GPU nodes:
    • g01-g03
      • 2 Xeon Gold 6140 2.3GHz CPUs with hyperthreading: 72 logical cores (2x18 physical cores) per node
      • g01: 3 RTX-2080-ti GPUs, 128 GB RAM
      • g02: 4 RTX-2080-ti GPUs, 256 GB RAM
      • g03: 4 RTX A4000 GPUs (Ampere architecture), 512 GB RAM
    • g04
      • 2 Xeon Gold 6248R 3.0GHz CPUs with hyperthreading: 96 logical cores (2x24 physical cores) per node
      • 8 GTX-3080 GPUs, 256 GB RAM
    • Total 156 physical cores, 1.2TB RAM
  • Storage:
    • /cluster (including home): 96TB RAID6 (6Gb/s Seagate IronWolf)
    • /data: 160TB RAID6 (6Gb/s Seagate IronWolf)
    • /data1: 240TB RAID6 (6Gb/s Seagate IronWolf Pro)
    • /data2: 160TB RAID6 (6Gb/s Seagate IronWolf Pro)
    • /lfs/data: 1.1PB ZFS Lustre (8 OSTs, each 6 Gb/s Seagate 18TB Exos; 1 MDT Seagate Nytro SSDs)
  • Interconnect: Mellanox EDR Infiniband (100 Gb/s)

Operating System

Centos 7.7

Batch System

Slurm 20.11.9