Compute Resources - CAIDA/bgp-hackathon GitHub Wiki

Compute Resources

We have secured an allocation on the Comet supercomputer located here at SDSC (but of course you'll still need a laptop ;-) ).

Teams will be allocated a user account upon formation, and they will be able to log in and use some simple scripts to obtain ssh access from laptops, to compute nodes (each node has 24 CPUs and 128GB of RAM). There will also be scripts to dynamically allocate a dedicated Hadoop and/or Spark cluster within the supercomputer.

We have already installed the BGPStream stack on Comet, and expect to have software from the other platforms ready before Saturday. While Comet will be great for data- and compute-intensive processing, it may prove too time-consuming to manually install non-standard software packages (e.g., database servers, etc.), since we will not have root access on the compute nodes. As a result, we expect that participants may need to make use of their own laptops for running such services.

If you're interested in playing around with the various platforms before the hackathon, you may want to take a look at the Vagrant environment put together by Nicolas Vivet: https://github.com/nizox/bgp-hackathon

Comet Usage Notes

Comet has 1944 compute nodes. Each with 24 CPUs,128GB RAM

General Usage Steps

  1. Copy code to /home on Comet
    e.g., rsync -av /path/to/my/code/ [email protected]:my/code/
  2. Edit/build/compile/test code
  3. Copy data needed to scratch on Comet
    e.g., rsync -av /path/to/lots/of/data/ [email protected]:/oasis/scratch/comet/USER/temp_project/
  4. Grab some compute resources
    e.g., [USER@comet-ln2 ~]$ get-nodes 1
  5. SSH to the compute node (use myjobs and nodes to get a list of compute nodes) e.g., [USER@comet-ln2 ~]$ ssh comet-17-17.sdsc.edu
  6. Run processing!

Note, compute resources are yours for 12 hours. Once you have nodes allocated, you can SSH in and out as if they are your own servers.

Login Nodes (used to access compute nodes)

  • We will initially allocate one user account per team
  • Login via SSH to [email protected]
  • Do not perform computation on login nodes
  • Do not overwrite .ssh/authorized_keys (append is fine)

Compute Nodes

There are a couple ways to get access to compute nodes.

From the login nodes:

  • get-nodes <node-count>: Blocks until the scheduler assigns nodes. Will output the hostnames of the acquired nodes -- participants can then SSH directly to those nodes.
  • hadoop-bootstrap <node-count>: provisions an Hadoop cluster on the fly
  • sbatch: Submit a traditional job for batch processing (advanced)

Once you have jobs running (i.e., you have been allocated compute nodes):

  • myjobs: Lists the jobs you currently have
  • nodes <job-id>: Lists the hostnames of compute nodes for the given job

Data Storage

  • /home/$USER: for scripts, binaries etc. Not for data.
  • /oasis/scratch/comet/$USER/temp_project: 
has 1.1PB available for your data
  • /scratch/$USER/$SLURM_JOBID 320GB of 
node-local SSD storage

Environment

We currently have the BGPStream stack installed. More software will be added in the next days.

Installed tools/libraries:

⚠️ **GitHub.com Fallback** ⚠️