Guide to Github, RStudio, and Hyak - statnet/computing GitHub Wiki

Audience and scope

This document provides a guide to getting set up with GitHub, using RStudio, and running code through Hyak (a high-performance computing cluster at the University of Washington). This guide is intended for users affiliated with statnet or the Network Modeling Group at the University of Washington, but it could be used more broadly with modifications to the code.

The big picture is as follows: Software packages and projects are housed on GitHub to facilitate version control and collaboration with Git. RStudio is used to build or edit packages and run analyses. You can access RStudio on your local computer or through the Center for Studies in Demography and Ecology (CSDE) servers. Large scale simulations and other computationally-intensive tasks are done through Hyak. To transfer files between your local computer, the CSDE server, and Hyak, use a Unix shell.

NOTE: Some of the instructions for working with Hyak may be out of date. A few updates have been added, but if some scripts or processes are not working, consult the Hyak wiki

Contents

Github
RStudio
Unix
Hyak

GitHub

Overview

GitHub is an online platform for version control and collaboration. Git is the software GitHub is built around. Projects are organized in repositories, which can contain code, datasets, or any other files related to a project. You can create a branch off the master version of a repository to make edits without changing the original files. This allows multiple people to work on a project simultaneously without interfering. To save and document your changes, you make a commit with an associated 'commit message' to summarize why you made the changes. This works best if you commit often so that each commit corresponds to one idea or issue that you fixed. If you think your changes should be reflected in the master branch, submit a pull request. The owner of the master branch will review the differences in the content of your branch and the master, and will merge the branches together if the changes are accepted.

Account set up

The first step to using GitHub is to sign up with a username and password. Do that here: https://github.com/. Then you can search for repositories, code, issues, or users. To join the statnet organization and get access to the group's coding repositories and documentation, search "statnet" in the search bar at the top. In the menu on the left, select "users," then, click on the statnet user profile. Some repositories are private, so you will need to request access.

Contributing to GitHub repositories

To download and contribute to GitHub repositories on your local computer, you can use the command-line interface (CLI), the GitHub graphical user interface (GUI), or you can work through RStudio directly.

  • To access GitHub from the command line, you first need to download and install Git. Click "Downloads" and select the download link for your operating system. In the installation process, use the default options.

    • To interact with Git, open Git Bash (for Windows users - it should be in the Start Menu) or Terminal (for Mac users). These are both Unix shells.
    • Once in the shell, you need to configure your username and email so that your changes to the code will be flagged as coming from your Git account. This cheat sheet is a good reference for setting up and using Git at the command line, and this tutorial will walk you through the basic commands. I'll summarize some of the important steps here:
      • To create a new repository, first change the working directory to the folder where you want the repository to be saved. Type pwd to see your current directory, and type cd [directory path] to change the current directory.
      • You may want to make a new directory folder for your repository, which you can do by typing mkdir [directory_name]. Then, to create a new repository, type git init [project-name].
      • To clone a repository, type git clone [url].
      • To create a branch, type git branch [branchname].
      • For more on how to use the CLI , go to the section on Unix.
    • Open the repository in RStudio by going to File -> New Project -> Existing Directory, then navigate to the folder in which you saved your repository.
  • To access GitHub from the GUI, download the Github Desktop app. From the App, you can clone projects from Github to be used in your local Rstudio by clicking on the plus sign in the upper lefthand corner. Cloning creates a local copy of the repository on your computer, which you can sync with the remote repository on GitHub. First, create a folder on your desktop to store the cloned project. Then select the "clone" option from the top of the window, as shown in the image below. If you've been added to statnet or if you have any repositories of your own, they should appear in the dropdown menu.
    github1

    • Then select the the project you want to clone (e.g. Mardham). A box will pop up for you to name your clone and put it in the appropriate folder on your computer. After cloning, create a branch so that you can work on the project in parallel with other users. At the top of the window, click the third icon from the left to create a branch.
      github2
    • A graph will appear near the top of the window, showing your branch and the master branch. Any commits you make or any made to the master will be displayed here. If you want the changes you made in your branch to be reflected in the master repository, create a pull request. This makes your changes public so others can review them. Click the "pull request" button in the upper right, enter a title and description, and submit.
    • Again, to open the repository in RStudio, go to File -> New Project -> Existing Directory, then navigate to the folder in which you saved your repository.
  • To access repositories directly from RStudio, go to File -> New Project -> Version Control -> Git, and enter the repository URL, which you can find on GitHub by clicking the geen "Clone or download" button from within a repository page. Within RStudio, there is a tab in the upper right window called Git. From here you can see your changes, make commits, pull, and push. You can also access the command line from within RStudio by going to the Tools menu and selecting "Shell."

Return to top

RStudio

RStudio is a user interface for R, which you can download here. You can also use RStudio on the CSDE server. You first need to request a CSDE Unix account. Then you can use this link to sign on to the RStudio nori server. To login, type your CSDE username and password.

Coding in R and RStudio

If you're new to R, there are a lot of helpful guides out there on the Internet. Here are some examples that I've found useful:

  • R Tutorial by Kelly Black at the University of Georgia. This guide goes over how to input data, data types, basic operations, plotting, indexing, data management, basic statistical analyses, and programming basics (e.g. if statements, for loops, etc.).
  • Try R on codeschool.com, sponsored by O'Reilly Media. This tutorial has 7 chapters with interactive activities to walk you through how to create variables, define functions, use vectors, matrices, data frames, etc.
  • Introduction to R on DataCamp. In addition to this basic introductory tutorial, DataCamp also has free tutorials on data visualization, data manipulation, dynamic reporting, and many more!
  • R Tutorial for the Network Modeling for Epidemics course. This tutorial describes the basics of R, how to create and perform operations on vectors, matrices, and arrays, how to use data frames,programming for conditional logic, and plotting.
  • A Short Introduction to R for Epidemiology by Michael Hills, Martyn Plummer, and Bendix Carstensen. In addition to discussing how to use the Epi package, it provides a general overview of R commands and functions.

There are also manuals available on the R-project website, and on the RStudio website.

In R, you can always type 'help' and the name of the function: help(function_name) or ?function_name. To find the documentation for an installed package, type help(package="package_name"). Vignettes provide more detailed documentation on packages. To find available vignettes, type browseVignettes() and it will open an html page with a list of all vignettes for installed packages. To see vignettes for a specific package, type browseVignettes("package_name").

Working with GitHub repositories in RStudio

After starting a new project with your cloned or branched repository, make sure you have a few important packages loaded.

  • devtools provides functions to facilitate package development. To install it, type install.packages("devtools"). Then load it with library(devtools).

  • Use devtools to install packages from GitHub. For example, to install the EpiModelHIV package, run the following code.

      install.packages("EpiModel", dependencies=TRUE)
      devtools::install_github("statnet/tergmLite", subdir = "tergmLite")
      devtools::install_github("statnet/EpimodelHIV")
    
  • roxygen2 is a documentation system for R. There are three ways to run roxygen: type roxygen2::roxygenise(), type devtools::document(), or type Ctrl + Shift + D if you're in RStudio.

    • Once roxygen2 is loaded, click the Build menu and select "Configure Build Tools." Make sure the root directory is correct, and check the box for "Generate documentation with Roxygen." That will pop up a window, and make sure it looks like this:

      roxygen

If you go to the Files tab in the lower right window of RStudio, you should see several folders.

  • R has code for all the modules. Modules encode a single core process within the system. The module relates to a concept, and the module function is the realization of that concept. The inputs to the modules are the data structure and parameters. To understand the functions and what data you will need, a good place to start is to go through the modules and note what each one does and what the inputs are.
  • man houses automatically-generated manuals associated with the functions. These can be useful reminders, but reading the modules and other files in the R folder is more informative.
  • src contains C and C++ source code that needs to be compiled. You probably won't need to interact with these files unless you're developing new packages. The aging module is stored in this file for EpiModelHIV, however (as of 12.9.16).
  • inst contains test scripts. You can run these to see what the modules are and the order in which they are run. These are useful to run for debugging.
  • test has some tests of code. You may want to delete this folder, as changes to the code may violate these previously written tests.
  • Depending on the repository, you may or may not see a scenarios folder. This will have examples of setup, estim, and sim files, which can be used as a template for the scripts you will write to run your model.
  • If you are running simulations, create new directories for each one, each of which should have setup, estim, and sim files. Don't change the name of these files; if you do, they won't be called appropriately.

Editing or creating new modules and functions

Important notes:

  • If you want to modify some code, don't edit the original file. Instead, copy and paste into new files in the same folder as the originals and rename them. Be sure that you update the name wherever the new code/function is called, including in the setup and estim files.
  • Don't delete any of the header "@" information on any files you're revising. This information is important to process the files into the new build.

When you are ready, go to the Build menu and click "Build and Reload." When the process finishes, you should see your new functions in the NAMESPACE page in the main package folder (not the R code). If not, your functions have not been added to the package. Do not try to add functions manually to this page.

Return to top

UNIX

UNIX is an operating system that provides a link between the user and the computer through the shell (a command line interpreter). Data in UNIX are organized into files, which are organized into directories. You can access unix using Terminal if you have a Mac or Git BASH or Putty if you have a Windows computer.

There are a bunch of online tutorials on Unix. Here are the links to a few:

To get you started, some basic commands and shortcuts are outlined below. Note that some command syntax is different on Windows, so if something doesn't work, try Googling the command for Windows or using the tutorials above.

  • To get help: type man then the command name you want to know about, or <command name> --help.
    • Anything in square brackets is an optional argument
    • To exit the help window, type q
  • The $ or > symbols are the shell prompts
    • If you see the prompt, the shell is ready for you to write commands to it
    • If you enter a command you didn't want to run, press control-c to get back to the shell prompt
  • Shortcuts
    • The up arrow takes you back to last command. You can also type history to see a list of the commands you've typed. To repeat the last command type !!, to repeat the second-to-last command type !-1, to repeat the third-to-last command type !-2, etc.
    • If have started typing a command, control-a will move the cursor to the beginning of the command, and control-e will take you to the end
    • Tab completion: if you start typing something and hit tab, it will suggest the file name or directory. If there is more than one available file or directory with that same stem, you may need to give it more information.
  • ls lists the files in the directory you are in
    • You can type ls –a to show hidden files
  • Files and directories are organized in a tree with branches. To locate files and folders, you can use absolute paths or relative paths:
    • Absolute paths list the location starting from the root directory, /
    • Relative paths list the location of a file or folder relative to the current working directory
    • ~ is a short form to refer to the user directory.
  • Wildcard expansion:
    • The star wildcard (*) represents any string. It can be combined with characters that you want the command to match on. Some examples are below.
      • rm temp/* removes everything in the temp folder
      • ls *.html *.txt will list all files with extension .txt or .html
      • rm *xxx* will remove all files in the current directory with the string "xxx" in the name
    • The question mark wildcard (?) represents a single character, and two question marks represent two characters in succession.
      • file ??? returns data on objects that have a name 3 characters in length (including extensions)
    • The bracket wildcard ([]) can represent any of the characters enclosed in brackets
      • ls code[1-4] will list any fines in the current directory named "code" followed by one of the numbers 1-4.
  • whoami will print your login name is
  • pwd will print the name of the current working directory.
  • cd is the command to change the directory
    • Type cd <directory_path> to change to a directory. This could be an absolute path or a relative path (for example if you want to move into a directory folder within the current directory, just type the name of that folder)
    • To move up one level, type cd ..
  • mkdir <directory_name> creates a directory
  • mv will move a file to a different directory or rename it.
    • To move something from some other folder to the current folder, type mv <current_pathfile> .. The . points to the current directory. So mv /Users/username/Documents/testfile . will move the file "testfile" from the Documents folder to the current directory.
    • To rename something, type mv <oldfilename> <newfilename>
    • Warning: if you move something to a new file name that already exists, it will replace it!
  • cp will copy a file: cp notes.txt notescopy.text will create a copy of notes.txt called notescopy.txt.
    • To copy a directory, add the option –r: cp -r <directoryname> <newdirectoryname>
  • rm removes a file or directory, e.g. rm notes.txt
    • To remove an entire directory, type rmdir <directoryname>. This will only work on empty directories, though.
    • To remove things in a directory first, type rm <directoryname>/*
    • Warning: Once you remove something, you can't get it back. To add a safeguard, you can add the option -i so that it will ask you to confirm that you want to remove something.
  • file will give you information about an object. One way you might want to use this is to find out the extension of a file, because that will determine how you can view the file.
  • To open/view text files:
    • cat <filename> will print the contents of the file
    • nano <filename> opens a text editor. To exit, type control-x
    • vi <filename> opens the vi editor. Type i to go from command mode to insert (edit) mode. To go back to command mode, hit escape. To learn how to navigate in the vi editor, check out this page. From command mode, type :w to save, :q to quit, and :wq to save and quit.
  • To view pdf files, use open <filename>.
  • wc -l will tell you how many lines there are.
  • grep is a way to search for content in files. It will reports the lines of a file that contain the word or string.
    • grep test *.txt will search all .txt files for the word "test".
    • grep t* *.txt will search all .txt files for a string beginning with "t"
  • You can create files to run commands together.
    • Create a file name with extension .sh (shell script) in an editor (nano or vi). E.g. nano script.sh
    • Type the lines of command you want to run
    • Then type bash script.sh

Return to top

Hyak

Hyak is a shared scalable computing cluster operated by UW-IT. It is made up of two clusters, each of which has hundreds of nodes, and each node has 16 processor cores and 128GB of memory. The nodes run on Linux, and there are four types of nodes designed for different tasks:

  • Login nodes: These are the nodes you arrive to when sshing into Hyak. They are used for:
    • Transferring data between Hyak, CSDE, and your local computer.
    • Submitting simulation jobs with qsub, monitoring these jobs, and other file management.
    • These nodes are not used for simulations, building R packages, or other computationally intense work.
  • Build nodes: These are the nodes you should use for building software like R packages and for small-scale tasks like package installation. They are connected to the internet, so you can install packages using the install.packages() function. Use these in interactive mode, wherein package installation is done manually as opposed to in batch jobs. To connect, use the build alias defined below. This starts a new job with qsub but flags it with the q build tag, which identifies the job as a build job. The default walltime is 30 minutes and the max is 8 hours.
  • Compute nodes: These are designed for large-scale simulations. CSDE and other research groups have "ownership" of some of these nodes. To connect to compute nodes, submit a job with qsub, rather than by trying to connect directly with ssh. The next section describes in more detail how to submit simulation jobs to CSDE nodes and/or other nodes. To exit computing nodes and return to the login node, type exit.
  • Interactive node: CSDE owns an interactive node specifically for running interactive computing jobs, such as running R interactively to test a simulation on small-scale or perform some data analysis. To connect to this node, use the shell alias defined below, which is shorthand for qsub -q int -I. This code tells Hyak that you want a node in the interactive queue and that the job is interactive. This node is not connected to the internet, so don't try to use it to build packages or transfer data. It make take a few minutes for the scheduler to get resources available. The default walltime is 60 minutes.

Setting up your account and logging in

  1. Get an account
  • To get an account, you need to have a UW NetID and an account sponsor. Typically the leading faculty member on the research group or the IT director of the academic unit.
    • CSDE affiliates can access through CSDE: on this page click "High-Performance CSDE Unix/Linux Systems" under the "Unix Systems" heading to learn how to get access to the CSDE Hyak nodes.
    • If you are a UW student, you can also join the UW HPC Club and access the STF nodes.
  1. Get a security token
  • You will need a security token (also known as an entrust token or PRN) form UW IT. This takes a couple of days after you fill out this form.
  • If you don't use Hyak very often, your PRN may not work. If this occurs, re-synch your entrust token here. Under "Synchronize Token," enter the 8-digit number that shows up when you hold down the green power key on your token. You should then see a message that says "Token synchronization successful."
  1. Add Lolo and Hyak to your computing services
  • Click here to add the Hyak and Lolo servers to your active computing services. Under "Inactive Services," check the boxes for Hyak and Lolo, and click subscribe. If you have successfully subscribed, Hyak and Lolo will appear under "Active Services," though you may need to refresh the page a few times. Note, you will not be able to subscribe to these services until your Hyak account has been approved.
  1. Connecting to Hyak
  • The most basic (and inefficient) approach to logging in:

    • In your shell program (e.g. bash or terminal), type ssh -X [email protected] (replacing "your_netid" with your UW netid).
    • It will prompt you for a password. In the shell, it will not show anything as you type a password, but it is processing what you type.
    • Then it will prompt you to enter your PRN. Hold down the power button on your entrust token until you get a number. Enter that number into the shell. This should complete the authentication process and connect you to Hyak.
  • A more streamlined approach

    • Set up an ssh Config file so you can use short logins at the terminal. In the shell program on your local computer, type:
          cd .ssh
          ls
      
    • Look for a file called config.
      • If it doesn't exist, type touch config.
    • Edit the config file: vi config.
      • Type i to get into interactive mode, and then add this to the file (don't include <> when you enter your user name):
        Host hyak hyak.washington.edu  
          User <your hyak username>  
          HostName hyak.washington.edu  
          ControlPath ~/.ssh/master-%r@%h:%p  
          ControlMaster auto  
          ControlPersist yes  
          Compression yes  
        Host union  
          User <your csde account name>  
          HostName union.csde.washington.edu
        Host libra
          User <your csde account name>
          HostName libra.csde.washington.edu
        Host nori
          User <your csde account name>
          HostName nori.csde.washington.edu
      
      • To save and close vi, press the esc key then type :wq.
    • Now in the CLI window, you can log onto Hyak with just ssh hyak. It will prompt you for your UW netid password and the PRN from your entrust token.
    • Once in Hyak, set up the config file there to easily switch between Hyak and CSDE linux nodes. Follow the steps above to move to the .ssh directory and open or create a config file to edit with vi. In the editor, enter these lines:
            Host union
                User <your csde account name>
                HostName union.csde.washington.edu
            
            Host libra
                User <your csde account name>
                HostName libra.csde.washington.edu
      
  • To bypass having to enter your password every time you access the CSDE cluster, do the following:

    • Type this code on the command line in your local machine (to log out of Hyak type logout on the command line, or open a new shell window): ssh-keygen -t rsa. If you are prompted for a file name or password, hit enter until a public key is created.
    • Add that key into your authorized key file on union:
          ssh union mkdir -p .ssh
          cat .ssh/id_rsa.pub | ssh union 'cat >> .ssh/authorized_keys'
      
    • Then log in to Hyak and type this:
          ssh-keygen -t rsa
          cat .ssh/id_rsa.pub | ssh union 'cat >> .ssh/authorized_keys'
      
    • Test the passwordless login between local-to-CSDE and Hyak-to-CSDE.
    • This will save time when transferring files between Hyak and CSDE to your local computer. (NOTE: If you have a Mac computer, you don't really need to use the CSDE servers for anything - you could just use R or RStudio on your local computer for basic estimation and analysis and then submit your runs to Hyak through terminal. However, there are added benefits to using the CSDE RStudio server: the server backs everything up regularly, and you can have code running even when your laptop is closed and you are off to lunch. For Windows users, the transfer of files from the local computer to Hyak is more cumbersome, so it's easier to use the CSDE servers.)
  • If you have trouble connecting to Hyak, check out the Hyak wiki - the issue may be with your DHCP configuration or with your shell, and the wiki has instructions for how to troubleshoot.

Setting up aliases

  • Aliases in Linux are command shortcuts.
    • Log in to Hyak. Aliases are stored in a file called either .bashrc or .bash_profile in your home directory.
      • Look at what's currently in it cat .bashrc. If it says "no such file", try with .bash_profile.
      • Use vi to edit the file and add aliases such as those below. These are examples, so you can add new ones that are relevant to your project.
        • alias gsr='cd /gscratch/csde/your_username': This changes the directory to your local gscratch folder (where you should save your simulation results). You can name that alias anything you want and you can point it to any folder you want (you should change it so that "your_username" is replaced with your CSDE username).
        • alias shell='srun -N 1 -p csde -A csde --time=24:00:00 --mem=50G --pty /bin/bash' and alias build='srun -p build --time=3:00:00 --mem=20G --pty /bin/bash' These aliases start up jobs on the interactive node and build node (2 of Hyak's 4 types of nodes), respectively.
        • alias transdat='scp /gscratch/csde/your_username/project/*.rda libra:~/project/data' and alias transscr='scp libra:~/project/*.[Rs]* /gscratch/csde/your_username/project/': The scp (secure copy) command transfers files safely between hyak and outside computers. The first of these commands transfers data from Hyak to CSDE (and, specifically, .rda files from that specific folder on Hyak to that specific folder on CSDE's libra node). The second transfers scripts from CSDE to Hyak (specifically any files ending with .R or .s from that specific folder on libra to that specific folder on Hyak).
        • alias myq='squeue -u <userid>': This checks what is running in your queue. Replace with your userid.
        • alias lspack='. /suppscr/csde/<userid>/spack/share/spack/setup-env.sh': This loads the spack environment on Ikt. Replace with your userid. For more on spack: guide to building the latest version of R, Hyak wiki guide to Spack
      • To test if these worked correctly, type logout and then login again and type alias.
      • Similar aliases can be defined for your local computer.
      • If you are transferring a lot of files between Hyak and your local computer, log on to Hyak in one session, then open another session to do the file transfers otherwise you'll have to enter your password and keygen number each time.

Loading and building packages

  • Using the build nodes to load R packages from CRAN

    • To use Hyak for R computing or analysis, you will need to load your R library onto Hyak. The first step is to log on to a build node by typing build (if you've defined a "build" alias, as described above). The software is loaded in "modules". To see the modules available on Hyak, type module available.
      • Load the R module (you can specify the version number to load more up-to-date versions of R, which are usually better if they are listed as available): module load r_3.2.5.
      • NEW: To build the latest version of R, use Spack
      • Type R in the command line to start R. Then start installing packages interactively on CRAN as you would in your R or RStudio command line. Start with EpiModel and it's dependencies: install.packages("EpiModel", dependencies=TRUE). If it worked, you should see no error messages during the installation. You will most likely get a message asking if you want a personal library, and the answer is yes.
      • Type q() to exit R.
  • Installing packages from Github

    • Since many packages are not hosted on CRAN, only on Github, we need to take a different approach to install them. Some packages are in private repositories, so you won't be able to install them unless you get access from the owner. There are some issues with using devtools running correctly on Hyak, so we use this alternative strategy:
      • Use vi to open .bashrc or .bash_profile and press i to enter interactive mode (this process is explained above in step 5).

      • Paste the following function into the file:

          installgit() {
          module load r_3.2.5;
          wget -q  https://github.com/$1/$2/archive/master.zip;
          unzip -q master.zip;
          R CMD INSTALL "${2}-master/$3"
          rm -r "${2}-master"
          rm -r master.zip
          }
        
      • Then save and close the file by typing esc and then :wq. This function will download any R package in a Git repository and install it for you. To do this, type installgit <owner> <repository> <subdirectory> into the command line. For example to get the Github version of EpiModelHIV, we'd type installgit statnet EpiModelHIV. In this case we don't need to specify a subdirectory because the package is in the root directory of the repository. But with something like tergmLite, you'd need to specify the subdirectory: installgit statnet tergmLite tergmLite.

Running jobs on Hyak

  1. Set up your scripts
  • To run simulations on Hyak, you need an R script that sets up the simulation. An example R simulation script is:
    library("methods")
    suppressMessages(library("EpiModelHIVmsm"))
    library("EpiModelHPC")
    
    args <- commandArgs(trailingOnly = TRUE)
    simno <- args[1]
    jobno <- args[2]
    fsimno <- paste(simno, jobno, sep = ".")
    print(fsimno)
    
    load("est/nwstats.rda")
  
    param <- param_msm(nwstats = st)
    init <- init_msm(nwstats = st)
    control <- control_msm(simno = fsimno,
                         nsteps = 52 * 50,
                         nsims = 16,
                         ncores = 16,
                         save.int = 500,
                         save.network = TRUE,
                         save.other = c("attr", "temp"))
  
    netsim_hpc("est/fit.rda", param, init, control,
             save.min = TRUE, save.max = FALSE, compress = "xz")
  • You will also need a shell script, as in the following example, titled runsim.sh:
    • This s a PBS (Portable Batch System) jobscript, which contains instructions for the scheduler, sets up the working environment, and executes your production program. Instruction lines start with #PBS, and they tell the scheduler what your program is, how many nodes it uses, and how much memory you need. The Standard specs lines set a specific issue of memory utilization. Make sure the current version of R used to build packages is loaded. The final line runs the R script in batch mode, pulling from variables for SIMN and PBS_ARRAYID and passing them into the R script as necessary.
       #!/bin/bash
    
        ###========== User specs / instructions for the scheduler ==========###
        ## Name the job
        #PBS -N sim$SIMNO
        ## This line tells the scheduler we want 1 node, 16 processing cores per node, we only want 16 core nodes, and we want 5 hours of runtime
        #PBS -l nodes=1:ppn=16,mem=44gb,feature=16core,walltime=05:00:00
        ## This defines where the output should go. Change the filepath to your target gscratch folder
        #PBS -o /gscratch/csde/camp/out
        ## This defines where the standard error should go. Again, change the filepath to your target folder.
        #PBS -e /gscratch/csde/camp/out
        ## This says you want the standard output and standard error to be combined into the out file.
        #PBS -j oe
        ## This defines the working directory for the job. Again, change the filepath to your target folder.
        #PBS -d /gscratch/csde/camp
        ## This defines whether email should be sent with job status. If "n", no mail sent. Type "abe" if want email when the job is aborted (a), begins (b), or terminates (e).
        #PBS -m n
    
        ###========== Standard specs ==========###
        HYAK_NPE=$(wc -l < $PBS_NODEFILE)
        HYAK_NNODES=$(uniq $PBS_NODEFILE | wc -l )
        HYAK_TPN=$((HYAK_NPE/HYAK_NNODES))
        NODEMEM=`grep MemTotal /proc/meminfo | awk '{print $2}'`
        NODEFREE=$((NODEMEM-2097152))
        MEMPERTASK=$((NODEFREE/HYAK_TPN))
        ulimit -v $MEMPERTASK
        export MX_RCACHE=0
    
        ###========== Define the working environment (load modules) ==========###
        module load r_3.2.4
    
        ###========== Execute your program ==========###
        ### App
        ALLARGS="${SIMNO} ${PBS_ARRAYID}"
        echo runsim variables: $ALLARGS
        echo
    
        ### App
        Rscript sim.R ${ALLARGS}
    
  1. Transfer files
  • To transfer the scripts from CSDE to Hyak to run them, first login to Hyak and make sure you are on a login node to transfer data.
    • If you haven't done so already, create a folder for your project in the gscratch directory. If you're using CSDE scratch directories, type cd /gscratch/csde from the home directory in Hyak. Then make a new folder: mkdir <your_username>. You can create additional folders within this directory for output or whatever else you need.
    • Then you can use the alias transscr to transfer the scripts. Make sure the code in the alias points to the correct directory to/from which you want the files transferred. If you want to transfer an entire folder, use the option -r after scp. You can also use wildcards (described in the Unix section) to transfer all files that start with "run", for instance.
  1. Submit your jobs
  • The next step is to submit the jobs with the job manager command qsub. There are many arguments and options for qsub, but the bare minimum are as follows: qsub -t 1 -v SIMNO=1 runsim.sh.

    • The -t parameter invokes an array job, in which there is just one sub-job in this case. Array jobs will be useful to run the same parameter set multiple times. This is an alternative to running one big job with lots of simulations that works better with the backfill queue.
    • The -v parameter passes environmental variables down into the runsim.sh shell script. In this case, SIMNO will be the primary simulation ID number, and passing SIMNO=X means that runsim.sh will execute a script called simX.R. In combination with the array ID, the unique ID of this simulation job would be 1.1, which will be used in the file name to save out the file.
  • For a second simulation, we may want to submit a series of 7 subjobs to the backfill queue using the following syntax: qsub -q bf -t 1-7 -v SIMNO=2 runsim.sh.

    • Backfill jobs execute on idle nodes throughout the cluster. They could be interrupted if the owner of the nodes needs them or every 4 hours, so if you run jobs on backfill: 1) be sure to frequently save intermittent results onto the disk and don't recalculate them after a new start, and 2) you can output the start time of jobs to track how much processor time the job has had. If you're using EpiModelHPC, this will happen automatically, but if you want to learn more about using the backfill queue, read this page.
    • Note the runsim.sh file doesn't need to be changed. The default -q argument is batch, which submits jobs only to CSDE nodes, whereas -q bf submits them to the backfill.
    • For backfill jobs, it is advisable to set the runtime to 4 hours or 4:01:00, since you will be booted off at 4 hours anyways.
  • The "-l" argument allows you to overwrite the job specifications in the PBS file. For instance, qsub -q bf -t 1-7 -v SIMNO=2 -l walltime=04:00:00 runsim.sh runs the script "runsim.sh" as a backfill job and tells the scheduler that it needs a walltime of 4 hours. To estimate walltime the first time you run a simulation, a rule of thumb is that for a network size of 10k, you will need 5 seconds per network simulated per time step. So if you have a simulation of 3 networks and 50 years of weekly time steps, you would need 5x3x52x50 = 11 hours.

  • Any qsub arguments can be placed together in a bash shell for easier execution and historical record. You could create a file called master.sh that contains the following. To execute this file and run the two commands, you would type bash master.sh.

        #!/bin/bash
    
        qsub -t 1-7 -v SIMNO=1 runsim.sh
        qsub -q bf -t 1-7 -v SIMNO=2 runsim.sh
    
  • If you are a member of more than one group (i.e. CSDE and HPC club), specify which group's nodes you want to use with the group_list argument: qsub -W group_list=hyak-groupname script.pbs. To see the groups you're a member of, run the command groups.

  1. Transfer data from Hyak to CSDE
  • By default, our R script saves a series of R data files within the data subfolder of our shared drive on Hyak. For data analysis, we will transfer the minimum files to CSDE for interactive analysis using the transdat alias.
  1. Monitor your jobs
  • Once a job is submitted, it will show up in the job queue. It will move from eligible to active if a node is free to run it, and it will stay as active until it completes, errors, or is terminated.
    • To check the status of your jobs, type checkjob <job_id> or qstat -f <job_id>. You can also type showq -w user=<userid> to show the status of all your jobs. Add option -n to see the job listed by the job name instead of the job ID. To see additional information on where the jobs are running, use the -r parameter. You can create an alias for displaying your job status with extra information: alias myq='showq -u <userid> -r'.
    • To look at the qstatus of all jobs, type showq, to se jobs in the backfill queue, type showq -w class=bf. To see the jobs in teh queue for a specific group, type showq -w quos=<groupname>. There are additional options such as groupname-bf for backfill in the group, or groupname-int for interactive.
    • To look at the status of the nodes in your group (i.e. CSDE), use nodestate <groupname>.
    • To cancel a job, type qdel <job_id> or mjobcts -c <job_id>.
    • To hold a job from running until you release it, type mjobctl -h <job_id>. To release a held job, type mjobctl -u <job_id>.
    • To change a job class, for example if backfill jobs are taking a long time to initiate and you want them to run in batch mode, type: mjobctl -m class=batch -m account=csde -m qos=csde <job_id>. To change from batch to backfill, type: mjobctl -m class=bf =m account=csde-bf -m qos=csde-bf. The job_id should correspond to what you see when you run showq.
      • You can also create an alias for this: alias tobatch='mjobcts -m class=batch -m account=csde -m qos=csde or alias tobf='mjobctl -m class=bf =m account=csde-bf -m qos=csde-bf'. Then when you execute this alias, type tobatch <job_id> or tobf <jobid>.
      • If you are using array jobs, the job id would look like this: job_id[#] with the number indicating the job in the array. To change the mode fo all array jobs with a given master job ID, use: tobatch x:<jobid>.*.

Hyak file systems

  • Each Hyak group (i.e. CSDE) has a shared scratch directory under /gscratch/groupname which is shared among all Hyak nodes. The CSDE directory is gscratch/csde. Gscratch is not backed up. These directories have quotas based on the number of nodes owned by the group. To see the quota for CSDE, type mmlsquota -j hyak-csde gscratch --blocksize G.
  • Every Hyak user also has a home directory that is private. You can store ssh keys or login scripts here. Don't store large code files or do computation here, as this system is small and slow. These files are also not backed up. To check your home directory quota, type mmlsquota home --block-size G.
  • There is also a file directory in each node. These are accessible only to the local node, and are cleaned up after each job is complete. To keep data on the local scratch disk, creat a directory with your group name, i.e. scr/csde.
  • There is a long-term storage system called lolo archive. This is intended for storage of data that you aren't changing frequently, and it should never be used for any interactive access. It is also not intended for storing many small files - if you have small files, collect them into tar files before transferring to the archive. The Archive quota limits you to 10.000 files/TB - an average file size of 100MB, but it performs best with files >1MB.
    • CSDE has storage on lolo. To move files here, first create a directory with your name in the lolo folder: cd /lolo/archive/hyak/csde, then mkdir <your_username>. Then, from the directory of the file you want to move, type cp filename /lolo/archive/hyak/csde/<your_userid>.
  • Lastly, there is a lolo collaboration filesystem for sharing data with collegues and staging data for later transfer to gscratch.
⚠️ **GitHub.com Fallback** ⚠️