Guide to Github, RStudio, and Hyak - statnet/computing GitHub Wiki
This document provides a guide to getting set up with GitHub, using RStudio, and running code through Hyak (a high-performance computing cluster at the University of Washington). This guide is intended for users affiliated with statnet or the Network Modeling Group at the University of Washington, but it could be used more broadly with modifications to the code.
The big picture is as follows: Software packages and projects are housed on GitHub to facilitate version control and collaboration with Git. RStudio is used to build or edit packages and run analyses. You can access RStudio on your local computer or through the Center for Studies in Demography and Ecology (CSDE) servers. Large scale simulations and other computationally-intensive tasks are done through Hyak. To transfer files between your local computer, the CSDE server, and Hyak, use a Unix shell.
NOTE: Some of the instructions for working with Hyak may be out of date. A few updates have been added, but if some scripts or processes are not working, consult the Hyak wiki
GitHub is an online platform for version control and collaboration. Git is the software GitHub is built around. Projects are organized in repositories, which can contain code, datasets, or any other files related to a project. You can create a branch off the master version of a repository to make edits without changing the original files. This allows multiple people to work on a project simultaneously without interfering. To save and document your changes, you make a commit with an associated 'commit message' to summarize why you made the changes. This works best if you commit often so that each commit corresponds to one idea or issue that you fixed. If you think your changes should be reflected in the master branch, submit a pull request. The owner of the master branch will review the differences in the content of your branch and the master, and will merge the branches together if the changes are accepted.
The first step to using GitHub is to sign up with a username and password. Do that here: https://github.com/. Then you can search for repositories, code, issues, or users. To join the statnet organization and get access to the group's coding repositories and documentation, search "statnet" in the search bar at the top. In the menu on the left, select "users," then, click on the statnet user profile. Some repositories are private, so you will need to request access.
To download and contribute to GitHub repositories on your local computer, you can use the command-line interface (CLI), the GitHub graphical user interface (GUI), or you can work through RStudio directly.
-
To access GitHub from the command line, you first need to download and install Git. Click "Downloads" and select the download link for your operating system. In the installation process, use the default options.
- To interact with Git, open Git Bash (for Windows users - it should be in the Start Menu) or Terminal (for Mac users). These are both Unix shells.
- Once in the shell, you need to configure your username and email so that your changes to the code will be flagged as coming from your Git account. This cheat sheet is a good reference for setting up and using Git at the command line, and this tutorial will walk you through the basic commands. I'll summarize some of the important steps here:
- To create a new repository, first change the working directory to the folder where you want the repository to be saved. Type
pwd
to see your current directory, and typecd [directory path]
to change the current directory. - You may want to make a new directory folder for your repository, which you can do by typing
mkdir [directory_name]
. Then, to create a new repository, typegit init [project-name]
. - To clone a repository, type
git clone [url]
. - To create a branch, type
git branch [branchname]
. - For more on how to use the CLI , go to the section on Unix.
- To create a new repository, first change the working directory to the folder where you want the repository to be saved. Type
- Open the repository in RStudio by going to File -> New Project -> Existing Directory, then navigate to the folder in which you saved your repository.
-
To access GitHub from the GUI, download the Github Desktop app. From the App, you can clone projects from Github to be used in your local Rstudio by clicking on the plus sign in the upper lefthand corner. Cloning creates a local copy of the repository on your computer, which you can sync with the remote repository on GitHub. First, create a folder on your desktop to store the cloned project. Then select the "clone" option from the top of the window, as shown in the image below. If you've been added to statnet or if you have any repositories of your own, they should appear in the dropdown menu.
- Then select the the project you want to clone (e.g. Mardham). A box will pop up for you to name your clone and put it in the appropriate folder on your computer. After cloning, create a branch so that you can work on the project in parallel with other users. At the top of the window, click the third icon from the left to create a branch.
- A graph will appear near the top of the window, showing your branch and the master branch. Any commits you make or any made to the master will be displayed here. If you want the changes you made in your branch to be reflected in the
master
repository, create apull request
. This makes your changes public so others can review them. Click the "pull request" button in the upper right, enter a title and description, and submit. - Again, to open the repository in RStudio, go to File -> New Project -> Existing Directory, then navigate to the folder in which you saved your repository.
- Then select the the project you want to clone (e.g. Mardham). A box will pop up for you to name your clone and put it in the appropriate folder on your computer. After cloning, create a branch so that you can work on the project in parallel with other users. At the top of the window, click the third icon from the left to create a branch.
-
To access repositories directly from RStudio, go to File -> New Project -> Version Control -> Git, and enter the repository URL, which you can find on GitHub by clicking the geen "Clone or download" button from within a repository page. Within RStudio, there is a tab in the upper right window called Git. From here you can see your changes, make commits, pull, and push. You can also access the command line from within RStudio by going to the Tools menu and selecting "Shell."
RStudio is a user interface for R, which you can download here. You can also use RStudio on the CSDE server. You first need to request a CSDE Unix account. Then you can use this link to sign on to the RStudio nori server. To login, type your CSDE username and password.
If you're new to R, there are a lot of helpful guides out there on the Internet. Here are some examples that I've found useful:
- R Tutorial by Kelly Black at the University of Georgia. This guide goes over how to input data, data types, basic operations, plotting, indexing, data management, basic statistical analyses, and programming basics (e.g. if statements, for loops, etc.).
- Try R on codeschool.com, sponsored by O'Reilly Media. This tutorial has 7 chapters with interactive activities to walk you through how to create variables, define functions, use vectors, matrices, data frames, etc.
- Introduction to R on DataCamp. In addition to this basic introductory tutorial, DataCamp also has free tutorials on data visualization, data manipulation, dynamic reporting, and many more!
- R Tutorial for the Network Modeling for Epidemics course. This tutorial describes the basics of R, how to create and perform operations on vectors, matrices, and arrays, how to use data frames,programming for conditional logic, and plotting.
- A Short Introduction to R for Epidemiology by Michael Hills, Martyn Plummer, and Bendix Carstensen. In addition to discussing how to use the Epi package, it provides a general overview of R commands and functions.
There are also manuals available on the R-project website, and on the RStudio website.
In R, you can always type 'help' and the name of the function: help(function_name)
or ?function_name
. To find the documentation for an installed package, type help(package="package_name")
. Vignettes provide more detailed documentation on packages. To find available vignettes, type browseVignettes()
and it will open an html page with a list of all vignettes for installed packages. To see vignettes for a specific package, type browseVignettes("package_name")
.
After starting a new project with your cloned or branched repository, make sure you have a few important packages loaded.
-
devtools
provides functions to facilitate package development. To install it, typeinstall.packages("devtools")
. Then load it withlibrary(devtools)
. -
Use
devtools
to install packages from GitHub. For example, to install theEpiModelHIV
package, run the following code.install.packages("EpiModel", dependencies=TRUE) devtools::install_github("statnet/tergmLite", subdir = "tergmLite") devtools::install_github("statnet/EpimodelHIV")
-
roxygen2
is a documentation system for R. There are three ways to run roxygen: typeroxygen2::roxygenise()
, typedevtools::document()
, or typeCtrl + Shift + D
if you're in RStudio.-
Once
roxygen2
is loaded, click the Build menu and select "Configure Build Tools." Make sure the root directory is correct, and check the box for "Generate documentation with Roxygen." That will pop up a window, and make sure it looks like this:
-
If you go to the Files tab in the lower right window of RStudio, you should see several folders.
-
R
has code for all the modules. Modules encode a single core process within the system. The module relates to a concept, and the module function is the realization of that concept. The inputs to the modules are the data structure and parameters. To understand the functions and what data you will need, a good place to start is to go through the modules and note what each one does and what the inputs are. -
man
houses automatically-generated manuals associated with the functions. These can be useful reminders, but reading the modules and other files in the R folder is more informative. -
src
contains C and C++ source code that needs to be compiled. You probably won't need to interact with these files unless you're developing new packages. The aging module is stored in this file for EpiModelHIV, however (as of 12.9.16). -
inst
contains test scripts. You can run these to see what the modules are and the order in which they are run. These are useful to run for debugging. -
test
has some tests of code. You may want to delete this folder, as changes to the code may violate these previously written tests. - Depending on the repository, you may or may not see a
scenarios
folder. This will have examples of setup, estim, and sim files, which can be used as a template for the scripts you will write to run your model. - If you are running simulations, create new directories for each one, each of which should have setup, estim, and sim files. Don't change the name of these files; if you do, they won't be called appropriately.
Important notes:
- If you want to modify some code, don't edit the original file. Instead, copy and paste into new files in the same folder as the originals and rename them. Be sure that you update the name wherever the new code/function is called, including in the setup and estim files.
- Don't delete any of the header "@" information on any files you're revising. This information is important to process the files into the new build.
When you are ready, go to the Build menu and click "Build and Reload." When the process finishes, you should see your new functions in the NAMESPACE page in the main package folder (not the R code). If not, your functions have not been added to the package. Do not try to add functions manually to this page.
UNIX is an operating system that provides a link between the user and the computer through the shell (a command line interpreter). Data in UNIX are organized into files, which are organized into directories. You can access unix using Terminal if you have a Mac or Git BASH or Putty if you have a Windows computer.
There are a bunch of online tutorials on Unix. Here are the links to a few:
- https://www.tutorialspoint.com/unix/index.htm
- http://people.ischool.berkeley.edu/~kevin/unix-tutorial/section1.html
- http://www.washington.edu/computing/unix/startdoc/
To get you started, some basic commands and shortcuts are outlined below. Note that some command syntax is different on Windows, so if something doesn't work, try Googling the command for Windows or using the tutorials above.
- To get help: type
man
then the command name you want to know about, or<command name> --help
.- Anything in square brackets is an optional argument
- To exit the help window, type
q
- The $ or > symbols are the shell prompts
- If you see the prompt, the shell is ready for you to write commands to it
- If you enter a command you didn't want to run, press control-c to get back to the shell prompt
- Shortcuts
- The up arrow takes you back to last command. You can also type
history
to see a list of the commands you've typed. To repeat the last command type!!
, to repeat the second-to-last command type!-1
, to repeat the third-to-last command type!-2
, etc. - If have started typing a command, control-a will move the cursor to the beginning of the command, and control-e will take you to the end
- Tab completion: if you start typing something and hit tab, it will suggest the file name or directory. If there is more than one available file or directory with that same stem, you may need to give it more information.
- The up arrow takes you back to last command. You can also type
-
ls
lists the files in the directory you are in- You can type ls –a to show hidden files
- Files and directories are organized in a tree with branches. To locate files and folders, you can use absolute paths or relative paths:
- Absolute paths list the location starting from the root directory,
/
- Relative paths list the location of a file or folder relative to the current working directory
-
~
is a short form to refer to the user directory.
- Absolute paths list the location starting from the root directory,
- Wildcard expansion:
- The star wildcard (
*
) represents any string. It can be combined with characters that you want the command to match on. Some examples are below.-
rm temp/*
removes everything in the temp folder -
ls *.html *.txt
will list all files with extension .txt or .html -
rm *xxx*
will remove all files in the current directory with the string "xxx" in the name
-
- The question mark wildcard (
?
) represents a single character, and two question marks represent two characters in succession.-
file ???
returns data on objects that have a name 3 characters in length (including extensions)
-
- The bracket wildcard (
[]
) can represent any of the characters enclosed in brackets-
ls code[1-4]
will list any fines in the current directory named "code" followed by one of the numbers 1-4.
-
- The star wildcard (
-
whoami
will print your login name is -
pwd
will print the name of the current working directory. -
cd
is the command to change the directory- Type
cd <directory_path>
to change to a directory. This could be an absolute path or a relative path (for example if you want to move into a directory folder within the current directory, just type the name of that folder) - To move up one level, type
cd ..
- Type
-
mkdir <directory_name>
creates a directory -
mv
will move a file to a different directory or rename it.- To move something from some other folder to the current folder, type
mv <current_pathfile> .
. The.
points to the current directory. Somv /Users/username/Documents/testfile .
will move the file "testfile" from the Documents folder to the current directory. - To rename something, type
mv <oldfilename> <newfilename>
- Warning: if you move something to a new file name that already exists, it will replace it!
- To move something from some other folder to the current folder, type
-
cp
will copy a file:cp notes.txt notescopy.text
will create a copy of notes.txt called notescopy.txt.- To copy a directory, add the option
–r
:cp -r <directoryname> <newdirectoryname>
- To copy a directory, add the option
-
rm
removes a file or directory, e.g.rm notes.txt
- To remove an entire directory, type
rmdir <directoryname>
. This will only work on empty directories, though. - To remove things in a directory first, type
rm <directoryname>/*
-
Warning: Once you remove something, you can't get it back. To add a safeguard, you can add the option
-i
so that it will ask you to confirm that you want to remove something.
- To remove an entire directory, type
-
file
will give you information about an object. One way you might want to use this is to find out the extension of a file, because that will determine how you can view the file. - To open/view text files:
-
cat <filename>
will print the contents of the file -
nano <filename>
opens a text editor. To exit, type control-x -
vi <filename>
opens the vi editor. Typei
to go from command mode to insert (edit) mode. To go back to command mode, hit escape. To learn how to navigate in the vi editor, check out this page. From command mode, type:w
to save,:q
to quit, and:wq
to save and quit.
-
- To view pdf files, use
open <filename>
. -
wc -l
will tell you how many lines there are. -
grep
is a way to search for content in files. It will reports the lines of a file that contain the word or string.-
grep test *.txt
will search all .txt files for the word "test". -
grep t* *.txt
will search all .txt files for a string beginning with "t"
-
- You can create files to run commands together.
- Create a file name with extension .sh (shell script) in an editor (
nano
orvi
). E.g.nano script.sh
- Type the lines of command you want to run
- Then type
bash script.sh
- Create a file name with extension .sh (shell script) in an editor (
Hyak is a shared scalable computing cluster operated by UW-IT. It is made up of two clusters, each of which has hundreds of nodes, and each node has 16 processor cores and 128GB of memory. The nodes run on Linux, and there are four types of nodes designed for different tasks:
-
Login nodes: These are the nodes you arrive to when
ssh
ing into Hyak. They are used for:- Transferring data between Hyak, CSDE, and your local computer.
- Submitting simulation jobs with
qsub
, monitoring these jobs, and other file management. - These nodes are not used for simulations, building R packages, or other computationally intense work.
-
Build nodes: These are the nodes you should use for building software like R packages and for small-scale tasks like package installation. They are connected to the internet, so you can install packages using the
install.packages()
function. Use these in interactive mode, wherein package installation is done manually as opposed to in batch jobs. To connect, use thebuild
alias defined below. This starts a new job withqsub
but flags it with theq build
tag, which identifies the job as a build job. The default walltime is 30 minutes and the max is 8 hours. -
Compute nodes: These are designed for large-scale simulations. CSDE and other research groups have "ownership" of some of these nodes. To connect to compute nodes, submit a job with
qsub
, rather than by trying to connect directly withssh
. The next section describes in more detail how to submit simulation jobs to CSDE nodes and/or other nodes. To exit computing nodes and return to the login node, typeexit
. -
Interactive node: CSDE owns an interactive node specifically for running interactive computing jobs, such as running R interactively to test a simulation on small-scale or perform some data analysis. To connect to this node, use the
shell
alias defined below, which is shorthand forqsub -q int -I
. This code tells Hyak that you want a node in the interactive queue and that the job is interactive. This node is not connected to the internet, so don't try to use it to build packages or transfer data. It make take a few minutes for the scheduler to get resources available. The default walltime is 60 minutes.
- Get an account
- To get an account, you need to have a UW NetID and an account sponsor. Typically the leading faculty member on the research group or the IT director of the academic unit.
- CSDE affiliates can access through CSDE: on this page click "High-Performance CSDE Unix/Linux Systems" under the "Unix Systems" heading to learn how to get access to the CSDE Hyak nodes.
- If you are a UW student, you can also join the UW HPC Club and access the STF nodes.
- Get a security token
- You will need a security token (also known as an entrust token or PRN) form UW IT. This takes a couple of days after you fill out this form.
- If you don't use Hyak very often, your PRN may not work. If this occurs, re-synch your entrust token here. Under "Synchronize Token," enter the 8-digit number that shows up when you hold down the green power key on your token. You should then see a message that says "Token synchronization successful."
- Add Lolo and Hyak to your computing services
- Click here to add the Hyak and Lolo servers to your active computing services. Under "Inactive Services," check the boxes for Hyak and Lolo, and click subscribe. If you have successfully subscribed, Hyak and Lolo will appear under "Active Services," though you may need to refresh the page a few times. Note, you will not be able to subscribe to these services until your Hyak account has been approved.
- Connecting to Hyak
-
The most basic (and inefficient) approach to logging in:
- In your shell program (e.g. bash or terminal), type
ssh -X [email protected]
(replacing "your_netid" with your UW netid). - It will prompt you for a password. In the shell, it will not show anything as you type a password, but it is processing what you type.
- Then it will prompt you to enter your PRN. Hold down the power button on your entrust token until you get a number. Enter that number into the shell. This should complete the authentication process and connect you to Hyak.
- In your shell program (e.g. bash or terminal), type
-
A more streamlined approach
- Set up an ssh Config file so you can use short logins at the terminal. In the shell program on your local computer, type:
cd .ssh ls
- Look for a file called
config
.- If it doesn't exist, type
touch config
.
- If it doesn't exist, type
- Edit the
config
file:vi config
.- Type
i
to get into interactive mode, and then add this to the file (don't include <> when you enter your user name):
Host hyak hyak.washington.edu User <your hyak username> HostName hyak.washington.edu ControlPath ~/.ssh/master-%r@%h:%p ControlMaster auto ControlPersist yes Compression yes Host union User <your csde account name> HostName union.csde.washington.edu Host libra User <your csde account name> HostName libra.csde.washington.edu Host nori User <your csde account name> HostName nori.csde.washington.edu
- To save and close
vi
, press theesc
key then type:wq
.
- Type
- Now in the CLI window, you can log onto Hyak with just
ssh hyak
. It will prompt you for your UW netid password and the PRN from your entrust token. - Once in Hyak, set up the config file there to easily switch between Hyak and CSDE linux nodes. Follow the steps above to move to the
.ssh
directory and open or create aconfig
file to edit withvi
. In the editor, enter these lines:Host union User <your csde account name> HostName union.csde.washington.edu Host libra User <your csde account name> HostName libra.csde.washington.edu
- Set up an ssh Config file so you can use short logins at the terminal. In the shell program on your local computer, type:
-
To bypass having to enter your password every time you access the CSDE cluster, do the following:
- Type this code on the command line in your local machine (to log out of Hyak type
logout
on the command line, or open a new shell window):ssh-keygen -t rsa
. If you are prompted for a file name or password, hit enter until a public key is created. - Add that key into your authorized key file on union:
ssh union mkdir -p .ssh cat .ssh/id_rsa.pub | ssh union 'cat >> .ssh/authorized_keys'
- Then log in to Hyak and type this:
ssh-keygen -t rsa cat .ssh/id_rsa.pub | ssh union 'cat >> .ssh/authorized_keys'
- Test the passwordless login between local-to-CSDE and Hyak-to-CSDE.
- This will save time when transferring files between Hyak and CSDE to your local computer. (NOTE: If you have a Mac computer, you don't really need to use the CSDE servers for anything - you could just use R or RStudio on your local computer for basic estimation and analysis and then submit your runs to Hyak through terminal. However, there are added benefits to using the CSDE RStudio server: the server backs everything up regularly, and you can have code running even when your laptop is closed and you are off to lunch. For Windows users, the transfer of files from the local computer to Hyak is more cumbersome, so it's easier to use the CSDE servers.)
- Type this code on the command line in your local machine (to log out of Hyak type
-
If you have trouble connecting to Hyak, check out the Hyak wiki - the issue may be with your DHCP configuration or with your shell, and the wiki has instructions for how to troubleshoot.
- Aliases in Linux are command shortcuts.
- Log in to Hyak. Aliases are stored in a file called either
.bashrc
or.bash_profile
in your home directory.- Look at what's currently in it
cat .bashrc
. If it says "no such file", try with.bash_profile
. - Use
vi
to edit the file and add aliases such as those below. These are examples, so you can add new ones that are relevant to your project.-
alias gsr='cd /gscratch/csde/your_username'
: This changes the directory to your local gscratch folder (where you should save your simulation results). You can name that alias anything you want and you can point it to any folder you want (you should change it so that "your_username" is replaced with your CSDE username). -
alias shell='srun -N 1 -p csde -A csde --time=24:00:00 --mem=50G --pty /bin/bash'
andalias build='srun -p build --time=3:00:00 --mem=20G --pty /bin/bash'
These aliases start up jobs on the interactive node and build node (2 of Hyak's 4 types of nodes), respectively. -
alias transdat='scp /gscratch/csde/your_username/project/*.rda libra:~/project/data'
andalias transscr='scp libra:~/project/*.[Rs]* /gscratch/csde/your_username/project/'
: The scp (secure copy) command transfers files safely between hyak and outside computers. The first of these commands transfers data from Hyak to CSDE (and, specifically,.rda
files from that specific folder on Hyak to that specific folder on CSDE's libra node). The second transfers scripts from CSDE to Hyak (specifically any files ending with.R
or.s
from that specific folder on libra to that specific folder on Hyak). -
alias myq='squeue -u <userid>'
: This checks what is running in your queue. Replace with your userid. -
alias lspack='. /suppscr/csde/<userid>/spack/share/spack/setup-env.sh'
: This loads the spack environment on Ikt. Replace with your userid. For more on spack: guide to building the latest version of R, Hyak wiki guide to Spack
-
- To test if these worked correctly, type
logout
and then login again and typealias
. - Similar aliases can be defined for your local computer.
- If you are transferring a lot of files between Hyak and your local computer, log on to Hyak in one session, then open another session to do the file transfers otherwise you'll have to enter your password and keygen number each time.
- Look at what's currently in it
- Log in to Hyak. Aliases are stored in a file called either
-
Using the build nodes to load R packages from CRAN
- To use Hyak for R computing or analysis, you will need to load your R library onto Hyak. The first step is to log on to a build node by typing
build
(if you've defined a "build" alias, as described above). The software is loaded in "modules". To see the modules available on Hyak, typemodule available
.- Load the R module (you can specify the version number to load more up-to-date versions of R, which are usually better if they are listed as available):
module load r_3.2.5
. - NEW: To build the latest version of R, use Spack
- Type
R
in the command line to start R. Then start installing packages interactively on CRAN as you would in your R or RStudio command line. Start with EpiModel and it's dependencies:install.packages("EpiModel", dependencies=TRUE)
. If it worked, you should see no error messages during the installation. You will most likely get a message asking if you want a personal library, and the answer is yes. - Type
q()
to exit R.
- Load the R module (you can specify the version number to load more up-to-date versions of R, which are usually better if they are listed as available):
- To use Hyak for R computing or analysis, you will need to load your R library onto Hyak. The first step is to log on to a build node by typing
-
Installing packages from Github
- Since many packages are not hosted on CRAN, only on Github, we need to take a different approach to install them. Some packages are in private repositories, so you won't be able to install them unless you get access from the owner. There are some issues with using
devtools
running correctly on Hyak, so we use this alternative strategy:-
Use
vi
to open.bashrc
or.bash_profile
and pressi
to enter interactive mode (this process is explained above in step 5). -
Paste the following function into the file:
installgit() { module load r_3.2.5; wget -q https://github.com/$1/$2/archive/master.zip; unzip -q master.zip; R CMD INSTALL "${2}-master/$3" rm -r "${2}-master" rm -r master.zip }
-
Then save and close the file by typing
esc
and then:wq
. This function will download any R package in a Git repository and install it for you. To do this, typeinstallgit <owner> <repository> <subdirectory>
into the command line. For example to get the Github version of EpiModelHIV, we'd typeinstallgit statnet EpiModelHIV
. In this case we don't need to specify a subdirectory because the package is in the root directory of the repository. But with something liketergmLite
, you'd need to specify the subdirectory:installgit statnet tergmLite tergmLite
.
-
- Since many packages are not hosted on CRAN, only on Github, we need to take a different approach to install them. Some packages are in private repositories, so you won't be able to install them unless you get access from the owner. There are some issues with using
- Set up your scripts
- To run simulations on Hyak, you need an R script that sets up the simulation. An example R simulation script is:
library("methods")
suppressMessages(library("EpiModelHIVmsm"))
library("EpiModelHPC")
args <- commandArgs(trailingOnly = TRUE)
simno <- args[1]
jobno <- args[2]
fsimno <- paste(simno, jobno, sep = ".")
print(fsimno)
load("est/nwstats.rda")
param <- param_msm(nwstats = st)
init <- init_msm(nwstats = st)
control <- control_msm(simno = fsimno,
nsteps = 52 * 50,
nsims = 16,
ncores = 16,
save.int = 500,
save.network = TRUE,
save.other = c("attr", "temp"))
netsim_hpc("est/fit.rda", param, init, control,
save.min = TRUE, save.max = FALSE, compress = "xz")
- You will also need a shell script, as in the following example, titled
runsim.sh
:- This s a PBS (Portable Batch System) jobscript, which contains instructions for the scheduler, sets up the working environment, and executes your production program. Instruction lines start with #PBS, and they tell the scheduler what your program is, how many nodes it uses, and how much memory you need. The Standard specs lines set a specific issue of memory utilization. Make sure the current version of R used to build packages is loaded. The final line runs the R script in batch mode, pulling from variables for SIMN and PBS_ARRAYID and passing them into the R script as necessary.
#!/bin/bash ###========== User specs / instructions for the scheduler ==========### ## Name the job #PBS -N sim$SIMNO ## This line tells the scheduler we want 1 node, 16 processing cores per node, we only want 16 core nodes, and we want 5 hours of runtime #PBS -l nodes=1:ppn=16,mem=44gb,feature=16core,walltime=05:00:00 ## This defines where the output should go. Change the filepath to your target gscratch folder #PBS -o /gscratch/csde/camp/out ## This defines where the standard error should go. Again, change the filepath to your target folder. #PBS -e /gscratch/csde/camp/out ## This says you want the standard output and standard error to be combined into the out file. #PBS -j oe ## This defines the working directory for the job. Again, change the filepath to your target folder. #PBS -d /gscratch/csde/camp ## This defines whether email should be sent with job status. If "n", no mail sent. Type "abe" if want email when the job is aborted (a), begins (b), or terminates (e). #PBS -m n ###========== Standard specs ==========### HYAK_NPE=$(wc -l < $PBS_NODEFILE) HYAK_NNODES=$(uniq $PBS_NODEFILE | wc -l ) HYAK_TPN=$((HYAK_NPE/HYAK_NNODES)) NODEMEM=`grep MemTotal /proc/meminfo | awk '{print $2}'` NODEFREE=$((NODEMEM-2097152)) MEMPERTASK=$((NODEFREE/HYAK_TPN)) ulimit -v $MEMPERTASK export MX_RCACHE=0 ###========== Define the working environment (load modules) ==========### module load r_3.2.4 ###========== Execute your program ==========### ### App ALLARGS="${SIMNO} ${PBS_ARRAYID}" echo runsim variables: $ALLARGS echo ### App Rscript sim.R ${ALLARGS}
- Transfer files
- To transfer the scripts from CSDE to Hyak to run them, first login to Hyak and make sure you are on a login node to transfer data.
- If you haven't done so already, create a folder for your project in the gscratch directory. If you're using CSDE scratch directories, type
cd /gscratch/csde
from the home directory in Hyak. Then make a new folder:mkdir <your_username>
. You can create additional folders within this directory for output or whatever else you need. - Then you can use the alias
transscr
to transfer the scripts. Make sure the code in the alias points to the correct directory to/from which you want the files transferred. If you want to transfer an entire folder, use the option-r
afterscp
. You can also use wildcards (described in the Unix section) to transfer all files that start with "run", for instance.
- If you haven't done so already, create a folder for your project in the gscratch directory. If you're using CSDE scratch directories, type
- Submit your jobs
-
The next step is to submit the jobs with the job manager command
qsub
. There are many arguments and options for qsub, but the bare minimum are as follows:qsub -t 1 -v SIMNO=1 runsim.sh
.- The
-t
parameter invokes an array job, in which there is just one sub-job in this case. Array jobs will be useful to run the same parameter set multiple times. This is an alternative to running one big job with lots of simulations that works better with the backfill queue. - The
-v
parameter passes environmental variables down into therunsim.sh
shell script. In this case,SIMNO
will be the primary simulation ID number, and passingSIMNO=X
means that runsim.sh will execute a script calledsimX.R
. In combination with the array ID, the unique ID of this simulation job would be 1.1, which will be used in the file name to save out the file.
- The
-
For a second simulation, we may want to submit a series of 7 subjobs to the backfill queue using the following syntax:
qsub -q bf -t 1-7 -v SIMNO=2 runsim.sh
.- Backfill jobs execute on idle nodes throughout the cluster. They could be interrupted if the owner of the nodes needs them or every 4 hours, so if you run jobs on backfill: 1) be sure to frequently save intermittent results onto the disk and don't recalculate them after a new start, and 2) you can output the start time of jobs to track how much processor time the job has had. If you're using EpiModelHPC, this will happen automatically, but if you want to learn more about using the backfill queue, read this page.
- Note the
runsim.sh
file doesn't need to be changed. The default-q
argument isbatch
, which submits jobs only to CSDE nodes, whereas-q bf
submits them to the backfill. - For backfill jobs, it is advisable to set the runtime to 4 hours or 4:01:00, since you will be booted off at 4 hours anyways.
-
The "-l" argument allows you to overwrite the job specifications in the PBS file. For instance,
qsub -q bf -t 1-7 -v SIMNO=2 -l walltime=04:00:00 runsim.sh
runs the script "runsim.sh" as a backfill job and tells the scheduler that it needs a walltime of 4 hours. To estimate walltime the first time you run a simulation, a rule of thumb is that for a network size of 10k, you will need 5 seconds per network simulated per time step. So if you have a simulation of 3 networks and 50 years of weekly time steps, you would need 5x3x52x50 = 11 hours. -
Any
qsub
arguments can be placed together in a bash shell for easier execution and historical record. You could create a file calledmaster.sh
that contains the following. To execute this file and run the two commands, you would typebash master.sh
.#!/bin/bash qsub -t 1-7 -v SIMNO=1 runsim.sh qsub -q bf -t 1-7 -v SIMNO=2 runsim.sh
-
If you are a member of more than one group (i.e. CSDE and HPC club), specify which group's nodes you want to use with the group_list argument:
qsub -W group_list=hyak-groupname script.pbs
. To see the groups you're a member of, run the commandgroups
.
- Transfer data from Hyak to CSDE
- By default, our R script saves a series of R data files within the
data
subfolder of our shared drive on Hyak. For data analysis, we will transfer the minimum files to CSDE for interactive analysis using thetransdat
alias.
- Monitor your jobs
- Once a job is submitted, it will show up in the job queue. It will move from eligible to active if a node is free to run it, and it will stay as active until it completes, errors, or is terminated.
- To check the status of your jobs, type
checkjob <job_id>
orqstat -f <job_id>
. You can also typeshowq -w user=<userid>
to show the status of all your jobs. Add option-n
to see the job listed by the job name instead of the job ID. To see additional information on where the jobs are running, use the-r
parameter. You can create an alias for displaying your job status with extra information:alias myq='showq -u <userid> -r'
. - To look at the qstatus of all jobs, type
showq
, to se jobs in the backfill queue, typeshowq -w class=bf
. To see the jobs in teh queue for a specific group, typeshowq -w quos=<groupname>
. There are additional options such asgroupname-bf
for backfill in the group, orgroupname-int
for interactive. - To look at the status of the nodes in your group (i.e. CSDE), use
nodestate <groupname>
. - To cancel a job, type
qdel <job_id>
ormjobcts -c <job_id>
. - To hold a job from running until you release it, type
mjobctl -h <job_id>
. To release a held job, typemjobctl -u <job_id>
. - To change a job class, for example if backfill jobs are taking a long time to initiate and you want them to run in batch mode, type:
mjobctl -m class=batch -m account=csde -m qos=csde <job_id>
. To change from batch to backfill, type:mjobctl -m class=bf =m account=csde-bf -m qos=csde-bf
. The job_id should correspond to what you see when you runshowq
.- You can also create an alias for this:
alias tobatch='mjobcts -m class=batch -m account=csde -m qos=csde
oralias tobf='mjobctl -m class=bf =m account=csde-bf -m qos=csde-bf'
. Then when you execute this alias, typetobatch <job_id>
ortobf <jobid>
. - If you are using array jobs, the job id would look like this:
job_id[#]
with the number indicating the job in the array. To change the mode fo all array jobs with a given master job ID, use:tobatch x:<jobid>.*
.
- You can also create an alias for this:
- To check the status of your jobs, type
- Each Hyak group (i.e. CSDE) has a shared scratch directory under /gscratch/groupname which is shared among all Hyak nodes. The CSDE directory is
gscratch/csde
. Gscratch is not backed up. These directories have quotas based on the number of nodes owned by the group. To see the quota for CSDE, typemmlsquota -j hyak-csde gscratch --blocksize G
. - Every Hyak user also has a home directory that is private. You can store ssh keys or login scripts here. Don't store large code files or do computation here, as this system is small and slow. These files are also not backed up. To check your home directory quota, type
mmlsquota home --block-size G
. - There is also a file directory in each node. These are accessible only to the local node, and are cleaned up after each job is complete. To keep data on the local scratch disk, creat a directory with your group name, i.e.
scr/csde
. - There is a long-term storage system called lolo archive. This is intended for storage of data that you aren't changing frequently, and it should never be used for any interactive access. It is also not intended for storing many small files - if you have small files, collect them into tar files before transferring to the archive. The Archive quota limits you to 10.000 files/TB - an average file size of 100MB, but it performs best with files >1MB.
- CSDE has storage on lolo. To move files here, first create a directory with your name in the lolo folder:
cd /lolo/archive/hyak/csde
, thenmkdir <your_username>
. Then, from the directory of the file you want to move, typecp filename /lolo/archive/hyak/csde/<your_userid>
.
- CSDE has storage on lolo. To move files here, first create a directory with your name in the lolo folder:
- Lastly, there is a lolo collaboration filesystem for sharing data with collegues and staging data for later transfer to gscratch.