Tutorials ‐ R Studio Server - uwsph/hpcusers GitHub Wiki
This quick tutorial aims provide an overview of the complete process from connecting to the cluster, to having an application running on it. For this tutorial, we'll be using R Studio Server, as it's popular software application amongst our users.
Requirements
For this tutorial you'll need the following:
- An account on the SPH HPC environment
- An up to date version of Chrome, Edge, or Firefox
- Husky OnNet VPN (Big IP Edge Client)
Connecting
- If not already installed. You'll need the Husky OnNet VPN client. For SPH provided computers, this should already be installed. For non-SPH systems, you'll need to download the installer from UW-IT
- TODO add VPN setup....
Once connected to the VPN, using your browser, go to https://ondemand.hpc.sph.washington.edu/. If you have an existing UW Weblogin session, it should login automatically.
Transferring Data
For this tutorial, we won't go in depth on transferring data to / from the cluster. With the cluster you have several options for moving your data, largely dependent on where it currently exists. In most cases, you'll want to use SFTP or if you only have a couple files, you could use the file manager within OnDemand.
Starting an Interactive Application
When working on a computing cluster, it's important to consider the size of your dataset, the operations you'll be performing on it, and the number of computing cores needed. Unlike other systems, you have to ask the scheduler (Slurm) for an allocation of resources. The scheduler uses that allocation request to find a computing node with the available capacity to meet your need. For this demonstration, we'll assume the dataset is about 4GB in size, we'll be generating multiple subsets of the data (increases memory required) and our needs are limited to a single core of computing power.
- From the OnDemand dashboard, locate the "Interactive Apps" menu, and select "RStudio Server"
- You'll be presented with a form asking for container and resource information. For this tutorial, use:
Setting | Value | Meaning |
---|---|---|
Apptainer Image | /projects/demo/apptainer/rocker_geospatial_4.4.1.sif | Container image containing R, R Studio, and various packages |
Number of cores | 2 | Total number of CPU cores your job can access |
Memory [GiB] | 8 | Total amount of RAM your job can utilize |
Number of hours | 2 | Maximum runtime. Your job will stop automatically when this is reached. |
Partition to run in | 12c128g | The hardware type for your job (you may need to change this based on what you have access to |
Account to run job under | sph | This is the account that sponsors your cluster access (you may need to change this) |
- Click on the "Launch" button, to request a session
- You should see a list of jobs, with yours at the top. When it turns green, you should have a "Connect to RStudio Server" button. Click it.
- A new browser tab should open, taking you to a familiar R Studio workspace.
Installing R Packages
The R Studio container has a large number of commonly requested R packages. In this case, the container has a selection of packages from the Tidyverse, as well as common geospatial packages. However, you're not limited to the packages built into the container. You can easily install new ones, if they don't require additional system libraries. Let's try installing the package "here".
- In the lower or possibly left portion of the screen, should be an R console.
- At the console, enter the following:
install.packages('here')
- If you receive a prompt to select a mirror, choose "0-Cloud" (it'll be near the top of the list).
- As the install runs, you should see something like the following:
> install.packages('here')
Installing package into ‘/home/users/jtyocum/R/x86_64-pc-linux-gnu-library/4.4’
(as ‘lib’ is unspecified)
trying URL 'https://p3m.dev/cran/__linux__/jammy/latest/src/contrib/here_1.0.1.tar.gz'
Content type 'binary/octet-stream' length 53480 bytes (52 KB)
==================================================
downloaded 52 KB
* installing *binary* package ‘here’ ...
* DONE (here)
The downloaded source packages are in
‘/tmp/RtmpHqPOOW/downloaded_packages’
Ending your session
Once your done, it's a good idea to properly exit your session. This will free up the computing resources immediately, allowing others to use them.
- In the upper right corner, click on the red button. When you mouse over it, you'll see a message about ending your session.
- Your browser should have returned to the OnDemand tab, title "My Interactive Sessions". If it didn't, please go to that tab.
- Click on the red "Delete" button for the R Studio session. This will terminate the session, and free up the resources.