The High Profile Computer at UNBC - Michael-D-Preston/PrestonLab GitHub Wiki
Introduction
Hi! Genome Quebec gave us a bunch of data and my baby little computer isnt cool enough to process it all in a meaningful amount of time so I (we) need something a bit bigger...
quick reference to how I'll format this document
- This is the step
This is the code you'll run, note the copy button --->
This is the output
- Step 2
- These are bonus facts that I want to say,
- or sub steps
Getting HPC access
You'll need UNBC high profile computer (HPC) access, go here and request access
Getting into the HPC
First and foremost, be in the UNBC wifi - I know it sucks but all this won't work unless you are. What if you're not at the university you say!? well load up a unbc desktop on a VPN and then follow along.
Download PuTTY on your computer. This is a program that allows us to talk into the HPC computer In PuTTY under host name enter 'username'@klinaklini.unbc.ca and click open
How familiar are you with the nervous system of a cephalopod? Octopi have 9 brains, a central one and then one for each arm. You have just entered the central brain ['name'@klinaklini]$. It's bad form to run commands or download stuff in the main brain Dont import files or run commands in the central ssh. So we want to go to one of the arm brains. Type in the command prompt:
ssh compute1
['name'@compute1]$
to go to arm brain 1 (There are 16 compute clusters total. Sweet! What if you don't want to be in compute1?
exit
And if you want to leave the HPC, again just type
exit
Depending on whos online with you, different compute's will be busy type
sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
defq* up infinite 1 down* compute3
defq* up infinite 15 idle compute[1-2,4-16]
to see if any of the compute clusters are active (and then don't choose to go to that one) All of the clusters are idle rn for me
So I will go to compute1 for the next couple things.
Penguins and moving around
You are on a linux cluster, its a different operating system like windows or mac, and you only get command line functionality. I'm gonna assume you know the common linux commands to move around and create files but if not here you go, and yknow what just do this tutorial but honestly this tutorial is really good too and has alot of indepth commands
So where are you? (this lists the current active directory)
pwd
/home/'name'
You are in the home directory, and just like being in the central brain, you don't want to store or do stuff here. Go to your research folder! (cd 'directory' moves you to a new directory, uhh go do a quick bash intro class to find out how to effectively use this command)
cd /data/researchHome/'name'/
any cool files in here? (this lists files)
ls
:(
Nothing! ugh
Side point. Copy and paste wont work in the command line prompt, you instead copy normally, but right click to paste. Also highlighting text in putty copys it
how to import files into your HPC folder so you can do cool stuff
You can download WinSCP to move files into the HPC. I haven't done this so idk how it works, or you can use in windows built in functionalities. Open windows folders, go to "this PC", there are three dots click em, and then click "map network drive", in folder write
\\research-files.unbc.ca\researchHome\'username'
Finish Now you should be able to put files here and then see them in your research space!
On a mac? (boo) Follow the directions under mount 'researchHome' locally Here
Running R
Finally time to run R, and know we are getting old school command line R, not any cringe R studio kinda stuff. to run R just... type "R"
R
output:
R version 4.4.1 (2024-06-14) -- "Race for Your Life" Copyright (C) 2024 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu
R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.
And you're in! you can now act like this is R (because it is!) I'll see you next episode when I figure out how to run R and the data2 pipeline!
PS. really do a quick runthrough of bash commands, if you wanna see how long your process has been running or how many clusters its using, feel free to do:
top
output:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
169285 aball 20 0 20080 4608 3456 R 0.7 0.0 0:00.05 top
1 root 20 0 171308 13452 9216 S 0.0 0.0 0:37.00 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:06.00 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
5 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 slub_flushwq
PID is the task ID
user is uh you
Don't know what PR and PR, NI, VIRT, RES, SHR, S are, but % CPUs is how many cpus youre using! 100% = 1 cpu, 1500% is 15 CPUS!!.
%Mem is memory usage
TIME is how long your program has been running in minutes:seconds.miliseconds
Command is what youre running, you'll probably see R
Note: if you want to see R here you have to run the top command in bash WHILE youre running R. HOW! easy, just load up another instance of putty and login like normal, go to the same cluster and see!
How to download packages
Unfinished
On main compute node, module load r/4.3.0 then R then download see service request ticket