Hive - cogcommscience-lab/lab-docs GitHub Wiki
The Hive cluster replaced the Peloton cluster. It is a modern cluster with many helpful features. This page will continue to be updated as our lab begins migrating from Peloton to Hive.
If you are stuck and need help, check out the HPC Documentation. This extensive resource can help you troubleshoot many issues you might encounter. These links are also useful:
- For common issues and questions, see the HPC FAQ
- To learn about Linux commands and scripts, see the HPC Help Docs
- Data Science Training tutorials, see this GitHub Repo
- Need more help. Open a ticket by emailing [email protected]
Any faculty member, grad student, or other researcher may sign up for an account on the new Hive computing cluster, which is managed by the High Performance Computing Core Facility (HPC CF) under the Office of Research. There is a generous free tier of service on Hive that is available to anyone. This is managed on the publicgrp
.
Our department has purchased compute resources on HIVE. To sign up for an account on Hive and access department compute resources, follow these steps:
- Visit (https://hippo.ucdavis.edu/) and log in with your UC Davis credentials.
- Select the HIVE cluster
- Select
rwhuskeygrp
as your sponsor. This is the tier paid for by the Department.- You will need to make a SSH key. Follow these steps to make one.
You will receive an email once your account is provisioned.
IMPORTANT: If you already have an account on Hippo, but for a different research group (e.g., publicgrp
) you will need to request access to another group on a cluster.
IMPORTANT: All files you store in rwhuskeygrp
will be available to anyone with access to rwhuskeygrp
. You probably want to restrict access. To do that, see below:
Lets say you don't want everyone in the department to have access to your data and/or code. Makes sense. You can create a restricted access group. IMPORTANT: Restricted access groups should be at the faculty member level. Graduate students should not have their own restricted access groups, but can be in the restricted access group of a faculty member.
To make a restricted access group on Hive, please follow these steps:
- Identify that a new user group is required. Contact Richard Huskey and provide him with the following details:
- Your name
- Your email
- A name you want to give your group. NB: your group name must end in
grp
e.g.,rwhuskeygrp
orccsl-grp
- Richard will open a ticket with campus HPC [email protected] to request that your new group is created and shows up in HIPPO. The request will include the following language: "Please make [this new PI] the only approver for this new group. Please allow this new group to access my Slurm resources."
- New group is created, with you as the approver who grants access to the group. That group will get a new directory on HIVE where data in that directory is restricted to that group, and the directory can store as much data as the Department has purchased (currently 20tb). Please do not use all 20tb, this is shared across the entire department.
- If you want to grant access to the group (e.g., for a graduate student advisee), the person who wants access to the group will need to submit a request on HIPO for the newly created group:
- New Users: Follow How to Make an Account on Hive steps above. Noting that, at Step 3, select the newly created group.
- Users who already have an account on HIVE, follow these steps to request access to another group on a cluster.
- Access is approved.
- New user gets access to new group.
There are two primary ways to access Hive:
- Command Line: Logging in via SSH to hive.hpc.ucdavis.edu, and log in with your UC Davis credentials.
- To log in from a terminal:
$ ssh <username>@hive.hpc.ucdavis.edu
- Be sure to replace with your kerberos name
- This requires you to have set up a ssh key
- To log in from a terminal:
- Open OnDemand* (GUI; web-based): Visit (https://ondemand.hive.hpc.ucdavis.edu/), and log in with your UC Davis credentials.
*Open OnDemand is an exciting new service that makes many computational research tools, backed by a high-performance computing cluster, available directly in your web browser. Currently, you can access JupyterLab (Python environment), RStudio (R environment), and a standard Linux Desktop environment.
Data live in two key areas on Hive:
* All users get free 20G of storage in their home directory. This is located at /home/username/
* Our department has 20TB of storage, which can be accessed at your group directory, e.g., /quobyte/rwhuskeygrp/
* Please do not use all 20tb, this is shared across the entire department.
In most cases, the best way to restrict access to your files is by making a restricted access group. But, in some cases, you may want to store files in rwhuskeygrp
. Notably, files in rwhuskeygrp
are accessible to the entire department. This means any user with access to rwhuskeygrp
has access to your data and code unless the necessary steps are taken. To restrict access to new directories you create within rwhuskeygrp
(and the data within those directories), you will need to chmod 700
that directory.
IMPORTANT: chmod
is a bash command that is executed in a unix shell. If you do not know the bash programming language, or the unix shell, then you will want to take this two-hour long Unix Shell trainingfrom Software Carpentry. Richard has also compiled a shell basics cheat-sheet that you can reference. Finally, chmod
changes file and directory permissions. If you don't know what those are, check out Linux file permissions explained.
Want to schedule jobs using Slurm and the command line? To do that, you will need to add your group, e.g., --account=rwhuskeygrp
to your srun/sbatch commands. Not sure how to use Slurm? Check out this handy guide.
The Department of Communication purchased 48 compute cores and 768 gigs of RAM. As a department, that means we have priority access when requesting these resources.