Hive - cogcommscience-lab/lab-docs GitHub Wiki

The Hive cluster replaced the Peloton cluster. It is a modern cluster with many helpful features. This page will continue to be updated as our lab begins migrating from Peloton to Hive.

Where to Get Help

If you are stuck and need help, check out the HPC Documentation. This extensive resource can help you troubleshoot many issues you might encounter. These links are also useful:

How to Access Hive HPC

How to Make an Account on Hive

Any faculty member, grad student, or other researcher may sign up for an account on the new Hive computing cluster, which is managed by the High Performance Computing Core Facility (HPC CF) under the Office of Research. There is a generous free tier of service on Hive that is available to anyone. This is managed on the publicgrp.

Our department has purchased compute resources on HIVE. To sign up for an account on Hive and access department compute resources, follow these steps:

  1. Visit (https://hippo.ucdavis.edu/) and log in with your UC Davis credentials.
  2. Select the HIVE cluster
  3. Select rwhuskeygrp as your sponsor. This is the tier paid for by the Department.
    • You will need to make a SSH key. Follow these steps to make one.

You will receive an email once your account is provisioned.

IMPORTANT: If you already have an account on Hippo, but for a different research group (e.g., publicgrp) you will need to request access to another group on a cluster.

IMPORTANT: All files you store in rwhuskeygrp will be available to anyone with access to rwhuskeygrp. You probably want to restrict access. To do that, see below:

How to Make a Restricted Access Group on Hive

Lets say you don't want everyone in the department to have access to your data and/or code. Makes sense. You can create a restricted access group. IMPORTANT: Restricted access groups should be at the faculty member level. Graduate students should not have their own restricted access groups, but can be in the restricted access group of a faculty member.

To make a restricted access group on Hive, please follow these steps:

  1. Identify that a new user group is required. Contact Richard Huskey and provide him with the following details:
    • Your name
    • Your email
    • A name you want to give your group. NB: your group name must end in grp e.g., rwhuskeygrp or ccsl-grp
  2. Richard will open a ticket with campus HPC [email protected] to request that your new group is created and shows up in HIPPO. The request will include the following language: "Please make [this new PI] the only approver for this new group. Please allow this new group to access my Slurm resources."
  3. New group is created, with you as the approver who grants access to the group. That group will get a new directory on HIVE where data in that directory is restricted to that group, and the directory can store as much data as the Department has purchased (currently 20tb). Please do not use all 20tb, this is shared across the entire department.
  4. If you want to grant access to the group (e.g., for a graduate student advisee), the person who wants access to the group will need to submit a request on HIPO for the newly created group:
  5. Access is approved.
  6. New user gets access to new group.

How to Sign Into Hive

There are two primary ways to access Hive:

  1. Command Line: Logging in via SSH to hive.hpc.ucdavis.edu, and log in with your UC Davis credentials.
    • To log in from a terminal: $ ssh <username>@hive.hpc.ucdavis.edu
    • Be sure to replace with your kerberos name
    • This requires you to have set up a ssh key
  2. Open OnDemand* (GUI; web-based): Visit (https://ondemand.hive.hpc.ucdavis.edu/), and log in with your UC Davis credentials.

*Open OnDemand is an exciting new service that makes many computational research tools, backed by a high-performance computing cluster, available directly in your web browser. Currently, you can access JupyterLab (Python environment), RStudio (R environment), and a standard Linux Desktop environment.

Where do your data live on Hive?

Data live in two key areas on Hive: * All users get free 20G of storage in their home directory. This is located at /home/username/ * Our department has 20TB of storage, which can be accessed at your group directory, e.g., /quobyte/rwhuskeygrp/ * Please do not use all 20tb, this is shared across the entire department.

How to restrict access to your data on Hive

In most cases, the best way to restrict access to your files is by making a restricted access group. But, in some cases, you may want to store files in rwhuskeygrp. Notably, files in rwhuskeygrp are accessible to the entire department. This means any user with access to rwhuskeygrp has access to your data and code unless the necessary steps are taken. To restrict access to new directories you create within rwhuskeygrp (and the data within those directories), you will need to chmod 700 that directory.

IMPORTANT: chmod is a bash command that is executed in a unix shell. If you do not know the bash programming language, or the unix shell, then you will want to take this two-hour long Unix Shell trainingfrom Software Carpentry. Richard has also compiled a shell basics cheat-sheet that you can reference. Finally, chmod changes file and directory permissions. If you don't know what those are, check out Linux file permissions explained.

Slurm and job scheduling

Want to schedule jobs using Slurm and the command line? To do that, you will need to add your group, e.g., --account=rwhuskeygrp to your srun/sbatch commands. Not sure how to use Slurm? Check out this handy guide.

How many resources are available?

The Department of Communication purchased 48 compute cores and 768 gigs of RAM. As a department, that means we have priority access when requesting these resources.

⚠️ **GitHub.com Fallback** ⚠️