Getting Started - icl-utk-edu/cluster GitHub Wiki

If you are unfamiliar with computing in a distributed memory environment, you will want to take a few steps to prepare yourself for cluster computing.

1. Logging into ICL systems

To use any ICL systems, you will first need to obtain an account by submitting an account application. Once you have received the verification email, you will be able to log into any of the available systems via SSH.

2. Using the Shell

Once logged into a system, you will be presented with a bash shell running under the Linux operating system. If you are unfamiliar with bash, you should read Introduction to using Bash (external source).

3. Editing files

In order to create and modify files, you will need to familiarize yourself with one or more of the following text editors:

4. Loading Modules

Many of the software packages and development libraries you will want to use are accessed via the module system, which provides a convenient interface for managing the shell environment as needed to access each package. Modules are explained in more detail in Software Modules.

5. Where to Store Files

There are three classes of disk storage on the cluster: home directory, scratch, and project-associated mass storage. The most appropriate choice for which type of storage to use varies from one task to another. To learn more about the different storage types, read Cluster Storage.

6. Installing Software

Most software is installed by compiling the code from source. You may request installation of packages; these requests will be fulfilled when the system maintainers deem the software in question to have a sufficiently wide user base. Software for personal use may be installed in your home directory.

7. Using the Batch System

Compute jobs may be executed interactively or via the Slurm job queue system. The job queue is tasked with distributing your compute jobs to the worker nodes of the cluster. The clusters support both serial jobs (one instance of a program running on one CPU of the cluster) and parallel jobs (a program which runs over more than one CPU of the cluster). Parallel jobs can use the low latency Infiniband interconnect for inter-process message passing.