Getting Started - icl-utk-edu/cluster GitHub Wiki
If you are unfamiliar with computing in a distributed memory environment, you will want to take a few steps to prepare yourself for cluster computing.
1. Logging into ICL systems
To use any ICL systems, you will first need to obtain an account by submitting an account application. Once you have received the verification email, you will be able to log into any of the available systems via SSH.
2. Using the Shell
Once logged into a system, you will be presented with a bash shell running under the Linux operating system. If you are unfamiliar with bash, you should read Introduction to using Bash (external source).
3. Editing files
In order to create and modify files, you will need to familiarize yourself with one or more of the following text editors:
- nano: a simple terminal mode text editor
- vim: a text editor for programmers with advanced regular expression and automated text editing capabilities
- emacs: a popular macro-based text editor with terminal and X11 graphical modes
4. Loading Modules
Many of the software packages and development libraries you will want to use are accessed via the module system, which provides a convenient interface for managing the shell environment as needed to access each package. Modules are explained in more detail in Software Modules.
5. Where to Store Files
There are three classes of disk storage on the cluster: home directory, scratch, and project-associated mass storage. The most appropriate choice for which type of storage to use varies from one task to another. To learn more about the different storage types, read Cluster Storage.
6. Installing Software
Most software is installed by compiling the code from source. You may request installation of packages; these requests will be fulfilled when the system maintainers deem the software in question to have a sufficiently wide user base. Software for personal use may be installed in your home directory.
7. Using the Batch System
Compute jobs may be executed interactively or via the Slurm job queue system. The job queue is tasked with distributing your compute jobs to the worker nodes of the cluster. The clusters support both serial jobs (one instance of a program running on one CPU of the cluster) and parallel jobs (a program which runs over more than one CPU of the cluster). Parallel jobs can use the low latency Infiniband interconnect for inter-process message passing.