Home - umccr/aws_parallel_cluster GitHub Wiki
UMCCR's AWS Parallel Cluster
Welcome to the AWS Parallel Cluster wiki!
- Getting started
- Running parallel cluster
- Staging Data
- Partitions Overview
- Using slurm
- Using toil
- Using cromwell
- Installing new software on the cluster
- Walkthroughs
- Troubleshooting
- Development
- Contributions
- Useful links
Getting started
You will need to go to the releases page to download the latest version.
Head to the installation page for more information on prerequisites, installing parallel cluster.
Running parallel cluster
Activate your conda env
conda activate pcluster
Ensure you're logged into aws
aws sts get-caller-identity
Check the account value is as expected.
Start your cluster
Use the --no-rollback
to ensure you can debug any issues with your first cluster.
start_cluster.py \
--cluster-name=my-first-cluster \
--file-system-type=efs \
--no-rollback
This may take around 20 minutes to complete
Head to our parameter options page for more information
Staging Data
Unlike in a HPC environment, your input data is likely not available on disk.
In most cases you will need to 'stage' your data such that is accessible by all nodes.
Head to the shared file system page for more information on data staging.
Partitions Overview
There are four partitions to select from, compute
, copy
, compute-long
, copy-long
. Those with
-long
suffixes are 'on-demand' instances whilst the others are 'spot' instances. We recommend using -long
only for
long running jobs that cannot be restarted.
Head to the partitions page for more information.
Using slurm
Slurm is a HPCphiles' bread and butter. You can use slurm on AWS Parallel Cluster too!
Head to the slurm page for more information on using slurm on AWS.
Using toil
Toil is a maintained HPC/AWS compatible execution engine for CWL. It integrates with slurm to submit batch jobs to complete a workflow step.
Head to the toil page for more information on running your CWL workflow through toil.
Using cromwell
Cromwell is also running on AWS through an integrated slurm backend. Cromwell can execute WDL workflows and submit slurm jobs to complete workflow steps.
Head to the cromwell page for more information on running your WDL workflow through cromwell.
Installing new software on the cluster
You have sudo permissions. You can install anything.
Docker and conda have been setup for you which I would encourage you to use.
Walkthroughs
Head to our walkthroughs page for details guides for running workflow languages through AWS Parallel cluster.
We have walkthroughs in:
- CWL
- WDL
Troubleshooting
This is still very much a process in development. Please see our troubleshooting page for more information.
Development
For understanding this repo in greater detail head to the development page for some lovely diagrams.
Contributions
Our projects board will guide you on what more needs to be done.