Large File Transfers - nthu-ioa/cluster GitHub Wiki

This page describes the best practice and suggestions for when you are moving more than 100GB of data:

Between filesystems on the cluster (e.g. from /data1 to /data);
Into or out of the cluster.

The two most important recommendations are:

Log on to s01 and issue all data-moving commands from that node;
Use rsync whenever possible (not scp, sftp or cp, even within the cluster; never use mv for large transfers).

The reason for using s01 is that it is connected directly to the disks. Using any other node will result in your data doing an extra round-trip across the network for no good reason.

Since large transfers may take a long time, you might want to run them under a screen session on s01.

rysnc is the right tool in almost every case. It only copies what it needs to copy, it can be stopped and restarted part-way through a transfer, and it checks that the transfer succeeded. It can compress data and report on its progress. Read about how to use it.

A typical useful set of rsync options is rsync -avz --partial --progress source dest. You should know what these options do and whether they are appropriate for you. For internal transfers, -z (compression) may not be necessary.

Note that rysnc has an option --bwlimit to set a maximum bandwidth. Sometimes this will help you as well as other users.

The maximum bandwidth of our connection to the outside world is 1 GB/s, but it is highly unlikely a single off-site transfer will be able to use all of that.

It is an extremely bad idea to use the mv command to move large amounts of data within the cluster. If the command doesn't complete (session closes, network glitch etc.) you will lose all your data.

If you want to move massive amounts of data (>10TB) please let the admins know. Please try to run large transfers at times when the network is quiet (nights and weekends).

[!CAUTION] Do not run anything apart from data management jobs on s01. If you crash this node the consequences will be serious.