Work on Hyak - Ljia1009/LING573_AutoMeta GitHub Wiki
- To access:
1
finished duo
to your home dir (only 10G storage, avoid storing here)
- Working Directory:
cd /gscratch/stf
mkdir here
this would be your working directory
- Useful commands:
hyakalloc (see the computing resource), sbatch (submit job), salloc (create interactive mode, needed if you want to 'work' on cluster)
squeue -u your_uw_id (check all your submitted jobs)
- Submit job:
interactive job:
hyakalloc (to see your group and computing resource)
salloc -A account(stf) -p ckpt-all(change to the partition you want) -N 1 -c 4 --mem=10G(change) --time=2:30:00(change)
or salloc -p <partition_name>-int -A <group_name> --time= --mem=G
batch job:
- If you work on VsCode: they ban remote-ssh. For other IDEs, probably the same. You may need to use code-server (my method) to use VsCode, but only for editing, since it would be run in a container. Other operations should be done in a separate terminal. To do so:
official guideline: https://hyak.uw.edu/docs/tools/vsc-code-server
cd to your working directory (and) home dir
ln -s /mmfs1/sw/containers/code-server/code-server_4.89.0-39.sif code-server_4.89.0-39.sif
Now you can use the code-server container from your directory
wget https://hyak.uw.edu/files/code-server.job
Edit the job script (code-server.job) using nano or vim, you will find lines below:
#SBATCH --partition=ckpt # update this line by the partition you want to use, can leave it without change
#SBATCH --time=02:00:00 # update this line to change time limit, recommend setting it to be very high
# Set home destination for code-server session<br /> CODER_HOME="/gscratch/scrubbed/UWNetID" # update this line to /gscratch/stf/your_name
# Provide container file<br /> CODER_SIF="code-server_4.89.0-39.sif" # update this line if needed (no need if you follow this)
then, sbatch code-server.job, you will receive an job number, then cat code-server.job.jobnumber
open another terminal, following the command shown in code-server.job.jobnumber
to end the vscode connection session
- Conda Env:
official guideline: https://hyak.uw.edu/docs/tools/python
better change to interactive first
download miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
install miniconda3 to your home directory (no, change to your working dir)
bash Miniconda3-latest-Linux-x86_64.sh -p path_to_working_dir/miniconda3
configure
conda init bash
Then you can use conda
- Storage issues:
After installing conda and creating envs, you might find your home dir loaded with cache (could not happen to you, just in case), which is not good and we want all cache to be stored in your working dir
follow this: https://hyak.uw.edu/blog/conda-disk-storage/
Also, for pip, similar issue could happen:
also on the above website
For libraries like huggingface, and maybe others, they can download models at home dir, which we should change:
# Add this line to your ~/.bashrc file:
export HUGGINGFACE_HUB_CACHE="/gscratch/stf/your_name/hf_cache"
Then, cd /gscratch/stf/your_name mkdir hf_cache
Within your Slurm Job Script:
add this: export HUGGINGFACE_HUB_CACHE="/gscratch/stf/your_name/hf_cache"