Work on Hyak - Ljia1009/LING573_AutoMeta GitHub Wiki

  • To access:

ssh [email protected]

1

finished duo

to your home dir (only 10G storage, avoid storing here)

  • Working Directory:

cd /gscratch/stf

mkdir here

this would be your working directory

  • Useful commands:

hyakalloc (see the computing resource), sbatch (submit job), salloc (create interactive mode, needed if you want to 'work' on cluster)

squeue -u your_uw_id (check all your submitted jobs)

  • Submit job:

interactive job:

hyakalloc (to see your group and computing resource)

salloc -A account(stf) -p ckpt-all(change to the partition you want) -N 1 -c 4 --mem=10G(change) --time=2:30:00(change)

or salloc -p <partition_name>-int -A <group_name> --time= --mem=G

batch job:

截屏2025-04-24 下午1 13 13

  • If you work on VsCode: they ban remote-ssh. For other IDEs, probably the same. You may need to use code-server (my method) to use VsCode, but only for editing, since it would be run in a container. Other operations should be done in a separate terminal. To do so:

official guideline: https://hyak.uw.edu/docs/tools/vsc-code-server

cd to your working directory (and) home dir
ln -s /mmfs1/sw/containers/code-server/code-server_4.89.0-39.sif code-server_4.89.0-39.sif

Now you can use the code-server container from your directory
wget https://hyak.uw.edu/files/code-server.job

Edit the job script (code-server.job) using nano or vim, you will find lines below:

#SBATCH --partition=ckpt # update this line by the partition you want to use, can leave it without change
#SBATCH --time=02:00:00 # update this line to change time limit, recommend setting it to be very high
# Set home destination for code-server session<br /> CODER_HOME="/gscratch/scrubbed/UWNetID" # update this line to /gscratch/stf/your_name
# Provide container file<br /> CODER_SIF="code-server_4.89.0-39.sif" # update this line if needed (no need if you follow this)

then, sbatch code-server.job, you will receive an job number, then cat code-server.job.jobnumber

open another terminal, following the command shown in code-server.job.jobnumber

to end the vscode connection session

截屏2025-04-24 下午12 56 36

  • Conda Env:

official guideline: https://hyak.uw.edu/docs/tools/python

better change to interactive first

download miniconda3

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

install miniconda3 to your home directory (no, change to your working dir)

bash Miniconda3-latest-Linux-x86_64.sh -p path_to_working_dir/miniconda3

configure

conda init bash

Then you can use conda

  • Storage issues:

After installing conda and creating envs, you might find your home dir loaded with cache (could not happen to you, just in case), which is not good and we want all cache to be stored in your working dir

follow this: https://hyak.uw.edu/blog/conda-disk-storage/

Also, for pip, similar issue could happen:

also on the above website

For libraries like huggingface, and maybe others, they can download models at home dir, which we should change:

# Add this line to your ~/.bashrc file:

export HUGGINGFACE_HUB_CACHE="/gscratch/stf/your_name/hf_cache"

Then, cd /gscratch/stf/your_name mkdir hf_cache

Within your Slurm Job Script:

add this: export HUGGINGFACE_HUB_CACHE="/gscratch/stf/your_name/hf_cache"

⚠️ **GitHub.com Fallback** ⚠️