Julia Jupyter notebook on CECI HPC - gher-uliege/Documentation GitHub Wiki
[!IMPORTANT]
Please do not forget to free CPU/GPU/... resources to not unnecessarily deplete the computation time budget
Connect to the cluster with port-forwarding
You need to first configure SSH as described CECI documentation. and the Lucia documentation.
Assuming that you can already connect to lucia with ssh lucia
, on your laptop/PC, connect to lucia with port forwarding:
ssh -L 9999:localhost:9999 lucia
If the port 9999 is already used, you can use a different number and adapt all port numbers below accordingly. The port number should be larger than 1024.
Install jupyter notebook
Run the following shell commands to install jupyter
:
pip install -U --user pip
which pip
# should be ~/.local/bin/pip
pip install --user jupyter
which jupyter
# should be ~/.local/bin/jupyter
export JUPYTER=$(which jupyter)
Run the following julia commands to install IJulia
:
using Pkg
Pkg.add("IJulia")
using IJulia
@show IJulia.JUPYTER
The last command gives you the path to the jupyter
program, for example "/gpfs/home/acad/ulg-gher/abarth/.local/bin/jupyter"
.
Run on the frontal node
Simple data preparation tasks can be done on the frontal node as long as they do not require much resources:
jupyter notebook --no-browser --port=9999
Then open the link (e.g. http://localhost:9999/?token=LONG_LONG_TOKEN
) in your web-browser as instructed.
Run on compute nodes
Start jupyter notebook
via the SLURM command srun
for example with:
srun --account=ACCOUNT_NAME --job-name=notebook --partition debug-gpu --gres=gpu:1 --time=1:00:00 --mem-per-cpu=6000 --ntasks=1 --cpus-per-task=1 --pty jupyter notebook --no-browser --port=9999
You may need to adapt the options --account
, --time
, --mem-per-cpu
, --ntasks
, --cpus-per-task
... (see https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html).
Enable port forwarding from frontal to the compute node in an additional separate SSH session by running this command on the frontal system
ssh -L 9999:localhost:9999 cnaXYZ
where cnaXYZ
is the allocated compute node as reported by the shell command squeue --me
.
[!TIP] The julia command
gethostname()
allows you to double check that you are running on the correct node. With the commandusing CUDA; CUDA.functional()
you can check if a CUDA GPU available.
Free resource
Hit Control-C
in the terminal where you lanched jupyter notebook
wuth srun
.
Verify that the resources are freed with the following command to be run on the frontal node:
squeue --me
Explicitly cancel a job with scancel JOB_NUMBER
if necessary.