JupyterHub Server - nthu-ioa/cluster GitHub Wiki

The CICA Jupyterhub Server

We now offer a centralized Jupyterhub server hosted on the CICA cluster:

https://cica.astr.nthu.edu.tw/jupyter

This can be used for everyday plotting and data analysis. It is not intended for very heavy calculations (more than a day or two of compute on more than 16 cores).

The new server is much easier to set up and access than the previous recommended way of running Jupyter notebooks on the server (which is still possible, but only useful for advanced users). Everything works entirely through your browser. There is no need to open an SSH tunnel through your terminal. There is also no need to write and submit your own Slurm script.

[!TIP] For live announcements (e.g. when the server needs to be rebooted etc.) and discussion of issues during testing, please join this Slack channel: https://join.slack.com/t/cicausers/shared_invite/zt-3b41zlv1c-YZfNaqlmxwgD1nWgBBwF2A. The next-best way to report issues is through the CICA GitHub issues page (see below).

Getting started

You need a login account on CICA to use this service. For the time being, this service is only accessible on campus (or through the VPN).

🟢 To access the service, visit https://cica.astr.nthu.edu.tw/jupyter.

🟢 Enter your CICA username and login password.

[!NOTE]
Your login password was set when you created your CICA account. It is not the "SSH passphrase" you may also have created when setting up SSH key access to CICA.

🟢 If you see the "Start my Server" button, click it (the hub may send you directly to the next step)

🟢 Select one of the server resource options and choose "Start":

🟢 You should find yourself in the familiar JupyterLab interface.

[!TIP]
You can close your browser and reconnect later without losing your session. You may need to reload the page in the browser.

You can log out manually.

Hardware

The basic server options run on a new node, a01, which you can also access in the usual way from fomalhaut. This node is a 128 core AMD Epyc server intended for relatively small multi-user jobs like JuptyerHub. Please avoid running other jobs on this node.

[!WARNING]
Unless you have been given explicit permission (e.g. certain CASA users) please don't start jobs outside slurm on a01. If you have been given permission, please try to keep your usage low to avoid disrupting the jupyterlab users.

Limitations

  • The bandwidth between the JupyterHub server and the /lfs storage is limited. In normal use you will probably not notice, but please don't start any large file transfers between the /data* and /lfs disks from this node. Log on to the cluster via ssh as normal to do those transfers. If your notebook jobs need to do intensive IO to /lfs, please the "mem" or "gpu" Jupyter server options.

Why are the server options limited to those resources and times?

The choices are intended to meet the needs of a large number of everyday users. If you have specific resource requirements, you can always set up and run a server yourself as before.

However, the choices in the menu will be kept under review. If you would like to propose any changes to the server menu, please raise an issue on the CICA github. See also below under "How do I ask a question or report a problem with Jupyterhub?".

The old way still works

You can still start your own JupyterLab processes under slurm and connect to them via an ssh tunnel as described here. Feel free to do this if:

  • You need resources not offered by the server options (eg. more memory, multiple GPUs);
  • You need to do anything nontrivial with your python environment.

How do I....?

Change server?

  • Go to "Hub Control Panel"
  • Choose "Stop Server"

Add my conda environment as a Jupyter kernel?

If you have a conda (or mamba) environment called (for example) myenv, you can add it as an option in the JupyterLab kernel dropdown by logging on to the cluster in a terminal as normal and executing the following commands (this may also work in a JupyterLab terminal).

module load python
source activate myenv
pip install ipykernel
python -m ipykernel install --user --name myenv --display-name "Python (myenv)"

Replace myenv with the actual name of your environment. The string after --display-name can be anything you want.

Load environment modules?

In the previous ssh-tunnel method of using Jupyter, you could include module load statements in the Slurm batch script that started you jupyter lab (or notebook) job. For example, you could include module load gcc hdf5 to give access to the HDF5 libraries and command line tools.

With the new server, you can still customise the environment of your notebook, but the approach is different. You now need to create a startup script for a particular Jupyter kernel.

For example, if your notebook uses the kernel myenv (see above), you need to find and modify the definition of this kernel (a .json file, by default under $HOME/.local/share/jupyter/kernels/) to set up the environment for this kernel.

For details see the following instructions from NERSC: https://docs.nersc.gov/services/jupyter/how-to-guides/#how-to-customize-a-kernel-with-a-helper-shell-script

See the slurm log file?

Each Jupyterhub slurm job writes a log to ${HOME}/.jupyterhub_logs. You might want to remove old logs from this directory occasionally.

Know that my password is secure?

Using your CICA login as your jupyterhub password might be a bit scary. Our system uses HTTPS exclusively, with a self-signed certificate, so your password is as secure as any other password you send over the web. In most respects it is no worse than typing your password to log in at the terminal -- the only real difference is the involvement of your web browser, which is probably slightly more vulnerable to attack on your end.

The server is only accessible by IPs within NTHU. Our server rate-limits login attempts, so you may find yourself locked out for some time if you type your password incorrectly too many times in a short period (5 minutes). If this happens, just wait a bit.

If you are concerned about security (we hope you are!) then a good strategy for extra security is to use an SSH key with a passphrase for your SSH logins, and to change your login password (and SSH key) from time to time.

Ask a question not on this list, or report a problem with Jupyterhub?

For non urgent questions (almost all questions are non-urgent!), please use the CICA github issue page. This is much more efficient than email. If you want, you can tag @apcooper in your question and add the "Jupyter" tag. Please also help others with their questions if you know the answer!