DIVA service (GHER-ULiège)

(Last changed 20201023)

Basic info

For every user, a JupyterNoteboook container with divnd installed is started by a JupyterHub instance. JupyterHub does the user authentication, starts the container, and redirects the user to the container. The user's NextCloud data is also bind-mounted by JupyterHub.

The service is running on bluewhale.dkrz.de as plain http. We are using the nginx proxy on port 443 for the SSL termination, so ERDDAP is accessible on https://bluewhale.dkrz.de/diva. The other notebooks can be reached at https://bluewhale.dkrz.de/divavip and https://bluewhale.dkrz.de/python. To login there, you need a POST request sending vre_username, vre_displayname, service_auth_token and vre_URL.

For testing, there is a HTML login form available at https://bluewhale.dkrz.de/diva/locallogin (same for divavip and python).

Useful info and mounts

What will be mounted into the JupyterHub (defined in docker-compose.yml):

Synchronized NextCloud user directories. Mounted from HOST_WHERE_ARE_USERDIRS (DKRZ: /scratch/vre/sync_from_athens/nextcloud_data) to /usr/share/userdirectories/.
JupyterHub config file. Mounted from /root/erddap/jupyterhub_config.py to /srv/jupyterhub/jupyterhub_config.py.
Docker socket, needed to spawn containers. Mounted from /var/run/docker.sock to the same location inside the container.

What will be mounted into the spawned containers (defined in jupyterhub_config.py, setting c.DockerSpawner.volumes, via the dict volume_mounts):

The user's directory. Mounted from HOST_WHERE_ARE_USERDIRS (GRNET: /scratch/vre/sync_from_athens/nextcloud_data/<username>/) to /home/jovyan/work/nextcloud_sync.
- The location inside the spawned container can be changed (defined by USERDIR_IN_CONTAINER), but we don't recommend changing it, as this might confuse the users.

Before deploying DIVA

The user data (NextCloud data) should already be in place before starting the deployment. Currently, we run DIVA on a server at DKRZ, far away from the other VRE servers at GRNET. That's why mounting the data via NFS was not an option, so we synchronize the data, which means, we actually create copies and try to keep the copies in sync (the users have to do that by clicking sync in the workspace).

If you run DIVA on a VM with NFS-mounted NextCloud data, you mainly have to adapt USERDIR_TEMPLATE_HOST in the .env file (further below).

Operating on synchronized data

Make sure you have the synchronized user data ready. On bluewhale it sits in /scratch/vre/sync_from_athens/nextcloud_data/, but you can use any other location - just make sure you specify it in HOST_WHERE_ARE_USERDIRS in the .env file.

The value of USERDIR_TEMPLATE_HOST should be /{raw_username}, so that the directories named <username> are mounted.

This is how it should look:

[alice@bluewhale ~]$ ls -lpah /scratch/vre/sync_from_athens/nextcloud_data
total 4.0K
drwxr-xr-x. 15 vre  vre  4.0K Oct 20 16:26 ./
drwxr-xr-x.  5 root root   92 Sep 14 10:39 ../
drwxrwxr-x. 13 vre  vre   230 Oct 21 01:48 vre_tomandjerrymarineidorgcgfqwt7j/ # one user
drwxr-xr-x.  2 vre  vre     6 Oct  1 10:04 vre_bugsbunnymarineidorgy4773w8y/   # another user
...

Inside the user directories, their content should be directly visible:

[alice@bluewhale ~]$ ls -lpah /scratch/vre/sync_from_athens/nextcloud_data/vre_tomandjerrymarineidorgcgfqwt7j/
total 8.0K
drwxrwxr-x. 13 vre vre  230 Oct 21 01:48 ./
drwxr-xr-x. 15 vre vre 4.0K Oct 20 16:26 ../
drwxrwxr-x.  3 vre vre   86 Oct 20 03:12 BioQC_test_data/
drwxrwxr-x.  6 vre vre   68 Sep 25 13:29 ERDDAP_test_data/
drwxrwxr-x.  2 vre vre    6 Sep 25 13:29 Imports/
drwxrwxr-x.  2 vre vre    6 Sep 25 13:29 Results/
drwxrwxr-x.  5 vre vre  112 Sep 25 13:29 webODV_test_data/
drwxrwxr-x.  2 vre vre   81 Sep 28 10:13 Work/
...

Operating on NFS-mounted data

Make sure you have the NFS-mounted user data ready. On GRNET's VMs it sits in /mnt/sdc-nfs-data/, but you can use any other location - just make sure you specify it in HOST_WHERE_ARE_USERDIRS in the .env file.

The value of USERDIR_TEMPLATE_HOST should be /{raw_username}/files, so that the subdirectories called files inside directories named <username> are mounted.

When peeking into a user directory, you should see /files, and only in there, the actual contents:

[root@snf-7990 ~]# ls -lpah /mnt/sdc-nfs-data/vre_tomandjerrymarineidorgcgfqwt7j/
total 20K
drwxrwxr-x  5 33 33 4.0K Sep 25 11:23 ./
drwxrwx--- 36 33 33 4.0K Oct 23 13:15 ../
drwxrwxr-x 13 33 33 4.0K Oct 21 17:56 files/
...

[root@snf-7990 ~]# ls -lpah /mnt/sdc-nfs-data/vre_tomandjerrymarineidorgcgfqwt7j/files
total 52K
drwxrwxr-x 13 33 33 4.0K Oct 21 17:56 ./
drwxrwxr-x  5 33 33 4.0K Sep 25 11:23 ../
drwxrwxr-x  2 33 33 4.0K Sep 21 09:59 BioQC_test_data/
drwxrwxr-x  6 33 33 4.0K Sep 21 10:08 ERDDAP_test_data/
drwxrwxr-x  2 33 33 4.0K Sep 21 09:59 Imports/
drwxrwxr-x  2 33 33 4.0K Sep 21 09:59 Results/
drwxrwxr-x  5 33 33 4.0K Sep 21 10:00 webODV_test_data/
drwxrwxr-x  2 33 33 4.0K Sep 28 08:12 Work/
...

Deployment step-by-step

Create home dir called diva (wherever you have your service directories, e.g. /root/diva)
Download the docker-compose.yml, the jupyterhub_config.py and the environment file.

mkdir /root/diva_normal
cd /root/diva_normal
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/docker-compose.yml
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/jupyterhub_config.py
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/env-diva
mv env-diva .env

Make some changes to the environment file:
- Change the value of ADMIN_PW to some value of your choice (replace foo).
- Change the value of JUPYTERHUB_CRYPT_KEY to the result of running openssl rand -hex 32 (replace foo).
- Change the value of HOST_NAME to the FQDM of the machine where diva will be reachable.
- Other values may have to change in case you use different paths than in this guide, e.g. HOST_WHERE_ARE_USERDIRS in case you don't have the NextCloud data in /scratch/vre/sync_from_athens/nextcloud_data, USERDIR_TEMPLATE_HOST in case you want to mount a subdirectory of the user directories in there, ...

openssl rand -hex 32
vi .env

Docker-compose and JupyterHub config need no changes. To be sure, make sure the URL used for authentication is included in the WHITELIST_AUTH env value in docker-compose.yml. In the jupyterhub_config.py, make sure that the section with ERDDAP-specific config, is excluded via if False.

Pull the diva notebook image (for possible more recent versions, check by for the DOCKER_JUPYTER_IMAGE value inside the .env file! JupyterHub is not able to pull it by itself, so you must pull it manually.)

docker pull abarth/divand-jupyter:2020-08-31T0716-501

Start the service

docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f

Now deploy or restart the reverse proxy, without which the service cannot be reached from outside! See: https://github.com/SeaDataCloud/Documentation/wiki/Reverse-Proxy
After every restart of erddap, make sure to restart the revproxy too!

cd /root/revproxy
vi proxy.conf
# add locations for this service to config
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f

Testing the deployment

Hub must be healthy! That takes about a minute.

docker ps -a | grep erddap

Can you login locally at https://bluewhale.dkrz.de/diva/locallogin, using any name, and the ADMIN_PW ? (In this case, no data is available)
Can you login locally at https://bluewhale.dkrz.de/diva/locallogin, using an existing VRE name, and the ADMIN_PW ? (In this case, data should be availble)
Next, try logging in from the dashboard. For this, there must be a form on the dashboard that sends the user to this instance via POST.
A container called diva-vre_xyzxyz should be spawned, and be healthy soon too.

Adding a VIP instance

Once you successfully deployed diva, do the same again for VIP DIVA, which provides more memory.

mkdir /root/diva_vip
cd /root/diva_vip
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/docker-compose.yml.vip
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/jupyterhub_config.py
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/env-diva-vip
mv env-diva-vip .env
# same changes to .env!!
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f

cd /root/revproxy
vi proxy.conf
# add locations for this service to config
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f

Adding a Notebook for R and Python... as not everyone is familiar with julia!

Similar steps as above. The same jupyterhub_config is used. You need to pull another image - check by for the DOCKER_JUPYTER_IMAGE value inside the .env file! JupyterHub is not able to pull it by itself, so you must pull it manually.

mkdir /root/notebook_python_and_r
cd /root/notebook_python_and_r
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/jupyternotebooks/docker-compose.yml # almost the same as for diva
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/jupyterhub_config.py           # same as for diva
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/jupyternotebooks/env-r-and-python
mv env-r-and-python .env
# same changes to .env!!
docker pull jupyter/r-notebook:15a66513da301
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f

cd /root/revproxy
vi proxy.conf
# add locations for this service to config
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f

Service: DIVA - SeaDataCloud/Documentation GitHub Wiki

DIVA service (GHER-ULiège)

Basic info

Useful info and mounts

Before deploying DIVA

Operating on synchronized data

Operating on NFS-mounted data

Deployment step-by-step

Testing the deployment

Adding a VIP instance

Adding a Notebook for R and Python... as not everyone is familiar with julia!

⚠️ GitHub.com Fallback ⚠️

Service: DIVA - SeaDataCloud/Documentation GitHub Wiki

DIVA service (GHER-ULiège)

Basic info

Useful info and mounts

Before deploying DIVA

Operating on synchronized data

Operating on NFS-mounted data

Deployment step-by-step

Testing the deployment

Adding a VIP instance

Adding a Notebook for R and Python... as not everyone is familiar with julia!

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️