Service: DIVA - SeaDataCloud/Documentation GitHub Wiki
(Last changed 20201023)
For every user, a JupyterNoteboook container with divnd installed is started by a JupyterHub instance. JupyterHub does the user authentication, starts the container, and redirects the user to the container. The user's NextCloud data is also bind-mounted by JupyterHub.
The service is running on bluewhale.dkrz.de
as plain http. We are using the nginx proxy on port 443 for the SSL termination, so ERDDAP is accessible on https://bluewhale.dkrz.de/diva
. The other notebooks can be reached at https://bluewhale.dkrz.de/divavip
and https://bluewhale.dkrz.de/python
. To login there, you need a POST
request sending vre_username
, vre_displayname
, service_auth_token
and vre_URL
.
For testing, there is a HTML login form available at https://bluewhale.dkrz.de/diva/locallogin
(same for divavip
and python
).
What will be mounted into the JupyterHub (defined in docker-compose.yml):
- Synchronized NextCloud user directories. Mounted from
HOST_WHERE_ARE_USERDIRS
(DKRZ:/scratch/vre/sync_from_athens/nextcloud_data
) to/usr/share/userdirectories/
. - JupyterHub config file. Mounted from
/root/erddap/jupyterhub_config.py
to/srv/jupyterhub/jupyterhub_config.py
. - Docker socket, needed to spawn containers. Mounted from
/var/run/docker.sock
to the same location inside the container.
What will be mounted into the spawned containers (defined in jupyterhub_config.py, setting c.DockerSpawner.volumes
, via the dict volume_mounts
):
- The user's directory. Mounted from
HOST_WHERE_ARE_USERDIRS
(GRNET:/scratch/vre/sync_from_athens/nextcloud_data/<username>/
) to/home/jovyan/work/nextcloud_sync
.- The location inside the spawned container can be changed (defined by
USERDIR_IN_CONTAINER
), but we don't recommend changing it, as this might confuse the users.
- The location inside the spawned container can be changed (defined by
The user data (NextCloud data) should already be in place before starting the deployment. Currently, we run DIVA on a server at DKRZ, far away from the other VRE servers at GRNET. That's why mounting the data via NFS was not an option, so we synchronize the data, which means, we actually create copies and try to keep the copies in sync (the users have to do that by clicking sync
in the workspace).
If you run DIVA on a VM with NFS-mounted NextCloud data, you mainly have to adapt USERDIR_TEMPLATE_HOST
in the .env
file (further below).
Make sure you have the synchronized user data ready. On bluewhale it sits in /scratch/vre/sync_from_athens/nextcloud_data/
, but you can use any other location - just make sure you specify it in HOST_WHERE_ARE_USERDIRS
in the .env file.
The value of USERDIR_TEMPLATE_HOST
should be /{raw_username}
, so that the directories named <username>
are mounted.
This is how it should look:
[alice@bluewhale ~]$ ls -lpah /scratch/vre/sync_from_athens/nextcloud_data
total 4.0K
drwxr-xr-x. 15 vre vre 4.0K Oct 20 16:26 ./
drwxr-xr-x. 5 root root 92 Sep 14 10:39 ../
drwxrwxr-x. 13 vre vre 230 Oct 21 01:48 vre_tomandjerrymarineidorgcgfqwt7j/ # one user
drwxr-xr-x. 2 vre vre 6 Oct 1 10:04 vre_bugsbunnymarineidorgy4773w8y/ # another user
...
Inside the user directories, their content should be directly visible:
[alice@bluewhale ~]$ ls -lpah /scratch/vre/sync_from_athens/nextcloud_data/vre_tomandjerrymarineidorgcgfqwt7j/
total 8.0K
drwxrwxr-x. 13 vre vre 230 Oct 21 01:48 ./
drwxr-xr-x. 15 vre vre 4.0K Oct 20 16:26 ../
drwxrwxr-x. 3 vre vre 86 Oct 20 03:12 BioQC_test_data/
drwxrwxr-x. 6 vre vre 68 Sep 25 13:29 ERDDAP_test_data/
drwxrwxr-x. 2 vre vre 6 Sep 25 13:29 Imports/
drwxrwxr-x. 2 vre vre 6 Sep 25 13:29 Results/
drwxrwxr-x. 5 vre vre 112 Sep 25 13:29 webODV_test_data/
drwxrwxr-x. 2 vre vre 81 Sep 28 10:13 Work/
...
Make sure you have the NFS-mounted user data ready. On GRNET's VMs it sits in /mnt/sdc-nfs-data/
, but you can use any other location - just make sure you specify it in HOST_WHERE_ARE_USERDIRS
in the .env file.
The value of USERDIR_TEMPLATE_HOST
should be /{raw_username}/files
, so that the subdirectories called files
inside directories named <username>
are mounted.
When peeking into a user directory, you should see /files
, and only in there, the actual contents:
[root@snf-7990 ~]# ls -lpah /mnt/sdc-nfs-data/vre_tomandjerrymarineidorgcgfqwt7j/
total 20K
drwxrwxr-x 5 33 33 4.0K Sep 25 11:23 ./
drwxrwx--- 36 33 33 4.0K Oct 23 13:15 ../
drwxrwxr-x 13 33 33 4.0K Oct 21 17:56 files/
...
[root@snf-7990 ~]# ls -lpah /mnt/sdc-nfs-data/vre_tomandjerrymarineidorgcgfqwt7j/files
total 52K
drwxrwxr-x 13 33 33 4.0K Oct 21 17:56 ./
drwxrwxr-x 5 33 33 4.0K Sep 25 11:23 ../
drwxrwxr-x 2 33 33 4.0K Sep 21 09:59 BioQC_test_data/
drwxrwxr-x 6 33 33 4.0K Sep 21 10:08 ERDDAP_test_data/
drwxrwxr-x 2 33 33 4.0K Sep 21 09:59 Imports/
drwxrwxr-x 2 33 33 4.0K Sep 21 09:59 Results/
drwxrwxr-x 5 33 33 4.0K Sep 21 10:00 webODV_test_data/
drwxrwxr-x 2 33 33 4.0K Sep 28 08:12 Work/
...
- Create home dir called
diva
(wherever you have your service directories, e.g. /root/diva) - Download the docker-compose.yml, the jupyterhub_config.py and the environment file.
mkdir /root/diva_normal
cd /root/diva_normal
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/docker-compose.yml
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/jupyterhub_config.py
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/env-diva
mv env-diva .env
- Make some changes to the environment file:
- Change the value of
ADMIN_PW
to some value of your choice (replace foo). - Change the value of
JUPYTERHUB_CRYPT_KEY
to the result of runningopenssl rand -hex 32
(replace foo). - Change the value of
HOST_NAME
to the FQDM of the machine where diva will be reachable. - Other values may have to change in case you use different paths than in this guide, e.g.
HOST_WHERE_ARE_USERDIRS
in case you don't have the NextCloud data in/scratch/vre/sync_from_athens/nextcloud_data
,USERDIR_TEMPLATE_HOST
in case you want to mount a subdirectory of the user directories in there, ...
- Change the value of
openssl rand -hex 32
vi .env
Docker-compose and JupyterHub config need no changes. To be sure, make sure the URL used for authentication is included in the WHITELIST_AUTH env value in docker-compose.yml. In the jupyterhub_config.py, make sure that the section with ERDDAP-specific config, is excluded via if False
.
- Pull the diva notebook image (for possible more recent versions, check by for the
DOCKER_JUPYTER_IMAGE
value inside the .env file! JupyterHub is not able to pull it by itself, so you must pull it manually.)
docker pull abarth/divand-jupyter:2020-08-31T0716-501
- Start the service
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f
- Now deploy or restart the reverse proxy, without which the service cannot be reached from outside! See: https://github.com/SeaDataCloud/Documentation/wiki/Reverse-Proxy
- After every restart of erddap, make sure to restart the revproxy too!
cd /root/revproxy
vi proxy.conf
# add locations for this service to config
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f
- Hub must be healthy! That takes about a minute.
docker ps -a | grep erddap
- Can you login locally at https://bluewhale.dkrz.de/diva/locallogin, using any name, and the
ADMIN_PW
? (In this case, no data is available) - Can you login locally at https://bluewhale.dkrz.de/diva/locallogin, using an existing VRE name, and the
ADMIN_PW
? (In this case, data should be availble) - Next, try logging in from the dashboard. For this, there must be a form on the dashboard that sends the user to this instance via
POST
. - A container called
diva-vre_xyzxyz
should be spawned, and be healthy soon too.
Once you successfully deployed diva, do the same again for VIP DIVA, which provides more memory.
mkdir /root/diva_vip
cd /root/diva_vip
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/docker-compose.yml.vip
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/jupyterhub_config.py
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/env-diva-vip
mv env-diva-vip .env
# same changes to .env!!
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f
cd /root/revproxy
vi proxy.conf
# add locations for this service to config
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f
Similar steps as above. The same jupyterhub_config is used. You need to pull another image - check by for the DOCKER_JUPYTER_IMAGE
value inside the .env file! JupyterHub is not able to pull it by itself, so you must pull it manually.
mkdir /root/notebook_python_and_r
cd /root/notebook_python_and_r
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/jupyternotebooks/docker-compose.yml # almost the same as for diva
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/diva/jupyterhub_config.py # same as for diva
wget https://raw.githubusercontent.com/SeaDataCloud/vre-config/master/services/jupyternotebooks/env-r-and-python
mv env-r-and-python .env
# same changes to .env!!
docker pull jupyter/r-notebook:15a66513da301
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f
cd /root/revproxy
vi proxy.conf
# add locations for this service to config
docker-compose down && docker-compose up -d && docker-compose logs --tail=100 -f