Daskhub installation - gcp4hep/analysis-cluster GitHub Wiki
Installation
This documentation will refer to our installation process of Dask Gateway + JupyterHub on a Kubernetes backend, in particular
using the daskhub
helm chart. The official daskhub
installation instructions are here.
Helm and other pre-requisites have to be followed from the official documentation. The file values.yaml
contains our
desired configuration.
helm upgrade --wait --install --render-subchart-notes dhub dask/daskhub --values=values.yaml
values.yaml
configuration
Explanations for relevant sections in our Tokens
The tokens are generated using:
openssl rand -hex 32
JupyterHub
- JupyterHub is connected to ATLAS IAM. Users are created automatically the first time they connect.
- We can define the default and a list of alternative images. More details on images in the Daskhub image section
- CVMFS can be installed on the cluster following these instructions and then mounted to the Jupyter pods
Dask gateway
-
Traefik service: note the service for dask-gateway is marked as LoadBalancer. This is to be able to connect to Dask Gateway directly from outside the K8s cluster.
-
We have added CVMFS to the workers in the Dask cluster as well, although I believe it might be required in the Jupyter pod only.
-
optionHandler: by default Dask Gateway does not allow users to overwrite worker size (CPU, memory) or image configuration. This has to be enabled explicitly.
Optional Node Pool settings
It can be recommendable to run the Dask setup with multiple node pools with different quality of service:
-
Main node pool with standard (non-preemptible) nodes. At the moment it's defined to be static. This is the default pool and will host the critical pods that do not tolerate interruption well:
- Jupyter Hub: necessary for running the Jupyter service
- Jupyter Notebook: it's not nice to kill the notebook while a user is working on it
- Dask Gateway: necessary for running the Dask Gateway interface
- Dask Scheduler: main component of each Dask cluster. If we loose it, we loose track of the whole task.
- Dask Workers: can be scheduled to fill up the main node pool. Excess will spill over to the Worker node pool.
-
Worker node pool with preemptible nodes. This node pool has a taint and only pods that tolerate it will be scheduled to the cluster. Dask has fault tolerance built in. If a worker pod is lost, the scheduler will replace it with a new worker pod. The work done by the worker is lost, but not the whole task.
In order to set it up:
- Create the second node pool enabling preemptible nodes (if you want) and your desired autoscaling limits. Under metadata add the taint
NoSchedule
fordask=worker
. The name is arbitrary, but it has to match the toleration defined for the worker pods in the values.yaml template. - Add the toleration under the worker configuration in the values.yaml configuration. Our sample configuration already includes the toleration.
- Edit the CVMFS daemonset to include and add the following toleration snippet inside the CVMFS container template. Otherwise those nodes will not mount CVMFS (you might not care).
$ kubectl edit daemonset cvmfs-nodeplugin -n cvmfs
...
tolerations:
- effect: NoSchedule
key: dask
operator: Equal
value: worker
- Autoscaling for this node pool works only if you define pod disruption budgets:
kubectl create poddisruptionbudget konnectivity-agent --namespace=kube-system --selector k8s-app=konnectivity-agent --max-unavailable 1
kubectl create poddisruptionbudget kube-dns --namespace=kube-system --selector k8s-app=kube-dns --max-unavailable 1
kubectl create poddisruptionbudget event-exporter-gke --namespace=kube-system --selector k8s-app=event-exporter-gke --max-unavailable 1
kubectl create poddisruptionbudget metrics-server --namespace=kube-system --selector k8s-app=metrics-server --max-unavailable 1
kubectl create poddisruptionbudget konnectivity-agent-autoscaler --namespace=kube-system --selector k8s-app=konnectivity-agent-autoscaler --max-unavailable 1
-
Critical GPU node pool. You can add node pools with different GPU flavours to your K8S cluster. Since these nodes are started exclusively for a user, it takes around 10 minutes to be ready (start VM and add it to the K8S cluster, install CVMFS, download the large-ish ML image). For the GPU profiles you need to include the necessary changes:
- GKE will only schedule pods to these nodes when the resource requests/limits specify they need an NVIDIA GPU.
- Use node selectors to direct the pods to specific GPUs families.
-
Worker GPU node pool can be added in order to run Dask clusters with multiple GPUs. At the time of this writing, we do not have a lot of experience with the usage Tensorflow across a Dask cluster.