Deployment - vmware/versatile-data-kit GitHub Wiki

Part 2: Data Job Deployment


Previous Section: Scheduling a Data Job for automatic execution.

Deployment takes both the build/code and the deployment-specific properties, builds and packages them and once a Data Job is deployed, it is ready for > immediate execution in the execution environment. It can be scheduled to run periodically.

To Deploy a job using Jupyter Notebooks, follow this guide.


Now that we are done with the modifications to the Data Job, we will deploy it in the local Control Service by using the following command:

vdk deploy -n hello-world -t my-team -p ./hello-world -r "initial commit"

This will submit the code of the Data Job to the Control Service and will create a Data Job Deployment. The Deployment process is asynchronous and even though the command completes fast, the creation takes a while until the Data Job is deployed and ready for execution. You can validate that the Data Job Deployment is completed by running the following command:

vdk deploy --show -n hello-world -t my-team

If the deployment is still ongoing, you will get the following output:

No deployments.

When the deployment completes, the command will print the following:

job_name     job_version       last_deployed_by    last_deployed_date           enabled
-----------  ----------------  ------------------  ---------------------------  ---------
hello-world  5000/hello-world                      2021-09-14T12:06:32.999160Z  True
For the curious: what is going on behind the scenes?
If you have kubectl (https://kubernetes.io/docs/tasks/tools/#kubectl) you can observe the Deployment creation process directly in the Kind cluster. To do this, first get all the pods in the cluster by using:
kubectl get pods
This will list all pods in the cluster. The one of interest starts with `builder-hello-world` and is dedicated to creating the Data Job image from the Data Job's source code. This image will be subsequently used for the job execution. The builder pod will look like this:
NAME                                       READY   STATUS             RESTARTS      AGE
builder-hello-world--1-kcvt9               1/1     Running            0             4s

Once this pod completes, the Control Service will create a [cronjob](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) corresponding to the Data Job Deployment. This cronjob is responsible for scheduling the job executions. Listing the cronjobs in the cluster with `kubectl get cronjobs` will show:
NAME                 SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
hello-world-latest   */2 * * * *   False     0        66s             8m33s

➡️ Next Section: Execution

⚠️ **GitHub.com Fallback** ⚠️