Deployment - vmware/versatile-data-kit GitHub Wiki
Previous Section: Scheduling a Data Job for automatic execution.
Deployment takes both the build/code and the deployment-specific properties, builds and packages them and once a Data Job is deployed, it is ready for > immediate execution in the execution environment. It can be scheduled to run periodically.
To Deploy a job using Jupyter Notebooks, follow this guide.
Now that we are done with the modifications to the Data Job, we will deploy it in the local Control Service by using the following command:
vdk deploy -n hello-world -t my-team -p ./hello-world -r "initial commit"
This will submit the code of the Data Job to the Control Service and will create a Data Job Deployment. The Deployment process is asynchronous and even though the command completes fast, the creation takes a while until the Data Job is deployed and ready for execution. You can validate that the Data Job Deployment is completed by running the following command:
vdk deploy --show -n hello-world -t my-team
If the deployment is still ongoing, you will get the following output:
No deployments.
When the deployment completes, the command will print the following:
job_name job_version last_deployed_by last_deployed_date enabled
----------- ---------------- ------------------ --------------------------- ---------
hello-world 5000/hello-world 2021-09-14T12:06:32.999160Z True
For the curious: what is going on behind the scenes?
If you have kubectl (https://kubernetes.io/docs/tasks/tools/#kubectl) you can observe the Deployment creation process directly in the Kind cluster. To do this, first get all the pods in the cluster by using:kubectl get podsThis will list all pods in the cluster. The one of interest starts with `builder-hello-world` and is dedicated to creating the Data Job image from the Data Job's source code. This image will be subsequently used for the job execution. The builder pod will look like this:NAME READY STATUS RESTARTS AGE builder-hello-world--1-kcvt9 1/1 Running 0 4s
Once this pod completes, the Control Service will create a [cronjob](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) corresponding to the Data Job Deployment. This cronjob is responsible for scheduling the job executions. Listing the cronjobs in the cluster with `kubectl get cronjobs` will show:NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE hello-world-latest */2 * * * * False 0 66s 8m33s