ExampleHome - vmware/versatile-data-kit GitHub Wiki
Versatile Data Kit (VDK) is a data framework that enables Data Engineers to
🧑💻 develop,
📊 and manage data workloads, aka data jobs
- Ingest data from different sources.
- Use Python/SQL and VDK templates to transform data.
- Package, version, and deploy data applications while dealing with credentials, retries, and reconnects.
- Provide built-in monitoring and smart notification capabilities.
- Track code and data modifications for quicker troubleshooting and version rollback.
See our introduction blog post
All getting started work in Google Collab (link) or any installation of VDK. But if you want to run examples locally, try out quickstart VDK
pip install quickstart-vdk
This installs the core vdk packages and the vdk command line interface. You can use them to run jobs in your local shell environment. Then you can run
vdk dev-studio --start
to start a local notebook server and follow the instructions there.
Install VDK Server | Deploy Job | Rollback Job to latest stable version | Schedule Job | Monitor Job with Operations UI |
➡️ See the Installation for more details.
pip install quickstart-vdk
This installs the core vdk packages and the vdk command line interface. You can use them to run jobs in your local shell environment.
See also the Getting Started section of the wiki
➡️ See the Interfaces for more details.