Ingestion - vmware/versatile-data-kit GitHub Wiki

A typical Ingestion job:

  • Extracts data from various sources** (HTTP APIs, Databases, CSV, etc.).
  • Does NOT do any transformations on the data (besides formating the payload to be accepted by target (e.g json serialization)).
  • Loads the data to your preferred Ingestion target (database, cloud storage)

Ingesting data

As usual - it is a one-liner, e.g.:

Example: send any JSON-able Python object for ingestion

job_input.send_object_for_ingestion( {'some number': 4098, 'some text': "hi!"}, "name_of_table_that_receives_the_data" ) # Every Python object is a dictionary, so we are showing an example with a dictionary here.

For real-life production examples, you can check the following examples.

Ingestion examples:

Ingesting data from REST API into DatabaseIngesting data from DB into DatabaseIngesting local CSV file into DatabaseIncremental ingestion using Job PropertiesIngesting data from an authenticated REST API using Secrets

Jupyter Ingestion Tutorial

VDK Ingestion Tutorial with Jupyter Notebooks

Videos

▶️ Data Ingestion Intro ▶️ Incremental Ingestion

All VDK Examples

All VDK examples can be found here

➡️ Next section: Transformation