Transformation - vmware/versatile-data-kit GitHub Wiki

A typical Processing job:

  • Creates a materialized view
  • Data comes from a source database
  • Data goes to a target database
  • Data in the target database is in a star schema
  • Schema is populated using standard fact/dimension loading strategies (relevant ones are implemented in the platform, so it is 1-liner in terms of Data Job code)

VDK provides:

  • SQL and Python parameterized transformations.
  • Extensible templates for data modeling.
  • Creates a dataset or table as a product.

Get started with transforming data:

Data Modeling: Treating Data as a ProductProcessing data using SQL and local databaseProcessing data using Kimball warehousing templates

VDK Templates

VDK provides SQL Data Processing Templates:Slowly Changing Dimension Type 1Slowly Changing Dimension Type 2Append StrategyInsert Strategy

➡️ Next Section: Data Processing Templates