DBT - davidkhala/ETL GitHub Wiki

DBT

transform their data by simply writing select statements

DBT project

all a project needs is the dbt_project.yml project configuration file

DBT Resource

Resource	Description
models	Each model lives in a single file and contains logic that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation.
snapshots	A way to capture the state of your mutable tables so you can refer to it later.
seeds	CSV files with static data that you can load into your data platform with dbt.
tests	SQL queries that you can write to test the models and resources in your project.
macros	Blocks of code that you can reuse multiple times.
docs	Docs for your project that you can build.
sources	A way to name and describe the data loaded into your warehouse by your Extract and Load tools.
exposures	A way to define and describe a downstream use of your project.
metrics	A way for you to define metrics for your project.
analysis	A way to organize analytical SQL queries in your project such as the general ledger from your QuickBooks.

dbt snapshot

DBT snapshot records changes to a mutable table over time, as a compensation in case

the model will have 2 new columns dbt_valid_from and dbt_valid_to

Blogs

Use case

source data systems are not built to store historical data
DBT snapshots are only useful if you run them frequently
You need a data enrichment in ouput model.
Type-2 Slowly Changing Dimension

dbt seed

Use case

When

Dimension table: A list of mappings of country codes to country names
A list of test emails to exclude from analysis
A list of employee account IDs

When not

Loading raw data that has been exported to CSVs
Any kind of production data containing sensitive information. For example personal identifiable information (PII) and passwords.

dbt test

DBT cloud

Demos with Experts(2023)

Ecosystem

Other refs

using-wildcard-source-tables