Airflow Notes - ayaohsu/Personal-Resources GitHub Wiki

Workflows are call DAGs (Directed Acyclic Graph). A DAG is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies.

Operators:

  • Sensors: Sensors will keep running until a certain criteria is met (e.g. a certain time, external file, or upstream data source)
  • Operators: Operators trigger a certain action (e.g. run a bash command, execute a python function, etc.)
  • Transfers: This moves data from one location to another (e.g. move data from MySql to Hive)

DagRun: A DAG with a specific execution time.
TaskInstance: The task belonging to a specific DagRun.

On the tree view: A circle represents a DAG run, and a square represents a TaskInstance.