Data Engineering - tarunchhabra/parakalo GitHub Wiki

Data sources - API, DB, S3 , EFS Data file format - Parquet, CSV, Binary Data processing - Spark, Flink, Presto Data pipeline workflow - Apache Airflow, Nifi, Luigi, Azkaban Data Storage e- NoSQL DB, DWH, S3 data lake,

Monitoring - Datadog,GrafanA Incident Management - PagerDuty, OpsGenie

Data pipelines- https://www.youtube.com/watch?v=Hv1XiSsouU8

Databases vs DWH vs Datalakes - https://www.youtube.com/watch?v=-bSkREem8dM