Home - SingingData/DataSystems GitHub Wiki

My Approach to Telling a Data Project Code Story

Business and Customer Goals

The goals describe the motivating questions this project. What problem or opportunity is the business needing to address? Who will consume the information? What operational constraints must the data deliver within?

Outputs

Outputs feature the end reporting or consumption experience. Screen shots, vignettes and descriptions of the data project output in operation provide vivid detail of the deliverable.

Data Pipeline Overview

The data pipeline describes the high level architecture and data flows from points of origin, through ingestion, transformation and delivery. A high level flow diagram is accompanied by brief description of the technology involved.

Code

The code details all the custom code written at each stages to make the data flow through the transformations and into the end outputs.

Data Structure Progression

The data structure progression provides vignettes of the data structure at each phase. This concrete depiction allows the reader to understand the journey through transformation to output.

Data Samples

Data samples allow the user to reconstruct the data project for themselves. In some cases this may be synthesized data to protect the privacy of project information.