Home - SingingData/DataSystems GitHub Wiki
My Approach to Telling a Data Project Code Story
Business and Customer Goals
The goals describe the motivating questions this project. What problem or opportunity is the business needing to address? Who will consume the information? What operational constraints must the data deliver within?
Outputs
Outputs feature the end reporting or consumption experience. Screen shots, vignettes and descriptions of the data project output in operation provide vivid detail of the deliverable.
Data Pipeline Overview
The data pipeline describes the high level architecture and data flows from points of origin, through ingestion, transformation and delivery. A high level flow diagram is accompanied by brief description of the technology involved.
Code
The code details all the custom code written at each stages to make the data flow through the transformations and into the end outputs.
Data Structure Progression
The data structure progression provides vignettes of the data structure at each phase. This concrete depiction allows the reader to understand the journey through transformation to output.
Data Samples
Data samples allow the user to reconstruct the data project for themselves. In some cases this may be synthesized data to protect the privacy of project information.