APP : App Overview - waidyanatha/dongcha GitHub Wiki
Dongcha APPs
If you already followed the READM.MD you would realize that three new folders: wrangler, mining, and visuals, that we call apps. Apps are the 3 key components of the data as a product architecture. The the data elements, the models, decision support tools, and so on are all considered as data products.
- Wrangler - is your ETL facilitators for curating and archiving the domain datasets
- Mining - manages the AI/ML models that work on the curated data for sensemaking
- Visuals - offer warehouses of read-optimized and standardized data units for decision support
The apps can, independently, serve as monolithic dockerized microservices extended through APIs.
Additionally there are the:
- INSTALLER - that house the setup scripts
- README - with a quick intro to dongcha and installation instructions
- requirements.tx - with the necessary and sufficient libraries
- dongcha.py - the main set of classes for configuring and initializing an instance
Each app follows a common structure with:
- Modules - that are domain and sector-specific packages with classes and methods
- Data - an associated folder structure that harmonizes with the modules for storing input, output, and temporary data
- DB (database) - stores all module-wise SQL scripts for creating database schemas, tables, views, and functions
- Dags - are module and workflow specific scheduler files for running various pipelines
- Logs - a folder structure that follows the modules hierarchy for storing package-wise logs
- Notebooks - are used for experimenting the workflows of the module and requirement-specific pipelines.
Folder Structure
The folder structure for data follows the same as the modules. Simply replace the modules folder with data. This makes it easy to manage and reference module-entity and function-package specific the data. This level of abstraction allows for implementing a common framework across all apps, module-entity, and function-package reads and writes. For example:
## folder path to entity = ota and function = scraper
wrangler/data/ota/scaper