Project plan for data pipeline - mettadatalabs1/oncoscape-datapipeline GitHub Wiki
Project Overview
Develop a systematic, extensible data pipeline to ingest, transform, and load molecular and clinical data. The pipeline should be an airflow workflow connecting modular tasks for validations, transformations, and load. The database to be used in MongoDB.