Home - sharmalab/MSCR-BD2K GitHub Wiki
MSCR 598-1 Big Data to Knowledge in Clinical and Translational Research
Background
In 2012, the NIH began the trans-institute Big Data to Knowledge (BD2K) initiative, recognizing the exponential growth of data, its potential tremendous value to human health and the importance of promoting parallel growth of data acquisition, storage and analytic infrastructure and processing capacity within the human health research domain (1, 3). The BD2K initiative is guided by the following mission statement:
“BD2K is a trans-NIH initiative established to enable biomedical research as a digital research enterprise, to facilitate discovery and support new knowledge, and to maximize community engagement.”
BD2K can derive from several patient care sub-domains including electronic medical records, continuous health monitor signal processing, the self quantized patient (mHealth/social networking) or public health surveillance activities. This course will seek to convey an understanding of the fundamental principles of this multi-faceted BD2K Pipeline derived from patient care and then survey how the resulting data is innovatively acquired, curated, combined, analyzed and presented back to the care providers and patients from whom it originated in several current and prominent biomedical fields of investigation.
Objectives
This course will teach fundamental Big Data principles underlying all fields that incorporate aspects of BD2K and provide more detailed case studies within select clinical, patient care fields. Students will learn to design the conceptual framework and workflow of a BD2K project within their field of interest and develop a working knowledge to engage in partnerships with BD2K collaborators. More specific goals are to:
- Define Big Data, Data Science and its components
- Recognize the importance BD2K in numerous sectors of human endeavor and biomedical science.
- Know the workflow of the Big Data pipeline.
- Learn how the Big Data Pipeline is applied in the following biomedical investigation fields:
- Health Services Research – Storage, mining, phenotyping, and real-time analysis query of EMR and other clinical data
- Population and Public Health – Clinical data collection and aggregation technologies, survey sampling techniques, data processing, exploratory analyses accounting for complex survey design, geospatial analysis.
- mHealth – collection, curation, standardization and API-based presentation of mobile app/sensor/device data collected by patients
- Social Network Analysis – API web scraping and text mining of continuous social networking data streams, crowd sourcing, A/B processing, network analysis