00.Datasets03 (M R) - sporedata/researchdesigneR GitHub Wiki
Select datasets
This section focuses on a few select healthcare datasets that hold special value in patient-centered research:
- MAA (USA) #public - Medical Abbreviation and Acronym database.
- MAFD (USA) #public - Myers Abortion Facility Database.
- mapMECFS (USA) - mapMECFS is an interactive data portal that provides access to research results across multiple biological disciplines from studies focused on advancing our understanding of Myalgic Encephalomyelitis / Chronic Fatigue Syndrome (ME/CFS).
- MBSAQIP (USA) #public - Metabolic and Bariatric Surgery Accreditation and Quality Improvement Program.
- MCBS (USA) #public|#private - Medicare Current Beneficiary Survey.
- MDC (Sweden) #public - The Malmö Diet Cancer study.
- MeDAL (USA) #public - Curated for abbreviation disambiguation and designed for understanding natural language pre-training in the medical domain.
- Medical_Transcriptions (USA) #public - Contains samples of medical transcriptions for different medical specialties.
- MedicalNewsToday (USA) #public - Contains 2,000 approved NLP-related medical articles from @MedicalNewsToday.
- Medicine-Graph (USA) #private - Offers a special collection of co-occurrence matrices that quantify the pairwise mentions of 3 million terms mapped onto 1 million clinical concepts.
- MEDLINE (USA) #public - MEDLINE is the principal bibliographic database of the NLM.
- MedNLI (USA) #private - A Natural Language Inference Dataset For the Clinical Domain.
- MeSH (USA) #public - Medical Subject Headings.
- MEPS (USA) #public - Medical Expenditure Panel Survey.
- MetaMap (USA) - MetaMap is a versatile and adaptable tool crafted to link biomedical texts to the UMLS Metathesaurus, enabling the identification of Metathesaurus concepts mentioned within the text.
- MHOS (USA) #private - Medicare Health Outcomes Survey.
- MHS (USA) #private - Military Health System.
- MI (USA) #public - Monarch Initiative.
- MIDRC (USA) #public - Medical Imaging and Data Resource Center.
- MMG (USA) #public - Map the Meal Gap.
- MIMIC-III (USA) #private - Medical Information Mart for Intensive Care III.
- MIMIC-IV (USA) #private - Medical Information Mart for Intensive Care IV.
- MIMIC-CXR (USA) #private - Medical Information Mart for Intensive Care Chest X-ray (MIMIC-CXR).
- MoTrPAC (USA) #public - The Molecular Transducers of Physical Activity Consortium (MoTrPAC) Data Hub is a national research portal to access data generated by the MoTrPAC.
- MPOG (USA) #private - Multicenter Perioperative Outcomes Group.
- MS (Latvia) - Monitoring System.
- MVP (USA) #private - Million Veteran Program.
- MW (USA) #public - The Metabolomics Workbench.
- n2c2 NLP (USA) - Unstructured notes from the Research Patient Data Registry at Partners Healthcare.
- N3C (USA) #private - National COVID Cohort Collaborative.
- NACC (USA) #public - National Alzheimer’s Coordinating Center.
- NACDA (USA) #public - The National Archive of Computerized Data on Aging.
- NAHDAP (USA) #public|#private- The National Addiction & HIV Data Archive Program.
- NASA (USA) #public - National Aeronautics and Space Administration.
- NBHWCR (Sweden) #public - The National Board of Health and Welfare's Cancer Register.
- NBIPolyp-UCDB (Portugal) #public - A specialized medical image dataset focused on the detection and classification of polyps in the gastrointestinal (GI) tract, particularly for aiding in the diagnosis of colorectal cancer.
- NCATS-ODP (USA) #public - The National Center for Advancing Translational Sciences OpenData Portal.
- NCBI (USA) #public - National Center for Biotechnology Information.
- NCCOR (USA) #public - National Collaborative on Childhood Obesity Research.
- NCDB (USA) #public - National Cancer Database.
- NCHHSTP_AtlasPlus (USA) #public - National Center for HIV, Viral Hepatitis, STD, and TB Prevention (NCHHSTP) AtlasPlus.
- NCHS Data Linkage (USA) #public|#private - NCHS Data Linkage Activities.
- NCI‐CPS (Lithuania) - National Cancer Institute-Cancer Prevention Statistics.
- NCI-IND (USA) #public - The CIP (Cancer Imaging Program) IND Directory is a centralized resource designed to facilitate the sharing of IND information.
- NCR (The Netherlands) #public - Netherlands Cancer Registry (Comprehensive Cancer Centre Netherlands) / Integraal Kankercentrum Nederland
- NCRI (Ireland) #public - National Cancer Registry Ireland.
- NCS (USA) #public|#private - National Congregations Study.
- NDA (USA) #public|#private - The National Institute of Mental Health Data Archive
- NDB (Japan) #Public - The National Database of Health Insurance Claims and Specific Health Checkups.
- NDC (USA) #public - The National Drug Code database.
- NDEx (USA) #public|#private - The Network Data Exchange.
- NDI (USA) #private - National Death Index.
- NEI (USA) #public - National Eye Institute Data Commons.
- NER_CRF_Medical (USA) #public - Connects medical communities with patients across the country.
- Nerthus (Norway) - Developed to aid in tasks related to automated colonoscopy analysis, like polyp detection and classification.
- Neuro-QoL (USA) #public - The Quality of Life in Neurological Disorders.
- News_Title_Dataset_CSV (USA) #public - News Title dataset with four categories.
- NHANES (USA) #public - National Health and Nutrition Examination Survey.
- NHATS (USA) #public|#private - National Health and Aging Trends Study.
- NHCDR (USA) #public - NINDS Human Cell and Data Repository.
- NHGRI-EBI (USA) #public - The NHGRI-EBI GWAS Catalog is a freely available and FAIR knowledgebase providing detailed, interoperable, standardized, and structured human genome-wide association study (GWAS) data.
- NHIS-CCS (USA) - National Health Insurance Service-Cancer Control Supplement.
- NHIS-NSC (South Korea) #private - National Health Insurance Service-National Sample Cohort.
- NHIS (USA) #public - National Health Interview Survey.
- NHS (England) #public - National Health Service.
- NIAAA (USA) #public - National Institute on Alcohol Abuse and Alcoholism.
- NIAGADS (USA) #public|#private- National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site.
- NIDA (USA) - National Institute on Drug Abuse.
- NIDDK-CR (USA) #private - Central Repository of the National Institute of Diabetes and Digestive and Kidney Diseases.
- NIF (USA) #public - Neuroscience Information Framework.
- NIH-Toolbox (USA) #public - The NIH Toolbox includes over 80 stand-alone measures available in 30-minute batteries to assess Cognition, Emotion, Motor skills, and Sensation.
- NIJZ (Slovenia) #public - Nacionalni Inštitut za Javno Zdravje (National Institute of Public Health).
- NITRC (USA) #public - NeuroImaging Tools and Resources Collaboratory.
- NLTK (USA) #public - Natural Language Toolkit.
- NORDCAN (USA) #public - Association of the Nordic Cancer Registries.
- NPI (USA) #public - National Provider Identifier.
- NSQIP (USA) #private - National Surgical Quality Improvement Program.
- NSRR (USA) #public - National Sleep Research Resource.
- NTDB (USA) #private - National Trauma Data Bank.
- NUMI (USA) - National Utilization Management Integration.
- NVI (Lithuania) #public - Nacionalinio Vėžio Instituto (Lithuanian Cancer Registry).
- OASIS (USA) #public #image - Open Access Series of Imaging Studies.
- ODC-SCI (USA) #public|#private - Open Data Commons for Spinal Cord Injury.
- OECD (France) #public - Organization for Economic Co-operation and Development.
- OHSUMED (USA) #public #text - A set of 348,566 references from MEDLINE consisting of titles and/or abstracts from approximately 270 medical journals over a five-year period.
- ONR (USA) #public - Office of Nutrition Research.
- openICPSR (USA) #public - openICPSR is a self-publishing platform for behavioral, health science, and social research data.
- OpenNeuro (USA) #public - OpenNeuro is a facilitator for data sharing and analysis, specifically focusing on raw data from EEG, iEEG, MEG, MRI, and PET modalities.
- OpenNotes (USA) #public - OpenNotes aims to enhance the patient-provider relationship by encouraging clinicians to provide patients with electronic access to their clinical notes.
- OpenPain (USA) #public - OpenPain is an open-access data-sharing platform focused on brain imaging studies of human pain.
- PbEHDE (Taiwan) #public|#private - Population-Based Electronic Health Data Environment.
- PBM (USA) #public - Pharmacy Benefits Management.
- PBS (Australia) #public|#private - Pharmaceutical Benefits Scheme.
- PCORnet (USA) #private - Patient-Centered Clinical Research Network.
- PDB (USA) #public - Protein Data Bank.
- PDBP (USA) #private - Parkinson's Disease Biomarker Program.
- PDMs (USA) #public - Population Density Maps.
- PDTS (USA) #private - Pharmacy Data Transaction Service.
- Pedianet (Italy) #private - Collects data from outpatient family paediatricians in Italy for clinical and epidemiological research.
- PeptideAtlas (USA) #public - PeptideAtlas is a multi-organism collection of peptides identified through extensive tandem mass spectrometry proteomics experiments.
- PGS (USA) #public - Polygenic Score Catalog.
- PHARMO (The Netherlands) #private - Provides a unique opportunity to gain insight into the complete patient journey and healthcare.
- PHC (Ghana, Uganda) - Primary Health Care survey.
- PhonBank (USA) #public - An open database for studying early phonological development using the Phon program.
- PhysioNet (USA) #public - A repository of freely available medical research data designed to conduct and catalyze biomedical research and education.
- PICCOLO (Spain) #public - PICCOLO Widefield Database.
- PLACES (USA) #public - Offers model-based, population-level analysis and community estimates of health metrics for all counties.
- PLCO (USA) #public - Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial.
- POC (USA) - Patterns of Care initiative.
- POLAR (The Netherlands) #public - POLyp Artificial Recognition database.
- PolypGen (Egypt, France, Italy, Norway, Sweden, UK) #public - A comprehensive resource focused on genetic and molecular data related to colorectal polyps, specifically adenomatous polyps.
- PPCR (USA) #private - Pediatric Proton/Photon Consortium Registry.
- PRECIS-2 (USA) #private - PRagmatic Explanatory Continuum Indicator Summary-2.
- Premier (USA) - Premier Healthcare Database.
- PRO-CTCAE (USA) #public - Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events.
- PROMIS (USA) #public - Patient-Reported Outcomes Measurement Information System.
- PROSPR (USA) #public - Population-based Research to Optimize the Screening PRocess (PROSPR) DataShare.
- PS (USA) - Physician Surveys.
- PubChem (USA) #public - An open chemistry database at the NIH and the world's most extensive collection of freely accessible chemical information.
- PubMed (USA) #public - Comprises over 29 million citations for biomedical literature from life science journals, MEDLINE, and online books.
- PubMed_200k_RCT (USA) #public - Consists of about 200,000 abstracts of randomized controlled trials (RCTs), totaling 2.3 million sentences.
- R4R (USA) #public - Resources for Researchers.
- Radiologist_Notes (USA) #private - Offers a semantic understanding of developing automated pipelines and terminologies for captioning medical conditions related to the lumbar spine.
- RAI-MDS (USA) #private - Resident Assessment Instrument-Minimum DataSet.
- RAMQ (Canada) #private - Régie de l'Assurance Maladie du Québec (Quebec Health Insurance Board).
- RedditData (USA) #public - The Domain-specific RedditData: Medical and Finance dataset constitutes Reddit posts for Natural Language Summarization.
- RHCs (USA) #public - Rural Hospital Closures.
- RHIhub (USA) - The Am I Rural? service can be used to help determine whether a specific location is considered rural based on various definitions of rural, including definitions that are used as eligibility criteria for federal programs.
- Ru_DrugAddiction (Russia) #private - Early Russian News Articles on Drug Addiction dataset.
- Russian_Voice (Russia) - Russian Voice dataset contains over 2000 recordings put together by the Central Research Institute of General Speech Pathology of the Russian Academy of Medical Sciences.