00.Datasets04 (S Z) - sporedata/researchdesigneR GitHub Wiki

Select datasets

This section focuses on a few select healthcare datasets that hold special value in patient-centered research:

  • S2D (USA) #private - Symptom2Disease dataset.
  • SADR (USA) - Standard Ambulatory Data Record.
  • SAGE (USA) - Study on global AGEing and adult health.
  • SAMD (USA) #public - Sentiment Analysis for Medical Drugs.
  • SANAD (Morocco, Saudi Arabia, UAE) #public #text - Single-label Arabic News Articles Dataset.
  • SAVEE (USA) #public - Surrey Audio-Visual Expressed Emotion.
  • scikit-learn (USA) #public - An open Python library that provides efficient tools for data mining and data analysis, including methods of classification, clustering, and regression.
  • SCORCH (USA) #public - Single Cell Opioid Responses in the Context of HIV.
  • SCRCR (USA) #public - Swedish Colorectal Cancer Registry.
  • SDC (USA) #public - Sanford Data Collaborative.
  • SDoH (USA) #public - Social Determinants of Health.
  • SEDD (USA) #private - State Emergency Department Databases.
  • SEER-CAHPS (USA) #private - Provides data on the experiences of Medicare beneficiaries with their care at different stages of the cancer care continuum.
  • SEER-Medicaid (USA) #private - Surveillance, Epidemiology and End Results-Medicaid.
  • SEER-Medicare (USA) #private - Surveillance, Epidemiology and End Results-Medicare.
  • SEER-MHOS (USA) #private - Surveillance, Epidemiology and End Results - Medicare Health Outcomes Survey Database.
  • SEER (USA) #private - Surveillance, Epidemiology and End Results Database.
  • SemEval/THYME (USA) - The 2015 SemEval/THYME dataset
  • SenNet (USA) #public - Cellular Senescence Network Database.
  • SHARE (Europe) #private - Survey of Health, Ageing, and Retirement in Europe.
  • ShARe/CLEF eHealth (USA) - The 2013 ShARe/CLEF eHealth dataset.
  • ShARe_corpus (USA) - Shared Annotated Resources (ShARe) disorders corpus.
  • SIDIAP (Spain) #public - Information System for Research in Primary Care.
  • SIDR (USA) - Standard Inpatient Data Record.
  • SLORA (Slovenia) #public - Cancer Registry of the Republic of Slovenia.
  • SMC (Sweden) - Swedish Mammography Cohort 2019 Lifestyle dataset.
  • SND (Sweden) - Swedish National Database of Research and Development.
  • SNHS (Spain) #public - Spanish National Health Survey.
  • SNIRAM-SNDS-HDH (France) - Système National d’Informations Inter-Régimes de l’Assurance Maladie.
  • SpanishTweets_Depression (Spain) #public - A curated selection of Spanish Tweets indicating symptoms of depression.
  • SPARC (USA) #public - Stimulating Peripheral Activity to Relieve Conditions program.
  • SPARCS (USA) - Statewide Planning and Research Cooperative System.
  • SQuAD_Translated_To_Persian (Iran) #public - Contains the English SQuAD (v1) columns with their corresponding translation.
  • SRA (USA) #public - Sequence Read Archive.
  • SRTR/UNOS (USA) #private - United Network for Organ Sharing provides accurate, clear, and timely information on the status of solid organ allocation and transplantation and the transplantation system in the United States.
  • SSA (USA) #private - Social Security Administration database.
  • Stanford-AIMI (USA) #public - A collection of de-identified annotated medical imaging data to foster transparent and reproducible collaborative research.
  • STAR (USA) - Includes patient-level data on transplant recipients, deceased and live donors, and waiting list candidates going as far back as October 1, 1987.
  • StatFin (Finland) #public - National Statistical Institute of Finland Database.
  • Stroke_MRIs (USA) #public - dataset of annotated clinical MRIs and metadata of patients with acute and subacute stroke.
  • STS (USA) #public - Society of Thoracic Surgeons National database.
  • SUNColonoscopyVid (Japan) #public – Japan's Shizuoka University of Nursing and Medical Care Colonoscopy Video Database.
  • SUS (Brazil) #public – Sistema Único de Saúde (Unified Health System).
  • SVI (USA) #public - Social Vulnerability Index.
  • SVS-VQI (USA) - Vascular Quality Initiative from Society for Vascular Surgery
  • T4SA (Italy) #public - Twitter for Sentiment Analysis.
  • TBPP (USA) #public - TB Portals Program.
  • TCIA (USA) #public - The Cancer Imaging Archive.
  • TEDI (USA) #public - The Institutional Purchased Care Data.
  • TEDNI (USA) #public - The Non-Institutional Purchased Care Data.
  • THIN (Italy) #private - The Health Improvement Network.
  • THYME_corpus (USA) - Temporal Histories of Your Medical Event.
  • TLMS (USA) #public - Tobacco Longitudinal Mortality Study.
  • TMDS (USA) #public - Theater Medical Data Store.
  • Tracking Network (USA) #public - The National Environmental Public Health Tracking Network.
  • Trans_Law (Spain) #public - Trans Law or Spanish Trans Law Twitter dataset.
  • TrialShare (USA) - A transformative approach to data sharing that promotes clinical trial transparency.
  • TRICARE (USA) #private - Military Health System Tricare Encounter Data.
  • TriNetX (USA) - Longitudinal real-world data from Epic.
  • TSSH (USA) - The State of Senior Hunger.
  • TUS-CPS (USA) #public - Tobacco Use Supplement to the Current Population Survey.
  • U-M (USA) #public - University of Michigan (U-M) Health datasets.
  • UFAL (USA) #private - UFAL Medical Corpus v. 1.0 is a collection of parallel corpora that aims at a more reliable machine translation of medical texts.
  • UK_Biobank (UK) #public - UK Biobank.
  • UMLS (USA) #public - Unified Medical Language System.
  • UniProt (USA) #public - Universal Protein Resource.
  • UPMC (USA) #public - University of Pittsburgh Medical Center.
  • USRDS (USA) #private - United States Renal Data System.
  • VA-databases (USA) #private - List of databases from the Veterans Affairs Health Service.
  • VAHS-CDW (USA) #private - Veterans Affairs Health Systems - Corporate Data Warehouse.
  • VASQIP (USA) #private - Veterans Affairs Surgical Quality Improvement Program.
  • VEuPathDB (USA) #public - Eukaryotic Pathogen, Vector and Host Informatics Resource Database.
  • VHA-SP (USA) #public - Veterans Health Administration Surgery Program.
  • Video_Transcript_Summarization (USA) #private - comprises nlp video transcripts from twenty-six (26) different categories.
  • VIIRS (USA) #public - Visible Infrared Imaging Radiometer Suite.
  • Vivli (USA) - Vivli is a global data-sharing platform that focuses on health research data. It aims to promote transparency and reproducibility in medical research by providing a centralized platform for sharing clinical trial data.
  • VPF (USA) #public - Vulnerable Population Footprint.
  • WCS (Worldwide) #public - Worldwide cancer statistics.
  • WHO-MD (Worldwide) #public - WHO Mortality Database.
  • WIdO (Germany) - Wissenschaftliches Institut der AOK (AOK Scientific Institute) Database.
  • WikiMed_Q+A (Iran) #public|#text- A Persian Q&A Dataset from Wikipedia about Medicine.
  • WLH (Sweden) - Women's Lifestyle and Health study.
  • WLPolyp-UCDB (Portugal) #public - Wide-Light Polyp dataset.
  • WLS (USA) #public - Wisconsin Longitudinal Study.
  • WONDER (USA) #public - Wide-ranging Online Data for Epidemiologic Research.
  • WTSs (Austria) #public - Wage Tax Statistics / Statistics Austria.
  • WVS-Database (USA) #public - World Values Survey Database.
  • YODA (USA) #public - Yale Open Data Access Project.
  • ZfKD (Germany) #public - Zentrum für Krebsregisterdaten (German Centre for Cancer Registry Data).