Databases Examples - spinningideas/resources GitHub Wiki

Free Dataset Sources & Example Databases

A collection of sources that host free, public-use datasets and sample databases. Most sources allow direct download; some require registration.

Dataset Repositories & Search Engines

  • Google Dataset Search – A Google index across thousands of dataset repositories, including government, academic, and publisher sources.
  • Data.gov – The U.S. government’s open data portal with over 200,000 datasets covering climate, health, finance, and more.
  • Kaggle Datasets – Community-curated datasets for data science, machine learning, and analytics projects.
  • Data.world – Collaborative platform for discovering and sharing open datasets.
  • FiveThirtyEight Data – Datasets behind FiveThirtyEight’s articles and analyses (sports, politics, economics).
  • GitHub: Awesome Public Datasets – A topic-centric list of high-quality public datasets across many domains.
  • UCI Machine Learning Repository – Classic benchmark datasets (e.g., Iris, Wine, Car Evaluation) widely used in machine learning.
  • Open Data Kit: Data Packaged Core Datasets – Curated, packaged reference datasets maintained as part of the Frictionless Data project.

Global Development & Economics

Government & Civic Data

Health & Social Impact

  • WHO Data – Global health statistics from the World Health Organization.
  • CDC Data – U.S. Centers for Disease Control and Prevention datasets.
  • County Health Rankings – U.S. county health factor rankings.
  • IHME Global Burden of Disease – Global disease burden estimates from the Institute for Health Metrics and Evaluation.

Climate, Environment & Energy

Machine Learning & Sample Databases

  • Iris Dataset – Classic classification dataset from the UCI repository.
  • Wine Quality Dataset – Wine quality ratings used for regression/classification benchmarks.
  • Chinook Sample Database – A sample SQLite database for practicing SQL, modeling a digital media store.

References