Reconcilable Data Sources - tfmorris/OpenRefine GitHub Wiki

Listing of Reconcilable Data Sources

With OpenRefine you can perform reconciliation against any web service supporting the Reconciliation Service API. Reconciliation against Freebase is built in, but there are several other reconciliation services available as describe on this page. You can alternatively extend your data by calling web services

OpenCorporates

31 million corporate entities (as of Nov. 2011) available for reconciliation through their service.

SPARQL endpoints

The RDF Extension by DERI at NUI Galway includes reconciliation against any SPARQL endpoint or RDF dump file and publishing of the results in RDF. http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/

VIVO Scientific Collaboration Platform

VIVO is a U.S. national interdisciplinary open source scientific collaboration platform funded by the NIH with development led by Cornell. Their reconciliation service allows reconciling against VIVO entities (faculty members, journals, etc) in any VIVO installation. Extending Google Refine for VIVO

Talis Kasabi

The Kasabi reconciliation services provide reconciliation against any database published on the Kasabi platform. Documentation

Talis Platform reconciliation services

This project has been shut down. They suggest some alternatives

Taxonomic Databases

Taxonomic databases (EOL, NCBI, uBio, WoRMS), as documented here.

Wish List

The following are data sources that could provide useful reconciling within OpenRefine. If you would like to help with coding a reconciling extension for any, please contact our mailing list. We would love to see some of these happen!

  • Historical Newspapers - Library of Congress' Chronicling America provides JSON, RDF, XML & Linked Data with an easy to use API.
  • Chemical Identifier Resolver - Over 96 million chemical structures hosted by NCI/NIH, provides names, conversions, & various formats even XML output.
⚠️ **GitHub.com Fallback** ⚠️