URIs from PIDs - mconlon17/vivo-etl GitHub Wiki

PIDs (persistent identifiers) and in particular, GPIDs (globally persistent identifiers) provide a mechanism for identifying entities within a domain. URIs provide a means for identifying entities in RDF (resource description framework) and knowledge graphs, including distributed knowledge graphs and the Semantic Web.

A central issue for VIVO has been the distributed production of data. This is a strength of VIVO -- institutions with vested interests in creating and curating the data regarding the work of their people. VIVO currently encourages the local and distributed creation of entities as needed to represent scholarship. But many entities overlap institutions, for example, papers are co-authored across institutions, people graduate from one institution and work at another. An institution creates URIs in its VIVO to represent entities outside the institution. This leads to duplication of URLs for the same entity. Harvard may create a URI for the University of Tennessee, and the Harvard VIVO might create something that looks like: http://vivo.harvard.edu/individual/n77321772. TIB may also create a URI for the University of Tennessee. The TIB URI might look like: http://tib.eu/indidivual/n99111212. If these data are combined into a knowledge graph or triple store, it would appear that they represent different entities.

PIDs could be helpful. The ROR PID for the University of Tennessee is 020f3ap87, and there is a landing page at https://ror.org/020f3ap87 Not that the PID represents the institution, and the URL for the PID represents the landing page for the PID value. We need a URL that represents the institution that is constructed from the PID.

Constructing a URI from a PID

The URI (not resolvable as a URL) could be written as:

https://vivoweb.org/ror/ror-pid-value

So, for the University of Tennessee,

http://vivoweb.org/ror/020f3ap87

Use

The scripts being develop in this repository can construct URIs from PIDs. If Harvard and TIB use the scripts (or simply agree to follow the pattern for constructing organization URIs from ROR PIDs, then each will represent the University of Tennessee with the same URI.