Getting Started - NatLibFi/Skosify GitHub Wiki

Introduction

Skosify was created in order to transform a collection of thesaurus-like OWL ontologies into a standard SKOS format. It works by stepwise adjusting the input ontology so that it becomes more SKOS-like.

You can give it any RDF/RDFS, OWL or SKOS vocabulary as input. However, to get meaningful results the input should be a thesaurus-like vocabulary or ontology, i.e. someting that can be usefully represented using SKOS.

Operation

Skosify will adjust the structure of the vocabulary using the following processing steps:

  1. Read the input file. Supported formats include RDF/XML, Turtle, N3 and N-Triples (i.e. anything that rdflib supports).
  2. Make sure the vocabulary has a defined skos:ConceptScheme. 1. If not, see if it has an owl:Ontology instance and convert that to a skos:ConceptScheme. 1. Otherwise create a skos:ConceptScheme. This will require that the --namespace parameter is given.
  3. If enabled, perform RDFS subclass and subproperty inference. See RDFS Inference for details.
  4. Transform classes/concepts, literals and relations according to the [types], [literals] and [relations] mappings defined in a configuration file. See OWL conversion to SKOS for details.
  5. Make sure that skos:Collections have the right structure, i.e. they are defined outside the concept hierarchy. See Collections for details.
  6. Transform aggregate concepts into a more SKOS-like representation. This is a peculiarity of some FinnONTO ontologies which you can safely ignore. See Aggregate Concepts for details.
  7. Enrich the vocabulary by performing inferences specified in SKOS. Optionally add skos:narrower, skos:broaderTransitive and skos:narrowerTransitive relationships. See SKOS Inference for details.
  8. Clean up unused and/or unnecessary class and property definitions and unreachable triples. See Cleanups for details.
  9. Make sure all concepts have a skos:inScheme relation to a skos:ConceptScheme.
  10. Make sure the topmost concepts have been identified using skos:hasTopConcept and skos:topConceptOf relationships.
  11. Perform some validations. See Validation for details. 1. Check for loops in the skos:broader hierarchy and break them. 1. Check for overlap in disjoint semantic relations (skos:related and skos:broaderTransitive) and correct any inconsistencies. 1. Remove extra whitespace from labels. 1. Check that concepts have only one skos:prefLabel per language and correct any inconsistencies. 1. Check for overlap in disjoint label properties and correct any inconsistencies.
  12. Write out the resulting SKOS vocabulary (as RDF/XML, N3/Turtle...)

Installation

Skosify requires Python 2.6 or newer.

pip install --upgrade skosify

Running

Simple usage of Skosify:

skosify myvoc.rdf -o myvoc-skos.rdf

This will read the file myvoc.rdf (assumed to be in RDF/XML format due to the extension) and write output into the file myvoc-skos.rdf as RDF/XML.

For detailed help, see

skosify --help