Visualizing the Graph Relationships - DDMAL/linkedmusic-datalake GitHub Wiki

Having a visual of the relationships between entities within a graph, i.e. the graph ontology, is useful for both debugging graph structure and generating/debugging SPARQL queries. Once a graph structure is finalized, the visual will not need to be recreated and thus automation of this process would be overkill.

1. Find the general relationships between entities (the ontology)

Visualizing a graph first requires the general relationships between entities. The graph ontology can be generated with the use of a SPARQL query on the dataset.

Download the results as turtle (this option will only appear once you have written a query containing "CONSTRUCT").

In the Virtuoso SPARQL endpoint, use one of the following SPARQL commands:

Default SPARQL command

This command has been used to generate the graphs found in the full visualization and the subgraph visualizations.

Unfortunately, due to limitations with Virtuoso's SPARQL engine, we cannot use a GRAPH block with a VALUES block inside a CONSTRUCT query. As such, to obtain the ontology for more than 1 database, you will need to run the query for each database, and combine the results manually.

When running the query, replace both instances of <<GRAPH_IRI>> with the database's graph IRI (see example).

CONSTRUCT {
  ?stype ?p ?otype .
}
WHERE {
  {
    SELECT DISTINCT ?stype ?p ?otype
    FROM <<GRAPH_IRI>>
    WHERE {
      {
        ?s ?p ?o .
        ?s rdf:type ?stype .
        ?o rdf:type ?otype .
      }
      UNION
      {
        ?s rdfs:label ?o .
        ?s rdf:type ?stype .
        BIND(rdfs:label AS ?p)
        BIND("label" AS ?otype)
      }
      UNION
      {
        ?s skos:altLabel ?o .
        ?s rdf:type ?stype .
        BIND(skos:altLabel AS ?p)
        BIND("alt label" AS ?otype)
      }
    }
  }
  UNION
  {
    SELECT DISTINCT ?stype ?p ?otype
    WHERE {
      {
        SELECT DISTINCT ?stype ?p
        FROM <<GRAPH_IRI>>
        WHERE {
          ?s ?p ?o .
          ?s rdf:type ?stype .
          FILTER(isLiteral(?o) || STRSTARTS(STR(?o), STR(wd:)))
        }
      }
      SERVICE <https://query.wikidata.org/sparql> {
        ?prop wikibase:directClaim ?p .
        ?prop rdfs:label ?otype .
        FILTER(LANG(?otype) = "en")
      }
    }
  }
}

e.g., for DIAMM:

CONSTRUCT {
  ?stype ?p ?otype .
}
WHERE {
  {
    SELECT DISTINCT ?stype ?p ?otype
    FROM diamm:
    WHERE {
      {
        ?s ?p ?o .
        ?s rdf:type ?stype .
        ?o rdf:type ?otype .
      }
      UNION
      {
        ?s rdfs:label ?o .
        ?s rdf:type ?stype .
        BIND(rdfs:label AS ?p)
        BIND("label" AS ?otype)
      }
      UNION
      {
        ?s skos:altLabel ?o .
        ?s rdf:type ?stype .
        BIND(skos:altLabel AS ?p)
        BIND("alt label" AS ?otype)
      }
    }
  }
  UNION
  {
    SELECT DISTINCT ?stype ?p ?otype
    WHERE {
      {
        SELECT DISTINCT ?stype ?p
        FROM diamm:
        WHERE {
          ?s ?p ?o .
          ?s rdf:type ?stype .
          FILTER(isLiteral(?o) || STRSTARTS(STR(?o), STR(wd:)))
        }
      }
      SERVICE <https://query.wikidata.org/sparql> {
        ?prop wikibase:directClaim ?p .
        ?prop rdfs:label ?otype .
        FILTER(LANG(?otype) = "en")
      }
    }
  }
}

Alternative Query - Predicate Centric Visualization

An alternative way of visualizing/understanding a graph is through the range and domain of each predicate. For example, you can find it helpful to know that place of birth (wdt:P19) only has mb:Artist and diamm:Person as subject (domain), and a Wikidata entity or a Literal as object (range).

The SPARQL query below allow constructing an alternative visualization graph (i.e. ontology). Again, due to limitations of Virtuoso, to obtain the ontology for more than 1 database, you will need to run the query for each database, and combine the results manually.

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX wd:   <http://www.wikidata.org/entity/>
CONSTRUCT {
  ?property a rdf:Property ;
            rdfs:domain ?domain ;
            rdfs:range ?range .
  ?class a rdfs:Class .
  wd:Entity a rdfs:Class .
}
WHERE {
  GRAPH <GRAPH_URI> {
    {
      SELECT DISTINCT ?class WHERE {
        ?instance rdf:type ?class .
      }
    }
    UNION
    {
      SELECT DISTINCT ?domain ?property ?range WHERE {
        ?subject ?property ?object .
        ?subject rdf:type ?domain .
        OPTIONAL {
          ?object rdf:type ?oType .
        }
        # Determine range
        BIND(
          IF(
            isIRI(?object),
            IF(
              STRSTARTS(STR(?object), "http://www.wikidata.org/entity/"),
              wd:Entity,
              COALESCE(?oType, owl:Thing)
            ),
            IF(
              isLiteral(?object),
              COALESCE(datatype(?object), rdfs:Literal),
              UNDEF
            )
          ) AS ?range
        )
      }
    }
  }
}

2. Remove Language Labels From the TTL file

Open the TTL from Step 1 into the text editor of your choice.

Replace all instances of @en with nothing.

3. Generate the graph visual

Copy the output from Step 2 and paste it into RDF Grapher.

Make sure the "From format:" is set to "Turtle" and the "To format:" is set to "SVG". If the dataset is large, it may be prudent to check the "Send form as HTTP POST (needed for large RDF data):" option.

Click "Visualize" to generate the graph visual.