Quest For The Query - MartijnKeesmaat/frontend-data GitHub Wiki

Proof of concept

The first concept was to show the categories with their respective sub-categories. Retrieving the relation of each category proved to be difficult. It is not possible with just one query. What could be possible is to write a new query for each main category, then one for their sub-categories and then one for every sub-sub-category. Then we would display the data into a sun-burst like this.

This diagram seems difficult to re-create. Which requires me to copy a big part of the code to get it to work. I am also not excited about just showing the categories. It would be more fun to show multiple types of data.

Material per category

The next goal was to combine two visualizations Rik asked for:

  1. Top 10 categories
  2. Top 10 Materials

The technical concept is simple, create a graph based on the categories and use that data to populate a new graph with the most used materials in said category.

Process

Together with Mohamad, I spend two days trying to achieve the desired result.

First, we used this approach of using a lot of skos:broaders to get the top level.

#+ summary: Get titles materialen meest voorkomende
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?matrialLabel (COUNT(?matrialLabel) AS ?matrialCount) WHERE {
  ?cho dct:medium ?matrial .
  ?matrial skos:prefLabel ?matrialLabel .
 # ?matrial skos:broader ?matrialGlobal .
 # ?matrialGlobal skos:broader ?test .
 # ?test skos:prefLabel ?testNaam .

}
GROUP BY ?matrialLabel
ORDER BY DESC(?matrialCount)
LIMIT 11

Then we realized that there is a category above all categories called functionele voorwerpen. This termmaster can be selected in the query and then be used with skos:narrower to get all the top levels.

#+ summary: Get titles meestvoorkomende catogrieen
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?hoofdCategorie  ?matrialLabel (COUNT(?matrialLabel) AS ?matrialCount) WHERE {
  ?cho edm:isRelatedTo <https://hdl.handle.net/20.500.11840/termmaster2802> .
  <https://hdl.handle.net/20.500.11840/termmaster2802> skos:narrower ?hoofdCategorie .
  ?hoofdCategorie skos:prefLabel ?matrialLabel .
}
GROUP BY ?matrialLabel ?hoofdCategorie
ORDER BY DESC(?matrialCount)

Once that worked and we retrieved the termmasters of each main category, we looked at how to use that termmaster to retrieve the materials and countries. This worked pretty well since we got the materials and countries with the most objects. One last problem we had was that the object count was always the same integer.

#+ summary: Get titles top tien materiaal van de eerste top categorie
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?matriaalName (COUNT(?matriaalName) AS ?matrialCount) WHERE {
  ?cho edm:isRelatedTo <https://hdl.handle.net/20.500.11840/termmaster2641> .
  <https://hdl.handle.net/20.500.11840/termmaster2641> skos:prefLabel ?categorieLabel .
  ?cho dct:medium ?matriaal .
   ?matriaal skos:prefLabel ?matriaalName .
#  ?matriaal skos:prefLabel ?matriaalName .

}
GROUP BY ?matriaalName
ORDER BY DESC(?matrialCount)
LIMIT 10
#+ summary: Get titles meest voorkomende landen van de eerste categorie
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?placeName (COUNT(?placeName) AS ?matrialCount) WHERE {
  ?cho edm:isRelatedTo <https://hdl.handle.net/20.500.11840/termmaster2641> .
  <https://hdl.handle.net/20.500.11840/termmaster2641> skos:prefLabel ?categorieLabel .
  ?cho dct:spatial ?place .
  ?place skos:prefLabel ?placeName .
 # ?cho dct:medium ?matriaal .
 # ?matriaal skos:prefLabel ?matriaalName .
 # ?matriaal skos:prefLabel ?matriaalName .

}
GROUP BY ?placeName
ORDER BY DESC(?matrialCount)
LIMIT 10

Final solution

In the final solution, I first fetch every main category with its respective termmaster link. The termmasters are combined with the name and object count into an array of objects. Now we have the termmasters we retrieve the materials of each category. With a forEach loop, we can iterate over each category and retrieve new data.

With the help of Ivo I realized that we counted the main category. With the help of an extra step which narrowed the objCount we got the counting working.

  # summary: Get titles meestvoorkomende catogrieen
  PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  PREFIX dc: <http://purl.org/dc/elements/1.1/>
  PREFIX dct: <http://purl.org/dc/terms/>
  PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
  PREFIX edm: <http://www.europeana.eu/schemas/edm/>
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
  # tel aantallen per materiaal
  SELECT ?categoryLabel ?category (COUNT(?allChos) AS ?objCount) WHERE {
    ?cho edm:isRelatedTo <https://hdl.handle.net/20.500.11840/termmaster2802> .
    <https://hdl.handle.net/20.500.11840/termmaster2802> skos:narrower ?category .
    ?category skos:prefLabel ?categoryLabel .
    ?category skos:narrower* ?allChos .
  }

  GROUP BY ?categoryLabel ?category
  ORDER BY DESC(?objCount)

This query retrieves the most common materials for each category. Because we loop over each category in the array, we can set the thermmaster dynamically.

categoriesTermaster.forEach(category => {
  PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  PREFIX dc: <http://purl.org/dc/elements/1.1/>
  PREFIX dct: <http://purl.org/dc/terms/>
  PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
  PREFIX edm: <http://www.europeana.eu/schemas/edm/>
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>

  # tel aantallen per materiaal
  SELECT ?subcategorie ?materiaalLabel (COUNT(?cho) AS ?choCount) WHERE {
    # haal van een term in de thesaurus de subcategorieen op
    ${category.termmaster} skos:narrower* ?subcategorie .
    # haal de objecten van deze subcategorieen en het materiaal
    ?cho edm:isRelatedTo ?subcategorie .
    ?cho dct:medium ?materiaal .
    # haal het Label op van materiaal
    ?materiaal skos:prefLabel ?materiaalLabel .
  }

  GROUP BY ?subcategorie ?materiaalLabel
  ORDER BY DESC(?choCount)
  LIMIT 5

This fetchDataFromQuery function is written so it can be re-used for each fetch. When we retrieve the JSON data, we call a function that is provided as an argument.

const fetchDataFromQuery = (querySrc, query, responseFn) => {
  fetch(`${querySrc}?query=${encodeURIComponent(query)}&format=json`)
    .then(res => res.json())
    .then(data => responseFn(data, outsideScope));
};

The function call looks like this. Because of the function as an argument we can use a different function for each fetch. In this case, it's called handleFetchMaterialPerCategory.

fetchDataFromQuery('https://api.data.netwerkdigitaalerfgoed.nl/datasets/ivo/NMVW/services/NMVW-20/sparql', queryCategories, category, handleFetchMaterialPerCategory);

image Now I have this array with 19 items. Each item represents a category that has a name, value and material array. The material also has a name and value.