gnd territories - miku/graphapi GitHub Wiki

Analysis of the GND Territories

GOAL: Enrich GND resources of cities with information from the dbpedia.

PROBLEM: There are no explicit links between cities and the dbpedia.

The class gndo:TerritorialCorporateBodyOrAdministrativeUnit is used for 'geographical entities' (or TCBoAU), there is no designated subclass for cities.

How many TCBoAUs are there?

PREFIX gndo: <http://d-nb.info/standards/elementset/gnd#>
SELECT COUNT(?tcboau) WHERE {
    ?tcboau a gndo:TerritorialCorporateBodyOrAdministrativeUnit .
}

Right now there are 142.330 TCBoAUs, some of which are linked with foaf:page or owl:sameAs.

PREFIX gndo: <http://d-nb.info/standards/elementset/gnd#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT (COUNT(?tcboau) AS ?total) (COUNT(?page) AS ?pages) (COUNT(?uri) AS ?uris) (COUNT(?also_page) AS ?both) WHERE {
    ?tcboau a gndo:TerritorialCorporateBodyOrAdministrativeUnit .
    OPTIONAL {
        ?tcboau foaf:page ?page .
        # FILTER(regex(?page, "http://de\\.wikipedia\\.org"))
    }
    OPTIONAL {
        ?tcboau owl:sameAs ?uri .
        # FILTER(regex(?uri, "http://(sws|www)\\.geonames\\.org"))
        OPTIONAL {
            ?tcboau foaf:page ?also_page .
        }
    }
} LIMIT 100

We get:

?total ?pages ?uris ?both
142.330 24.413 36.953 14.275

Since the filters make no difference, all linked pages are from the (German) Wikipedia and all owl:sameAs URIs are from geonames.org

First and foremost we are interested in the cities that are birth and/or death places. We run the last query again (we can safely drop the filters) and restrict the domain to the birth/death places:

PREFIX gndo: <http://d-nb.info/standards/elementset/gnd#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT (COUNT(?tcboau) AS ?total) (COUNT(?page) AS ?pages) (COUNT(?uri) AS ?uris) (COUNT(?also_page) AS ?both) WHERE {
    {
        SELECT DISTINCT ?tcboau WHERE {
            {
                ?tcboau a gndo:TerritorialCorporateBodyOrAdministrativeUnit .
                ?s1 gndo:placeOfBirth ?tcboau .
            }
            UNION
            {
                ?tcboau a gndo:TerritorialCorporateBodyOrAdministrativeUnit .
                ?s2 gndo:placeOfBirth ?tcboau .
            }
        }
    }
    OPTIONAL {
        ?tcboau foaf:page ?page .
    }
    OPTIONAL {
        ?tcboau owl:sameAs ?uri .
        OPTIONAL {
            ?tcboau foaf:page ?also_page .
        }
    }
}

These numbers look more promising:

?total ?pages ?uris ?both
29.051 12.201 13.538 8.254

Writing links to dbpedia from the links to Wikipedia

Generate new triples that link a GND resource with owl:sameAs to the dbpedia resource:

PREFIX gndo: <http://d-nb.info/standards/elementset/gnd#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

INSERT INTO <http://d-nb.info/gnd/> {
   ?tcboau owl:sameAs ?dblink .
}
WHERE {
    {
        SELECT ?tcboau ?page (iri(replace(str(?page),"^http://de.wikipedia.org/wiki/", "http://de.dbpedia.org/resource/")) as ?dblink) WHERE {
            ?tcboau a gndo:TerritorialCorporateBodyOrAdministrativeUnit .
            ?tcboau foaf:page ?page .
        }
    }
}

Optionally, delete the foaf:page links:

PREFIX gndo: <http://d-nb.info/standards/elementset/gnd#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

DELETE FROM <http://d-nb.info/gnd/> {
   ?tcboau foaf:page ?page .
}
WHERE {
   ?tcboau a gndo:TerritorialCorporateBodyOrAdministrativeUnit .
   ?tcboau foaf:page ?page .
}

Writing links to dbpedia from links to geonames

Resources from geonames.org link to dbpedia via rdfs:seeAlso.

ToDo: From each two triples of the form

    ?gnd_resource owl:sameAs ?geonames_resource .
    ?geonames_resource rdfs:seeAlso ?dbpedia_resource .

generate a triple

    ?gnd_resource owl:sameAs ?dbpedia_resource .