SPARQL Patterns - jneubert/doc GitHub Wiki

Command line execution

With curl, it is more save to use --data-binary than --data for transmitting the query, because only the former preserves linebreaks.

SELECT and CONSTRUCT query:

curl --silent -X POST \
  -H "Content-type: application/sparql-query" \
  -H "Accept: text/turtle" \
  --data-binary @/path/to/query/file \
  https://query.wikidata.org/sparql \
  > result.ttl

Accept param

  • application/sparql-results+json (default)
  • text/turtle
  • application/ld+json
  • application/n-triples
  • text/csv; charset=utf-8
  • text/tab-separated-values; charset=utf-8

UPDATE query

curl --silent -X POST \
  -H "Content-type: application/sparql-update" \
  --data-binary @/path/to/query/file \
  http://localhost:3030/mydata/update \
  > /dev/null

Documentation

Header

# Purpose of the query,
# perhaps spanning multiple lines
#
# Additional comments (optional)
#
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
#
select ...

Result variables

select
  ?var1   # definition of ?var1
  ?var2   # definition of ?var2
where

Statistics

Count items separately, if above threshold

select (count(?x) as ?countAll) (sum(?topX) as ?countTop)
where {
  # some pattern, e.g. from a ranking
  ?x :rank ?rank .
  bind((?rank <= 100), 1, 0) as ?topX)
}

Percentage

with one decimal place

select (concat(str(round((?part/?total)*1000)/10), ' %') as ?percentage)

Labels with language fallbacks

# get label for the wd item, if exists
optional {
  ?wd rdfs:label ?wdLabelDe .
  filter(lang(?wdLabelDe) = 'de')
}
optional {
  ?wd rdfs:label ?wdLabelEn .
  filter(lang(?wdLabelEn) = 'en')
}
# one label
bind(str(coalesce(?wdLabelEn, ?wdLabelDe, strafter(?wd, '/entity/'))) as ?wdLabel)
# combinded de | en label
bind(concat(if(bound(?wdLabelDe), str(?wdLabelDe), ''), ' | ', if(bound(?wdLabelEn), str(?wdLabelEn), '')) as ?wdLabel)

Casting

Enables pretty-printing of stringified values while sorting by the computed integer value.

select ?xyz (str(count(?xyz)) as ?prettyCount)
where {
  # ...
}
group by ?xyz
sort by desc(xsd:integer(?prettyCount))

Federated queries (SERVICE clause)

Always use a select clause and possibly grouping/filtering/limits to restrict to the minimum of rows required.

Naming convention for cached result files

In directory results below the directory containing the query file. Version information relevant for the result can be referenced in the file name:

{query_name}.{main_dataset}_{version}[{additional_dataset}_{version}].json
  • main_dataset should be identical to the endpoint used
  • additional_dataset may refer to datasets referenced in service clauses or named graphs