SQL Sail integration - villaitatti/metapolis GitHub Wiki
SQL Sail integration
Metapolis displays geospatial data by integrating with a PostgreSQL database configured with PostGIS. The SQL Sail integration provides an interface to perform SQL queries on this database and convert the results to RDF format, enabling SPARQL queries over traditional SQL data.
The setup for SQL Sail integration requires the database to be created and a JDBC driver to be installed. For PostgreSQL, this driver can be downloaded from the PostgreSQL JDBC driver page. Once downloaded, the driver should be placed in the application’s lib
folder. Metapolis includes this driver at lib/postgresql-42.2.14.jar
, so no further installation is required.
A sample application demonstrating this integration can be found on GitHub: test repository.
Example
The following example uses a sample table called Artworks, representing artwork data in the SQL database:
id title technique catalogue width
1 Nude Baptist lightly seated, leaning on left elbow, with right hand held up black chalk 228 157
2 Mapping out of the lunette at Poggio a Cajano. black chalk 2151 79
3 Various "putti" for same work, or for the Lucca God the Father Red chalk 361 126
The table Artworks
has the following columns:
- id: the identifier in string format,
- title: the artwork's title in string format,
- technique: the technique used to draw the painting in string format,
- catalogue: is the number used within the catalogue,
- width: is the artwork width and it is an integer.
Configuration
To implement SQL Sail, create a dedicated repository similar to the RESTSail configuration. SQL Sail allows direct querying and retrieval of SQL database data within the RDF model, making SQL data accessible via SPARQL.
Service Descriptor
This service descriptor maps SQL table columns to RDF properties, defines constraints for query inputs, and specifies predefined SQL queries that can be executed by the SPARQL endpoint.
Within the prefixes there are a few to note.
- rssqlsail: – Namespace for properties associated with SQL Sail configurations.
- ephedra: – Namespace specific to this service configuration.
PREFIX sp: <http://spinrdf.org/sp#>
PREFIX spin: <http://spinrdf.org/spin#>
PREFIX spl: <http://spinrdf.org/spl#>
PREFIX : <http://www.researchspace.org/resource/system/repository#>
PREFIX rssqlsail: <http://www.researchspace.org/resource/system/sql#>
PREFIX ephedra: <http://www.researchspace.org/resource/system/ephedra#>
:artworks a ephedra:Service ;
<http://www.openrdf.org/config/sail#sailType> "researchspace:SQLSail" ;
rdfs:label "A wrapper for the SQL Artworks service." ;
ephedra:hasSPARQLPattern (
[ sp:subject :_results; sp:predicate rssqlsail:hasQueryId; sp:object :_queryId ]
[ sp:subject :_results; sp:predicate rssqlsail:hasTitle; sp:object :_title ]
[ sp:subject :_results; sp:predicate rssqlsail:hasTechnique; sp:object :_technique ]
[ sp:subject :_results; sp:predicate rssqlsail:hasInputTechnique; sp:object :_inputTechnique ]
[ sp:subject :_results; sp:predicate rssqlsail:hasCatalogue; sp:object :_catalogue ]
[ sp:subject :_results; sp:predicate rssqlsail:hasWidth; sp:object :_width ]
) ;
spin:constraint [ a spl:Argument; spl:predicate :_queryId; spl:valueType xsd:string; rdfs:comment "query ID" ];
spin:constraint [ a spl:Argument; spl:predicate :_inputTechnique; spl:valueType xsd:string; rdfs:comment "The technique parameter used as input" ];
spin:column [ a spin:Column; spl:predicate :_results; spl:valueType rdfs:Resource; rdfs:comment "Row index: will be assigned a random blank node" ];
spin:column [ a spin:Column; spl:predicate :_title; spl:valueType xsd:string; rdfs:comment "Output title" ];
spin:column [ a spin:Column; spl:predicate :_technique; spl:valueType xsd:string; rdfs:comment "Output technique"];
spin:column [ a spin:Column; spl:predicate :_catalogue; spl:valueType xsd:integer; rdfs:comment "Output catalogue"] ;
spin:column [ a spin:Column; rdfs:comment "Output width"; spl:predicate :_width; spl:valueType xsd:integer];
rssqlsail:includesSQLQuery [ rssqlsail:hasQueryId "default"; rssqlsail:text "SELECT * FROM Artworks;" ];
rssqlsail:includesSQLQuery [ rssqlsail:hasQueryId "getBySize"; rssqlsail:text "SELECT * FROM Artworks where technique = ?inputTechnique;" ] .
The service is defined with the URI :artworks, marked as a type of ephedra:Service, which designates it as an SQL Sail service.
The property <http://www.openrdf.org/config/sail#sailType>
specifies the service type, and it is mandatory. It must be researchspace:SQLSail indicating that it’s an SQL-based data source.
:artworks a ephedra:Service ;
<http://www.openrdf.org/config/sail#sailType> "researchspace:SQLSail" ;
rdfs:label "A wrapper for the SQL Artworks service." ;
ephedra:hasSPARQLPattern
defines mappings between the inputs, outputs, and SPARQL properties, making these parameters accessible thorugh SPARQL queries. Each pattern here represents a single column in the Artworks table, or an input parameter, and its RDF representation.
- sp:subject: The general subject for result entries.
- sp:predicate: Defines the predicate (property) representing the SQL column.
- sp:object: The object mapped to a variable or constant within SPARQL.
spin:constraint
defines the constraints for query parameters, specifying the input values that users must provide when querying the database.
- spl:predicate: The predicate name for the input.
- spl:valueType: The expected data type (to be chosen between
xsd:string
,xsd:integer
orxsd:int
,xsd:float
). - rdfs:comment: A brief description.
The two constraint in the example are :_queryID
(which allows users to select the SQL queries by ID), and :_inputTechnique
(that is a parameter on which the query filters artworks).
spin:column
maps each SQL column to RDF output variables for SPARQL queries. These definitions set the output types for each column, allowing the SPARQL system to understand how each SQL result column should be interpreted.
- spl:predicate: The RDF property corresponding to each SQL column.
- spl:valueType: The expected data type (to be chosen between
xsd:string
,xsd:integer
orxsd:int
,xsd:float
). - rdfs:comment: A description of the column.
rssqlsail:includesSQLQuery
specifies the predefined SQL queries available within this service. Each query is defined by a unique ID and the SQL text itself.
NOTE: the query body may contain parameters (e.g., inputTechnique) that must prepend a question mark.
The previous code shows how the RDF triple must be formed. The predicate must be rssqlsail:hasTitle
, where the prefix is defined in the same file, and hasTitle is an arbitrary predicate name. It means the SPARQL subject is related to an element with _title
as spl:predicate
.
Repository Configuration File
The repository configuration file defines the repository ID, database connection details, and service endpoints. Below is an example:
Repository configuration file
@prefix repo: <http://www.openrdf.org/config/repository#> .
@prefix reposail: <http://www.openrdf.org/config/repository/sail#> .
@prefix sail: <http://www.openrdf.org/config/sail#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ephedra: <http://www.researchspace.org/resource/system/ephedra#> .
@prefix rsrepo: <http://www.researchspace.org/resource/system/repository#> .
@prefix rssqlsail: <http://www.researchspace.org/resource/system/sql#> .
@prefix : <http://www.researchspace.org/resource/system/repository#> .
[] a repo:Repository ;
repo:repositoryID "artworks" ;
rdfs:label "A wrapper for the SQL test repository." ;
repo:repositoryImpl [
repo:repositoryType "openrdf:SailRepository" ;
reposail:sailImpl [
sail:sailType "researchspace:SQLSail" ;
ephedra:serviceURL "jdbc:postgresql://{database url}/{database name}" ;
rsrepo:username "{username here}" ;
rsrepo:password "{password here}" ;
ephedra:implementsService :artworks ;
] ;
] .
The ephedra:serviceURL
is generally composed as follows:
jdbc:{dbms service}://{url}:{port}/{database_name}
. The DBMS service can be postgresql, mysql, or other services name. For this use case, the URL service is the following: jdbc:postgresql://localhost:3306/artworks_data.
The rsrepo:username
and rsrepo:password
must be passed in plain text in the configuration file; otherwise, they can be passed through the secret resolver (see Repository Manager).
Querying the system
With the repository and descriptor loaded, you can query SQL data using SPARQL. For instance, to retrieve all rows in the Artworks
table, use the default query ID:
PREFIX rssqlsail: <http://www.researchspace.org/resource/system/sql#>
SELECT ?title ?technique ?catalogue ?width WHERE {
?subject rssqlsail:hasQueryId "default".
?subject rssqlsail:hasTitle ?title.
?subject rssqlsail:hasTechnique ?technique.
?subject rssqlsail:hasCatalogue ?catalogue.
?subject rssqlsail:hasWidth ?width.
}
For more complex queries requiring parameters, like filtering by technique, use the getBySize
query:
PREFIX rssqlsail: <http://www.researchspace.org/resource/system/sql#>
SELECT ?title ?technique ?catalogue ?width WHERE {
?subject rssqlsail:hasQueryId "getBySize".
?subject rssqlsail:hasTitle ?title.
?subject rssqlsail:hasTechnique ?technique.
?subject rssqlsail:hasCatalogue ?catalogue.
?subject rssqlsail:hasWidth ?width.
?subject rssqlsail:hasInputTechnique "black chalk"
}
These examples demonstrate how SPARQL queries can access SQL data seamlessly using predefined descriptors, enabling integration between relational databases and semantic queries.