Triplestores - ge-semtk/semtk GitHub Wiki

This page provides details about working with specific triplestores.

Performance test data for Fuseki, BlazeGraph, and virtuoso is found in [performance-to-2M-fuseki-blaze-virt.xlsx] (https://github.com/ge-semtk/semtk/blob/master/documentationFiles/performance-to-2M-fuseki-blaze-virt.xlsx) Since these tests were performed on a single 9th gen i7 machine, Amazon Neptune cloud performance can not be fairly compared.

Apache Fuseki

Security is controlled by run/shiro.ini. File contains instructions on how to:

  • allow admin access beyond localhost
  • set up an admin password

BlazeGraph

"Port in use" error on Windows

If BlazeGraph won't start on windows because the port is in use, open a windows powershell as admin

netstat -ano | grep :9999

gets the process number 1234

taskkill /pid 1234

kills the process. Now restart BlazeGraph.

Openlink Virtuoso

  • download and start instructions: Virtuoso
  • sample SemTK connection string is http://localhost:8890

Note: This is SPARQL1.0 compliant, but not SPARQL1.1. Some literals in VALUES and FILTER statements, etc. may match differently. SemTK attempts to reconcile these differences with fairly good but not perfect success. Further, our team has experienced failures in virtuoso that result in incomplete query results, with no other obvious symptoms.

Using SemTK with AWS Neptune

SemTK supports the AWS Neptune triple store.

Requirements

  • Neptune cluster
  • S3 bucket for uploading data to Neptune

Configuring SemTK to use AWS Neptune

Add these exports to semtk-opensource/ENV_OVERRIDE, customizing them for your environment:

  • export NEPTUNE_UPLOAD_S3_CLIENT_REGION=region
  • export NEPTUNE_UPLOAD_S3_BUCKET_NAME=bucket-name (bucket must be accessible from the instance on which SemTK is running, and from Neptune)
  • export NEPTUNE_UPLOAD_S3_AWS_IAM_ROLE_ARN=arn:aws:iam::555555555555:role/app/role-id (allows Neptune to upload from S3 bucket - only needed if IAM authentication is enabled)

Connecting to Neptune

To create a Neptune connection in SPARQLgraph, use these values: