Queries in n3 - nsip/n3 GitHub Wiki

n3 is a tool that assembles data from disparate sources and allows it to be queried in a single unifying GraphQL interface. This documents the queries that n3 makes available across that data.

Endpoint

In the default installation of n3, the endpoint for GraphQL queries is http://localhost:1323/n3/graphql, and authorisation is done using the HTTP header

{ "Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdWQiOiJkZW1vIiwiY25hbWUiOiJteVNjaG9vbCIsInVuYW1lIjoibjNEZW1vIn0.VTD8C6pwbkQ32u-vvuHnxq3xijdwNTd54JAyt1iLF3I" }

to set up a data context and to acquire a bearer token see the importing data section of this wiki. The token above is generated for you if you run the load(.exe) tool bundled with the distribution, which also populates n3 with a demonstration set of sample data. If you want to query the datastore using a graphql client such as GraphiQL, remember to add your authorization token in the relevant section of your client. Examples below assume you've run load(.exe) and that you've set the authorisation header to the demo value above.

Schemas

Dynamic Schemas

n3 ingests heterogeneous objects, and assigns each of them an identifier.

n3 dynamically formulates schemas of the data it receives in a context as it ingests each object, and it updates those schemas as it receives more data. If you access the GraphQL endpoint of n3 in a given context, and view the Schemas exposed by n3 for that context, you will see a different type for each JSON object and each XML object ingested by n3, as well as schemas for each nested container in those objects.

For example, an ingest of the following SIF-AU StudentPersonal object

{
        "StudentPersonal":
        {
            "RefId": "82656FA0-17B6-42BF-9915-487360FDF361",
            "LocalId": "nwkoe858",
            "PersonInfo":
            {
                "Name":
                {
                    "Type": "LGL",
                    "FamilyName": "Bailey",
                    "GivenName": "Jacquelin"
                },
                "Demographics":
                {
                    "IndigenousStatus": "3",
                    "Sex": "2",
                    "BirthDate": "2004-06-02"
                },
                "EmailList":
                {
                    "Email": [
                    {
                        "Type": "01",
                        "value": "[email protected]"
                    }]
                }
            }
        }
    }

will generate the following types and add them to the GraphQL schema: StudentPersonal, PersonInfo, Demographics, EmailList, Email, Name. If a StudentPersonal object is loaded with more fields, the schema will be updated to reflect the new content. (In fact, it will update the GraphQL schema, as it ingests the new objects.) That means that n3 does not need to know what the schemas of ingested objects are, to enable queries on them.

Currently the schemas are not namespaced. If the same tag or attribute value is used in two different schemas, with different content, n3 will treat them as a union, and will allow the attribute values of either tag in its schema. So for example, if a completely different JSON object contains:

        "NewObject":
        {
            "Name": {
                "nomen": "Fred" 
            }
        }

the definition in the schema of the Name type is updated as following, conflating the SIF and non-SIF types:

type Name {
  GivenName: String
  FamilyName: String
  Type: String
  nomen: String
}

If on the other hand "Name" is a simple type (not an array or object in JSON; single element with no attributes in XML), it will be treated as a type attribute in the GraphQL Schema, and will not be conflated with the "Name" type definition. So in the following, "Name" is left as an attribute of courses, and is not merged into the Name type:

    "courses": [
    {
        "Name": "Features Of Places",
        "outcomes": [
        {
            "description": "describes features of places and the connections people have with places",
            "id": "GE1-1"
        }
    }]

Merging types from different data models, of course, is contrary to well-managed data design, but the purpose of n3 is to be robust enough to cope with disparate data, which does not align to a single data model, and which may not have been explicitly modelled at all. If namespacing data is critical to your operations, you may need to do one of the following:

  • Wrap the data in a distinct container for each data model; so all SIF objects would be mapped in {"sif": ... }, all xAPI objects in {"xAPI": ...}, etc. Because GraphQL will navigate through ingested objects from their root nodes, this guarantees that there will be no actual clashes in any predicates stored in n3: the root wrapper acts as a namespace for all its child nodes.
  • Intervene in the data to prefix tags as needed: while the wrapper will prevent collisions in the stored predicates, the GraphQL schema types will still be inferred as unions if there is a name clash, as described above, unless the actual tag names are altered.

n3 Schema

In contrast to the ingested objects, the GraphQL schema of n3 itself is deliberately quite thin:

schema {
  query: n3query
}

type n3query {
  q(qspec: QueryInput!): n3data
}

input QueryInput {
  queryType: QueryType!
  queryValue: String!
  traversal: [String!]
  filters: [FilterSpec!]
}

enum QueryType {
  findById
  findByType
  findByValue
  findByPredicate
  traversalWithId
  traversalWithValue
}

input FilterSpec {
  eq: [String!]
}

The n3data type, returned by any n3 GraphQL query, is the accessor for the objects ingested by n3: it contains one field for each distinct object type ingested (as identified by its root node label). For the sample data included in the n3 distribution, it is mapped as follows:

type n3data {
  Syllabus: [Syllabus]
  StudentPersonal: [StudentPersonal]
  TeachingGroup: [TeachingGroup]
  SchoolCourseInfo: [SchoolCourseInfo]
  TimeTableSubject: [TimeTableSubject]
  StaffPersonal: [StaffPersonal]
  StudentAttendanceTimeList: [StudentAttendanceTimeList]
  XAPI: [XAPI]
  Subject: [Subject]
  Lesson: [Lesson]
  SchoolInfo: [SchoolInfo]
  GradingAssignment: [GradingAssignment]
}

So any n3 GraphQuery will return zero or more instances of each of the objects seen by n3.

The query functionality is discussed immediately below.

Queries

Simple queries

As the schema shows, the only content the n3 GraphQL endpoint recognises is a query, q, with a single parameter, qspec, containing the details of the query. For example, the following is a query for all StaffPersonal objects on the endpoint:

{  
  q(qspec: {
     queryType: findByType,
     queryValue: "StaffPersonal"
    }) {
    StaffPersonal{
        RefId
        PersonInfo {
          Name {
            FamilyName
            GivenName
          }
        }
      }
  }
} 

As normal on GraphQL, the query can be constrained to return only a subset of all available information. The following constraint limits the StaffPersonal objects returned to include only StafPersonal.RefId and StaffPersonal.PersonInfo.Name.FamilyName:

{  
  q(qspec: {
     queryType: findByType,
     queryValue: "StaffPersonal"
    }) {
      StaffPersonal {
        RefId
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
    }
}

Based on the sample data, this returns:

{
  "data": {
    "q": {
      "StaffPersonal": [
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Kinney"
            }
          },
          "RefId": "15A1FB68-1379-4827-8AD7-4D8749010D90"
        },
       ...
      ]
    }
  }
}

Parameterised queries

Queries that will be recurrently made can be parameterised, as normal in GraphQL. The foregoing query on StaffPersonal can be presented as a query separate from its parameters, as follows:

{
  query: 
    query teachersQuery($qspec: QueryInput!) {
      q(qspec: $qspec) {
        StaffPersonal {
          RefId
          PersonInfo {
            Name {
              FamilyName
              GivenName
            }
          }
        }
      }
    },
    variables: {
      qspec: {
        queryType: "findByType",
        queryValue: "StaffPersonal"
      }
   }
}

Find queries

There are four find queries defined for n3 GraphQL, as specified in qSpec.queryType:

  • findByType looks up the object with the type given in qSpec.queryValue:
{  
  q(qspec: {
     queryType: findByType,
     queryValue: "StaffPersonal"
    }) {
      StaffPersonal {
        RefId
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
    }
}

returns:

{
  "data": {
    "q": {
      "StaffPersonal": [
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Kinney"
            }
          },
          "RefId": "15A1FB68-1379-4827-8AD7-4D8749010D90"
        },
...
      ]
    }
  }
}
  • findById looks up the object with the ID given in qSpec.queryValue; the ID is the unique identifier assigned to the object as discussed above, whether as an object attribute specified for the given data model in configuration, a combination of attributes as a primary key, or a randomly generated identifier.
{  
  q(qspec: {
     queryType: findById,
     queryValue: "15A1FB68-1379-4827-8AD7-4D8749010D90"
    }) {
      StaffPersonal {
        RefId
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
    }
}

returns:

{
  "data": {
    "q": {
      "StaffPersonal": [
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Kinney"
            }
          },
          "RefId": "15A1FB68-1379-4827-8AD7-4D8749010D90"
        }
      ]
    }
  }
}
  • findByValue finds any objects that contain as an attribute either qSpec.queryValue itself, or a value starting with qSpec.queryValue.
{  
  q(qspec: {
     queryType: findByValue,
     queryValue: "Kin"
    }) {
      StaffPersonal {
        RefId
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
    }
}

returns:

{
  "data": {
    "q": {
      "StaffPersonal": [
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Kinney"
            }
          },
          "RefId": "15A1FB68-1379-4827-8AD7-4D8749010D90"
        },
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Knoll"
            }
          },
          "RefId": "D4A3C1E3-3F6E-4B31-ABA6-26809DF5FD63"
        }
      ]
    }
  }
}

The search can find internal n3 values as well; for example, queryValue: Property.Link will return all objects linked to other objects through a non-ID field (because that link is represented internally as a "Property.Link" object of an is-a predicate.)

  • findByPredicate finds any objects that contain the predicate named in qSpec.queryValue, or a substring of it. Predicates in n3 are the JSON Path from the root of the object to an attribute value; so findByPredicate will return all instances of a structure in an ingested object, comparable to an XPath query. A query on StaffPersonal will return all objects with StaffPersonal at the root (so it is doing the same thing as findByType); a query on StaffPersonal.PersonInfo.Name.FamilyName will return all objects that contain a FamilyName node at the nominated path.
{  
  q(qspec: {
     queryType: findByPredicate,
     queryValue: "courses.0.name"
    }) {
      StaffPersonal {
        RefId
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
    }
}

returns:

{
  "data": {
    "q": {
      "Syllabus": [
        {
          "courses": [
            {
              "name": "Community and rememberance"
            },
            {
              "name": "First contacts"
            }
          ]
        },
....
      ]
    }
  }
}

Traversal queries

Traversal go through the graph generated by the links elements extracted in the data model, as discussed above, and return a listing of objects related to an initial node that can be extracted by going through those links. That means that traversal can returns all available information related to the initial node. If no links are available (because the objects involved are not recognised as belonging to a data model, with nominated link attributes), then there will be no graph to traverse.

There are two mandatory attributes for traversal queries. The initial node to start the graph traversal from is identified through the query parameter queryValue, as either an object ID (traversalWithId), or a value contained within an object (traversalWithValue).

The sequence of object links to explore is a string array given in the query parameter traversal; this array must contain at least two entries. The initial value in the array is the object that contains the starting point. So

{  
  q(qspec: {
     queryType: traversalWithId,
     queryValue: "15A1FB68-1379-4827-8AD7-4D8749010D90"
     traversal: ["StaffPersonal", "TeachingGroup"]
    }) {
      StaffPersonal {
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
      TeachingGroup {
        ShortName
      }      
    }
}

returns:

{
  "data": {
    "q": {
      "StaffPersonal": [
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Kinney"
            }
          }
        }
      ],
      "TeachingGroup": [
        {
          "ShortName": "7A English 2"
        },
        {
          "ShortName": "8A Geography 2"
        },
        {
          "ShortName": "7B Geography 1"
        },
        {
          "ShortName": "8B English 2"
        },
        {
          "ShortName": "8A English 2"
        }
      ]
    }
  }
}

That means that the graph traversal should include the starting object StaffPersonal (with ID 15A1FB68-1379-4827-8AD7-4D8749010D90), and all TeachingGroups that are linked to that StaffPersonal (by having the same RefId as an attribute value).

Links are established between objects in order of priority.

  • If there is an available link from object ID to object ID -- that is, if the ID of the target object is recovered as a link attribute in the source object -- then that is the only link traversed between the two object classes.
  • If no such link is available, a link is instead established if the two objects share a Property Link: if both objects share an attribute, and that attribute is defined as a link in the data model for one of the objects.
  • If no such link is available, a link is instead established if the two objects share a Unique Link: a compound primary key that both objects have in common.

So the following query traverses from a Staff Member all the way to the lesson they teach, using the staff surname as the starting point:

{  
  q(qspec: {
     queryType: traversalWithValue,
     queryValue: "Kin"
     traversal: [
		"StaffPersonal",
		"TeachingGroup",
		"GradingAssignment",
		"XAPI",
		"Subject",
		"Syllabus",
		"Lesson"
	]
    }) {
      StaffPersonal {
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
      TeachingGroup {
        ShortName
        TeachingGroupPeriodList {
          TeachingGroupPeriod {
            DayId
          }
        }
      }  
      Lesson {
        learning_area
        title
      }
    }
}

returns:

{
  "data": {
    "q": {
      "Lesson": [
        {
          "learning_area": "hsie",
          "title": "Lesson 2 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 2 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 2 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 2 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 2 Sequence 1: The Ancient World"
        },
        {
          "learning_area": "hsie",
          "title": "Lesson 1 Sequence 1: The Ancient World"
        }
      ],
      "StaffPersonal": [
        {
          "PersonInfo": {
            "Name": {
              "FamilyName": "Kinney"
            }
          }
        }
      ],
      "TeachingGroup": [
        {
          "ShortName": "8A Geography 2",
          "TeachingGroupPeriodList": {
            "TeachingGroupPeriod": [
              {
                "DayId": "We"
              },
              {
                "DayId": "We"
              }
            ]
          }
        },
        {
          "ShortName": "8A English 2",
          "TeachingGroupPeriodList": {
            "TeachingGroupPeriod": [
              {
                "DayId": "We"
              },
              {
                "DayId": "Th"
              }
            ]
          }
        },
        {
          "ShortName": "7B Geography 1",
          "TeachingGroupPeriodList": {
            "TeachingGroupPeriod": [
              {
                "DayId": "Fr"
              },
              {
                "DayId": "Fr"
              }
            ]
          }
        },
        {
          "ShortName": "7A English 2",
          "TeachingGroupPeriodList": {
            "TeachingGroupPeriod": [
              {
                "DayId": "We"
              },
              {
                "DayId": "We"
              }
            ]
          }
        },
        {
          "ShortName": "8B English 2",
          "TeachingGroupPeriodList": {
            "TeachingGroupPeriod": [
              {
                "DayId": "We"
              },
              {
                "DayId": "Fr"
              }
            ]
          }
        }
      ]
    }
  }
}

Note that the traversal from GradingAssignment to XAPI, and from XAPI to Subject, is via Property Link: there are no direct links to object IDs connecting these objects, but the object ID in XAPI, defined to be a link attribute, is the URL of the GradingAssignment; and the object definition name in XAPI, defined to be a link attribute, contains as a prefix the name of the Subject. Similarly, the link from Subject to Syllabus to Lesson is because all three share the Unique Link of learning area, subject, and stage.

Filtered queries

Traversal queries will explore all links specified in the traverse parameter; this can easily lead to a combinatorial explosion of options. A filter can be imposed on traversal queries, consisting of a list of objects, paths and values, and a test (currently only "eq"): i.e. each filter expresses the requirement that any objects have a value at the nominated path equal to the supplied value. So

[
  { "eq": ["StudentPersonal", "PersonInfo.Name.FirstName", "Fred" }
]

expresses the requirement that, for any StudnetPersonal object encountered, the value at PersonInfo.Name.FirstName should be equal to "Fred".

If any of the nodes retrieved during traversal matches one of the filters, then further traversal of the graph is limited to the children of the nodes matching the filter.

The query above, for example, retrieves all the lessons that a teacher is teaching; they are ultimately accessed via the teaching groups (classes) the teacher is responsible for. If we introduce a filter clause on the days the class is taught, we can constrain the lessons retrieved to those that the teacher teaches on a Thursday:

{  
  q(qspec: {
     queryType: traversalWithValue,
     queryValue: "Kin"
     traversal: [
		"StaffPersonal",
		"TeachingGroup",
		"GradingAssignment",
		"XAPI",
		"Subject",
		"Syllabus",
		"Lesson"
	],
      filters: [
           { eq : ["TeachingGroup", "TeachingGroupPeriodList.TeachingGroupPeriod.1.DayId", "Th"] }
        ]
    })  {
      StaffPersonal {
        PersonInfo {
          Name {
            FamilyName
          }
        }
      }
    GradingAssignment {
      DetailedDescriptionURL
    }
      TeachingGroup {
        RefId
        ShortName
        TeachingGroupPeriodList {
          TeachingGroupPeriod {
            DayId
          }
        }
      } 
    XAPI {
      object {
        id
      }
    }
    Subject {
      subject
      stage
    }
      Lesson {
        learning_area
        title
      }
    }
}