Triple‐Edge subgroup proposals - w3c/rdf-star-wg GitHub Wiki

Some inputs:

This page will become a self-contained proposal.

For now some inputs are from

Named Occurrence sketch: https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0000.html

Previous tripe term sketch: https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0033.html

and the email threads arising.

There is also: semantics description for 2024Jan/0000.html

Intro

Key concepts / new RDF terms

"What's new?"

Motivating factors / technical use case

  • Unasserted triples, quotation - talking about something that is not asserted (not a triple in the graph)

annotation (talking about something asserted) and quoting (talking about something not asserted) — see near the end of https://www.w3.org/2023/11/16-rdf-star-minutes.html

  • Talking about talking about something that is asserted.

Quotation implies??

Motivating example

Based on annotation syntax? Alternative? Assertion and "occurrence" written explicitly

"Unasserted"

Terminology

RDF Triple

Triples are defined in RDF 1.1. Furthermore defined here, an RDF triple denotes its abstract structure. This is a mathematical or "platonic" type (cf. types and universals). This is why adding such to a set can only be done once, as it is the same triple.

RDF Triple Occurrence

An occurrence of a triple is a use of it (presumably through but not as tokens), on the level of the abstract syntax (as in an interpretation of an RDF representation).

This is closely related to the (non-normative) definition of a reified RDF statement:

The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object. This supports use cases where properties such as dates of composition or provenance information are applied to the reified triple, which are meaningful only when thought of as referring to a particular instance or token of a triple.

There can be only one such asserted occurrence per RDF graph, which is defined as a set of triples. A triple in that set is asserted.

LPG Edge

In Labelled Property Graphs, there can be multiple edges (specifically arcs, i.e. directed edges) between nodes, each labelled with the same "relationship type" (or even with multiple labels on the edge itself). The edge itself can also have other properties, but not relationships. This is different (in many ways) from triples and occurrences thereof. Also, there seems to be no concept of asserted in LPGs.

RDF Named Triple

A named triple is a name associated with a triple. While a triple itself denotes an abstract structure, naming it does not denote this "type", but names a a distinct occurrence thereof. The name is an IRI or a bnode. Assigning a name to a triple occurrence does not assert it - it remains unasserted.

There can be multiple such occurrences described in a graph (i.e. as subjects or objects of triples asserted in that graph).

The name of a triple occurrence denotes an RDF claim (now further defined).

RDF Claim

A claim is the conceptual meaning of a triple, i.e. the conceptual relationship it expresses if asserted in a graph, as defined in RDF 1.1 concepts:

Asserting an RDF triple says that some relationship, indicated by the predicate, holds between the resources denoted by the subject and object. This statement corresponding to an RDF triple is known as an RDF statement.

The difference between a claim and this statement is that the statement is defined as asserted, a claim is defined as the meaning it would denote if asserted. A triple may mean different things in different graphs. Unasserted, the claim is "what it would mean" (like a hypothesis).

There is a restriction such that a named occurrence of a triple cannot name ("be paired with") multiple triple occurrences in the same RDF graph. This is the same kind of restriction that RDF named graphs have, in that:

Graph names are unique within an RDF dataset.

The claim that is denoted by such a name however, may be the same as another claim, under certain entailments. Therefore, two names for two different triples, i.e. two different abstract structures, may denote the same intended meaning, i.e. one claim. They still name two different occurrences, since the triples themselves are different, just as <clarkkent> and <superman> remain distinct IRIs even if they denote the same resource.

RDF Fact

Depending on whether you believe in, or accept, an RDF graph, i.e. a set of triples, it can be interpreted either as meaning a set of statements as facts (some of which can be facts about claims), or just as a set of statements as claims (some of which can make claims about claims).

RDF Data Model

For a named occurrence approach: https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0000.html

graph            ::= (triple)* 
triple           ::= subject predicate object 
subject          ::= iri | BlankNode | tripleOccurrence 
predicate        ::= iri 
object           ::= term 
term             ::= iri | BlankNode | literal | tripleOccurrence 
tripleOccurrence ::= identifier triple
identifier       ::= iri | BlankNode  

RDF 1.2 Concepts changes

Semantics

Note:
A term is denoted by r, a triple by t, and a graph by g.
Given a tripleOccurrence r, we denote the identifier of r as r.id, and the subject, predicate, object of r as r.s, r.p, r.o, respectively.
Given a triple t, we denote the subject, predicate, object of t as t.s, t.p, t.o, respectively.
RDF 1.1 syntax is the above without the tripleOccurrence category.

An RDF-star simple interpretation I is a structure <IR, IP, IS, IL, IEXT, IT, IO> consisting of:

  1. A non-empty set IR of resources, called the domain or universe of I.
  2. A set IP, called the set of properties of I.
  3. A mapping IS from IRIs into IR ⋃ IP, called the interpretation of IRIs.
  4. A partial mapping IL from literal into IR, called the interpretation of literals.
  5. A mapping IEXT from IP into 2IR x IR, called the extension of properties.
  6. A binary relation IO over IR x (IR x IP x IR), called the occurrence of a triple term.

A is a mapping from BlankNode to IR.

Given `I` and `A`, the function \[I+A\](.) is defined over _terms_, _triples_, and _graphs_ as follows.

  • [I+A](r) = IL(r) if r is a literal
  • [I+A](r) = IS(r) if r is a iri
  • [I+A](r) = A(r) if r is a BlankNode
  • [I+A](r) = [I+A](r.id) if r is a tripleOccurrence

  • [I+A](t) = TRUE if and only if <[I+A](t.s),[I+A](t.o)> ∈ IEXT([I+A](t.p)) and
    • <[I+A](ts.id), <[I+A](ts.s),[I+A](ts.p),[I+A](ts.o)>> ∈ IO
      if t.s is a tripleOccurrence ts
    • <[I+A](to.id), <[I+A](to.s),[I+A](to.p),[I+A](to.o)>> ∈ IO
      if t.o is a tripleOccurrence to

  • [I+A](g) = TRUE if and only if ∀ t ∈ g . [I+A](t) = TRUE

An interpretation I is called a model of a graph g if there exists A such that [I+A](g) = TRUE.
The set of all models of a graph g is called models(g).

Simple entailment: g ⊨ g' if and only if models(g) ⊆ models(g').

Concrete Syntax

Turtle

A named occurrence is written in Turtle as an RDF term:

<< occurrenceName | :s :p :o >>

This names an occurrence of the triple s p o.

The triple is not asserted, keeping "assertion" and "occurrence" as orthogonal concepts even if they might commonly be used together.

occurrenceName is a URI or blank node, including [] (the ANON terminal rule 47 in Turtle - no triples inside the []).

The occurrence name can be repeated with it being the same named occurrence term:

It can be used with a predicateObjectList (rule [14] in RDF 1.1 Turtle)

<< _:a | :s :p :o >>
   :starts 1999 ;
   :finishes 2000 .

or using the occurrence name for one of the triples:

<< _:a | :s :p :o >> 
    :starts 1999 .
_:a :finishes 2000 .

The name can be omitted - a blank node is generated by the parser:

<< :s :p :o >> :q 123 .

N-Triples

In N-Triples, reflecting the RDF abstract data model, there is a new syntax form for a named occurrence term:

<< _:a | :s :p :o >> :q 123 .

In N-triples, the name is required. There are no shorthand forms.

RDF Graph Merge

Graph merge happens as before - blank nodes need to be kept apart.

Annotation Syntax

Input material from email:

Annotation syntax is Turtle/TriG syntax that both asserts a triple, and uses an occurrence of that triple.

:liz :spouse :dick {| id:1 | :start 1964; :end 1974 |} . :liz :spouse :dick {| id:2 | :start 1975; :end 1976 |} .

which would generate to 6 triples and there are 5 unique triples - the RDF graph does not have a duplicate asserted triple.

:liz :spouse :dick .
 << id:1 | :liz :spouse :dick >> ;
      :start 1964;
      :end 1974 .
 << id:2 | :liz :spouse :dick >> ;
      :start 1975;
      :end 1976 .

SPARQL

Assumption: The RDF 1.2 features map into SPARQL.

Test: SPARQL syntax "Turtle+variables"

Named occurrences

Syntax

New type of patterns? New functions? Accessors and creators

Consequences on property paths

Evaluation

?? Simple entailment consequences??

Items to discuss

Unordered

  • Graphs, graph terms, named graphs, graph occurrences.
  • Terminology "Edges"? "occurrences"? "usage"?
  • Do occurrences "infer"?
  • Name sharing (one name, two occurrences) Exactly one? Relationship to merge?
  • type, instance, occurrence
⚠️ **GitHub.com Fallback** ⚠️