On ordering - mlibrary/hydra-prototype GitHub Wiki
As a pure graph oriented formalism, RDF offers no predefined possibility (besides the disputed Bag mechanism) to express order 1. This article then walked through eight approaches to representing an ordered list of relationships. Part of what makes this interesting is that you can write an ordered sequence, but the actual representation doesn't look anything like the original RDF. For example:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Seq rdf:about="http://example.org/favourite-fruit">
<rdf:li rdf:resource="http://example.org/banana"/>
<rdf:li rdf:resource="http://example.org/apple"/>
<rdf:li rdf:resource="http://example.org/pear"/>
</rdf:Seq>
</rdf:RDF>
becomes
<http://example.org/favourite-fruit> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq> .
<http://example.org/favourite-fruit> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> <http://example.org/banana> .
<http://example.org/favourite-fruit> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> <http://example.org/apple> .
<http://example.org/favourite-fruit> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3> <http://example.org/pear> .
Historically, an additional challenge is that it's complicated to construct SPARQL queries to work with scenarios where you may or may not have a sequence of items. Unlike, say, XPath, where queries can handily work across multiple axis. Does this issue matter for our use of Fedora?
Anyway, in Hydra, the practice is to flatten the relationship between objects (e.g. an object can have multiple member objects), and then use link relations to set up an alternative path: what is the first child member? Then from that, the next, etc. Hydra employes proxy objects here, to avoid tightly coupling an object and its members. This works great for book/pages, or item/works, etc. but might be overly complicated for item has multiple authors.
Let's say you load the following set of triples into Fedora as the resource /objects/mbook1
, specifying multiple authors for dc:creator
using RDF Collections:
<> a <http://pcdm.org/models#Object>,
<http://projecthydra.org/works/models#GenericWork>;
<http://purl.org/dc/terms/title> "The Murders in the Rue Morgue";
<http://purl.org/dc/terms/creator> ( "Poe, Edgar Allan" "Bradbury, Ray" );
<info:fedora/fedora-system:def/model#hasModel> "BibliographicWork" .
Fedora parses that into the object graph:
-
</objects/mbook1>
-
dc:title
: "The Murders in the Rue Morgue" -
dc:creator
:<.well-known/genid/6b8f40d1-5838-4015-9313-2a5baed4f939>
-
-
<.well-known/genid/6b8f40d1-5838-4015-9313-2a5baed4f939>
-
rdf:first
: "Poe, Edgar Allan" -
rdf:rest
:</.well-known/genid/4ff4b71d-d175-44a6-8132-c88ff69e88bd>
-
-
</.well-known/genid/4ff4b71d-d175-44a6-8132-c88ff69e88bd>
-
rdf:first
: "Bradbury, Ray" -
rdf:rest
:rdf:nil
-
When </objects/mbook1>
is fetched (via REST), the response includes the anonymous properties that make up the "collection" of authors.
If you use rdf:Seq
, you load
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<> a <http://pcdm.org/models#Object>,
<http://projecthydra.org/works/models#GenericWork>;
<http://purl.org/dc/terms/title> "The Pit and the Pendulum";
<http://purl.org/dc/terms/creator> [
a rdf:Set;
rdf:_1 "Poe, Edgar Allan";
rdf:_2 "Bradbury, Ray"
];
<info:fedora/fedora-system:def/model#hasModel> "BibliographicWork" .
which creates:
-
</objects/mbook2>
-
dc:title
: "The Pit and the Pendulum" -
dc:creator
:<.well-known/genid/0e8da192-9c00-4b02-9886-db02a0a96a29>
-
-
<.well-known/genid/0e8da192-9c00-4b02-9886-db02a0a96a29>
-
rdf:_1
: "Poe, Edgar Allan" -
rdf:_2
: "Bradbury, Ray"
-
Note that you have to use the rdf:_N
syntax; loading data marked up as rdf:li
results in an unordered list.
If you decide to do this in your model:
class BibliographicWork < ActiveFedora::Base
include Hydra::Works::GenericWorkBehavior
property :author, predicate: ::RDF::DC.creator, multiple: true
and then in your code:
work.author = [ 'Poe, Edgar Allan', 'Bradbury, Ray' ]
What's actually persisted is the not-guaranteed-to-be-in-any-order multiple values for dc:creator
(in my test, the authors end up sorted as Bradbury, Poe).
Indirect container properties (see: aggregates :members
) also provides an ordered property (e.g. ordered_members
) which creates order proxies in the http://www.iana.org/assignments/link-relations/ namespace.
The ordered list of members is not automatically populated by using obj.members << other_obj
. The ordered property has to be explicitly used: obj.ordered_members << other_obj
.
The basic container (e.g. Collection
) then gets an iana:first
and iana:last
pointing to the proxy objects in the indirect container (e.g. members
). The proxy object gets iana:next
and iana:prev
as appropriate.
Of note is that this doesn't seem to discriminate between which container is used --- in my Bag
model, I have both the members
indirect container supplied by Hydra::Works::CollectionBehavior
, and a works
container that uses an app-defined relationship (http://lib.umich.edu/models#hasWork
). In one bag (bag
), I set ordered_works
. Now, while bag.members
returns a different list from bag.works
, bag.ordered_members
and bag.ordered_works
return the same list (the items in bag.works
).
This implies that a pcdm:Object
can only support one ordered relationship. (RDF has always been lousy with ordered sequences, so I don't know how much this matters.)