PKP PLN SWORD API implementation - mjordan/pkppln GitHub Wiki
Starting with version 2.4.5, scheduled for release in Summer 2014, Open Journal Systems (OJS) will enable the automated preservation of journal content in the PKP Private LOCKSS Network (PLN).
As illustrated in the overview of the PKP PLN, OJS creates SWORD Deposits on issue publication against a SWORD server on the PKP PLN staging server. The staging server applies several processes to the journal issue, such as checking its files for viruses and validating the OJS Export XML, before issuing a separate SWORD Deposit for the deposit against LOCKSS-O-Matic's SWORD server. LOCKSS-O-Matic then instructs the LOCKSS boxes in the PLN to harvest the issue content from the Staging Server. This document describes the SWORD implementation that allows the OJS PKP PLN Plugin to talk to the PLN's staging server. The more general (i.e., not specific to the PKP PLN) SWORD implementation that LOCKSS-O-Matic uses is available elsewhere.
The examples below illustrate the transactions between the SWORD client and server using curl to document HTTP headers and responses. The OJS instance is at jfs.example.org and the staging server is at stagingserver.example.com.
The PKP PLN SWORD workflow is much like that of any application that uses the protocol: the client asks the server for a Service Document, the client creates a deposit, the server returns a deposit receipt and does something with the deposited content, and the client periodically requests a SWORD Statement from the server.
The value of the 'On-Behalf-Of' request header is the OJS journal's UUID, which is generated by the PKP PLN plugin, and the value of the 'Journal-URL' header is the journal's base URL:
curl -v -H 'On-Behalf-Of: a120bcd6-3204-4c65-b454-6effd76a2bed' -H 'Journal-URL: http://jfs.example.org/index.php/jsf' http://stagingserver.example.com/api/sword/2.0/sd-iri
Response: HTTP/1.0 200 OK
<service xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:sword="http://purl.org/net/sword/terms/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:pkp="http://pkp.sfu.ca/SWORD"
xmlns="http://www.w3.org/2007/app">
<sword:version>2.0</sword:version>
<sword:maxUploadSize>1000</sword:maxUploadSize>
<pkp:uploadChecksumType>SHA-1</pkp:uploadChecksumType>
<pkp:pln_accepting>Yes</pkp:pln_accepting>
<pkp:terms_of_use>
<pkp:jm_has_authority updated="2014-08-27 10:34:00">I have the legal and contractual authority to include this journal's content in a secure preservation network and, if and when necessary, to make the content available in the PKP PLN</pkp:jm_has_authority>
<pkp:article_licenses updated="2014-08-27 10:34:00">I confirm that licensing information pertaining to articles in this journal is accurate at the time of publication.</pkp:article_licenses>
<pkp:can_use_info updated="2014-08-27 10:34:00">I agree to allow the PKP-PLN to include this journal's title and ISSN, and the email address of the Primary Contact, with the preserved journal content.</pkp:can_use_info>
<pkp:no_violations updated="2014-08-27 10:34:00">I agree not to violate any laws and regulations that may be applicable to this network and the content.</pkp:no_violations>
<pkp:not_to_preserve updated="2014-08-27 10:34:00">I agree that the PKP-PLN reserves the right, for whatever reason, not to preserve or make content available.</pkp:not_to_preserve>
<pkp:sole_risk updated="2014-07-22 14:52:30">I agree that the use of the PKP PLN is at my sole risk and that I will not hold the PKP PLN responsible.</pkp:sole_risk>
</pkp:terms_of_use>
<workspace>
<atom:title>PKP PLN deposit for a120bcd6-3204-4c65-b454-6effd76a2bed</atom:title>
<collection href="http://stagingserver.example.com/api/sword/2.0/col-iri/a120bcd6-3204-4c65-b454-6effd76a2bed">
<accept>application/atom+xml;type=entry</accept>
<sword:mediation>true</sword:mediation>
</collection>
</workspace>
</service>
The Service Document contains a number of custom XML elements (in this example, using the 'pkp' namespace), including some terms of use. These terms are displayed to the Journal Manager in the OJS PKP PLN Plugin user interface. To have her journal included in the PLN, the Journal Manager checks a box next to each term. The terms of use are retrieved in the Service Document so that they can be modified centrally by the PKP PLN administrators if necessary. The terms are exported with the issue content and preserved in the same Bag as the content.
The 'pln_accepting' element allows the OJS client to react appropriately if the staging server is not accepting deposits temporarily (for example, if one or more of its components are undergoing maintenance), or if a specific journal is disallowed from creating deposits temporarily (for example, if there is a problem with the journal's content).
A deposit is created whenever a journal publishes an issue. The SWORD implementation used here is very simple: a 'deposit' is just a list of the .zip files (usually one) that contains the exported OJS issue. The file is listed in the Atom document in a <pkp:content>
element, with the file's size (in bytes) and its SHA-1 checksum as attributes. The issue's volume and issue number, and its publication date, are also included as attributes. The OJS plugin provides a UUID for the deposit in the Atom <id>
element.
curl -v --data-binary @atom_create.xml --request POST http://stagingserver.example.com/api/sword/2.0/col-iri/a120bcd6-3204-4c65-b454-6effd76a2bed
Atom document used to create the deposit:
<entry xmlns="http://www.w3.org/2005/Atom"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:pkp="http://pkp.sfu.ca/SWORD">
<email>[email protected]</email>
<title>Journal of Foo Studies</title>
<pkp:issn>1234-123x</pkp:issn>
<pkp:journal_url>http://jfs.example.org/index.php/jfs</pkp:journal_url>
<!-- The <id> element contains the deposit's UUID, which is generated by the PKP PLN OJS plugin. -->
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2013-10-07T17:17:08Z</updated>
<pkp:content size="102400" checksumType="sha1" volume="4" issue = "3" pubdate = "2011-04-25" checksumValue="bd4a9b642562547754086de2dab26b7d">http://jfs.example.org/download/1225c695-cfb8-4ebb-aaaa-80da344efa6a.zip</pkp:content>
</entry>
Response: HTTP/1.0 201 Created
The SWORD server returns the following Deposit Receipt:
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:sword="http://purl.org/net/sword/">
<sword:treatment>Issues for preservation in the PKP PLN from journal Journal of Foo Studies (a120bcd6-3204-4c65-b454-6effd76a2bed).</sword:treatment>
<content src="http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a" />
<link rel="edit-media" href="http://stagingserver.example.com/api/sword/2.0/col-iri/a120bcd6-3204-4c65-b454-6effd76a2bed" />
<link rel="edit-media" href="http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a" />
<link rel="http://purl.org/net/sword/terms/add" href="http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/edit" />
<link rel="edit" href="http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/edit" />
<link rel="http://purl.org/net/sword/terms/statement" type="application/atom+xml;type=feed" href="http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/state" />
</entry>
Soon after the deposit is created, the staging server harvests the Zip file identified in the deposit, and applies several processes to the issue content to prepare it for harvesting into the Private LOCKSS Network. On successful completion of these processes, the staging server issues a SWORD deposit request against LOCKSS-O-Matic.
The OJS PKP PLN plugin can report the status of an issue to the journal manager by retrieving the SWORD Statement from the staging server.
The request for the SWORD Statement uses the journal UUID and the deposit UUID as URL parameters.
The possible state term values are:
failed: The deposit to the PKP PLN staging server (or LOCKSS-O-Matic) has failed.
in_progress: The deposit to the staging server has succeeded but the deposit has not yet been registered with the PLN.
disagreement: The PKP LOCKSS network is not in agreement on content checksums.
agreement: The PKP LOCKSS network agrees internally on content checksums.
These state values are converted into messages displayed to the OJS journal manager. To retrieve the state values from the staging server, the PKP PLN plugin issues the following request to the staging server:
curl -v http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/state
Response: HTTP/1.0 200 OK
<atom:feed xmlns:sword="http://purl.org/net/sword/terms/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:pkp="http://pkp.sfu.ca/SWORD">
<atom:category scheme="http://purl.org/net/sword/terms/state" term="agreement" label="State">The PKP LOCKSS network agrees internally on content checksums</atom:category>
<atom:entry>
<atom:category scheme="http://purl.org/net/sword/terms/" term="http://purl.org/net/sword/terms/originalDeposit" label="Orignal Deposit"/>
<atom:content type="application/zip" src="http://jfs.example.org/download/1225c695-cfb8-4ebb-aaaa-80da344efa6a.zip"/>
</atom:entry>
</atom:feed>
The OJS plugin can issue an 'update' to a deposit if the journal issue is modified after it is originally deposited. In this case, the Atom document is the same as that used to create the deposit. Like the request for the SWORD Statement, updates to the deposit use the journal UUID and the deposit UUID as URL parameters.
curl -v -H "Content-Type: application/xml" -X PUT --data-binary @atom_modify.xml http://stagingserver.example.com/api/sword/2.0/cont-iri/a120bcd6-3204-4c65-b454-6effd76a2bed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/edit
Atom document used to update the metadata:
<entry xmlns="http://www.w3.org/2005/Atom"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:pkp="http://pkp.sfu.ca/SWORD">
<email>[email protected]</email>
<title>Journal of Foo Studies</title>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2013-12-02T15:25:46Z</updated>
<pkp:content size="102400" checksumType="sha1" checksumValue="d47a9b642562547754086de2dab26c6e">http://jfs.example.org/download/1225c695-cfb8-4ebb-aaaa-80da344efa6a.zip</pkp:content>
</entry>
Response: HTTP/1.0 200 OK
If the staging server receives an update request from a journal, it applies the same processes to the harvested payload file as it would for a new deposit.