Table Schemas - UCLALibrary/resourcesync-oai-pmh GitHub Wiki

Content provider (source)

In order to use the multi sub-command, add a row for each collection to the template source_collections.csv. For a description of what should go in each field, see the "positional arguments" and "optional arguments" sections of the following command's output:

python3 source.py single --help

Also, note the following:

  • resource-dir, metadata-dir, and no-set-param are optional (you may leave these fields blank in order to use the defaults)
  • anything can go in the no-set-param field (a blank field is interpreted as false, and a non-blank field as true)

Content aggregator (destination)

In order to use the destination-side wrapper script, a local TinyDB instance should be created, with rows following this schema:

field name type description
collection_key string unique identifier for the set of resources (collection)
collection_name string human-readable name for the set of resources (collection)
institution_key string unique identifier for the institution
institution_name string human-readable name for the institution
resourcelist_uri string URL of the collection's ResourceList (or ResourceListIndex)
changelist_uri string URL of the collection's Changelist (or ChangeListIndex)
url_map_from string the part of a resource's URL to cut off in order to obtain the resource's name, which will become its local filename
file_path_map_to string base directory on the local filesystem for synchronized files, relative to the user's home directory; will contain directories for each institution containing directories for each collection
new boolean whether or not a baseline synchronization has yet been performed on the collection