Migration : Big Picture - OregonDigital/hyrax-migrator GitHub Wiki

Hyrax Integration

hyrax-migrator is designed as a Rails engine to live alongside Hyrax with dependencies that make use of the some of the Hyrax core code. The hyrax integration is used, generally, to identify User, AdminSet, and other models that will relate to a work as it is being migrated. Hyrax core integration is found at https://github.com/OregonDigital/hyrax-migrator/tree/master/lib/hyrax/migrator/hyrax_core.

Migrator Work

In the context of the hyrax-migrator, a work represents a unique original item to be migrated to a Hyrax repository. The work data structure is described as;

  • env : A hash containing the state of the work as metadata is being mapped from the original item to the properly formed shape to be persisted in the Hyrax repository. This hash enables the ability for the migrator actors to restart/resume migration without having to reprocess previous steps.
  • pid : A unique id previously generated for the original item. This gives the migrator the ability to crosswalk and maintain previous id's during migration, to transform the original id, or to ignore it entirely and let Hyrax mint a new id for the work being migrated.
  • file_path : A path (file path, or S3 URL) to the original item archive, currently expected to be a BAG file.

Migration Specifics

The general workflow for the migrator is to pass a work into the DefaultMiddleware#actor_stack which processes the original item through several Actors to perform a configurable set of operations including original file validation, metadata crosswalk, file upload, repository persistence, and migration validation.

What is an original item?

The migrator is implemented, initially, to handle zipped BAG files that contain metadata and original file(s) that are processed/crosswalked through Actors and their related Services. An example zipped BAG might contain a structure like:

my_bag.zip
  bag-info.txt
  bagit.txt
  data
    DC.xml
    RELS-EXT.xml
    content.jpeg
    descMetadata.nt
    rightsMetadata.xml
    workflowMetadata.yml
  manifest-md5.txt
  manifest-sha1.txt
  tagmanifest-md5.txt
  tagmanifest-sha1.txt

Migration Workflow

Actors

BagValidator

CrosswalkMetadata

ModelLookup

FileUpload

ListChildren

AdminSetMembership

AddRelationships

PersistWork

Terminal