Migration - Enstore-org/enstore GitHub Wiki

Definition

Migration is the act of moving tape data from one type of media to another; for example, from M8 to M9 cartridges. This is a process commonly performed at Fermilab, and likely a necessary part of any long term archival storage system.

We use two types of migration to handle moving data in Enstore. The classical migration which uses a standalone Python script (now migrate_chimera.py) to interface with Enstore directly via encp, and an experimental migration process which uses dCache. This page focuses on the former.

Design Goals

The classical migration process was designed with the following goals in mind:

  1. File based migration - volume migration is simply looping over the files
  2. Tracks states of the file during and after migration - migration table in enstoredb
  3. Can be interrupted any time and when being rerun with the same parameters, it starts from where it left off
  4. Low priority - taking advantage of the idle cycles of the system
  5. Paranoid checking - making sure every step is verifiably correct

In addition, a major feature of this process is the reuse of the original PNFSID for the new file. This has been prioritized over simplicity in implementation.

Migration

Note, migration can be performed for a single file, sets of files, a volume, or a set of volumes. The procedure for each file is the following, and specifying volumes varies only in small ways mentioned below.

Copy

To create a new copy of the source data on the new media:

  • The source file is encp-ed to local file system on a migration node, then, it is encp-ed to a new (PNFS) destination.
  • The destination has a similar path to the source, except it includes a “Migration” directory - different directory, different tags (different media). ** This is just an arbitrary design decision — easier for debugging.
  • The destination file has its own bfid and file record in enstoredb, and pnfsid and layers in PNFS.

Enstore migration: copying

Metadata Swap

To update source file's PNFS metadata so to match the destination file, without changing PNFSID:

  • In PNFS, copy destination's layer 1 to the source’s layer 1 (having destination file’s bfid in it). ** In PNFS, copy destination’s layer 4 to the source’s layer 4, EXCEPT, (in layer 4) pnfsid_file keeps the original (source) and original_name keeps that of the source. *** Now, the original (source) PNFS entry is pointing to the new (destination) file in Enstore. ** in enstoredb, the pnfs_id on the destination file record (which up until now pointed to the file under /Migration/) is set to the source file's pnfs_id. ** in enstoredb, the pnfs_path in the destination file record is set to the source file’s pnfs_path (the one without /Migration/ in it).

Enstore migration: metadata_swap

Scanning

To read the new file back from tape and ensure integrity:

  • The file is encp-ed out — using the same PNFS entry reading the new file (on destination tape). ** This action utilizes “tape assertion” to speed up the read back.

Finalize

To clean up source file metadata and migration artifacts:

  • The source file in Enstore is marked deleted.
  • The layers of the destination file in “/pnfs/fs/usr/Migration/……” are “zapped” (cleaned out, so that deletion of the PNFS entry will not trigger the deletion in Enstore).
  • Remove destination file’s entry from PNFS. ** This is no longer needed as the source file's entry in PNFS has been updated to reference the destination file in Enstore.

Enstore migration: finalize

Volume Migration

In volume migration, scanning can be postponed to the end after all files are copied / all metadata is swapped. In this case:

  • The destination tape will initially have a file_family value of "<source_file_family>-Migration" in order to prevent non-migration files from being written to this tape before migration is complete.
  • After "final scan", the file_family will be updated to reflect the source volume
  • Final scan is scanning the destination volume(s), which may have files from multiple source volumes.
  • After completion, if all destination volumes are scanned successfully, it will leave all source volumes with only deleted files, and the source media can be recycled.

Migration Issues

The migration process is fairly robust, and does not typically lead to data issues. However, because it is often paranoid, it is very common for underlying issues to be uncovered during a migration attempt. Examples of such issues are orphaned files and erroneous duplicates. In addition, migration occasionally runs into transient issues such as network failures or server busy timeouts; these can typically be resolved by re-running migration with the same arguments at a later time.