Erroneous Duplicates - Enstore-org/enstore GitHub Wiki

Definition

Erroneous duplicates can occur when Enstore tapes have several copies of the same file that it believes are unique. In this case, each entry in the Enstore DB will reference the same file in the PNFS namespace, but PNFS will only refer to a single of the copies in Enstore. This results in one or more of the copies in Enstore becoming Orphaned Files.

Causes

TODO

Symptoms and Resolutions

Metadata Inconsistent on BFID

During migration, this sort of issue can lead to errors like the following:

(IDs have been modified)

Thu May  5 06:49:08 2022 WRITE_1 SWAP_METADATA CDMS144212345000000 /pnfs/fs/usr/file/myfile CDMS165175134800002 /pnfs/fs/usr/Migration/file/myfile failed due to [1] metadata CDMS144212345000000 /pnfs/fs/usr/file/myfile are inconsistent on bfid ... ERROR

In these cases, migration has copied one of the duplicate files to the new volume, and updated the PNFS namespace metadata to reference the BFID of the new copy. Then, while trying to migrate following duplicates, the migration script sees that there is already a copy on the new media, but the BFID does not match what it expects, so migration fails.

Resolution

To resolve, mark the erroneous copies deleted. Responders should ensure copies on the source media share the same crc and file size, and that a non-deleted matching copy exists on the destination media. This can be done by querying Enstore for all files matching the PNFSID of the failing files, and checking relevant attributes. If these conditions hold, it is safe to mark deleted all BFIDs that are failing with the error above, and rerun migration:

(IDs have been modified)

$ enstore info --file CDMS14421234500000
...
'pnfsid': '00001234567890ABCDEF1234567890ABCDEF',
enstore info --file 0000E5A5CEF94A96433BA0DF46EC0BE62698 | grep -E "(bfid|tape_label|deleted|complete_crc)"
 'bfid': 'CDMS999912345000000',   # Dest BFID
 'complete_crc': 1496848692L,
 'deleted': 'no',
 'tape_label': 'DST001L8',
 'bfid': 'CDMS144212345000000',   # ERR in log
 'complete_crc': 1234567890L,
 'deleted': 'no',
 'tape_label': 'SRC001',
 'bfid': 'CDMS144212345000001',   # ERR in log
 'complete_crc': 1234567890L,
 'deleted': 'no',
 'tape_label': 'SRC001',
 'bfid': 'CDMS144212345000002',   # ERR in log
 'complete_crc': 1234567890L,
 'deleted': 'no',
 'tape_label': 'SRC001',
 'bfid': 'CDMS144212345000003',   # no ERR in log
 'complete_crc': 1234567890L,
 'deleted': 'no',
 'tape_label': 'SRC001',
$ for bfid in CDMS144212345000000 CDMS144212345000001 CDMS144212345000001; do
  enstore file --modify $bfid "deleted=y"
done
bfid = CDMS144212345000000
bfid = CDMS144212345000001
bfid = CDMS144212345000002