File Analyzer at Georgetown Solutions for Special Collections - Georgetown-University-Libraries/File-Analyzer GitHub Wiki
File-Analyzer-Use-Cases-at-Georgetown-University
The Georgetown University Library consolidated item records for the university's art collections into EmbARK. Before this project was completed, 13,621 objects were described outside of EmbARK in 13 input spreadsheets. Across these input spreadsheets, 340 unique column mappings existed which mapped to roughly 100 unique EmbARK fields (across 11 distinct EmbARK object types). The FileAnalyzer user interface was utilized for iterative quality control during the data migration and normalization effort.
The File Analyzer presented the user with a couple dozen filters that allowed the user to identify missing data, mismatched data, and data format errors.