Metadata prep and management - smith-special-collections/sc-documentation GitHub Wiki
Page Table of Contents
Metadata display and configuration
You may find that, whether from the Islandora Multi-Importer (IMI) spreadsheet or from a validated XML, the metadata and fields may not be displayed on Compass. This is because the Compass Metadata Team has configured certain fields to be displayed in the digital object record. If there is a field that should be displayed or if the display label for the field should be changed, it must be discussed and configured by the Compass Metadata Team. To put it forth to this team, forward the request and reason for (re)configuration through one of Smith's Compass liaisons.
Preparing the IMI metadata spreadsheet
The 3-College TWIG_Template-Multi-Importer CSV to MODS Federated Mapping and Twig Template Reference, reflects the most up-to-date changes to the TWIG template used in Compass, therefore it's recommended that this template be used when starting a new IMI spreadsheet.
There is also an IMI template designed specifically for Smith College which includes examples for both FTP and zip ingesting, but this spreadsheet may not reflect the latest TWIG updates.
To use the IMI metadata spreadsheet:
-
Make a copy of the IMI spreadsheet.
-
Determine which fields are needed for the project.
- The field names must be entered exactly as they are in the template, however, not all fields need be used and they can be in any sequence.
-
If only objects are being added in your ingest (and metadata will be added separately), then you only need to use the following fields:
- PARENT
- CMODEL
- SEQUENCE
- OBJ_FILE
- IDENTIFIER
- TITLE
-
(optional) Incorporate the metadata exported from ArchivesSpace or catalog into the IMI spreadsheet.
-
Review the IMI data carefully to ensure it is formatted correctly. Often ingest errors are because of data that is not well-formed.
-
Download the IMI sheet as CSV file to your desktop:
- Download the IMI sheet(s) as “Comma-separated values (.csv, current sheet)”
- IMI must be UTF-* encoded to preserve special characters. Google sheets downloaded as csv will automatically be saved with UTF-8 encoding to preserve non-ASCII characters. If using Excel to create your IMI sheet, note that not all Excel versions will export spreadsheet with UTF-8 encoding. It's safest to copy and paste the contents from your Excel document to a blank Google spreadsheet, then download the file as a CSV document.
- Warning: Once downloaded, do not open the csv file in Excel -- it will override the UTF-8 encoding and you will have to repeat the download process.
-
See Ingest instructions.
See also these instructions for combining a file list with object-level metadata
IMI fields and formatting
The Compass Metadata Team have determined which fields are used by each of the three colleges (HC, MHC, and SC), and their decisions have coalesced in the fields implemented for the IMI spreadsheet. The fields listed below are all of the fields available for use in the IMI spreadsheet. However, some fields may only be in use for specific colleges. Where known, that has been indicated on the table. As always, staff should follow the rules of DACS and archival metadata practices outlined in the ArchivesSpace manual for archival, component, or digital object records. or local bibliographic practices to determine which metadata fields should be available in the Compass digital object record.
IMI uses pipes | to delimit element and attribute values. If an attribute is optional and not entered, the pipes need to still be included, even if there is no value between them. IMI also uses semicolons ; to delimit multiple values. Refer to the IMI metadata spreadsheet template (above) to see how elements, attributes, and multiple values should be formatted, as well as for examples.
IMI field | Description | Delimiters | Pipe formatting? | Attribute values |
---|---|---|---|---|
PARENT | In this field, enter the PID of the parent collection or object, to which the objects belong. You can find the PID by logging into Compass, scrolling to the bottom of the page and clicking Islandora Repository, and then navigating to the collection you plan to ingest to and looking at the end of the URL. Note that for repositories the PID in the URL is wrong. You will have to replace the single dash between the repository name and item information with a double-dash if the item is going directly into a repository (eg. smith:scsc--sca rather than smith:scsc-sca) | |||
CMODEL | In this field, enter the content model to assign to the object. The content model is a package that pairs the object with a specific viewer, etc. Some collections may only allow certain types of content models. | Refer to the Islandora content models reference page. | ||
SEQUENCE | For multi-page objects, enter the sequence of the pages for each file. For example, page 1 of a multipage object would have the value "1" in this field, page 2 -- the value "2", etc. | |||
OBJ_FILE | Enter the filename of the object, plus the file extension. For multipage objects, this element in the object's first row will be blank. If using FTP method to upload files, be sure to include the full path for the FTP location: /mnt/ingest/smith | |||
Identifier | Enter the local identifier or filename for the object. | |||
Physical_Location | Enter the name of the holding institution that stewards the object. | A semi-colon delimits multiple values. | ||
Collection | Enter the parent resource to which the object belongs. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | |
Shelf_Location | Enter identifying information about the physical container, within which the original items are housed. This field is not being used by Smith | A semi-colon delimits multiple values. | ||
Title | Enter the title of the object. | |||
SubTitle | Enter the subtitle of the object. | |||
Name | Enter the name of the creator, source, and/or Special Collection unit. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | Agent default is optional; the default is "personal". Agent type options are "personal" or "corporate". Authority source is optional; the default is "local". Authority source options include "local", "naf", etc. Authority valueURI is optional; there is no default. Role1 and role2 are optional. Role1 and role2 authority source is optional; the default is "marcrelator". Role1/role2 authority source options include "marcrelator" and "archivesspace". |
Publisher | Enter the object's publisher. | |||
Place_Of_Publication | Enter the object's place of publication. | |||
Date_Created | Enter the date expression for when the object was created. | |||
Date_Issued | Enter the date expression for when the object was issued/published. | |||
Date_Submitted | Enter the date expression for when the object was submitted to Compass. | |||
Date_Created_Start | Enter the normalized start date for when the object was created. | |||
Date_Created_End | Enter the normalized end date for when the object was created. | |||
Date_Issued_Start | Enter the normalized start date for when the object was issued. | |||
Date_Issued_End | Enter the normalized end date for when the object was issued. | |||
Access_Restriction | Enter information pertaining to the access of the object. This is a conditions governing access note. | |||
Use_Restriction | Enter information pertaining to the use and reproduction of the object. This is a conditions governing use note. | |||
Copyright | Enter the DPLA rights statement and the rights statement URI associated with this object. | A pipe delimits element and attribute values. | Yes. | |
Language | Enter the language of the materials. | A semicolon delimits multiple values. | ||
Description | Enter a brief description about the object. | |||
Digital_Origin | Enter the method by which the object achieved digital form. The values that may be used are defined by MODS: http://www.loc.gov/standards/mods/userguide/physicaldescription.html#digitalorigin. | |||
Extent | Enter the extent statement for the object. | |||
Digital_Format | Enter the internet media type. Use media type values from the Internet Assigned Numbers Authority (iana) standard: http://www.iana.org/assignments/media-types/media-types.xhtml. | |||
Physical_Description_Note | Enter information about the physical description of an item. This can include a condition description or characteristics about the physical nature of an item that effected its digital form. | |||
Material | Enter the types of substances or materials used to create the object's physical instantiation. This is being solely used by MHC for their art teaching collection. | |||
Technique | Enter the technique used to create the object's physical instantiation. This is being solely used by MHC for their art teaching collection. | |||
Type | Enter a term that specifies a general type of content of the object. The values that may be used are defined by MODS: http://www.loc.gov/standards/mods/userguide/typeofresource.html. | A semicolon delimits multiple values. | ||
Genre | Enter the genre of the materials. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | Authority source is optional; the default is "aat". Authority valueURI is optional. |
Culture | Enter the culture from which the object's physical instantiation originated. This is being solely used by MHC for their art teaching collection. | A semicolon delimits multiple values. | ||
Preferred_Citation | Enter the institution's preferred citation for the object. | A semicolon delimits multiple values. | ||
General_Note | Enter a general note about the object. | A semicolon delimits multiple values. | ||
Note_Type | Enter a general note about the object, with a specific note type and display label. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | |
Sponsorship | Enter information about an individual or organization that may have sponsored the digitization of the object or contributed to digital object to the institution. | A semicolon delimits multiple values. | ||
Subject_Topic | Enter topics that the object may be about or is captured in the object. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | Authority source is optional; the default is "local". Authority valueURI is optional. |
Subject_Geographic | Enter geographic place that the object may be about or is captured in the object. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | Authority source is optional; the default is "local". Authority valueURI is optional. |
Subject_Name | Enter the names of individuals, organizations, or families that the object may be about or are captured in the object. | A semicolon delimits multiple values. A pipe delimits element and attribute values. | Yes. | Authority source is optional; the default is "local". Authority valueURI is optional. |
Classification | Enter the call number for the bibliographic object. | A semicolon delimits multiple values. | ||
Alternate_Copy | Enter information about the existence and location of copies. This is being used solely by MHC. | A semicolon delimits multiple values. | ||
Original_Location | Enter information about the existence and location of originals. This is being used solely by MHC. | |||
Series | Indicate the series in which the object is a child of. This is being used by HC and MHC. |
Preparing XML metadata
Work with Special Collections Technical Services to get metadata out of ArchivesSpace or Aleph in an XML format that Compass accepts (MODSXML or MARCXML). See (https://github.com/smith-special-collections/sc-documentation/wiki/Ingest-workflow) for more information.
If your metadata is in an IMI spreadsheet [see workaround below](### Adding or updating metadata without a MODSXML file) to convert it to XML.
Updating metadata datastreams for individual objects
-
Go to the Compass digital object record for which the metadata will be updated.
-
Select the manage tab above the navigation breadcrumbs. This will take you to the backend management dashboard.
- Select Datastreams underneath the Manage tab.
- Download the MODS datastream by clicking download in the Operations column.
-
Update the metadata in an editor, such as oXygen XML Editor.
-
Return to the object's datastream management page. Under the Operations column, select Replace in the Operations column next to MODS datastream
-
Upload the revised xml file.
-
Click Add Contents. Once replaced, the number in the Versions column should now be updated.
Adding or updating metadata without a MODSXML file
-
Create an IMI sheet for the object
-
Even if there are multi-pages it should have only one row of metadata for the parent object
-
As if you’re ingesting a new IMI batch, upload the IMI spreadsheet
-
In the Template tab, select TWIG template: Twig_Temp_2018_01_19
-
Click Preview
-
Click on “Template Parsed Output” tab above
-
Copy and paste the contents to any text editor
-
Save as an xml file
-
Replace the MODS datastream following the steps above
Revert back to a previous datastream version
-
Go to the Compass digital object record for which the metadata will be updated.
-
Select the manage tab above the navigation breadcrumbs. This will take you to the backend management dashboard.
-
Select Datastreams underneath the Manage tab.
-
Under the Version column, select the version number for the metadata datastream that you'd like to revert to a previous version.
- Determine which version you'd like to revert back to. Under the Operations column, select Revert.
- Double check the digital object record to ensure you've reverted to the correct version.
Editing descriptive metadata for one item
-
In Manage screen, click Datasteams
-
Next to MODS Record click Edit
-
Edit field(s)
-
Click Update at bottom of screen
Editing a Title/Label for a Book or page
(note: this does not affect the Title in metadata only the large display title at the top of the image)
-
In Manage screen, click Properties
-
Edit Item label
-
Click Update