Batch Import Format - ccnmtl/footprints GitHub Wiki
Preparing data for import
Footprints can ingest data in a .csv format, encoded as UTF-8. The fields are detailed below. (Stay tuned: we are working on a Google Sheets template to help with constructing data sets.)
There are a few options to ensure the data is properly encoded. Using the default Microsoft Excel export to .csv will not work as Excel exports with the ANSI character set.
-
Prepare your data in Google Sheets and export to .csv. Google Sheets handles special characters properly. Detailed Directions
-
Prepare your data in Excel, save as .xls, import the .xls into Google Sheets, then export to .csv. Detailed Directions
Fields
bold fields are required
- Catalog Link
- Validation: url
- BHB number
- Unique identifier for the Imprint.
- Validation: numeric
- Imprint Title
- Validation: text
- Literary Work Title
- Standardized English title, ideally from Library of Congress
- Validation: text
- Literary Work Author
- Validation: name
- Literary Work Author VIAF ID
- Validation: numeric
- Literary Work Author Birth Date
- Validation: date
- Literary Work Author Death Date
- Validation: date
- Publisher
- Validation: name
- Publisher VIAF ID
- Validation: numeric
- Publication Location
- Validation: numeric, geonameId
- Publication Date
- Validation: date
- Book Copy Call Number
- Unique book copy identifier
- Validation: text
- Evidence Type
- Validation: evidence type
- Evidence Description
- Validation: text
- Evidence Location Description
- Validate: text
- Evidence Citation
- Validation: text
- Footprint Actor (Former Owner/Seller/Other)
- Validation: name
- Footprint Actor VIAF ID
- Validation: numeric
- Footprint Actor Role
- Validation: role
- Footprint Actor Begin Date
- Validation: date
- Footprint Actor End Date
- Validation: date
- Footprint Notes
- Validation: text
- Footprint Location
- Validation: numeric, geonameId
- Footprint Date
- Validation: date
- Footprint Narrative
- Validation: text
Validators
Date
The python-edtf library supports a set of natural language date formats. Here are example of valid dates and date ranges that can be specified in the data to import. Additional date formats may/can be supported.
Basic Examples
- century: 16th century
- century: 1800s
- decade: 1860s
- year: 1860
- month & year: January 2008
- month, day, year: January 12, 1940
Uncertain/approximate
- adding a ? indicates uncertainty
- 1860?
- prepending a ~ or c. prefix or approximately indicates approximate
- c. 1860
- ~1860
- approximately 1860
- Both modifiers can be used
- ~1860?
Date Ranges
Any valid date separated by a hyphen. NO SPACES. Approximate, uncertain and unknown can be used to denote ambiguity. "before" is used to indicate an unknown start date, "after" is used to indicate an unknown end date
- 1841?-~1879? - interval whose beginning is uncertain but thought to be 1841, and whose end is uncertain and approximate but thought to be 1879
- January 12, 1940-December 1941
- before 1992
- after 1992
Evidence Type (Medium)
text must match exactly.
- Approbation in imprint
- Booklist/estate inventory
- Bookseller/auction catalog (pre-1850)
- Bookseller/auction catalog (1850-present)
- Bookseller marking in extant copy
- Censor signature in extant copy
- Dedication in imprint
- Library catalog/union catalog
- Owner signature/bookplate in extant copy
- Reference in another text
- Subscription list in imprint
Name
- Valid text string
- Match on VIAF id, if exists
- Match on name, birth date, death date
Numeric
- Value must be a string of numbers, 0-9.
Numeric, Imprint & Footprint Location
- We use the GeoNames geographical database as the canonical source for our locations. The batch format expects a valid geoname id.
Roles
text must match exactly.
- Anthologizer
- Bookdealer
- Estate Agent
- Expurgator
- Giver
- Librarian
- Owner
- Seller
- Subscriber
- Viewer
Url
- Value must be a valid web url, e.g. http://www.google.com or https://footprints.ccnmtl.columbia.edu