Posting new items - jhu-library-applications/levy-api GitHub Wiki

Step 1: Post taxonomies

1. Post new taxonomy terms and record identifiers.

This script posts new taxonomy terms to Drupal and records their identifiers.

input: termsToCreate/taxonomyTermsToCreate.csv
script: postTaxonomyTerms.py
output: logs/logOfTaxonomyTermsAdded.csv

2. Add taxonomy identifiers to spreadsheet of new items.

In Drupal, taxonomy terms are linked to levy_collection_items by their Drupal identifiers (UUIDs). This means we need to add the relevant taxonomy identifiers to our metadata spreadsheet. This script gathers the taxonomy identifiers from the spreadsheets in items-matched folder and from our log and adds them to our metadata spreadsheet. In order to keep track of the changes to the metadata spreadsheet, the new version of the spreadsheet is renamed "01_" plus the name of the original spreadsheet.

input:

logs/logOfTaxonomyTermsAdded.csv
levy-api/items-matched/
metadata spreadsheet (named finalmetadata.csv in this example)

script: addTaxonomyIdentifiersToSpreadsheet.py
output:

completed_field_instrumentation_metadata_id.csv
completed_field_publisher_id.csv
completed_field_publisher_id.csv
01_finalmetadata.csv
allTaxonomyIdentifiersByFileIdentifiers.csv

Step 2: Post levy_collection_name nodes and paragraphs

1. Posts new levy_collection_names nodes and collection_name paragraphs. Record identifiers.

Using information from levy_collection_namesToCreate.csv, this script first posts new levy_collection_name nodes to Drupal and records their identifiers in logOfLevyCollectionNamesAdded.csv. Then, the script gathers all levy_collection_names identifiers and creates a spreadsheet of paragraphs with names and relator identifiers to create for the metadata items. These paragraphs are posted and their identifiers recorded in logofLevyParagraphCollectionName.csv.

input: termsToCreate/levy_collection_namesToCreate.csv and matched_CollectionNames.csv
script: postNodeAndParagraph_collection_names.py
output:

logs/logOfLevyCollectionNamesAdded.csv
paragraph_levy_collection_namesToAdd.csv
logs/logofLevyParagraphCollectionName.csv

2. Add levy_collection_name identifiers to spreadsheet of new items.

In Drupal, levy_collection_name paragraphs are linked to levy_collection_items by their Drupal identifiers (UUIDs). This means we need to add the relevant levy_collection_name paragraphs identifiers to our metadata spreadsheet. This script gathers the identifiers from our log and adds them to our metadata spreadsheet. In order to keep track of the changes to the metadata spreadsheet, the new version of the spreadsheet is renamed "02_" plus the name of the original spreadsheet.

input: logs/logofLevyParagraphCollectionName.csv and 01_finalmetadata.csv
script: addParagraphIdentifiersToSpreadsheet.py
output: 02_finalmetadata.csv

Step 3: Post files

1. Post new image and PDF files to Drupal site and create associated collection_item_images paragraphs and record identifiers.

This script posts PDF and image files to Drupal. For images, it also creates a collection_item_image paragraph. Collection_item_image paragraph identifiers and PDF file identifiers are recorded in logOfImagesAndPDFs.csv.

input: File spreadsheet (named allFiles.csv in this example)
script: postFilesAndParagraph_collection_item_images.py
output: logs/logOfImagesAndPDFs.csv

⚠️ Note: Files (ie pdfs) that are not referenced by an entity are deleted after 6 hours. Setting can be accessed at admin/config/media/file-system. In other words, don't wait too long between this step and posting new levy_collection_items.

2. Add PDF file identifiers and collection_item_images paragraph identifiers to spreadsheet of new items.

In Drupal, files and collection_item_images are linked to levy_collection_items by their Drupal identifiers (UUIDs). This means we need to add the relevant identifiers to our metadata spreadsheet. This script gathers the identifiers from our log and adds them to our metadata spreadsheet. In order to keep track of the changes to the metadata spreadsheet, the new version of the spreadsheet is renamed "03_" plus the name of the original spreadsheet.

input: logs/logOfImagesAndPDFs.csv and 02_finalmetadata.csv
script: addFileInfoToSpreadsheet.py
output: 03_finalmetadata.csv

Step 4: Post new levy_collection_items

1. Post levy_collection_item.

This script first posts the levy_collection_item (line 145), recording its drupal_internal__nid. Next, it works to link the levy_collection_items to its related paragraphs. This requires updating fields in the paragraphs with information from the item and then recording updated paragraph information in the items.

To do this, the script first updates the parent_id field in related collection_name paragraphs with drupal_internal__nid from levy_collection_items to link the paragraph to the item (line 178). It records the new drupal_internal__revision_id from the paragraph.

Next, the script updates the parent_id field in related collection_item_image paragraphs with drupal_internal__nid from levy_collection_items to the link the paragraph to the item (line 213).

Finally, the script returns to the levy_collection item and updates the following fields (line 242) within the item: 1) field_people with type, id, and target_revision_id from collection_name paragraphs and, 2) Updates field_images with type, id, and target_revision_id from collection_item_image paragraphs.

Results from the 4 post/patch requests for each item are captured in logofLevyCollectionItems.csv.

input: 03_finalmetadata.csv
script: postNode_levy_collection_item.py
output: logofLevyCollectionItems.csv