2025 Meeting Minutes - uwlib-cams/MARC2RDA GitHub Wiki

January 15, 2025

See time zone conversion Meeting norms Present: Absent: Cypress Time: Notes:

Water Cooler/Agenda Review/Roles for Meeting (5)

Updates (10)

Linking fields ()

Wrap-up (5)

Action items

Backburner

  • WEM Access points, RIMMF Demo

January 8, 2025

See time zone conversion Meeting norms Present: Absent: Notes: Tynan

Water Cooler/Agenda Review/Roles for Meeting (5)

Updates (10)

  • Ying-Hsiang cycling off project, arranging handoffs soon. Thank you for your incredible contributions, Ying-Hsiang!
  • Doreen is primarily working Fridays and in the mornings during the rest of the week now, and 15 hours per week rather than 19.5 this quarter
  • We now have Cypress full time (not all on M2R, but more than before)
  • Crystal will miss the last meeting in January
  • We had a long follow up discussion regarding 773 (I'm sorry, I lost the notes in a conflicting edit, will go back to the recording to augment), but we decided to discuss next week, so we can dive in further then
  • Crystal heard back from Theo at LOC, catalog is not free unless you use an outdated version, not a lot of RDA in it; you can get 10,000 records at a time through the catalog; if we went to the catalog and did 10,000 records at a time we can get as much as we want, although they block bots from doing this; the system will slow you down if you try to automate it; this has to be done manually or by a slow program; they also have a way to purchase the catalog, but it's very expensive (e.g. $25,000!); Theo recommends downloading 10,000 at a time and compile a dataset of 100K should be enough
    • Deborah: download bulk from 2019 and use the 10k at a time approach for the rest; we would need to de-duplicate the records

Project Plan Review and Update

Project Overview

  • Problem statement: adding a need to mention differing and non-interoperable ontologies
  • Goals:
    • Deborah: one of the things in the impact should be a description of the entities and their relationships -- this is the main new thing in RDA
    • Sofia: move from record-based cataloging to entity-based cataloging
  • Impact discussion
    • How much is a large pool? The available bulk download is from 2019; we can download records 10k at at time
    • Laura: we can talk to Jeff at OCLC; where would we host the records -- National Library of Greece Wiki?
      • Would give us a better picture to give people than just using LC's record; could also discover things about the transformation
      • Decision: add this to a discussion for next week
      • Sofia: wikibase database has size limits, asking how to make the storage bigger
    • Are we reducing dependency on vendor systems?
      • Laura: in order to demonstrate this reduced dependency, we have to use it in a system that is not a vendor system and provide library services off of it
      • Rephrase to reinforce commitment to open-scholarship
      • Laura: main impact is to demonstrate that RDA can be implemented using RDF directly; there is a path for adopting it for libraries that have a large legacy store of MARC data
      • Ebe: if someone doesn't want to use RDF, but wants to use something else -- should we be specific about the type of encoding?
      • Decision: we don't want to promise that we can help people encode another way
  • Phase I
    • Java extension is not in phase I anymore
      • Instead for phase I we have moved on to having pre-approved iri sources
    • Ying-Hsiang, send documentation to Cypress and Tynan for scripts to feed Bibliographic into Wikibase Cloud
  • Post-Phase I close-out
    • We may not need to justify phase II, UW libraries approved
    • We can think about grant applications to support phase II,
    • We may also consider submitting to additional conferences
    • A composition that describes in a granular way what we did for Phase I, why we did it, what the results were; goal to get this published somewhere
      • Deborah's project plan is a good outline for this
      • We may want to have an open-source version of this to make information more accessible
  • Phase II
    • Collection records
      • What will we do with collections? We are pulling them out of phase I; what does RDA need for collections?
    • Item-level mappings -- not part of phase I, will be part of phase II
    • CSR
      • You can have diachronic works that fall into a BSR (multipart monos/series)
      • Removing machine-readable mapping -- we don't have the capacity for that right now
    • BSR
    • Guidelines for pre and post processing -- part of our documentation in phase II, we have python scripts to serialize
  • Timeline
    • Close-out is June-August of 2025
    • Start Phase II in August
    • How much time do we need for review and re-coding? We need to extend the deadline for ending phase I to April 30th
      • Mapping done by Feb 28, 2025
      • Mapping review by Mar 31, 2025
      • Transform code by April 30, 2025
      • Output review by May 30, 2025
    • This means starting phase II in September
  • Deliverables

Phase I

Deborah Project Plan (Simplified Incomplete)

Transformation Review (if time)

Wrap-up (5)

Action items

Backburner