2025 Meeting Minutes - uwlib-cams/MARC2RDA GitHub Wiki
January 15, 2025
See time zone conversion Meeting norms Present: Absent: Cypress Time: Notes:
Water Cooler/Agenda Review/Roles for Meeting (5)
Updates (10)
Linking fields ()
Wrap-up (5)
Action items
Backburner
- WEM Access points, RIMMF Demo
January 8, 2025
See time zone conversion Meeting norms Present: Absent: Notes: Tynan
Water Cooler/Agenda Review/Roles for Meeting (5)
Updates (10)
- Ying-Hsiang cycling off project, arranging handoffs soon. Thank you for your incredible contributions, Ying-Hsiang!
- Doreen is primarily working Fridays and in the mornings during the rest of the week now, and 15 hours per week rather than 19.5 this quarter
- We now have Cypress full time (not all on M2R, but more than before)
- Crystal will miss the last meeting in January
- We had a long follow up discussion regarding 773 (I'm sorry, I lost the notes in a conflicting edit, will go back to the recording to augment), but we decided to discuss next week, so we can dive in further then
- Crystal heard back from Theo at LOC, catalog is not free unless you use an outdated version, not a lot of RDA in it; you can get 10,000 records at a time through the catalog; if we went to the catalog and did 10,000 records at a time we can get as much as we want, although they block bots from doing this; the system will slow you down if you try to automate it; this has to be done manually or by a slow program; they also have a way to purchase the catalog, but it's very expensive (e.g. $25,000!); Theo recommends downloading 10,000 at a time and compile a dataset of 100K should be enough
- Deborah: download bulk from 2019 and use the 10k at a time approach for the rest; we would need to de-duplicate the records
Project Plan Review and Update
Project Overview
- Problem statement: adding a need to mention differing and non-interoperable ontologies
- Goals:
- Deborah: one of the things in the impact should be a description of the entities and their relationships -- this is the main new thing in RDA
- Sofia: move from record-based cataloging to entity-based cataloging
- Impact discussion
- How much is a large pool? The available bulk download is from 2019; we can download records 10k at at time
- Laura: we can talk to Jeff at OCLC; where would we host the records -- National Library of Greece Wiki?
- Would give us a better picture to give people than just using LC's record; could also discover things about the transformation
- Decision: add this to a discussion for next week
- Sofia: wikibase database has size limits, asking how to make the storage bigger
- Are we reducing dependency on vendor systems?
- Laura: in order to demonstrate this reduced dependency, we have to use it in a system that is not a vendor system and provide library services off of it
- Rephrase to reinforce commitment to open-scholarship
- Laura: main impact is to demonstrate that RDA can be implemented using RDF directly; there is a path for adopting it for libraries that have a large legacy store of MARC data
- Ebe: if someone doesn't want to use RDF, but wants to use something else -- should we be specific about the type of encoding?
- Decision: we don't want to promise that we can help people encode another way
- Phase I
- Java extension is not in phase I anymore
- Instead for phase I we have moved on to having pre-approved iri sources
- Ying-Hsiang, send documentation to Cypress and Tynan for scripts to feed Bibliographic into Wikibase Cloud
- Java extension is not in phase I anymore
- Post-Phase I close-out
- We may not need to justify phase II, UW libraries approved
- We can think about grant applications to support phase II,
- We may also consider submitting to additional conferences
- A composition that describes in a granular way what we did for Phase I, why we did it, what the results were; goal to get this published somewhere
- Deborah's project plan is a good outline for this
- We may want to have an open-source version of this to make information more accessible
- Phase II
- Collection records
- What will we do with collections? We are pulling them out of phase I; what does RDA need for collections?
- Item-level mappings -- not part of phase I, will be part of phase II
- CSR
- You can have diachronic works that fall into a BSR (multipart monos/series)
- Removing machine-readable mapping -- we don't have the capacity for that right now
- BSR
- Guidelines for pre and post processing -- part of our documentation in phase II, we have python scripts to serialize
- Collection records
- Timeline
- Close-out is June-August of 2025
- Start Phase II in August
- How much time do we need for review and re-coding? We need to extend the deadline for ending phase I to April 30th
- Mapping done by Feb 28, 2025
- Mapping review by Mar 31, 2025
- Transform code by April 30, 2025
- Output review by May 30, 2025
- This means starting phase II in September
- Deliverables