2019.09.11 Community Meeting - OCFL/spec GitHub Wiki

Call-in Details

  • 4pm BST / 11am EDT / 8am PDT
  • https://lyrasis.zoom.us/my/vivo1
  • or Telephone:
    • US: +1 669 900 6833 or +1 646 876 9923
    • Canada: +1 647 558 0588
    • Australia: +61 (0) 2 8015 2088
    • United Kingdom: +44 (0) 20 3695 0088
    • Meeting ID: 812 835 3771
    • International numbers available: https://zoom.us/u/MO73B

Attendees

  1. Rosalyn Metz
  2. Ben Pennell
  3. Ben Cail
  4. Bethany Seegar
  5. Jared Whiklo
  6. Neil Jefferies
  7. Peter Winckles
  8. DANS
  9. Simeon Warner
  10. Jonathan Rochkind
  11. Aaron Birkland

Agenda

  1. Community updates (introductions, updates, implementations, plans, etc)
  2. Feedback on Beta Release: Issue #367 -- note that below are a number of issues that have been brought up in the course of the conversation in #367. The goal of this topic is to better understand those issues and identify actionable work for the editors to clarify for the community. This may result in breaking out Issue #367 into multiple tickets.
    • Understanding the base use case for OCFL. The OCFL is a specification for digital preservation. See Completeness. Why are we storing what we store? Is this clear to the OCFL Community?
    • Optimizing inventory.json file. Right now the output of the inventory.json is large. See @pwinckles 's performance tests. Can the editors do something to optimize it? Discussion includes switching to not formatted, and SHA256.
      • SHA512 v. SHA256. Using one over the other seems to be a question of choosing performance over storage. SHA256 is notoriously slow. This was something that was considered as part of the specification. Are there concerns about this decision? For reference: Stop Using SHA-256
    • Optimizing the overall object size. How could we do that?
    • Clarifying the inventory.json file. Some developers struggled with understanding why there is the need for the inventory.json file in each object version and also at the object root. How can the editors clarify the purpose of each inventory.json file in either the spec or the implementation notes? See @birkland 's comment on #367 for a quick synopsis of the confusion.
    • Understanding workflows. There seems to be concerns around workflow systems. Can this be clarified further? Is there something actionable the Editors can take forward and clarify?
  3. Other feedback on Beta Release?
  4. Next meeting: Wednesday, Oct 9th @11am ET

Notes

Audio recording

  • How does the community get to an official response in tickets? Is there something that Editors can do to make it more clear? Perhaps a review of tickets in Milestone 1.0 and 2.0 in the Community calls so that folks better understand what it is that we are working on.
  • The base use case is how to facilitate digital preservation. Questions about how Fedora works now verses how it worked before.
    • Discussion around potentially creating scenarios that allow you to NOT version. Calling out Ben C. and Neil J.
  • S3 conversation to happen at the OCFL editors meeting in October. Editors should start an S3 ticket so that we can capture a bigger conversation and the needs for S3.
  • Optimizing the inventory - is there something that Editors can do to make it better. Should we revisit the inventory.json as Editors? Perhaps revisiting the hash algorithms would be a good start.
  • Clarify why Editors have included the inventory.json file in each one of the version for an object.
  • Workflows. When does it make sense to squash a version? This may be implementation specific, and Editors think we need more information about implementations so that we can really address this. Perhaps that meeting with Fedora Committers will help draw that out.
  • Optimizing the overall object size. If you have a really big file, can you really create a version? That seems difficult to do. This may also be tied to the understanding of remote versions which will hopefully be in scope for version 2 of the spec.

Action Items

Previous Action Items

  • Neil to contact NDSA group regarding openness of preservation storage in the "Levels of Preservation"
  • Andrew Woods to create Bare RFC Repo after checking about process w/ editors
  • Aaron Birkland will create an issue in the spec repo for the RFC process
  • Peter W. will submit a pull request with the proposed change for blake2b-512.
  • Andrew Woods needs to update the invite to include the correct Zoom information.
  • Rosalyn Metz and Andrew Woods to facilitate a call with Fedora Committers.