Design Principles - SAA-SDT/TS-EAS-subteam-notes GitHub Wiki

Design Principles

Introduction

TS-EAS developed the following design principles in the first half of 2020 in order to reflect recent decisions and to guide future discussions for the development of EAD, EAC-CPF, and any other schemas that might be added to the EAS family in the future.

During the major revision of EAC-CPF (originally released in 2010, and revised slightly in 2018), it became apparent that the differences between EAD3 and EAC-CPF were difficult to keep track of not only for the maintainers of the standards but also for implementers. Early on, a major goal of the EAC-CPF revision was to align EAC-CPF more closely with EAD3. However, considering how different the two schemas actually are in their purpose and scope as well as in their maintenance setup, there was no guidance when it came to making decisions about how to harmonize the two. The following fifteen principles, focused on the schemas as well as the corresponding documentation, will help provide that guidance for TS-EAS going forward.

This document will be reviewed and updated, as necessary, based on feedback from the Committee (TS-EAS) and the Community, which includes archivists, software implementers, and anyone who interacts with the EAS schemas directly or the files associated with those schemas.

Schema Principles

1. Simplicity comes first.

  • The reason that simplicity is of such importance is because the Committee recognizes the high costs involved with implementing encoding standards. Therefore, our primary objective is to minimize the differences of the schemas that are managed by TS-EAS.
  • Define as few elements and attributes as possible to lower costs related to implementation, training, and maintenance.
  • When choosing data types for elements and attributes, the schemas should favor the least restrictive but nevertheless most sensible option.
  • To ensure consistency, the schemas should be developed together rather than separately.
  • Use non-EAS namespaces sparingly;
  • But use external namespaces if another domain is to be included that is significantly outside the scope of archival description.

Examples

  • Each element will be defined once, rather than multiple times in multiple places.
  • The element <recordId> currently has the NMTOKEN datatype in EAC-CPF 1.0, while EAD3 allows any text. In the context of reconciling both EAS schemas, EAC-CPF will move to a more open datatype.
  • We opted not to add "xlink" to EAC-CPF and to define a subset of xlink-link attributes instead. The “xlink” attribute namespace was previously removed from EAD3.

2. Community needs are tied with (and tied to) the first principle.

  • Although these principles are numbered, both the “first” principle and the “second” principle are of equal importance. In other words, Community needs (also) come first.
  • The Committee will receive and review feature requests from the Community on a continual basis, while also adhering to a predictable release schedule.
  • Each release will use “semantic versioning” numbers to clearly indicate to the Community what types of updates are included.
  • Legacy practices of the Community and past decisions of the Committee should always be considered but rarely supersede the first principle.

Examples

  • Feature requests can be submitted via GitHub directly or via the webform on the EAS microsite. Furthermore, TS-EAS has agreed on an annual rolling revision cycle to ensure a predictable release schedule.
  • The release of EAD3 version 1.1.1 was a “patch” release that included backwards compatible bug fixes. For more information about the semantic versioning scheme, see the Semantic Versioning website

3. The schemas exist, first and foremost, to allow the Community to validate and share archival description, which should in turn adhere to archival descriptive standards.

  • The EAS schemas should be able to encode any description written according to archival standards.
  • The schemas are intended to encode archival description and will not necessarily be able to encode other descriptive standards.
  • The schemas should not be made more complex to be able to encode non-archival description.
  • When necessary, the description should point to non-archival descriptions in other formats rather than embed them in EAS.
  • The schemas should support machine indexing as well as presentation of archival description.
  • The schemas should also permit the encoding of administrative metadata about the history of the archival description.

4. Readability matters, especially since our Community is a community of people.

  • To help implementers as well as developers, the names of elements and attributes should be easy to read and to type.
  • Element and attribute names are defined using a consistent naming scheme (e.g. camelCase).
  • Elements and attributes are defined with English-language names.
  • The schemas will not reuse the same name for both an attribute and an element.

5. Since our Community is an international community, the schemas support internationalization.

  • Whenever repetition or internationalization is required, the committee will define those concepts as elements rather than attributes.
  • The schemas must provide clear mechanisms to define the languages of what is being described (e.g. language of materials, language spoken, etc.) as well as the languages used in that description (e.g. the ability to encode multiple descriptive titles in different languages).
  • Recognizing language and cultural differences, the schemas will be flexible in how they permit the encoding of names, addresses, dates, time zones, typographical representations, language-based annotations, and more.
  • The Committee will review additional recommendations focused on supporting internationalization. For example, the W3C’s Best Practices for XML Internationalization

6. The schemas permit customizations, acknowledging that local requirements exist, without sacrificing interoperability.

  • Allow minor expansions without invalidating the core schemas. This will reduce instances of “tag abuse,” where implementers use tags and attributes in ways that contradict their definitions.
  • Additional XML validation options that cannot be easily defined or maintained in the core schemas will be supplied with Schematron.
  • Encourage implementers to create their own local validation rules that restrict, rather than expand, the core schemas.

7. Value lists populated from external sources will be maintained for the current version only.

  • The TS-EAS schemas may include value lists with constraints from external sources to enable validation against those standards.
  • As the external sources evolve, their value lists change. Therefore, the value lists that are populated from external sources will be maintained by TS-EAS only up until the point that a new major version of an EAS schema is released.
  • Updates of value lists populated from external sources in the current schemas will be carried out on a yearly basis as part of the rolling revision cycle for minor revisions.

Examples

  • Value lists populated from external sources currently include codes for country (ISO 3166-1), language (ISO 639-1, 639-2, and 639-3), and script (ISO 15924).

8. The schemas are free to use without restrictions on their use.

  • The schemas are published online.
  • The EAS schemas are currently serialized in RNG and XSD formats, both of which can be utilized with cost-free software.
  • The schemas are released under a CC0 license.

9. The schemas will be revised, not in isolation, but in the context of related standards as well as changing technologies.

  • Additional EAS schema serializations may be explored by the Committee and provided if requested by the Community.
  • Updates to existing as well as emerging related standards may be explored by the Committee with regard to their relationship to and potential impact on EAS.
  • When defining new elements and attributes, the Committee will look at related standards for inspiration and may decide to adopt existing solutions rather than creating their own.

10. Only the current version is supported.

  • The current version is found in the main branch of the schema repositories on GitHub and on the standard’s publication page.
  • When a major revision of a schema is approved, any previous versions of that schema will be put into read-only mode.
  • Schemas that are in read-only mode will no longer be updated, but they will remain available online.
  • In conjunction with the release of a schema after a major revision, the predecessor schema's constraints originating from external sources will be removed.

Documentation Principles

1. Documentation should be published online and made available under the terms of an open licence.

  • This includes the tag libraries, best practice guides, and any other usage documentation produced by EAS.

2. Documentation should include encoding examples.

  • Examples should always be practical, providing useful and non-contradictory implementation guidance.
  • Examples should come from real-world implementations when possible.

3. Documentation should be accessible, adhering to established standards and best practices.

  • PDF documentation should adhere to PDF/UA guidelines. For example, by generating PDF documentation using the fop -a option
  • HTML documentation should adhere to WCAG AA guidelines.

4. Documentation will be published in English as its primary language.

  • Suggested procedures for translating and generating translated documentation will be maintained and provided alongside source documents.

5. Documentation should comprehensively reflect the most current version of the schemas.

  • Release documentation updates alongside schema updates.
  • All elements and attributes should be documented.

6. Documentation should be transparent.

  • Explicitly list content changes made to documentation in appendices.
  • Use version control systems, such as Git, for maintaining documentation and publish changes in a public repository.

7. Documentation should be useful for both technical implementers and metadata creators.

  • Provide clear and specific narrative descriptions of how each element and attribute should be used in archival description.
  • Provide detailed technical expectations, requirements (repeatability, obligation, data types, etc.), and uses for each element and attribute.
  • Provide references to other related SAA and ICA standards when possible.

⚠️ **GitHub.com Fallback** ⚠️