Product Licensing and Release - Imageomics/Image-Datapalooza-2023 GitHub Wiki

To meet the goals and motivations of this event, and to comply with the requirements and the Data Management Plan (DMP) associated with the NSF grants supporting this event, all data, model, and code products generated or augmented at and for Image Datapalooza must adhere to FAIR principles. Datasets that include Indigenous data should also adhere to CARE principles.

This means the following policy holds for all digital products developed or created at the event:

All digital products (code, data, models, documentation, tutorials, etc) developed at the event are to be released such that it is accessible to the public, at the latest at the conclusion of the event, under an Open Content license or terms of use.
Code is to be released under an OSI-approved open source license, or to the public domain (for example, by applying a CC-Zero waiver).
- Scripts can simply be added to this GitHub repository. For more complex codebases, we recommend using a version control repository on GitHub, Gitlab, etc.
Data, documents, tutorials, etc are to be released either to the public domain (for example, by applying a CC-Zero waiver), or under terms no more restrictive than requiring attribution (such as CC-BY).
- For image and video datasets, this only applies to items that are not already licensed by (and thus used under license from) a third party.
- For datasets that include Indigenous data, see Carroll et al (2020) and Carroll et al (2021) for reconciling FAIR and CARE principles for scientific data.
- Datasets collected in whole or in part from regions that harbor Indigenous researchers are to at least adhere to the Collective Benefit principle (the C in CARE), even if they have been expressly released from or are otherwise entirely unencumbered by Indigenous rights. Specifically, at a minimum they are to be made available to respective Indigenous researchers with the least obstacles possible.
- For ML-ready datasets, for storage, version control, and sharing we recommend using Hugging Face Dataset Hub, which provides for rich metadata description in the form of a Dataset Card.
ML models are to be released under an OSI-approved open source license, or a Responsible AI License (RAIL) (in particular Open RAIL-M), or to the public domain (for example, by applying a CC-Zero waiver).
- For further guidance, consider the chapter on Machine Learning Model Licenses from The Turing Way.
- For storage, version control, sharing, and publishing (including DOI provision), we recommend using Hugging Face Model Hub, which provides for rich metadata description in the form or a Model Card. (See Imageomics models published there as examples.)