Minutes_Standards_2020 04 - airr-community/airr-standards GitHub Wiki
Standards Call 2020-04
Minutes
Meta
Date: Mon, 2020-08-24 18:00 UTC
Topics
You will need to renew your AIRR-C Antibody Society membership. Make
sure to select "AIRR-C member" during renewal so the money goes to
the right place.
The v1.3.1 release is ready to go once the minor issues remaining are
finished (readme edits, typo, doc config). Will aim to finish this
release before the next call.
The clone API entry point is already merged in. Should be okay for
v1.3.1 as it doesn't change the schema, only the API, and should be
backwards compatible.
Robust discussion surrounding Repertoire, RepertoireSet, and
DataProcessing. Further followup on GitHub and in subsequent calls.
No conclusions, but some major points were:
There seemed to be general agreement that DataProcessing should be
associated with an output object, should include fields to define
the input object (files or object identifiers), and that it should
define the steps from input to output.
If we assign a DataProcessing object to each output unit, there
will be a data explosion if we require one DataProcessing per
input. Do we need to address this and, if so, how?
How do we define identifiers to link input objects to files?
Particularly "raw" fastq files. Are the SequencingRun and
RawSequenceData objects sufficient?
Should we require or suggest the use of a workflow language to
define the steps in DataProcessing, e.g., CWL/WDL/SnakeMake? This
may be too much of a compliance burden.
If we strictly define Repertoire as a discrete biological unit,
then RepertoireSet could be aggregates and/or subsets of those
basic units.
Discussed whether Repertoire should be 1:1 with Rearrangement
tables and DataProcessing.
Discussed whether Sample should be a single biological unit (e.g.,
subject, tissue and time point).