Present: Ahmad, Brian, Christian, Florian, Francisco, John, Sri
Decisions
Inclusion of species information for cell and locus annotations
(#137) was approved.
Introduction of the fields cell_species and locus_species,
will be added to the schema via #260.
Principle of "layering" (i.e. specialized keys deeper down in in
the schema hierachy can override more general definitions of the
same feature that took place further up) will be added to the
docs.
Follow-up
Rename "Organism" to "Species":
The designation for this field is formally incorrect as an
organism is an individual of a species, not the species itself.
However, it is the latter one that we are aiming to annotate in
this field. This could lead to confusion when using the term as a
suffix (e.g., #137). The current term is derived from the
INSDC Feature Table, which uses rather creative semantics to
make it fit.
There is a consensus that the MiAIRR name should be changed. As
this breaks compatibility it will be slated for inclusion in
AIRRv2. Whether the key organism will be changed at the same
time, is up for discussion with with DataRep (#266).
Human Population Genetics XT: We will get feedback from GLDB WG in
their November call. The proposal will be put to a vote in
MiniStd Call 2019-11, please comment on Github (#264, #265).
New topics
Documentation of effective changes to the standard/schema: While
nearly all changes made to the standard over the last two years are
documented in some Github ticket, there is no comprehensive log that
would summarize all changes. DataRep will discuss how they think this
can be done with creating too much overhead.
Deprecation vs. renaming of fields: While we have a procedure to
document deprecation of fields (#248), it is unclear how to
document renaming, especially how to keep the information what the
new field name is.
Relation between MiAIRR Set 6 and DataRep Rearrangement:
This has been an area of overlapping responsibility for some time.
Although it has not been an issue until now that two WG basically
define similar items, it is probably time to get this sorted out.
As DataRep is the larger stakeholder, the proposal is that they
set the standard definition for rearrangement data (e.g, as
published in [Vander Heiden, 2018]). The MiAIRR "Set 6" would
then be described as the subset of rearrangement fields from the
DataRep standard that are recommended as the minimal information
that one should store in INSDC repositories. For these fields,
MiniStd will provide a mechanism for mapping the data to the
INSDC Feature Table.
Essentially, DataRep becomes the owner/definer of rearrangement
data fields. MiniStd would no longer define these fields, but it
would identify a subset of the rearrangement fields defined in the
DataRep standard that it considers minimal via the inclusion of
these fields into MiAIRR Set 6. In addition, implementations of
MiAIRR would provide a mechanism/procedure for mapping those
minimal rearrangement fields to the INSDC repositories.
DataRep will discuss this on Monday.
John: Are there current statistics on how many AIRR data sets are
available via SRA/Genbank/TLS? No, Christian will collect these
numbers for the next call.