Minutes_MiniStd_2019 11 - airr-community/airr-standards GitHub Wiki

MiniStd Call 2019-11

Agenda

Follow-up

  • Current NCBI submission stats
  • Rename MiAIRR field "Organism" to "Species" (#266)
  • Inclusion of species information for cell and locus annotations (#260)
  • Human Population Genetics XT (#264, #265)
  • DataRep Discussion (#248):
    • How to document changes to the standard in a transparent fashion?
    • How to document renaming (instead of deprecation) of fields?
  • Relationship between MiAIRR Set 6 and DataRep rearrangement object
  • Adding gene and gene family to DataRep spec but not MiAIRR (#258)
  • Talking about a Spec definition for cell (#211)

Minutes

Meta

  • Date: Fri, 2019-12-13 14:30 UTC
  • Present: Brian, Christian, Corey, Francisco, Florian, Marcos, Sri

Decisions

  • Renaming "Organism" field to "Species" was approved. After we discussed this again, the renaming was put to a vote. It was decided to perform this renaming in the upcoming v2 release of MiAIRR and the AIRR schema (see #266).
  • Relation between MiAIRR Set 6 and DataRep rearrangement. See minutes of MiniStd Call 2019-11 for a summary. DataRep is fine with the suggested procedure (DataRep governs the fields, MiniStd simply declares whether a field is "minimal" in terms of reporting). We approved this as the new mode of operation, which will be included in the documentation until the v2 release, although it is formally independent of it.

Follow-up

  • Human Population Genetics XT: Due to time restrictions this was not yet brought up in a GLDB call. Therefore comments on the respective tickets (#264, #265) were requested via the GLDB mailing list until our next call.
  • Makeing renaming of fields trackable: Renaming (not only deprecation) is now included #248 and defined as a To-Do for v2.0 (#305).
  • Addition of further gene call fields to rearrangement (#258): This is a bigger discussion involving ComRepo, DataRep and GLDB. However, as it does not affect the existence of the [vdj]_call fields, which we require for Set 6, it is not a MiniStd topic.
  • Inclusion of species information for cell and locus annotation: As discussed during the MiniStd Call 2019-09 and decided in MiniStd Call 2019-10, we want to introduce fields to provide species information for the cell_* and locus fields to address issue #137. The respective changes were introduced in PR #260, however it turns out that it is problematic to add ontology-controlled fields to the rearrangement object (#278), i.e., for locus. Therefore only cell_species was added to the schema, while locus_species has been reverted (via #281). Will follow up with DataRep and ComRepo on potential solutions.
  • Current NCBI submission stats: Pulled from NCBI based on the "AIRR" keyword (note that not all submitted studies include this). Results in table191201 are queried via https://www.ncbi.nlm.nih.gov/nuccore/?term=AIRR%5BKeyword%5D and show TLS record counts aggregated by BioProject ID:
BioProject records
PRJNA545339 12
PRJNA336331 1
PRJNA488042 20
PRJNA520929 62
PRJNA338795 93

New topics

  • Define cell and receptor objects: The ongoing work to create API endpoints to access single-cell data (#211) has sparked some discussion about the cell and receptor entities and their respective (potential) IDs cell_id and pair_id (see lengthy discussion in #273). We agree that it would be important to include a representation of these objects in the schema and adapt the API endpoints accordingly. Will follow up in MiniStd Call 2020-01.
⚠️ **GitHub.com Fallback** ⚠️