Vocab. COSMIC - OHDSI/Vocabulary-v5.0 GitHub Wiki
COSMIC vocabulary
Overview
This vocabulary is based on the data from the Catalogue Of Somatic Mutations In Cancer (shortly COSMIC) ar one of the the most comprehensive resource for exploring the impact of somatic mutations in human cancer.
Sources
The source data is provided by COSMIC developers in tsv format. Only Cancer Mutation Census Data source was used to be ingested in OMOP terminologies.
Transformation
The procedures for transforming Concepts from the source to the OMOP Standard Vocabularies can be found on the OHDSI GitHub.
Concept Names
All concept names are concatenations of gene acronym with both aminoacid residues alterations and changes in related coding sequences.
Concept Synonyms
To facilitate the OMOP Genomic composition based on KOIOS tool results all the relevant HGVS expressions are included as synonyms. The language of the concepts is a Genetic nomenclature.
Concept Code
Concept codes are taken from the COSMIC-derived genomic_mutation_id.
Standard Concepts
The entire set of COSMIC concepts are non-standard entities.
Concept Classes and Domains
The entire set of COSMIC concepts belong to Variant concept_class_id within Measurement domain.
Concept Relationships
No relationships within vocabulary exist. No relationships between COSMIC and other OMOP vocabularies exist at the time.
Instructions for ETL
All COSMIC concepts are non-Standard. That means they should be mapped to the corresponding Standard Concepts using the CONCEPT_RELATIONSHIP table ("Maps to" and occasionally "Maps to value" records). Most of them will mapped to single OMOP Genomic Concepts when it will be released.