5. Annotation Sources: HW 1 - bcb420-2022/Inika_Prasad GitHub Wiki

COSMIC: Cancer Gene Census

A cancer gene census from the Catalogue of Somatic Mutations in Cancer (COSMIC)

Type of data and the information it offers

What sort of data is it? What sort of information does it offer us?

There are various datasets within the COSMIC catalogue, such as:

  • COSMIC: The core of COSMIC, an expert-curated database of somatic mutations
  • Cell Lines Project: Mutation profiles of over 1,000 cell lines used in cancer research
  • COSMIC-3D: An interactive view of cancer mutations in the context of 3D structures
  • Cancer Mutation Census: Classification of genetic variants driving cancer
  • Actionability: Mutations actionable in precision oncology
  • Cancer Gene Census: A catalogue of genes with mutations that are causally implicated in cancer

They are curated and treated differently, so I will discuss the Cancer Gene Census here onwards. The census includes a list of genes causally implicated in cancer. The information associated with every gene consists of...

  • various identifiers
  • Genomic coordinates
  • identified mutations
  • type of mutations
  • tissue distributions
  • drug sensitivity and resistance
  • variants of the gene
  • publications identified by COSMIC that discuss the gene
  • cancer hallmark with which the gene is associated
  • role in cancer (oncogene/tumor suppressor gene)

Publishing

When and where was it published? Was it published?

The paper "A census of human cancer genes" published in 2004 in Nature Reviews Cancer was used as the starting point for the Cancer Gene Census by COSMIC. Version 1 (v1) of the Census was released on February 4th, 2004.

Source: Futreal, P., Coin, L., Marshall, M. et al. A census of human cancer genes. Nat Rev Cancer 4, 177–183 (2004). https://doi.org/10.1038/nrc1299

Updating & status

Is this annotation set updated regularly or is it a static source?

The census is not static and is updated periodically with new genes by COSMIC's team of postdoctoral scientist curators. (Source). Newly released genes can be found in the release notice alert. As of today, 2022-02-28, the most recent version is V95, released on 2021-11-24.

Finding the data

Where can I find this data? (link to the download web address or ftp site or publication where it can be found)

You can find and download the data on the COSMIC Cancer Gene Census website. You can download the data for non-commercial use, but you must register first. You can download data using the command line with the help of COSMIC's tutorial. The NIH has a Clincal Table Search Service for COSMIC Mutation Census. However, this is not for the Cancer Gene Census in particular.

Format & release

How is the data formatted and released? Does it exist in some sort of standard file format?

The data can be searched for on the website or downloaded as a .csv or .tsv file.

Identifiers associated with the annotations

What identifiers are associated with these annotations?

The genes are associated with Gene Symbols from HGNC. If you go to the page for a gene of interest, you will find other identifiers and links to external webpages such as the Uniprot, RefSeq, Entrez Gene id, Ensembl Gene Id, and other aliases.