Annotation source - bcb420-2022/Tianyan_Zhou GitHub Wiki

Objectives

find a unique annotation dataset for human genes and answer the questions outlined in Quecus time estimate: 2h time taken: 2h 2022-03-01

Procedures

  1. Google "gene annotation source"
  2. found BioGPS.
  3. Access the website and document questions.

Results

1. What sort of data is it? What sort of information does it offer us?

  • BioGPS (http://biogps.org) is a centralized gene-annotation portal that enables researchers to access distributed gene annotation resources.
  • It offers us mixed information including gene expression, gene identifiers, and gene wiki.
  • The 'Gene expression/activity chart' includes ∼ 6000 datasets.

2. When and where was it published? Was it published?

3. Is this annotation set updated regularly or is it a static source?

This annotation set is a static source since I did not find some update messages.

4. Where can I find this data? (link to the download web address or ftp site or publication where it can be found)

The data can be found at http://biogps.org/dataset/ and downloaded directly from its website.

5. How is the data formatted and released? Does it exist in some sort of standard file format?

  • There are three panels in most of the annotations, including gene expression/activity chart, gene identifiers, and **gene wiki **.

  • There are other formats options: JSON and XML.

6. What identifiers are associated with these annotations?

  • The gene identifiers include: symbol, description, accessions(NCBI gene, Ensembl, OMIM), aliases, genome location, GO function(molecular function, biological process, cellular component),interpro, transcripts(NCBI, Ensembl), and reporters.
  • Some identifiers of gene annotation might include: proteins(NCBI, Ensembl)

Conclusion and Outlook:

  • BioGPS is a powerful tool for querying different gene annotation data.
  • I will start to work on Assignment 2 soon

References:

  • Wu C, Jin X, Tsueng G, Afrasiabi C, and Su AI (2016) BioGPS: building your own mash-up of gene annotations and expression profiles. Nucl. Acids Res. 44(D1): D313-D316. (Database Issue)

  • Wu C, MacLeod I, Su AI (2013) BioGPS and MyGene.info: organizing online, gene-entric information. Nucl. Acids Res. 41(D1): D561-D565. (Database Issue)

  • Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW 3rd, Su AI (2009) BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol. 10(11):R130.