Annotation source - bcb420-2022/Tianyan_Zhou GitHub Wiki
Objectives
find a unique annotation dataset for human genes and answer the questions outlined in Quecus time estimate: 2h time taken: 2h 2022-03-01
Procedures
- Google "gene annotation source"
- found BioGPS.
- Access the website and document questions.
Results
1. What sort of data is it? What sort of information does it offer us?
- BioGPS (http://biogps.org) is a centralized gene-annotation portal that enables researchers to access distributed gene annotation resources.
- It offers us mixed information including gene expression, gene identifiers, and gene wiki.
- The 'Gene expression/activity chart' includes ∼ 6000 datasets.
2. When and where was it published? Was it published?
- It was originally published in 2009 in Genome Biol. 10(11):R130, with the following link BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources.
- Its latest update was published in 2016 in Nucleic Acid, with the following link BioGPS: building your own mash-up of gene annotations and expression profiles.
3. Is this annotation set updated regularly or is it a static source?
This annotation set is a static source since I did not find some update messages.
4. Where can I find this data? (link to the download web address or ftp site or publication where it can be found)
The data can be found at http://biogps.org/dataset/ and downloaded directly from its website.
5. How is the data formatted and released? Does it exist in some sort of standard file format?
-
There are three panels in most of the annotations, including gene expression/activity chart, gene identifiers, and **gene wiki **.
-
There are other formats options: JSON and XML.
6. What identifiers are associated with these annotations?
- The gene identifiers include: symbol, description, accessions(NCBI gene, Ensembl, OMIM), aliases, genome location, GO function(molecular function, biological process, cellular component),interpro, transcripts(NCBI, Ensembl), and reporters.
- Some identifiers of gene annotation might include: proteins(NCBI, Ensembl)
Conclusion and Outlook:
- BioGPS is a powerful tool for querying different gene annotation data.
- I will start to work on Assignment 2 soon
References:
-
Wu C, Jin X, Tsueng G, Afrasiabi C, and Su AI (2016) BioGPS: building your own mash-up of gene annotations and expression profiles. Nucl. Acids Res. 44(D1): D313-D316. (Database Issue)
-
Wu C, MacLeod I, Su AI (2013) BioGPS and MyGene.info: organizing online, gene-entric information. Nucl. Acids Res. 41(D1): D561-D565. (Database Issue)
-
Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW 3rd, Su AI (2009) BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol. 10(11):R130.