Core Project Schema Extensions - giffordlabcvr/Parvovirus-GLUE GitHub Wiki
At the heart of each GLUE project is a relational database with an extensible schema.
Parvovirus-GLUE extends GLUE's core schema with additional tables and data fields. These schema extensions are defined in this project build file.
Sequence table extensions
The sequence table of GLUE's core schema was extended to include the following additional fields:
Parameter | Type | Definition |
---|---|---|
full_name | VARCHAR | Full name of the virus this sequence is derived from |
name | VARCHAR | Abbreviated name of the virus this sequence is derived from |
gb_create_date | GenBank | GenBank creation date of the sequence |
gb_update_date | VARCHAR | Date of most recent GenBank update |
subfamily | VARCHAR | Taxonomy - virus subfamily |
genus | VARCHAR | Taxonomy - virus genus |
clade | VARCHAR | Taxonomy - virus clade |
length | INTEGER | Length of the sequence |
pubmed_id | INTEGER | PubMed ID of manuscript associated with GenBank entry |
Isolate table
A custom table was defined to capture isolate-associated information, as follows:
Parameter | Type | Definition |
---|---|---|
isolate | VARCHAR | Name of the isolate |
host_sci_name | VARCHAR | Scientific name of the isolation species |
country | VARCHAR | Country where virus was isolated |
country_iso | VARCHAR | ISO code of country where virus was isolated |
place_sampled | VARCHAR | Location of sampling (state, region, or city) |
collection_year | INTEGER | Year virus was isolated |
collection_month | VARCHAR | Month virus was isolated |
collection_month_day | INTEGER | Day of month virus was isolated |