Core Project Schema Extensions - giffordlabcvr/Parvovirus-GLUE GitHub Wiki

At the heart of each GLUE project is a relational database with an extensible schema.

Parvovirus-GLUE extends GLUE's core schema with additional tables and data fields. These schema extensions are defined in this project build file.

Sequence table extensions

The sequence table of GLUE's core schema was extended to include the following additional fields:

Parameter Type Definition
full_name VARCHAR Full name of the virus this sequence is derived from
name VARCHAR Abbreviated name of the virus this sequence is derived from
gb_create_date GenBank GenBank creation date of the sequence
gb_update_date VARCHAR Date of most recent GenBank update
subfamily VARCHAR Taxonomy - virus subfamily
genus VARCHAR Taxonomy - virus genus
clade VARCHAR Taxonomy - virus clade
length INTEGER Length of the sequence
pubmed_id INTEGER PubMed ID of manuscript associated with GenBank entry

Isolate table

A custom table was defined to capture isolate-associated information, as follows:

Parameter Type Definition
isolate VARCHAR Name of the isolate
host_sci_name VARCHAR Scientific name of the isolation species
country VARCHAR Country where virus was isolated
country_iso VARCHAR ISO code of country where virus was isolated
place_sampled VARCHAR Location of sampling (state, region, or city)
collection_year INTEGER Year virus was isolated
collection_month VARCHAR Month virus was isolated
collection_month_day INTEGER Day of month virus was isolated