Taxonomy Database - mariehoffmann/isPCR GitHub Wiki
Schema
Node
extracted from nodes.dmp
Attributes | Constraint | Comment |
---|---|---|
tax_id |
PRIMARY KEY |
node id in GenBank taxonomy database |
parent_tax_id |
parent node id in GenBank taxonomy database | |
rank |
rank of this node (superkingdom, kingdom, ...) |
Names
All known names assigned to a tax_id
are listed in the table. If there exists multiple names for a tax_id
, there will be as many entries with the same tax_id
. Rows are extracted from names.dmp
.
Attributes | Constraint | Comment |
---|---|---|
tax_id |
PRIMARY KEY , FOREIGN KEY |
the id of node associated with this name |
name_txt |
PRIMARY KEY |
name itself |
unique_name |
the unique variant of this name if name not unique |
Lineage
For each taxonomic node identified by its unique tax_id
, a list of ancestors is stored from most far to closest one. Data is extracted per default from NCBI's taxidlineage.dmp
.
Attributes | Constraint | Comment |
---|---|---|
tax_id |
PRIMARY KEY , FOREIGN KEY |
|
lineage |
list of tax_id s |
Accessions
This table contains a pre-processed tax_id
to accession number resolution, such that we are able to retrieve all existing accessions given a tax_id
. Data is extracted per default from NCBI's nt
(.fast) file and Node
and Names
tables.
Attributes | Constraint | Comment |
---|---|---|
tax_id |
PRIMARY KEY , FOREIGN KEY |
only unique in combination with accession |
accession |
PRIMARY KEY |