Managing the SQL Database - AndersenLab/CAENDR GitHub Wiki

Overview

This page describes how to manage the site's SQL database, which indexes large quantities of data for efficient querying.

List of SQL Tables

A description of each table, along with what site features, tools, etc. depend on each.

Table Description Table Dependencies Usage
strains Data on each of the individual strains available through the site. Used basically everywhere. - Strain / isotype pages & lists across the site (tables, tool dropdowns, catalog, etc); release data (incl. strain data TSV download); strain map; stats page data; Heritability Calculator valid strains list
wormbase_gene Full listing of species genomic data. - Gene lists
wormbase_gene_summary Summary of species genomic data, for more efficient lookups. wormbase_gene ??
strain_annotated_variant Combines several features linked to variants: genetic location, base pairs affected, consequences of reading, gene information, strains affected, and severity of impact - Strain Variant Annotation tool data
phenotype_db Raw phenotype data uploaded by site users. phenotype_metadata Phenotype Database tool
phenotype_metadata Phenotype submission information - trait, strain, gene, submitter, etc. - Phenotype Database tool

Building SQL Tables

This section describes how to (re)build the different tables in the site's SQL database.

Database Operations

The following table lists the SQL table rebuild operations, with links to the data files required by each:

Database Operation Name Tables Required Files
"Rebuild strain table from google sheet" strains link
"Rebuild wormbase gene table from external sources" wormbase_gene & wormbase_gene_summary link
"Rebuild Strain Annotated Variant table from .csv.gz file" strain_annotated_variant link
"Rebuild all Phenotype trait tables from database" phenotype_db & phenotype_metadata link
"Rebuild All Tables" all tables listed above

Steps to Build

To build (or rebuild) a site SQL table, perform the following steps:

  1. Ensure that all require data files / sources exist for the table & species you want to rebuild. Consult the table above for details.
  2. Log in to the site as an admin user. If you don't have an admin account, perform the steps above.
  3. Navigate to the Admin portal. This can be accessed through the profile button in the top right-hand corner, or by navigating directly to the URL /admin.
  4. Under the "Tool & Database Administration" section, navigate to "Perform Database Operations".
  5. Click the "New Operation" button (right-hand side above the data table).
  6. Select the database operation for the SQL table(s) you want to rebuild. Consult the table above for details.
  7. Select all species you want to rebuild the table for.
  8. Enter a quick note to identify this operation.
  9. Click "Start". This will begin the operation, and redirect you to the table of operations.
  10. Monitor the new operation for a moment to ensure it is set to RUNNING. If it enters an error state, you will need to debug before trying again.

You can monitor the operations page for the operation to complete; you should(?) also receive an email notification when the operation completes.