Import Genomics DB data using Ansible - GenomicsDB/GenomicsSampleAPIs GitHub Wiki
The genomicsdb-webserver
playbook repo is meant to be used as a quick and easy way to import genomics db data (tile db and meta db), and setup a working genomics db webserver at a target node. This also means that the data has to be exported from another instance, and made available to the playbook.
The genomicsdb-webserver
playbook uses the genomicsdb
playbook, hence all the infrastructure setup and variables from that playbook are applicable. The details of the genomicsdb
playbook can be found here
The Ansible playbook - genomicsdb-webserver
will setup the following
- Copy the data set from target to the host node
- Load the data into Tile DB
- Import the data into Meta DB
- Update Meta DB with the workspace and array name that the playbook uses
- Setup nginx configuration
- Setup GA4GH configuration
- Setup GA4GH service
- Start both nginx and GA4GH service
In addition to the variables from the genomicsdb
playbook, the following variables can be overridden (if necessary). The defaults can be found in defaults/main.yml under the genomicsdb-webserver
role
Name | Description |
---|---|
array_name | Name of the tile db array |
system_services_path | Path where system services are stored |
system_services_extension | File extension for the services file. |
webserver_port | Externally available port where the webserver can be accessed |
import_path | Path where the tiledb_csv and metadb_csv files are available |
metadb_file | Name of the db.gz file that has the Meta DB data (Export using `pg_dump --no-owner --data-only -d db_name |
callset_mapping_file | Name of the callset mapping file for GenomicsDB import, contains callset information mapping for GenomicsDB. NOTE callset_mapping_file also contains the path to the files that will be used during the import process. The path has to be accessible from the node for the owner_user. |
vid_mapping_file | Name of vid mapping file for GenomicsDB import, contains fields and reference set information for GenoimcsDB. |
size_per_column_partition | Buffer size that GenomicsDB will allocate while reading sample/CallSet. See GenomicsDB wiki for more info. |
delete_and_create_tiledb_array | If set to true, GenomicsDB loading process will delete existing data in the array. |