Use Cases - CAMI-challenge/CAMISIM GitHub Wiki

The following page contains a list of different use cases for CAMISIM. All parameters that are passed in the bash command in the examples can also be set in the configuration files. Further information on the configuration options can be found here.

de novo:

If the community is designed de novo, a user-defined number of the complete genomes is used for creating a community which maximizes genome novelty as well as phylogenetic spread.

>> nextflow run main.nf

from profile:

If a taxonomic profile is used as input, the output data set is created from the NCBI complete genomes, reflecting the input profile as closely as possible and will contain the same number of samples as the input profile, if not specified otherwise.

>> nextflow run main.nf --params.biom_profile "${projectDir}/defaults/mini.biom" 

read simulator:

It is also possible to define the read simulator in the command. Choose from "art"/"nanosim3"/"wgsim".

>> nextflow run main.nf --type "nanosim3"

pooled gold standard assembly:

To do a pooled gold standard assembly for specific samples, set them in the nextflow.config file:

>> pooled_gsa = [0,2]

To perform the pooled gold standard assembly for all samples, specify this in the nextflow.config file (is also set per default):

>> pooled_gsa = true    

To skip the pooled gold standard assembly, specify this in the nextflow.config file:

>> pooled_gsa = []          

anonymization:

To perform anonymization, specify this in the nextflow.config file (is also set per default):

>> anonymization = true    

To skip the anonymization, specify this in the nextflow.config file:

>> anonymization = false    

distribution:

To use custom distributions for the read simulation, specify the path to the distribution files in the distribution.config file like this:

>> distribution_files = "${projectDir}/nextflow_defaults/distribution_*.txt"

or

>> distribution_files = ["${projectDir}/nextflow_defaults/distribution_0.txt", "${projectDir}/nextflow_defaults/distribution_1.txt"]

To calculate distributions with CAMISIM specify this in thedistribution.config file:

>> distribution_files=""

NCBI taxonomy dump:

To use a custom NCBI taxonomy dump, specify the path in the nextflow.config file like this:

>> ncbi_taxdump_file = "${projectDir}/tools/ncbi-taxonomy_20170222.tar.gz"

To download and use a new NCBI taxonomy dump:

>> ncbi_taxdump_file = "" 

NanoSim read length calculation:

If simulating with the NanoSim, it is possible to set a custom read length in the nanosim.config like this (4508 is the calculated default value):

>> read_length = 4508

To calculate a new read length value, comment out or delete the parameter in the nanosim.config:

>> read_length = 4508

Wgsim CIGAR creation:

When simulating with Wgsim, CAMISIM creates a default CIGAR (<Length of read>M) because of performance reasons. To create real CIGAR values specify this in the wgsim.config:

>> create_cigar = true