Configuration file - CSB5/OPERA-MS GitHub Wiki

OPERA-MS can be run using a configuration file that indicates the path to the input files and the options used for the assembly. The commands to process the test dataset are:

cd test_files
perl ../OPERA-MS.pl test.config 2> log.err

The configuration file is formatted as follows:

#One space between OPTION and VALUE
<OPTION1> <VALUE1> 
<OPTION2> <VALUE2>
...
<OPTION2> <VALUE3>

Essential parameters

  • ILLUMINA_READ_1 : path to the first read for Illumina paired-end read data

  • ILLUMINA_READ_2 : path to the second read for Illumina paired-end read data

  • LONG_READ : path to the long-read fastq file obtained from either Oxford Nanopore, PacBio or Illumina Synthetic Long Read sequencing

  • OUTPUT_DIR : directory where OPERA-MS results will be outputted

Optional parameters

  • REF_CLUSTERING : default: YES - whether reference-based clustering should be performed (YES) or skipped (NO)

  • STRAIN_CLUSTERING : default: YES - whether strain-level clustering should be performed (YES) or skipped (NO)

  • POLISHING : default: NO - whether short-read polishing (currently using Pilon) should be performed (YES) or skipped (NO). The polished contigs can be found in contigs.polished.fasta

  • LONG_READ_MAPPER : default: blasr - software used for long-read mapping i.e. blasr or minimap2

  • KMER_SIZE : default: 60 - kmer value used to assemble contigs

  • CONTIG_LEN_THR : default: 500 - contig length threshold for clustering; contigs smaller than CONTIG_LEN_THR will be filtered out

  • CONTIG_EDGE_LEN : default: 80 - during contig coverage calculation, number of bases filtered out from each contig end, to avoid biases due to lower mapping efficiency

  • CONTIG_WINDOW_LEN : default: 340 - window length in which the coverage estimation is performed. We recommend using CONTIG_LEN_THR - 2 * CONTIG_EDGE_LEN as the value

  • CONTIGS_FILE : path to the contigs file, if the short-reads have been assembled previously

  • NUM_PROCESSOR : default : 2 - number of processors to use (note that 2 is the minimum)

⚠️ **GitHub.com Fallback** ⚠️