Advanced options - bbuchfink/diamond GitHub Wiki

Advanced options

  • --dbsize #

    Effective size of the database in letters (affects computation of e-values).

  • --seed-cut #

    Cutoff for masking low complexity seeds in bits/letter. The defaults are 0.9 for the fast mode, 0.8 for the default mode and 1.0 otherwise. (Supported since v2.0.12)

  • --motif-masking (0,1)

    Enable soft-masking of a fixed set of highly abundant sequence motifs. Enabled by default in the fast, default, mid-sensitive and sensitive mode. (Supported since v2.0.12)

  • --ext (banded-fast,banded-slow,full)

    This option determines how band sizes are setup for banded Smith-Waterman extension. The banded-slow setting is slightly slower but more accurate. The defaults are banded-fast for the default and sensitive modes, and banded-slow for the more-sensitive, very-sensitive and ultra-sensitive modes.

    Setting this option to full will compute full-matrix instead of banded Smith Waterman extensions, vectorized using the SWIPE algorithm. Full-matrix extensions are more accurate but will reduce performance. Supported since v2.0.7.

  • --band #

    Set a fixed band size for banded Smith Waterman extension. Setting this option overrides the preconfigured defaults set by the --ext option. Note that the band size is first determined by chaining, while this parameter provides an additional band margin in both directions.

  • --no-ranking

    Disable ranking heuristic. Ranking refers to a heuristic that eliminates hits prior to full gapped extension. It works by establishing a tentative order on the target sequences that were hit with respect to a single query. This order is determined by ungapped extension scores at seed hits. The aligner will compute gapped Smith-Waterman extensions for chunks of targets (see --ext-chunk-size) according to the ranking order until no alignment was found for the current chunk that meets the user-specified reporting criteria.

    Setting this option disables the ranking heuristic, causing gapped extensions to be computed for all seed hits.

    Option supported since v2.0.3.

  • --gapped-filter-evalue #

    Set the e-value threshold for the gapped filter heuristic that estimates gapped scores prior to full extension (0.0 to disable this filter step). This is a normalized e-value with respect to a database size of 1 billion letters. The default is 1.0 for the sensitive to ultra-sensitive modes and 0.0 otherwise. Supported since v2.0.7.

  • --ext-chunk-size #

    The chunk size of target sequences used by the ranking heuristic (see above). Higher numbers improve the accuracy of the algorithm. The default is the number specified by the --max-target-seqs/-k option, rounded up to the next multiple of 32, with a minimum of 128 and a maximum of 400. When using --top, the default value is 128.

    Option supported since v2.0.3.

  • --xml-blord-format

    Use gnlBL_ORD_ID style format for hit IDs in XML output.

  • --range-cover #

    The minimum percentage of a hit's query range that needs to be spanned by higher scoring hits for a hit to be deleted in range culling mode (default=50.0).

  • --culling-overlap #

    The minimum percentage of a hit's query or target range that needs to be spanned by a higher scoring hit against the same target for a hit to be deleted. (default=50.0)

  • --bin #

    The number of bins for storing seed hits. Higher numbers will lead to the creation of more temporary files, reduce memory usage of the extension stage, but also slightly reduce efficiency. The default value is 16, except for the ultra-sensitive mode which uses --bin 64 by default.

  • --no-unlink

    Disable unlinking of temporary files.

Legacy options

  • --freq-masking

    Enable masking of seeds based on frequency instead of complexity. This was the default behaviour prior to v2.0.12. (Supported since v2.0.12)

  • --freq-sd #

    Set the number of standard deviations above the average frequency for ignoring a seed. The default values are 50 in default mode, 20 in sensitive mode, 200 in more-sensitive mode, 15 in very-sensitive mode and 20 in ultra-sensitive mode.