blobtools create - genomehubs/blobtoolkit GitHub Wiki

The minimum requirement to create a new dataset with blobtools is an assembly FASTA file. This is enough to create a new BlobDir directory containing a collection of JSON files in a BlobDir directory. Further data can be added as part of the blobtools create command or using blobtools add. The BlobDir format can be processed using blobtools filter or visualised using blobtools view and in the interactive BlobToolKit Viewer.

Command line

  • blobtools create is a synonym for blobtools add intended for use when creating a new dataset
Add data to a BlobDir.

Usage:
    blobtools add [--bed BED...] [--beddir DIRECTORY] [--bedtsv TSV...] [--bedtsvdir DIRECTORY]
                  [--busco TSV...] [--cov BAM...] [--hits TSV...] [--fasta FASTA] [--hits-cols LIST]
                  [--key path=value...] [--link path=url...] [--taxid INT] [--skip-link-test]
                  [--blobdb JSON] [--meta YAML] [--synonyms TSV...] [--trnascan TSV...]
                  [--text TXT...] [--text-delimiter STRING] [--text-cols LIST] [--text-header]
                  [--text-no-array] [--taxdump DIRECTORY] [--taxrule bestsum|bestsumorder[=prefix]]
                  [--threads INT] [--evalue NUMBER] [--bitscore NUMBER] [--hit-count INT]
                  [--update-plot] [--pileup-args key=value...] [--create] [--replace] DIRECTORY

Arguments:
    DIRECTORY             Existing Blob directory.

Options:
    --bed BED             BED format file.
    --beddir DIRECTORY    Directory containing one or more BED format files.
    --bedtsv TSV          TSV file with header row and bed-format columns 1-3.
    --bedtsvdir DIRECTORY Directory containing one or more BED-like tsv files.
    --busco TSV           BUSCO full_table.tsv output file.
    --cov BAM             BAM/SAM/CRAM read alignment file.
    --fasta FASTA         FASTA sequence file.
    --hits TSV            Tabular BLAST/Diamond output file.
    --hits-cols LIST      Comma separated list of <column number>=<field name>.
                          [Default: 1=qseqid,2=staxids,3=bitscore,5=sseqid,10=qstart,11=qend,14=evalue]
    --taxid INT           Add ranks to metadata for a taxid.
    --key path=value      Set a metadata key to value.
    --link path=URL       Link to an external resource.
    --skip-link-test      Skip test to see if link URL can be resolved.
    --meta YAML           Dataset metadata.
    --blobdb JSON         Blobtools v1 blobDB.
    --synonyms TSV        TSV file containing current identifiers and synonyms.
    --taxdump DIRECTORY   Location of NCBI new_taxdump directory.
    --taxrule rulename[=prefix]
                          Rule to use when assigning BLAST hits to taxa (bestsum, bestsumorder,
                          bestdistsum, bestdistsumorder, blastp).
                          An alternate prefix may be specified. [Default: bestsumorder]
    --threads INT         Number of threads to use for multithreaded tasks. [Default: 1]
    --evalue FLOAT        Set evalue cutoff when parsing hits file. [Default: 1]
    --bitscore FLOAT      Set bitscore cutoff when parsing hits file. [Default: 1]
    --hit-count INT       Number of hits to parse when inferring taxonomy. [Default: 10]
    --update-plot         Flag to use new taxrule as default category.
    --text TXT            Generic text file.
    --text-delimiter STRING
                          Text file delimiter. [Default: whitespace]
    --text-cols LIST      Comma separated list of <column number>[=<field name>].
    --text-header         Flag to indicate first row of text file contains field names.
    --text-no-array       Flag to prevent fields in files with duplicate identifiers being
                          loaded as array fields.
    --trnascan TSV        tRNAscan2-SE output
    --pileup-args key=val Key/value pairs to pass to samtools pileup.
    --create              Create a new BlobDir.
    --replace             Replace existing fields with matching ids.

Examples:
    # 1. Add BUSCO scores to BlobDir
    blobtools add --busco busco.full_table.tsv BlobDir
⚠️ **GitHub.com Fallback** ⚠️