Overview - ampinzonv/BB3 GitHub Wiki
BioBASH v0.3 "Jazzy"
BioBASH is a lightweight and portable collection of command-line utilities written in pure Bash, designed to support common bioinformatics tasks directly from the terminal. It prioritizes simplicity, modularity, and compatibility with both Linux and macOS systems.
๐ง Module Overview
file.sh
๐ Functions for working with FASTA and FASTQ files:
- Extract headers, IDs, sequences
- Subset entries by ID or coordinate range
- Convert FASTQ to FASTA
- Compute basic statistics (e.g. N50)
utility.sh
๐ General-purpose helper functions:
- Count or list unique items
- Parse arguments
- Validate inputs and directories
- Used internally across modules
blast.sh
๐ฅ Interfaces to NCBI BLAST+ tools:
- Run BLAST searches (
bb_run_blast
) - Create BLAST databases
- Parse and filter BLAST outputs
- Detect reciprocal best hits (RBH)
plot_ascii.sh
๐ Minimalist ASCII plotting tools:
- Visualize BLAST hit coverage
- Plot histograms of FASTQ quality scores
- All plots are text-only, optimized for terminals, SSH sessions, or log file inclusion
๐ค Input and Output Philosophy
BioBASH functions are designed to be pipe-friendly and follow UNIX conventions:
Input Type | Mechanism |
---|---|
File | --input file.txt |
STDIN | --input - |
Paired inputs | Specific flags (e.g. --a , --b ) |
Output Type | Behavior |
---|---|
STDOUT | Default unless --outfile is used |
File output | Use --outfile file.txt or --outdir for multiple files |
Most functions allow redirection and integration into pipelines, for example:
cat sequences.fasta | bb_get_fasta_id --input - | bb_get_list --input -
โ๏ธ Default Behaviors
Parameter | Default | Notes |
---|---|---|
--quiet |
Off | Enables verbose output with [INFO] messages |
--force |
Off | Prevents overwriting files unless explicitly set |
--processors |
1 |
Parallelization for BLAST (where supported) |
--sample_size |
10 |
For functions using random subsampling |
--phred_offset |
33 |
Used for FASTQ quality interpretation |
๐ Philosophy
BioBASH embraces the principle that everything should be transparent and reproducible. All tools output plain text, which is ideal for:
- Running in remote servers via SSH
- Logging in pipelines
- Teaching environments where simplicity is key
Plots are ASCII by design: no dependencies on Python, R, or external librariesโjust Bash.
๐งช Compatibility
BioBASH is tested on:
- Ubuntu 20.04+
- macOS 12+ (Monterey or later)
Some modules may require BLAST+ (makeblastdb
, blastn
, etc.) or gzip
utilities.
๐งฉ Extending BioBASH
Each function is self-contained and can be:
- Loaded in a shell session
- Sourced in other scripts
- Integrated into Makefiles, Snakemake, or Nextflow
๐ License & Contributions
BioBASH is open-source and community-driven. Contributions, bug reports, and feature requests are welcome via GitHub.