bb_fasta_stats - ampinzonv/BB3 GitHub Wiki

Function: bb_fasta_stats

Generate summary statistics from a FASTA file, including N50.


๐Ÿ” Description

This function analyzes a FASTA file and reports basic statistics like sequence count, total length, average length, minimum and maximum lengths, and N50.

๐Ÿ“ฅ Input

  • FASTA file of sequences.
  • STDIN is supported via --input -.

๐Ÿ“ค Output

  • A textual report showing summary metrics.

๐Ÿงช Examples

bb_fasta_stats --input contigs.fasta
cat contigs.fasta | bb_fasta_stats --input -
bb_fasta_stats --input contigs.fasta --outfile stats.txt

โš™๏ธ Usage

bb_fasta_stats --input FILE [--outfile FILE] [--quiet] [--force]

๐Ÿงต Options

Option Description
--input FILE Input FASTA file or - for STDIN (required)
--outfile FILE Output file for results (default: STDOUT)
--quiet Suppress informational messages
--force Overwrite output file if it exists

๐Ÿ“Œ Notes

  • N50 is the length of the shortest contig at 50% of the total sequence length when sorted by size.
  • This is useful for genome assemblies, transcript sets, and other sequence collections.