bb_reciprocal_blast - ampinzonv/BB3 GitHub Wiki

Function: bb_reciprocal_blast

Perform a reciprocal BLAST between two FASTA files and extract pairs that are mutually best hits (RBH).


🔍 Description

This function conducts two BLAST searches (A vs B and B vs A), then identifies reciprocal best hits—sequence pairs where each is the best hit for the other. This approach is commonly used to infer orthologous gene pairs between species.


⚙️ Usage

bb_reciprocal_blast --query FILE --subject FILE --blast_type TYPE [--outfile PREFIX] [--min_identity PCT] [--min_coverage PCT] [--processors N] [--quiet] [--force]

🧵 Parameters

Option Description
--query FILE FASTA file containing the query sequences (required)
--subject FILE FASTA file containing the subject sequences (required)
--blast_type TYPE BLAST algorithm to use (blastn, blastp, tblastn, etc.) (required)
--outfile PREFIX Output prefix for result files (optional)
--min_identity PCT Minimum percent identity to accept a hit (default: 0)
--min_coverage PCT Minimum percent coverage of query length (default: 0)
--processors N Number of CPU threads to use (default: 1)
--quiet Suppress informational messages
--force Overwrite output files if they exist

📤 Output Files

Given the prefix results, the following files will be created:

  • results.A_vs_B.blast: BLAST output from A vs B
  • results.B_vs_A.blast: BLAST output from B vs A
  • results.reciprocal.tsv: List of reciprocal best hit pairs (tab-separated)

🧪 Example

bb_reciprocal_blast \
  --query genes_A.faa \
  --subject genes_B.faa \
  --blast_type blastp \
  --outfile rbh_output \
  --min_identity 30 \
  --min_coverage 50 \
  --processors 4

📌 Notes

  • Uses bb_blast_on_the_fly for alignment and bb_blast_best_hit for filtering.
  • A reciprocal hit is retained only if both hits meet identity and coverage thresholds.
  • Coverage is computed using the length of the query sequence (qlen).