Overview - ajmoore143/KEGGBLAST GitHub Wiki
Overview
Purpose:
KEGGBLAST automates the common steps required to go from a KEGG Orthology (KO) entry to fully-formatted FASTA files and BLAST results, including:
- Fetching gene IDs from a given KO (e.g. “K09252”)
- Downloading amino acid (AASEQ) and nucleotide (NTSEQ) sequences for each gene
- Automatically matching user-provided species names (even if slightly misspelled) to KEGG IDs
- Saving results (tables, folder structures, FASTA files)
- Running BLAST (via either gget or NCBI API), with optional taxonomic filters
- Caching the KEGG species dictionary locally for faster subsequent runs
Who Should Read This:
- Bioinformaticians who need to pull sequences in bulk from KEGG
- Anyone who wants to automate BLAST searches against a list of KO-derived genes
- Developers looking to integrate KEGG + BLAST steps into a larger pipeline
Dependencies / Prerequisites:
- Python 3.7+
- The
keggblast
package (install viapip install .
)