SB BLAST - mendessoares/BuddySuite GitHub Wiki
--blast, -bl
Description
BLAST is a local alignment algorithm commonly used to search large collections of sequences for likely homologs to a query sequence. The SeqBuddy blast tool searches a pre-existing blast database with all input sequences and returns the matches as a new sequence file. The BLAST databases must be made with the NCBI C++ toolkit makeblastdb
program, using the -parse_seqids option.
To make a blast database from the command line:
$: makeblastdb -in path/to/fasta_file -out db_name -dbtype {nucl, prot} -parse_seqids
At the moment, the SeqBuddy blast tool uses hard coded parameters when it calls the blast executable, but adding custom parameters is on the ToDo list. The following command is an example of how blastn would be called by SeqBuddy:
$: blastn -db database -query in_file.fa -out temp.txt -num_threads 4 -evalue 0.01 -outfmt 6
Dependencies
blastn, blastp, and blastdbcmd binaries, from the [NCBI C++ toolkit] (http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/) must be present in your system path.
Argument
Path to BLAST database ( str )
BLAST databases consist of 6 separate files; provide a relative or absolute path to any of these files or the base name of all files.
Example
Input file: Drosophila.nex
#NEXUS
begin data;
dimensions ntax=8 nchar=316;
format datatype=protein missing=? gap=-;
matrix
'Dme-Panxδ3' -----GFI---K----IDNMVFRCHYRITAILFTC-CIIVTANNLIGDPISCI--IPMHVINTFCWITYTYTV---A--GPGLE-K--HSYYQWVPFVLFFQGLMFYVPHWVWKM-D-GKIRMITG--VDDRDRIL-KYFVNNT--HNGYSFYFFCELLNFINVIVNIFMVDKFLGGAFMSYGTDVLKFSNMDQ-DRFDPMIEIFPRLTKCTFHKFGPSGSVQKHDTLCVLALNILNEKIYIFLWFWFIILATISGVAVLYSVVI---TR-TIR----------K--EGDFLILHFLSQNLSTRSYSDML-Q----
'Dme-Panxδ7' --L--SV----R-Q-RIDNIVFKLHYRWTVILLVA-TLLITSRQYIGEHIQCL--VVSPVINTFCFFTPTF-VD--P---PGI--D-RHAYYQWVPFVLFFQALCFYIPHALWKW-EGGRIKALVK--LG-MERVKD---IRDM--RLNWG-HVFAEVLNLINLLLQITWTNRFLGGQFLTLG------HALKN-RSDEVV---FPKITKCKFHKFGDSGSIQMHDALCVMALNIMNEKIYIILWFWYAFLLIVTVLGLLWRLCF---VR-WSL----------P-LASNWMFLFFLRSNLS-----E-L----DN
'Dme-Panxδ2' MDVFGSVKGLLKIDQV-DNNVFRMHYKATVIILIAFSLLVTSRQYIGDPIDCIVEIPLGVMDTYCWIYSTFTVPEGRDVQP--GSEKYHKYYQWVCFVLFFQAILFYVPRYLWKSWEGGRLKMLVDLSVNDKDRKIVDYFG-NLNRHNFYAFFFVCEALNFVNVIGQIYFVDFFLDGEFSTYGSDVLKFTELEPDERIDPMARVFPKVTKCTFHKYGPSGSVQTHDGLCVLPLNIVNEKIYVFLWFWFIILSIMSI-SLIYRIAVAPKLRHLLLRARSRAESEVEVAIGDWFLLYQLGKNIDPLIYKEVISDLEMG
'Dme-Panxδ5' MSAVKPLSKYLQFKIRIYDSVFTIHSRCTVVILLTCSLLLSARQYFGDPIQCI-S-EEKNIESYCWTMGTYYNEASIAE--GVEIRQYLRYYQWVIILLLFQSFVFYFPSCLWKVWEGRRLKQLCEVDNTRRM--LVKYFDMHFC----YMAYVFCEVLNFLISVVNIIVLEVFLNGFWSKYLRALW-------DRWV-SV---FPKIAKCELKF-GGSGTANVMDNLCILPLNILNEKIFVFLWAWFL-LALMSGLNLLCRLAICSRLREQMIRTKRHVKRALDLTIGDWFLMMKVSVNVNPMLFRDLMQEL---
'Dme-Panxδ6' MAAVKPLSNYLRLKVRIYDPIFTLHSKCTIVILLTCTFLLSAKQYFGEPILCL-S-SERQADSYCWTMGTYWNEQSIAE--GVETRMYLRYYQWVFMILLFQSLLFYFPSFLWKVWEGQRMEQLCEVDRTRQM--LTRYFPIHWC----YSIYAFCELLNVFISILNFWLMDVVFNGFWYKYIHALW-------NLWM-RV---FPKVAKCEFVY-GPSGTPNIMDILCVLPLNILNEKIFAVLYVWFL-FALLAIMNILYRLLICCPLRLQLLNPKSHVREVLSAGYGDWFVLMCVSINVNPTLFRELLEQL--D
'Dme-Panxδ4' MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPIQCF-G-D-KDMDAFCWIYGAYL-QCAVSK--VVEN--YITYYQWVVLVLLLESFVFYMPAFLWKIWEGGRLKHLCDFKRTHRV--LVNYFETHFR----YFVYVFCEILNLSISILNFLLLDVFFGGFWGRYRNALY-------NQWI-AV---FPKCAKCEYKG-GPSGSSNIYDYLCLLPLNILNEKIFAFLWIWFI-LAMLISLKFLYRLAVLYPMRLQLLRPKKHLQVALNCSFGDWFVLMRVGNNISPELFRKLLEEL---
'Dme-Panxδ1' YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPISCIVGVP-HVVNTFCWIHSTFTMPDRREVHPGVDF-KYYTYYQWVCFVLFFQAMACYTPKFLWNKFEGGLMRMIVGLNITRKRDALLDYLIKHVKRHKLY-AYWACEFLCCINIIVQMYLMNRFFDGEFLSYGTNIMKLSDVPQEQRVDPMVYVFPRVTKCTFHKYGPSGSLQKHDSLCILPLNIVNEKTYVFIWFWFWILLVLLGL--VFRCIIFPKFRPRLLNASNRIPMECRLDIGDWWLIYMLGRNLDPVIYKDVMSEFQVP
'Dme-Panxδ8' LDIFRGLKNLVKVSVKTDSIVFRLHYSITVMILMSFSLIITTRQYVGNPIDCVTDIP-DVLNTYCWIQSTYTLKSLVSVYPGIGNKKHYKYYQWVCFCLFFQAILFYTPRWLWKSWEGGKIHALIDLDISEKKKLLLDYLWENLRYHNWW-AYYVCELLALINVIGQMFLMNRFFDGEFITFGLKVIDYMETDQEDRMDPMIYIFPRMTKCTFFKYGSSGEVEKHDAICILPLNVVNEKIYIFLWFWFILLTFLTLLTLIYRVIIFPRMRVYLFRMRFRVRRDIEIKMGDWFLLYLLGENIDTVIFRDVVQDLRL-
;
end;
Database directory
$: ls path/to/blastdb
>>> Abacion_magnum.nex Abacion_magnum.nhr Abacion_magnum.nin Abacion_magnum.nog
>>> Abacion_magnum.nsd Abacion_magnum.nsi Abacion_magnum.nsq
Usage example
$: sb Drosophila.nex -bl /path/to/blastdb/Abacion_magnum
output
>4086 comp4411_c0_seq1|m.4086
MFDVLGSLKSVFLRLKTISVDNSIFKLHYRLTTIILAVFSILVTSKQYLGDPIDCTTSST
TIRAELLDQYCWVSSTYSLPKAFDQKVGRFGHVSHPGIATYHEGDQVIYHQYYQWVCFVL
FLQSMMFYLPHYLWKIWECGRLKALADDIQGPLTSDETKKGKLAAISAYFSTSLFHHNFY
ATRYSICEVLNFANVVGQMFLTNRFLGGTFLTYGTEVIEFSESNQLNRTDPMIKVFPRVT
KCSFFTYGSSGDMQNHDALCVLPVNIINEKIYIVLWFWFIILAVLSGLAIIYRLIVTFSV
RARYLALRSRANSVSRSEIEKIAYNTEFGDWFVLYLLSKNVNSYVFKEVVDVVVKQLDNS
DYVPKEKHGLFKKLPL*
>5440 comp6054_c0_seq1|m.5440
FVLFFQAMLFYIPRFLWKMWEGKRLETIVLGMHVGILTEEEKNNRKKVLLEYLTRHFRRH
TFYAIKYYICELLCLVNVIGQMYLMNKFLGGEFMDYGSRVLEFSEQNQDSRTDPMIYVFP
RMTKCTFHKFGTSGDIQRHDALCVLPLNIVNEKIYIFLWFWFIILATLTALVLCYRILII
AFPKFRPQILHARCRLTPMKTINSVLRNADLGDWFLFYLLGKNMDPCIFREVCIELSKKL
ETAESNNP*