HomologousVirusSequenceSearchandMultipleSequenceAlignment - BGIGPD/BestPractices4Pathogenomics GitHub Wiki
Homologous Virus Sequence Search and Multiple Sequence Alignment
Purpose of Homologous Virus Sequence Search
- Study the evolutionary relationships of viruses.
Homologous Virus Sequence Search Control
-
Use NCBI BLAST for homologous virus sequence search: NCBI BLAST
Downloading Viral Sequences
- Download sequences to local computer or use
wget
orncbi-datasets-cli
to download to the server.
Searching for Viral Sequences
-
Search for other viral sequences of the same species using: Viral Sequence Similarity and Search
Processing Viral Sequences
- Unzip and filter viral sequences:
gunzip -c virushostdb.formatted.cds.faa.gz > virushostdb.formatted.cds.faa
seqkit grep -n -r -p "Severe acute respiratory syndrome-related coronavirus" virushostdb.formatted.cds.faa > SARS-CoV-2.faa
Multiple Sequence Alignment
Install MAFFT using conda:
conda install -c bioconda mafft
Run MAFFT for sequence alignment:
mafft --auto SARS-CoV-2.faa > SARS-CoV-2.aln.faa