Calling Variants - igheyas/Bioinformatics GitHub Wiki

  1. Exit your InSilicoSeq virtual environment (if you’re still in it):
deactivate
  1. Update your package lists and install bcftools (and tabix, which you’ll need to index your VCF):
sudo apt update
sudo apt install -y bcftools tabix
  1. Re-enter your project directory (and re-activate your ISS env, if needed):
cd /mnt/c/Users/IAGhe/OneDrive/Documents/Bioinformatics
source iss_env/bin/activate
  1. Now rerun your pileup → call pipeline:
bcftools mpileup \
  -f ref_genome.fa \
  -Ou \
  aln.filtered.bam \
| bcftools call \
  -mv \
  -Oz \
  -o raw_snps_indels.vcf.gz

# index your VCF so you can quickly query it later
tabix -p vcf raw_snps_indels.vcf.gz

Output:

image

Interpretation: That output is just bcftools telling you:

It’s defaulting to diploid calls because you didn’t pass a --ploidy (for a bacterial genome you can add --ploidy 1).

It saw 1 BAM in your input.

It has capped per-site depth at 250 (the default).

None of those are errors – they’re purely informational. As long as you now have a file called raw_snps_indels.vcf.gz in your directory, you’ve successfully called variants.

  1. Index the VCF so you can easily query it:
tabix -p vcf raw_snps_indels.vcf.gz
  1. Take a peek at your variants:
bcftools view raw_snps_indels.vcf.gz | head -n 30

Output:

image es – that looks like a perfectly valid VCF. A quick checklist:

VCF version & header You’re on VCFv4.2, and bcftools has printed out all the standard header lines (##fileformat, INFO and FORMAT field definitions, etc.).

FILTER=PASS You see ##FILTER=<ID=PASS,…> and only PASSed records below.

Contig declaration ##contig=<ID=NC_000913.3,length=4641652> matches your E. coli reference.

INFO & FORMAT fields All of the usual annotations (DP, AD, MQ, PL, GT, etc.) are defined in the header.

Data lines The first lines after the “#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ” show actual variant calls from your aln.filtered.bam.

Next steps Index the VCF for fast querying:

tabix -p vcf raw_snps_indels.vcf.gz

Inspect a few variant records:

bcftools view raw_snps_indels.vcf.gz | column -t | less -S

###Output: image

That’s exactly what you want to see – your VCF nicely column-aligned in the less pager. To get back to the shell prompt, just press

q

That quits less and drops you back at the command line.

⚠️ **GitHub.com Fallback** ⚠️