Coverage and Depth Analysis - igheyas/Bioinformatics GitHub Wiki

Coverage & Depth Analysis
After mapping and preparing a sorted, indexed BAM, you can quantify how deeply and how uniformly your reads cover the reference.

  1. samtools depth
    Outputs per‐position coverage in a simple three‐column format:
 # Compute depth at every position
samtools depth aln.bwa.sorted.bam > depth.tsv
  • Format:
CHROM   POS   DEPTH
NC_000913.3   1    30
NC_000913.3   2    29
  • Average depth:
awk '{sum+=$3; count++} END {print "Avg depth =", sum/count}' depth.tsv

Output:

awk '$3>0{cov++} END {print "Breadth =", cov/NR*100 "%"}' depth.tsv

Output:

Install via apt

# refresh package lists
sudo apt update

# install bedtools
sudo apt install -y bedtools
  1. bedtools genomecov
    Provides more flexible summary and output options:
# Per‐base coverage (same as samtools depth)
bedtools genomecov -ibam aln.bwa.sorted.bam -d > cov_per_base.txt
# Generate a bedGraph of contiguous regions with the same coverage
bedtools genomecov -ibam aln.bwa.sorted.bam -bga > coverage.bedgraph

You won’t see anything “pop up” when you run those commands – they just write two plain text files:

cov_per_base.txt (one line per base: chr pos depth)

coverage.bedgraph (blocks of constant coverage: chr start end depth)

To peek at the first few lines of each, try:

# show first 10 bases of per‐base coverage
head -n 10 cov_per_base.txt

Output:

# show first 10 blocks of bedGraph coverage
head -n 10 coverage.bedgraph

Output:

### Output:
<img width="1242" height="80" alt="image" src="https://github.com/user-attachments/assets/528b330b-c690-4441-9a1f-96b6a931def7" />

Peek at their contents (you’ve already done this, but just to confirm):
```bash
head -n 10 cov_per_base.txt
head -n 10 coverage.bedgraph

Compute summary stats (make sure you include the final “t” in .txt!):

awk '{
  sum += $3;
  if ($3 > max) max = $3
}
END { printf "mean: %.1f\tmax: %d\n", sum/NR, max }' cov_per_base.txt

Output