Coverage and Depth Analysis - igheyas/Bioinformatics GitHub Wiki
Coverage & Depth Analysis
After mapping and preparing a sorted, indexed BAM, you can quantify how deeply and how uniformly your reads cover the reference.
- samtools depth
Outputs per‐position coverage in a simple three‐column format:
# Compute depth at every position
samtools depth aln.bwa.sorted.bam > depth.tsv
- Format:
CHROM POS DEPTH
NC_000913.3 1 30
NC_000913.3 2 29
- Average depth:
awk '{sum+=$3; count++} END {print "Avg depth =", sum/count}' depth.tsv
Output:
awk '$3>0{cov++} END {print "Breadth =", cov/NR*100 "%"}' depth.tsv
Output:
Install via apt
# refresh package lists
sudo apt update
# install bedtools
sudo apt install -y bedtools
- bedtools genomecov
Provides more flexible summary and output options:
# Per‐base coverage (same as samtools depth)
bedtools genomecov -ibam aln.bwa.sorted.bam -d > cov_per_base.txt
# Generate a bedGraph of contiguous regions with the same coverage
bedtools genomecov -ibam aln.bwa.sorted.bam -bga > coverage.bedgraph
You won’t see anything “pop up” when you run those commands – they just write two plain text files:
cov_per_base.txt (one line per base: chr pos depth)
coverage.bedgraph (blocks of constant coverage: chr start end depth)
To peek at the first few lines of each, try:
# show first 10 bases of per‐base coverage
head -n 10 cov_per_base.txt
Output:
# show first 10 blocks of bedGraph coverage
head -n 10 coverage.bedgraph
Output:
### Output:
<img width="1242" height="80" alt="image" src="https://github.com/user-attachments/assets/528b330b-c690-4441-9a1f-96b6a931def7" />
Peek at their contents (you’ve already done this, but just to confirm):
```bash
head -n 10 cov_per_base.txt
head -n 10 coverage.bedgraph
Compute summary stats (make sure you include the final “t” in .txt!):
awk '{
sum += $3;
if ($3 > max) max = $3
}
END { printf "mean: %.1f\tmax: %d\n", sum/NR, max }' cov_per_base.txt