BAM Processing with samtools - iffatAGheyas/bioinformatics-tutorial-wiki GitHub Wiki
BAM Processing with samtools
Once you have your SAM file, you can use samtools to convert, sort, index and gather basic statistics on your alignments.
1. Convert SAM → BAM
# -b: output BAM; -S: input is SAM
samtools view -bS aln.bwa.sam > aln.bwa.bam
- Why?
BAM is compressed (~4–5× smaller than SAM) and indexed, which makes downstream operations (sorting, querying) much faster.
2. Sort the BAM
# -@ 8: use 8 threads; -o: output file
samtools sort -@ 8 -o aln.bwa.sorted.bam aln.bwa.bam
3. Index BAM
# This generates aln.sorted.bam.bai
samtools index aln.bwa.sorted.bam
- Why?
The index (.bai
) lets you quickly jump to any region in the BAM without reading the entire file.
4. Quick Statistics
-Flag statistics
# Overall mapping summary (total reads, mapped reads, duplicates, etc.)
samtools flagstat aln.bwa.sorted.bam
# Per-chromosome read counts and mapped length
samtools idxstats aln.bwa.sorted.bam
-flagstat output example:
500000 + 0 in total (QC-passed reads + QC-failed reads)
495000 + 0 mapped (99.0% : N/A)
490000 + 0 paired in sequencing
-idxstats output example:
NC_000913.3 4641652 490000 5000
Together, these steps turn your raw SAM into a compact, indexed BAM ready for variant calling, coverage analysis, or visualization.