BAM Processing with samtools - iffatAGheyas/bioinformatics-tutorial-wiki GitHub Wiki

BAM Processing with samtools

Once you have your SAM file, you can use samtools to convert, sort, index and gather basic statistics on your alignments.

1. Convert SAM → BAM

# -b: output BAM; -S: input is SAM
samtools view -bS aln.bwa.sam > aln.bwa.bam

  • Why?
    BAM is compressed (~4–5× smaller than SAM) and indexed, which makes downstream operations (sorting, querying) much faster.

2. Sort the BAM

# -@ 8: use 8 threads; -o: output file
samtools sort -@ 8 -o aln.bwa.sorted.bam aln.bwa.bam

3. Index BAM

# This generates aln.sorted.bam.bai
samtools index aln.bwa.sorted.bam
  • Why?
    The index (.bai) lets you quickly jump to any region in the BAM without reading the entire file.

4. Quick Statistics

-Flag statistics

# Overall mapping summary (total reads, mapped reads, duplicates, etc.)
samtools flagstat aln.bwa.sorted.bam

# Per-chromosome read counts and mapped length
samtools idxstats aln.bwa.sorted.bam

-flagstat output example:

500000 + 0 in total (QC-passed reads + QC-failed reads)
495000 + 0 mapped (99.0% : N/A)
490000 + 0 paired in sequencing

-idxstats output example:

NC_000913.3    4641652    490000    5000

Together, these steps turn your raw SAM into a compact, indexed BAM ready for variant calling, coverage analysis, or visualization.