Tools - NBISweden/workshop-genome_assembly GitHub Wiki

Scripts and Code snippets for tools used in Genome Assembly.

Uppmax specific information

General Scripting

  • Pure Bash Bible: A collection of pure bash alternatives to external processes.

Basecalling and Format conversion:

  • Flappie: ONT Fast5 base-calling and Fastq conversion ( > R9.X ).
  • Guppie: ONT Fast5 base-calling and Fastq conversion ( > R9.X ).
  • SMRT Tools: PacBio HDF5 format conversion to BAM and Fastq.
  • Seqret: General format conversion tool, including Sanger trace signals to Fasta and Fastq

Read QC:

Filtering:

Assemblers:

Scaffolding, Gap-Filling, and Assembly reconciliation

Consensus, and Polishing:

Assembly QC:

  • Preseq: Data quantification
  • Quast: Assembly sequence metrics
  • Kmer Analysis Toolkit: Assembly completeness metrics, Illumina
  • FRCBam: Assembly accuracy metrics, Illumina
  • NucBreak: Assembly accuracy metrics, Illumina
  • TigMint: Assembly accuracy metrics, 10X, (ONT?, PacBio?)
  • Busco: Assembly gene space metrics
  • Bandage: Assembly graph visualisation and manipulation
  • HBAR-DTK: PacBio HGAP assembler graph visualisation, (any Celera based assembler).
  • Blast: Assembly contamination metrics
  • Blobtools: Assembly contamination metrics
  • Kraken: Assembly contamination metrics
  • MashMap: Assembly build comparison