Minimap2 - Dreycey/Bioinformatics-Tools GitHub Wiki

minimap2

Installation

Use one of the following options:

1. Installing from source

2. Installing from precompiled binaries:

Basic commands

Note: A large portion of these commands are taken directly from the excellent documentation on the minimap2 github page.

Minimap2 works well with both FASTA and FASTQ files. These files can be gzipped and used Please use these file formats for the input genomes.

  1. Without any options, minimap2 takes a reference database and a query sequence file as input and produce approximate mapping, without base-level alignment (i.e. no CIGAR), in the PAF format:
  • minimap2 <reference_genome>.fa <query_genome>.fq > approx-mapping.paf
  1. To generate the CIGAR, use the -c flag:
  • minimap2 -c <reference_genome>.fa <query_genome>.fq > alignment_file.paf
  1. To output the alignment files in SAM format, use the -a flag:
  • minimap2 -a <reference_genome>.fa <query_genome>.fq > alignment_file.sam
  1. Optionally, you could build indexes to reduce time by using the -d flag before running any of the above commands:
  • minimap2 -d <reference_minimizer_index>.mmi <reference_genome>.fa

The above command will generate a minimizer index (.mmi files), which can be used in place of the reference genome in the other commands for faster execution:

  • minimap2 -a <reference_minimizer_index>.mmi <query_genome>.fq > alignment_file.sam

NOTE: Once you build the index, indexing parameters such as -k, -w, -H and -I can't be changed during mapping. If you are running minimap2 for different data types, you will probably need to keep multiple indexes generated with different parameters. This makes minimap2 different from BWA which always uses the same index regardless of query data types.

Example

In this example, we will produce an approximate mapping between the ecoli genome and its nanopore reads.

  1. Download the E. Coli genome FASTA file from here: https://www.ncbi.nlm.nih.gov/nuccore/U00096

--update: Dreycey (05/12/2019)

wget ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.fna.gz
gzip -d GCF_000005845.2_ASM584v2_genomic.fna.gz
  1. Download the E. Coli Nanopore reads from this website: https://www.ebi.ac.uk/ena/data/view/ERX708228.

    Alternatively, they could be downloaded using the following command:

  2. Run the following command to produce the required approximate mapping:

    • minimap2 GCF_000005845.2_ASM584v2_genomic.fa Nanopore_ecoli.fa > approximate_mapping_ecoli.paf

    The produced mapping will be in the approximate_mapping_ecoli.paf file.