Example yara mapper - seqan/slimm GitHub Wiki

Install slimm using your method of choice. We recommend conda. conda install -c bioconda slimm

Other choices are:
- Download pre-compiled Binaries or
- Build from source.

Create a fresh directory called slimm_tutorial

 mkdir ~/slimm_tutorial
 cd ~/slimm_tutorial

Download and extract the viruses only version yara index (V_genomes_indices_20180924_yara.zip) from here. This yara index contains reference genome database of different viral species.
```
 wget https://ftp.mi.fu-berlin.de/pub/dadi/slimm/V_genomes_indices_20180924_yara.zip
 unzip V_genomes_indices_20180924_yara.zip 
```
Download the corresponding SLIMM database (slimm_db_20180924.sldb) from here. The same SLIMM database can be used for other groups such as Archea Bacteria Fungi and Viral as well as their combinations.
```
 wget https://ftp.mi.fu-berlin.de/pub/dadi/slimm/slimm_db_20180924.sldb
```

Create two new directories with the name alignment_files slimm_reports and place your metagenomic sequencing reads under a directory named mg_reads . If you don't have one yet, you may download SRR1057982 and use it. Now you should have the following folder/directory structure in your working directory:

   Working Directory
   │
   ├── V_genomes_indices_20180924_yara (indexed reference genomes)
   ├    ├── V_genomes.lf.drs
   ├    ├── V_genomes.lf.drv 
   ├    ├── V_genomes.rid.concat
   ├    ├── ...
   │
   ├── slimm_db_20180924.sldb (SLIMM taxonomic database)
   │
   ├── alignment_files (alignment files will be stored here)
   │
   ├── slimm_reports (slimm taxonomic reports will be stored here)
   │
   ├── mg_reads (metagenomic sequencing reads) 
   ├    ├── SRR1748536_1.fastq
   ├    ├── SRR1748536_2.fastq

Use yara-mapper to map the metagenomic reads against reference genomes and produce alignment files.

 yara_mapper -v -t 30 -s 2 -sa record \
         -o ./alignment_files/SRR1748536.bam \
         .V_genomes_indices_20180924_yara/V_genomes \
         ./mg_reads/SRR1748536_1.fastq \
         ./mg_reads/SRR1748536_2.fastq \

Run SLIMM on the output of the read mapper (SAM/BAM files)

 slimm -w 1000 \
       -o slimm_reports/ \
        slimm_db_20180924.sldb \
        alignment_files/SRR1748536.bam

You will find a taxonomic profile of your sample under the directory slimm_reports/ with the name SRR1748536_profile.tsv. The file contains a multi-level taxonomic profile of the sample that SLIMM reported. The first column indicates the taxonomic rank for easy filtering.

You can also tell SLIMM to report only a single rank using -r parameter. For example,

    slimm -w 1000 -r species \
          -o slimm_reports/ \
           slimm_db_20180924.sldb \
           alignment_files/SRR1748536.bam

Would generate species level report only.

(see slimm --help for more details)