Indexing - Bioinformatics-Institute/transcriptomics_WBC GitHub Wiki

RNA-seq Flowchart - Module 2

1-ii. Indexing

Indexes are small files that allow faster access to other large files, like genomes and mapping files.

Create a bowtie2 index

Create a bowtie index for chr22 and the ERCC spike-in sequences and write it to a 'index/bwt2' sub-directory:

cd $RNAWORKING
mkdir index
mkdir index/bwt2
bowtie2-build fasta/chr22_ERCC92.fa index/bwt2/chr22_ERCC92
ls index/bwt2

To create an index for the full hg19 genome instead of just chr22 you would do the following:


    #bowtie2-build fasta/hg19.fa index/hg19


OPTIONAL ALTERNATIVE - HISAT2 indexes

Create reference and splice site index files for use with HISAT2

cd $RNAWORKING
mkdir index/hisat2
hisat2-build $RNAWORKING/fasta/chr22_ERCC92.fa index/hisat2/chr22_ERCC92
extract_splice_sites.py annotation/genes_chr22_ERCC92.gtf > index/hisat2/splicesites.txt
ls index/hisat2

END OF OPTIONAL ALTERNATIVE - HISAT2 indexes


Previous Section This Section Next Section
Annotations-and-Genomes Indexing Data