Transcript Expression Profiling - undiagnosed/metagenomics GitHub Wiki

RNA metagenomic sequencing can be used to not only identify pathogens, but also mRNA transcript expression profiling as biomarkers of disease.

Here are a couple examples of joint analyses for pathogens and mRNA expression profiling:

Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling

Integrating microbial and host transcriptomics to characterize asthma-associated microbial communities

Salmon

Download Salmon

Download human RNA protein coding transcriptome

Create salmon index file with default parameters:

salmon index -t Homo_sapiens.GRCh37.67.cdna.all.fa.gz -i GRCh37_transcriptome_index

Run salmon (using 4 threads in this example):

salmon quant -i GRCh37_transcriptome_index -l A -1 R1_trimmed_paired.fastq.gz -2 R2_trimmed_paired.fastq.gz -p 4 -o salmon_out

Importing into R

Comparing gene expression to healthy individuals

There have been studies such as The Genotype-Tissue Expression (GTEx) pilot analysis that have determined ranges of values for gene expression in various body tissues and fluids. We can use this data as control data to check for differences in gene expression as biomarkers. Recount2 will be used to keep a consistent processing pipeline. The RNA-Seq data must be processed with Rail-RNA to match recount2 processing. Follow the procedure at recount-contributions to generate the files necessary for differential expression analysis with recount2 in R.