Transcript Expression Profiling - undiagnosed/metagenomics GitHub Wiki
RNA metagenomic sequencing can be used to not only identify pathogens, but also mRNA transcript expression profiling as biomarkers of disease.
Here are a couple examples of joint analyses for pathogens and mRNA expression profiling:
Salmon
Download Salmon
Download human RNA protein coding transcriptome
Create salmon index file with default parameters:
salmon index -t Homo_sapiens.GRCh37.67.cdna.all.fa.gz -i GRCh37_transcriptome_index
Run salmon (using 4 threads in this example):
salmon quant -i GRCh37_transcriptome_index -l A -1 R1_trimmed_paired.fastq.gz -2 R2_trimmed_paired.fastq.gz -p 4 -o salmon_out
Comparing gene expression to healthy individuals
There have been studies such as The Genotype-Tissue Expression (GTEx) pilot analysis that have determined ranges of values for gene expression in various body tissues and fluids. We can use this data as control data to check for differences in gene expression as biomarkers. Recount2 will be used to keep a consistent processing pipeline. The RNA-Seq data must be processed with Rail-RNA to match recount2 processing. Follow the procedure at recount-contributions to generate the files necessary for differential expression analysis with recount2 in R.