Questions2 - ParkinsonLab/microbiome_helper GitHub Wiki

Name: ________________________ Student Number: ________________________________

#Metatranscriptomics Tutorial Questions

####Remove adapter sequences and trim low quality sequences Question 1: How many low quality sequences have been removed after step 1 (Trimmomatic)?

 

 

####Read quality filtering Question 2: How has the per read sequence quality curve changed after read quality filtering (vsearch)?

 

 

####Remove duplicate reads Question 3: How many unique reads are in the dataset?

 

 

####Remove vector contamination Question 4: How many reads BWA mapped to the vector database?

 

 

####Remove host reads Question 5: How many reads did BWA and BLAT align to the mouse host sequence database?

 

 

####Remove abundant rRNA sequences Question 6: How many rRNA sequences were identified? How many reads are now remaining?

 

 

####Rereplication
Question 7: How many putative mRNA sequences were identified? How many unique mRNA sequences?  

 

Question 8: How many total contaminant, host, and rRNA reads were filtered out?

 

 

####Taxonomic Classification

Question 9: How many reads did kaiju classify?

 

 

Question 10: What is the most abundant family in our dataset? What is the most abundant phylum? Hint: Try decreasing the Max depth value on the top left of the screen and/or double clicking on spcific taxa.

 

 

####Assembling reads
Question 11: How many assemblies did SPAdes produce? Hint: try using the command tail mouse1_contigs.fasta

 

 

Question 12: How many reads were not used in contig assembly? How many reads were used in contig assembly? How many contigs did we generate?

 

 

####Annotate reads to known genes/proteins Question 13: How many reads were mapped in each step? How many genes were the reads mapped to? How many proteins were the genes mapped to?

 

 

####Enzyme Function Annotation Question 14: How many unique enzyme functions were identified in our dataset?

 

 

####Generate normalized expression values associated with each gene Question 15: Have a look at the mouse1_RPKM.txt file. What are the most highly expressed genes? Which phylum appears most active?