Bioinformatics reading list - Michael-D-Preston/PrestonLab GitHub Wiki
By Angus Ball
Introduction
Welcome to the reading list! these papers are gonna be very dense and wide ranging. I'm not anticipating you'll want to read cover to cover so some tips. With the more review papers just read the sections of metabarcoding or amplicon sequencing. Metagenomics is not something you should be worrying within this analysis. Second with some of the heavier stat packages papers I'll be real, I didn't read exactly the code of how these programs work. I'd pay attention to the general statistical techniques used (i.e. matrix completion, linear models ect). BUT you should definitely understand how these different packages are judged (e.g. sensitivity versus specificity, or type 1/2 errors ect). As long as you feel comfortable with your understanding of what you're using :-).
PS. I'll be citing these papers through the specific analysis sections so don't feel like you need to read everything here before you start. You can do it piecemeal.
Eitherway...
Some introductory review papers
A primer and discussion on DNA-based microbiome data and related bioinformatics analyses
Identifying biases and their potential solutions in human microbiome studies
DNA Metabarcoding for the Characterization of Terrestrial Microbiota—Pitfalls and Solutions
Transformations and data structure
Microbiome Datasets Are Compositional: And This Is Not Optional
Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data
A review of normalization and differential abundance methods for microbiome counts data
Naught all zeros in sequence count data are the same
Differential abundance
Read me for comparison of methods (Note these papers are out of date, for example they don't include ANCOM-BC-2):
Microbiome differential abundance methods produce different results across 38 datasets
The accuracy of absolute differential abundance analysis from relative count data
Read me for ALDEx2:
This is useful when trying to interpret ALDEx2 (It explains bland altman, volcano and effect plots) : Displaying Variation in Large Datasets: Plotting a Visual Summary of Effect Sizes
Read me for ANCOM-BC:
ANCOM-BC: Analysis of compositions of microbiomes with bias correction
Beta diversity
A Novel Sparse Compositional Technique Reveals Microbial Perturbations
Alpha diversity
Rarefaction, Alpha Diversity, and Statistics
Estimating diversity in networked ecological communities
Improved detection of changes in species richness in high diversity microbial communities
Functional annotation
Inferring microbiota functions from taxonomic genes: a review
Network analysis
Barest of minimum:
Also give this bad boy a proper read for a comparison of network analysis methods: Network analysis methods for studying microbial communities: A mini review
I've decided the package to use is NetCoMi, mostly for it's ability for comparative network analysis. If you plan on using this program go read the paper: NetCoMi: network construction and comparison for microbiome data in R
And as much as you emotionally can of the supplementary data
Shocker network analysis is hard, here are some other resources
This is a good paper for an oversight of what network analysis actually is (Read this paper first): Connect the dots: sketching out microbiome interactions through networking approaches
Go watch this youtube video for a verbal explanation of network analysis: From hairballs to hypotheses: network analysis... - Karoline Faust - MICROBIOME - ISMB/ECCB 2023
Open challenges for microbial network construction and analysis
I'm so sorry... This is 19 pages of heavy hard text. I take no joy in making it a required reading and yet it is. Mercy: From diversity to complexity: Microbial networks in soils