Bioinformatics reading list - Michael-D-Preston/PrestonLab GitHub Wiki

By Angus Ball

Introduction

Welcome to the reading list! these papers are gonna be very dense and wide ranging. I'm not anticipating you'll want to read cover to cover so some tips. With the more review papers just read the sections of metabarcoding or amplicon sequencing. Metagenomics is not something you should be worrying within this analysis. Second with some of the heavier stat packages papers I'll be real, I didn't read exactly the code of how these programs work. I'd pay attention to the general statistical techniques used (i.e. matrix completion, linear models ect). BUT you should definitely understand how these different packages are judged (e.g. sensitivity versus specificity, or type 1/2 errors ect). As long as you feel comfortable with your understanding of what you're using :-).

PS. I'll be citing these papers through the specific analysis sections so don't feel like you need to read everything here before you start. You can do it piecemeal.

Eitherway...

Some introductory review papers

A primer and discussion on DNA-based microbiome data and related bioinformatics analyses

Identifying biases and their potential solutions in human microbiome studies

DNA Metabarcoding for the Characterization of Terrestrial Microbiota—Pitfalls and Solutions

Transformations and data structure

Microbiome Datasets Are Compositional: And This Is Not Optional

Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data

A review of normalization and differential abundance methods for microbiome counts data

Naught all zeros in sequence count data are the same

Differential abundance

Read me for comparison of methods (Note these papers are out of date, for example they don't include ANCOM-BC-2):

Microbiome differential abundance methods produce different results across 38 datasets

The accuracy of absolute differential abundance analysis from relative count data

Read me for ALDEx2:

ALDEx2: Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis

This is useful when trying to interpret ALDEx2 (It explains bland altman, volcano and effect plots) : Displaying Variation in Large Datasets: Plotting a Visual Summary of Effect Sizes

Read me for ANCOM-BC:

ANCOM-BC: Analysis of compositions of microbiomes with bias correction

ANCON-BC part 2: Analysis of microbial compositions: a review of normalization and differential abundance analysis

ANCOM-BC-2, read me

Beta diversity

Compositionally Aware Phylogenetic Beta-Diversity Measures Better Resolve Microbiomes Associated with Phenotype

A Novel Sparse Compositional Technique Reveals Microbial Perturbations

Alpha diversity

Rarefaction, Alpha Diversity, and Statistics

Estimating diversity in networked ecological communities

Improved detection of changes in species richness in high diversity microbial communities

Functional annotation

Inferring microbiota functions from taxonomic genes: a review

Network analysis

Barest of minimum:

Also give this bad boy a proper read for a comparison of network analysis methods: Network analysis methods for studying microbial communities: A mini review

I've decided the package to use is NetCoMi, mostly for it's ability for comparative network analysis. If you plan on using this program go read the paper: NetCoMi: network construction and comparison for microbiome data in R

And as much as you emotionally can of the supplementary data

Shocker network analysis is hard, here are some other resources

This is a good paper for an oversight of what network analysis actually is (Read this paper first): Connect the dots: sketching out microbiome interactions through networking approaches

Go watch this youtube video for a verbal explanation of network analysis: From hairballs to hypotheses: network analysis... - Karoline Faust - MICROBIOME - ISMB/ECCB 2023

Open challenges for microbial network construction and analysis

I'm so sorry... This is 19 pages of heavy hard text. I take no joy in making it a required reading and yet it is. Mercy: From diversity to complexity: Microbial networks in soils