Sequencing Depth - uic-ric/uic-ric.github.io GitHub Wiki

These are general recommendations for sequencing depth for different types of sequencing experiments. Unless otherwise noted, recommendations below are for typical mammalian systems, with ~3 Gb genomes and ~20,000 protein coding genes. For other types of organisms you will need to scale estimates based on the genome size and/or number of genes. Please contact the Research Informatics Core (RIC) at [email protected] with any questions.

Omics type	Experiment	Recommend depth (number of clusters)	Paired End (PE) or Single End (SE)	Minimum recommended sequencing length (bp)	Notes
Transcriptomics	RNA-seq (gene expression)	20-30M	SE	50	PE data not necessary if only gene-level expression is needed.
	RNA-seq (isoform expression)	40-60M	PE	75-100	PE data is essential. Longer reads can be helpful for finding splice junctions.
	miRNA-seq	5-10M	SE	50	No benefit to longer or PE reads for short RNAs.
Epigenomics	ATAC-seq	40-60M	PE	50	PE data provides benefit in identifying peaks and resolving PCR duplicates.
	ChIP-seq (narrow marks)	40-60M	PE	50	PE data provides benefit in identifying peaks and resolving PCR duplicates. Should be paired with an input sample sequenced the same way, to at least the same depth. You may need to aim slightly higher or lower in depth depending on the prevalence of the mark across the genome (e.g., transcription factors = less prevalent, histone marks = more prevalent), as well as the efficiency of the antibody.
	ChIP-seq (broad marks)	70-100M	PE	50	Higher depth is important for broader marks with less percentage enrichment over input. This recommendation would apply to methylation pulldown experiments as well (MEDIP-seq or MBD-seq).
	Bisulfite (RRBS)	20-40M	PE or SE with UMI	150	Some libraries options with have UMIs to identify PCR duplicates, in which case SE data is sufficient.
	Bisulfite (whole genome)	300-500M	PE or SE with UMI	150	Longer and paired-end reads are needed for accurate alignments in lower-complexity bisulfite-converted genomes. Aiming for ~30-50x coverage. Anticipate that ~25-40% of reads will not be mappable to the reference.
Genomics	Variant calling (whole genome)	300M	PE	150	Aiming for ~30x coverage. This recommendation is for germline variant calling; for somatic variant calling, aim for ~50-100x coverage.
	Variant calling (germline, exome)	10-15M	PE	150	Typical target size is ~40Mb for human/mouse, aiming for ~50-100x coverage. For somatic variant calling, aim for 100-150x coverage.
	Variant calling (prokaryotic/small genome)	~2M	PE	150	Recommended depth is based on ~100x coverage for a 5MB genome. Scale up or down as needed for bigger or smaller genomes.
	De novo genome assembly (prokaryotic/small genome)	Illumina + Long read	PE	150	We recommended ~100x coverage from both Illumina and long-read sequencing (PacBio or Nanopore).
Metagenomics	Shotgun metagenomics	10-20M	PE	150, 250 preferred	For short-read annotation approaches only, overlapping paired-end reads is recommended. For combined short-read annotation and de novo assembly, then larger inserts are recommended. Even longer reads are becoming possible – PacBio, Oxford Nanopore, synthetic long-read technologies.
	Metatranscriptomics	10-20M	PE	150, 250 preferred	Actual sequencing depth may vary based on the amount of host sequence data and/or rate of ribosomal depletion.
	Amplicon metagenomics	10-50k	PE	150-300, depending on amplicon length	There should be a minimum of 20bp overlap between PE reads for merging of forward and reverse reads. Some amplicons are too long to be fully sequenced by Illumina sequencers. Long read sequencing can be achieved with PacBio, Oxford Nanopore and synthetic long-read sequencing (Loop Genomics).

Sequencing Depth - uic-ric/uic-ric.github.io GitHub Wiki

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️