Paper 3: RNA‐Sequencing - bcb420-2025/Keren_Zhang GitHub Wiki

Table of Contents

RNA Sequencing: The Teenage Years

RNA sequencing (RNA-seq) has significantly evolved since its development more than a decade ago. Originally a method for analyzing differential gene expression, RNA-seq now influences nearly every aspect of genomic function understanding.

Historical Context and Development

  • Origins: Introduced over a decade ago, RNA-seq was first utilized for differential gene expression (DGE) analysis across various organisms like Zea mays, Arabidopsis thaliana, Saccharomyces cerevisiae, Mus musculus, and Homo sapiens.
  • Workflow: The standard workflow has not changed significantly and includes RNA extraction, mRNA enrichment or ribosomal RNA depletion, cDNA synthesis, adaptor-ligated library preparation, and sequencing, typically producing 10-30 million reads per sample.

Technological Advancements

  • Evolution of Methodologies: The technology has seen improvements in long-read RNA-seq and direct RNA sequencing (dRNA-seq) methods, enhancing the ability to analyze RNA biology in a richer and less biased manner compared to older microarray-based methods.
  • Short-Read vs Long-Read: Traditionally dominated by short-read technologies from Illumina, newer long-read technologies like those from Pacific Biosciences and Oxford Nanopore allow for better understanding of transcript complexity by enabling full-length mRNA sequencing.

Current Applications

  • Beyond DGE: RNA-seq is now used for a variety of applications beyond traditional DGE. These include studying mRNA splicing, the role of non-coding RNAs in gene expression regulation, and other complex aspects of RNA biology.
  • Spatial Transcriptomics: New areas such as spatial transcriptomics (spatialomics) are being explored, which integrate the physical location of RNA transcripts within tissue samples, providing a spatial context to transcriptomic data.

Future Prospects

  • Routine Applications: With ongoing advancements, techniques like single-cell RNA-seq and spatial RNA-seq are expected to become as routine as DGE analysis.
  • Replacement of Short-Read Technologies: There is potential for long-read methods to replace short-read techniques in specific niches where their advantages can be fully leveraged.

Challenges and Considerations

  • Data Complexity: The complexity of data generated by newer RNA-seq technologies demands advanced computational tools and methodologies for effective data analysis and interpretation.
  • Methodological Variance: The field continues to grapple with the challenge of methodological variance, particularly in how different RNA-seq approaches handle multi-mapped reads or isoform quantification.

Key Terms

Differential Gene Expression (DGE)
Methods used to identify quantitative changes in expression levels between experimental groups.
Read Depth
Total number of sequencing reads obtained for a sample, crucial for ensuring sufficient data for reliable analysis.
Short-Read Sequencing
Technologies generating reads up to 500 bp, commonly used for fragmented or degraded mRNAs.
Long-Read Sequencing
Technologies producing reads over 1,000 bp, capturing full-length or nearly full-length mRNAs, and offering a more complete view of transcript diversity.
Direct RNA Sequencing (dRNA-seq)
A method of sequencing RNA directly without the need for reverse transcription, offering insights into RNA modifications and dynamics.
Multi-mapped Reads
Reads that could map to multiple locations in the genome or transcriptome, often a challenge in data analysis.
Synthetic Long Reads
A technique for creating long reads by assembling shorter reads, used to overcome limitations of short-read sequencing technologies.

References

⚠️ **GitHub.com Fallback** ⚠️