II. What We Measure, Why and How - abishpius/R-for-Computational-Biology GitHub Wiki
Overview of what we measure and why
The "What we measure and why" subsection reviews basic biology terms used throughout genomics and our introduction to Bioconductor. It also explores these concepts with basic R code.
Genomics and bioinformatics are widely used in scientific research, both for basic biology and to make clinical discoveries that improve patient care.
For example, the 70-gene Mammaprint expression signature has been used to guide treatment decisions for thousands of people with breast cancer.
The Molecular Basis for Phenotypic Variation
Phenotypic variation between organisms and cells is partly explained by their DNA.
DNA encodes all of the information for making proteins, the building blocks of life. Messages within the DNA are transcribed into RNA, which is then translated into protein.
Phenotypic variation depends on differences in the DNA and also on differences in which parts of the DNA are expressed as proteins.
We use genomic technologies to measure differences in DNA and gene expression, such as changes in the sequence or amounts of expression, and relate those changes to phenotypic variation.
DNA: chromosomes, replication, SNPs and other variants
Single Nucleotide Polymorphism: a single base in the genome that often differs between individuals and may cause phenotypic differences
Gene Expression
In a typical somatic human cell, before DNA replication, there are 2 copies of DNA for most genes. RNA copy for gene varies.
Humans are a diploid species, meaning the somatic cells typically contain two copies of the autosomal (not X or Y) chromosomes, before S phase in which the chromosomes are duplicated.
The number of copies of RNA depends on how much the gene is being transcribed. “Housekeeping genes” such as those used to make ribosomal or transfer RNA are transcribed at a high rate, while others, such as mRNA for some transcription factors, are transcribed less frequently. Some genes are not transcribed at all in certain cell types.
Genetic information, such as single nucleotide polymorphisms, have a chance of being transmitted across generations. mRNA transcripts and proteins such as transcription factors degrade over time and, most importantly, do not replicate themselves. In other words, DNA (not proteins or RNA) is known as the main molecule of genetic inheritance. (Side note: There are cases of mRNA and proteins being temporarily inherited, for example the mRNA which are in the egg cell at the moment it is fertilized by the sperm cell.)