1.8 Open Reading Frame - alunga20/Concepts_of_Molecular_Biology GitHub Wiki
-
An open reading frame is a portion of a DNA molecule that, when translated into amino acids, contains no stop codons.
-
The genetic code reads DNA sequences in groups of three base pairs, which means that a double-stranded DNA molecule can read in any of six possible reading frames--three in the forward direction and three in the reverse. A long open reading frame is likely part of a gene.
-
These sequences – called open reading frames (ORF) – will be preceded by a start codon and uninterrupted by stop codons.
-
Open reading frames will typically consist of at least 100 codons (300 nucleotides).
-
While open reading frames may predict potential coding regions, they do not automatically guarantee the presence of a gene.
-
Some long and uninterrupted sequences of DNA may not actually be translated, while other short sequences may code protein.
-
Any particular stretch of DNA will have six reading frames that could potentially code for a functional protein.
-
mRNA is translated in codons (triplets of bases), meaning there are three potential reading frames for a given DNA sequence.
-
DNA is double-stranded and either strand could include a gene, meaning there are six reading frames in total (2 × 3).
To identify an open reading frame:
-
Locate a sequence corresponding to a start codon in order to determine the reading frame – this will be ATG (sense strand).
-
Read this sequence in base triplets until a stop codon is reached (TGA, TAG or TAA).
-
The longer the sequence, the more significant the likelihood that the sequence corresponds to an open reading frame.
NOTE:
Certain bioinformatic programs can automatically identify potential ORFs when provided with a candidate sequence.
Gene sequences are largely conserved – so if an ORF sequence is present in multiple genomes, it likely represents a gene.