1.8 Open Reading Frame - alunga20/Concepts_of_Molecular_Biology GitHub Wiki

  • An open reading frame is a portion of a DNA molecule that, when translated into amino acids, contains no stop codons.

  • The genetic code reads DNA sequences in groups of three base pairs, which means that a double-stranded DNA molecule can read in any of six possible reading frames--three in the forward direction and three in the reverse. A long open reading frame is likely part of a gene.

  • These sequences – called open reading frames (ORF) – will be preceded by a start codon and uninterrupted by stop codons.

  • Open reading frames will typically consist of at least 100 codons (300 nucleotides).

  • While open reading frames may predict potential coding regions, they do not automatically guarantee the presence of a gene.

  • Some long and uninterrupted sequences of DNA may not actually be translated, while other short sequences may code protein.

  • Any particular stretch of DNA will have six reading frames that could potentially code for a functional protein.

  • mRNA is translated in codons (triplets of bases), meaning there are three potential reading frames for a given DNA sequence.

  • DNA is double-stranded and either strand could include a gene, meaning there are six reading frames in total (2 × 3).

To identify an open reading frame:

  1. Locate a sequence corresponding to a start codon in order to determine the reading frame – this will be ATG (sense strand).

  2. Read this sequence in base triplets until a stop codon is reached (TGA, TAG or TAA).

  3. The longer the sequence, the more significant the likelihood that the sequence corresponds to an open reading frame.


NOTE:

Certain bioinformatic programs can automatically identify potential ORFs when provided with a candidate sequence.

Gene sequences are largely conserved – so if an ORF sequence is present in multiple genomes, it likely represents a gene.

image