8: OPEN READING FRAMES - Natasha-Adongo/assignment GitHub Wiki
INTRODUCTION
Open reading frames (ORFs) are parts of a reading frame that contain no stop codons. A reading frame is a sequence of nucleotide triplets that are read as codons specifying amino acids; a single strand of DNA sequence has three possible reading frames. Long ORFs may indicate candidate protein coding regions in a DNA sequence, start and stop ends of the ORF are not equivalent to the ends of the mRNA, but they are usually contained within the mRNA. In a gene, ORFs are located between the start-code sequence (initiation codon) and the stop-code sequence (termination codon). ORFs are usually encountered when sifting through pieces of DNA while trying to locate a gene.
The existence of an ORF, especially a long one, is usually a good indication of the presence of a gene in the surrounding sequence. In this case, the ORF is part of the sequence that will be translated by the ribosomes, it will be long, and if the DNA is eukaryotic, the ORF may continue over gaps called introns. However, short ORFs can also occur by chance outside of genes. Usually ORFs outside [genes] are not very long and terminate after a few codons.While open reading frames may predict potential coding regions, they do not automatically guarantee the presence of a gene
HOW TO IDENTIFY AN OPEN READING FRAME
- Locate a sequence corresponding to a start codon in order to determine the reading frame – this will be ATG (sense strand)
- Read this sequence in base triplets until a stop codon is reached (TGA, TAG or TAA)
- The longer the sequence, the more significant the likelihood that the sequence corresponds to an open reading frame
Certain bioinformatic programs can automatically identify potential ORFs when provided with a candidate sequence