1.7 Open Reading Frame (ORF) - swatiri/Molecular-Biology GitHub Wiki
An open reading frame is a portion of a DNA molecule that, when translated into amino acids, contains no stop codons. They are sequences found between start and stop codons. Open reading frames will typically consist of at least 100 codons (300 nucleotides).An ORF is simply a frame of reference, and what is being read, "reading", is the RNA code, and it is being read by the ribosomes in order to make a protein. And "open" means that the road is open to keep reading, and the ribosome will be able to keep reading the RNA code and add another amino acid one after another.
DNA, which is transcribed, into RNA and then translated into a protein. And when it's translated into a protein, the mRNA is not read one letter at a time, but it's read three letters at a time. And those three letters are called a codon, and each of those codons, each of those codons is interpreted by the ribosome. So an open reading frame is the length of DNA, or RNA, which is transcribed into RNA, through which the ribosome can travel, adding one amino acid after another before it runs into a codon that doesn't code for any amino acid. When that happens, it confuses the ribosome, and the ribosome stops.
By analyzing the ORF we can predict the possible amino acids that might be produced during translation. The genetic code reads DNA sequences in groups of three base pairs, which means that a double-stranded DNA molecule can read in any of six possible reading frames--three in the forward direction and three in the reverse. A long open reading frame is likely part of a gene. So a codon that makes that happen is called a stop codon, and a stop codon ends an open reading frame.
How to identify an open reading frame:
Locate a sequence corresponding to a start codon in order to determine the reading frame – this will be ATG (sense strand) Read this sequence in base triplets until a stop codon is reached (TGA, TAG or TAA) The longer the sequence, the more significant the likelihood that the sequence corresponds to an open reading frame
Certain bioinformatic programs can automatically identify potential ORFs when provided with a candidate sequence
Gene sequences are largely conserved – so if an ORF sequence is present in multiple genomes, it likely represents a gene