Debunking myths on genetics and DNA

Monday, February 13, 2012

The "not-so-universal" genetic code, its origin and its evolution

From [1]:
"Until relatively recently, the [genetic] code was thought to be invariable, frozen, in all organisms, because of the way in which any change would produce widespread alteration in the amino acid sequences of proteins. The universality of the genetic code was first challenged in 1979, when mammalian mitochondria were found to use a code that deviated somewhat from the universal."
A brief refresher: proteins are chains of amino acids. They are made from messenger RNA by assigning each triplet of RNA nucleotides (a codon) to one amino acid. For example, in the sequence AUGCCCAAGCUG each triplet codes an amino acid: AUG becomes M, CCC becomes P, AAG becomes K, and CUG becomes L. All together: AUG|CCC|AAG|CUG -> MPKL.

So, what does "universal" mean in the above quote? It means that the above sequence gets translated into the same amino acids in every organism, from bacteria to humans. Is this true? Not always.

Take a stop codon, for example. A stop codon is a triplet of RNA nucleotides that end the translation. Think of it as a flag that says, "The protein code ends here." If the genetic code were a universal one, a stop codon would always be a stop codon, in all organisms. The first exception to this was discovered in 1985, when the stop codon UGA was found to be actually coding an amino acid in the bacteria Mycoplasma capricolum. More exceptions to the "universal" conception (other triplets that coded different amino acids instead of always the same one) were later found in other organisms and in mitochondrial DNA as well. A more realistic theory is that, being DNA dynamical, when codons "disappear" the old codons can undergo reassignments and take on a new meaning.

The "universal" view has prevailed for many years on the basis that present time proteins are so evolved that changes would most likely be lethal. The first deviations from universality were found in the late 'seventies in mitochondrial DNA. It was argued that mtDNA is considerably smaller than nuclear DNA and hence it had a better tolerance to changes.

In [1], Ohama et al. list various code changes reported in the nuclear DNA in the past three decades, and then discuss the origin of the genetic code:
"The theories to explain the early evolution of the genetic code are numerous, all of which include speculations that the coding system arose with one or a limited number of amino acids, and that others were added until a total of 20 was reached. Most of these theories are aesthetically pleasing but cannot be verified."
They assume that the most ancient genetic code had to have a minimum number of codons made of all 20 amino acids and a minimum number of corresponding tRNAs -- transfer RNA molecules that act as mediators between the mRNA and the amino acids. This first genetic code had to have very little tolerance for change. However, with the time, the development of synonymous codons (different triplets code the same amino acid), allowed for flexibility and therefore resulted in an advantageous addition.

Finally, they conclude:
"It should be stressed however that there are no organisms which use the genetic code system for more than, or less than, 20 amino acids. What were frozen are 20 amino acids (magic 20!) and not the genetic code that assigns them. Thus the genetic code is still in the state of evolution."
I'm including below a second reference [2] that goes a bit more in depth on how these codon reassignments happen, for those of you who might be interested. In this case, the authors looked at the evolution of the genetic code in yeast.

[1] Ohama T, Inagaki Y, Bessho Y, & Osawa S (2008). Evolving genetic code. Proceedings of the Japan Academy. Series B, Physical and biological sciences, 84 (2), 58-74 PMID: 18941287

[2] Miranda, I., Silva, R., & Santos, M. (2006). Evolution of the genetic code in yeasts Yeast, 23 (3), 203-213 DOI: 10.1002/yea.1350


  1. About half of my thesis was on a class of Mycoplasmal proteins. There was a new postdoc in a collaborator's lab tasked with the cloning into an expression vector. He spent 7 months telling us that it wasn't working but never explaining what was going wrong. He'd gone through six different vectors before he finally told us that he was getting expression but it was too short. Apparently no one had bothered to tell him to mutate the Trp codon to something that wasn't a stop in E. coli. We'd all been doing it so long we considered it a given. I felt very bad.

  2. Oh, that's such a cute story! Well, not for the poor guy, of course... Interesting, I should've asked you first, then. It actually came as a surprise to me. I'm no experimentalist, though. :)

  3. antisocialbutterflieFebruary 14, 2012 at 3:37 PM

    Yeah, it's definitely a giggle after the fact story and an object lesson in talking through your bench problems with your lab mates.

    This is actually a fairly common issue among people who do protein expression. There are codon preferences depending on what organism you are expressing in. Specifically if you are expressing mammalian proteins in E. coli, there are occasionally significant differences in expression levels without optimization, either by silent mutation to the gene itself or using specific cell lines that are engineered to contain higher tRNA levels for "rare codons." The latter tends to be slightly less effective than the former. Mycoplasma is sort of a odd case where it codes for something completely different.

  4. You know, you keep leaving these awesome comments, I might end up asking you another guest post... ;-)

    Srsly, though, thanks, that's quite intriguing.


Comments are moderated. Comments with spam links will be deleted and never published. So, if your intention is to leave a comment just to post a bogus link, please spare your time and mine. To all others: thank you for leaving a comment, I will respond as soon as possible.