Debunking myths on genetics and DNA

Monday, October 24, 2011

The missing heritability

It's been dubbed the "dark matter of the genome" because… we know it's there and yet we can't find it.

Ever since the completion of the Human Genome Project, the hunt to disease variants has taken up much, if not most, of genetic research. The idea is simple: we take a sample of healthy people (the controls), a matched sample of diseased people (the cases), we type their DNA, stratify by other possible factors (this one depends on the study, but think of things like smoking, age, family history, socio-economic status, etc.), and then look at what variants in the DNA are statistically more prevalent in the cases. If the experimental design is solid, and the statistical analyses are well done, the result should be one or more loci in the genome that increase the risk of developing the disease.

This has been done for numerous cancers (a vastly known example are the two SNPs BRCA1 and BRCA2, which have been found to increase the risk of breast cancer), and also for heart disease, type 2 diabetes, schizophrenia, and other genetic pathologies.

Is this it? All you need to do to find out whether or not you'll develop something nasty in your lifetime is look at your DNA and breathe easily if nothing of the "red flags" are raised?


When you go back and combine the genetic variability of the trait and the environmental factors, you see that all together they explain only a small fraction of the disease's heritability. In other words, for any of these investigated maladies, the vast majority of the inherited cases remain unexplained. Think for example, of twin pairs where only one sibling develops the genetic disease.

First of all, a philosophical note: the above thinking falls within the so-called "gene-centered" view, which assumes a causal relationship between gene copies and phenotype. This may not be the case at all, as what I've learned so far is that genomes have a tendency to be far more complex than we can predict.

Having said that, here are some hypothesis on where the "dark matter" of the genome could hide.

(1) RARE VARIANTS: The causal relationship we're after could be hidden in what we call "rare variants," in other words, gene copies that can only be found in very few individuals. These alleles are so sparse in the population that even if you find a few, you have very little statistical power to detect their effects on the disease risk. This problem is currently being tackled with improved sequencing technology and new statistical methods to allow for these rare variants to be taken into account.

(2) EPIGENETICS: Recent studies have shown that epigenetic changes induced by environmental factors (such as diet, maternal physiology during pregnancy, parental behaviors, etc.) can be inherited across generations [1]. These "transgenerational genetic effects" are not encoded in the DNA itself, but in the way genes are expressed. They have been found in numerous mouse models, and they indicate that when we don't find anything and the disease is there, we may have missed the causal factor simply because we failed to look at the genetics and exposures of the parents and/or grandparents. Interestingly, as Nadeau points in [1], "in the cases that have been studied, the phenotypic consequences of transgenerational effects persist beyond the first generation but with progressively weaker effects." And, "all genetically predisposed progeny are affected regardless of inheritance of the parental gene." Let me stress the significance of this last statement: a transgenerational genetic effect takes place when an individual presents a specific phenotipic trait, even though the genetic change is not present in the individual, but only in the parent. A study recently published in Nature [2], for example, showed that epigenetic changes induced on a first generation of worms in order to elongate their life span were transmitted to the offsprings, too. Another one published in Science showed a similar result in plants [3].

(3) POST-TRANSCRIPTIONAL REGULATION: A recent paper published in Cell [4] looked at an aggressive form of brain tumor called glioblastoma, and found an association between the disease and the way genes in the cancer cells were expressed. In other words, rather than looking at the actual gene copies, they looked at which genes were translated into their subsequent products, and through what processes. Quoting from the abstract, they found:
"~7,000 genes whose transcripts act as miR ‘‘sponges’’ and 148 genes that act through alternative, non-sponge interactions. Biochemical analyses in cell lines confirmed that this network regulates established drivers of tumor initiation and subtype implementation." 
Let's try and understand this. Genes are transcribed into portions of RNA, which are then used to make proteins. However, in any given cell, some genes are expressed and some are not. In other words, genes can be "turned on" or "turned off," and this happens through very complicated processes. One way is to use tiny molecules of RNA (called miRNA or "micro" RNA) that are complementary to the gene RNA. After the gene has been transcribed, the miRNA binds to the complementary strand of RNA, making it double-stranded. Once the RNA is double-stranded it can no longer "produce" a protein, and therefore, the gene it came from is effectively "silenced," or turned off. So, the "miRNA sponges" found in the Cell paper effectively silence a network of genes and have an important role in cancer pathogenesis. This process is not encoded in the genes themselves (and hence it wouldn't be found by simply looking at the different alleles in the population). Rather, it affects the way genes are transcribed.

(4) PROTECTIVE ALLELES: So far the great focus has been on finding risk alleles. But what about protective alleles, or in other words, variants that counter-act the effect of the deleterious ones? I don't mean just alleles that carry a negative risk, but alleles that are proven to interact with the ones that induce a positive risk, and level them out. The existence of such alleles has been hypothesized and studies are under way to test this possibility too. I didn't find anything in the literature yet, but if you are aware of published studies on this, please let me know and I will include them here.

[1] Nadeau JH (2009). Transgenerational genetic effects on phenotypic variation and disease risk. Human molecular genetics, 18 (R2) PMID: 19808797

[2] Greer, E., Maures, T., Ucar, D., Hauswirth, A., Mancini, E., Lim, J., Benayoun, B., Shi, Y., & Brunet, A. (2011). Transgenerational epigenetic inheritance of longevity in Caenorhabditis elegans Nature DOI: 10.1038/nature10572

[3] Schmitz, R., Schultz, M., Lewsey, M., O'Malley, R., Urich, M., Libiger, O., Schork, N., & Ecker, J. (2011). Transgenerational Epigenetic Instability Is a Source of Novel Methylation Variants Science, 334 (6054), 369-373 DOI: 10.1126/science.1212959

[4] Sumazin P, Yang X, Chiu HS, Chung WJ, Iyer A, Llobet-Navas D, Rajbhandari P, Bansal M, Guarnieri P, Silva J, & Califano A (2011). An Extensive MicroRNA-Mediated Network of RNA-RNA Interactions Regulates Established Oncogenic Pathways in Glioblastoma. Cell, 147 (2), 370-81 PMID: 22000015

Photo: what happens when you put the camera on a tripod, leave the shutter open for thirty seconds, and three cars finally drive by. The original had a lamppost, but I edited out the post and left the lamp. You can find the original here.


  1. I may have missed something in the description. Wouldn't a tumor suppressor (like p53) count as a protective allele though it's not explicitly refered to as one? Or even a heterozygously expressed dominant negative or dominant positive mutation? I know there are minimized disease phenotypes that can occur in that situation. Or I could be entirely off base, in which case I apologize.

  2. Hi, excellent point! My understanding is that the protein p53 is considered a "tumor suppressant," but as far as I know wild-type mutations in the p53 coding gene have been found to be associated with cancer, as for example in this paper: PMID:21989411, or in this: PMID:21986947, where they say "Mutations of p53 in cancer can result in a gain of function associated with tumour progression and metastasis." This paper, actually, is quite interesting because, if I understand it correctly, they found an interaction between p53 and the gene ANKRD11, and apparently this can restore the tumor suppressant function of p53: "ANKRD11 restores a native conformation to the mutant p53 protein and causes dissociation of the mutant p53–p63 complex. This represents the first evidence of an endogenous protein with the capacity to suppress the oncogenic properties of mutant p53."

  3. So, come to think of it, that last paper I mentioned is along the lines of what I was looking for: a gene that counter-effects the negative allele of another gene. I can't access the full article from home. I'll take a better look tomorrow and add it to the references above.

    Thanks so much!

  4. Glad to be useful.

    FYI: Off the top of my head p53 intercepts signals from the DNA damage proteins like ATM in the event of double-strand breaks to arrest cell cycle progression. So it would be a broad spectrum situation rather that a one-on-one gene balance. However that ANKRD11 thing looks mighty interesting. I might have to look up that paper myself tomorrow in lab.

  5. Shoot, my lab doesn't have access to Oncogene. Does yours? I found this 2008 paper: PMID:18840648 instead. I think I'm going to discuss it in a separate post. Again, many thanks -- feel free to pitch in your input again!


Comments are moderated. Comments with spam links will be deleted and never published. So, if your intention is to leave a comment just to post a bogus link, please spare your time and mine. To all others: thank you for leaving a comment, I will respond as soon as possible.