Tomorrow, February 3, is
Eric Lander's birthday, the director of the
Broad Institute (the well-known MIT/Harvard genomic research center), and the first author of the historic 2001 Nature paper that marked the completion of the Human Genome Project [1]. I heard him once speak at USC and without ever getting technical he managed to engage the whole audience and share his passion for genetics. As you know, I've been honoring famous geneticists by discussing one of their papers on their birthday and today I'm facing a conundrum. You see, the natural choice would be to pick the latest PNAS paper titled "The mystery of genetic heritability" [2]. I want to talk about this paper and at the same time I don't want to talk about this paper.
I'm not a geneticist. I'm a computational biologist, which means my background is mostly analytical, not biological. I used to work on SNP associations and cancer epidemiology and now I work on HIV. I am NOT one of the players in this game. Hence, what does my opinion count when it comes to a highly debated paper as this one?
The thing is, this paper resonates with me. It makes a great point about a mathematical model that's been "assumed" for years now in the world of genetics. Often people don't get mathematical models. They don't get that mathematical models are tools, not the truth. Hence when one says "I present this model," you get two possible reactions: those who have seen data concordant with your model will smile and happily welcome your model. Those who instead have seen the opposite will boo you and challenge you. Problem is, models are neither right or wrong. Models are tools. Do they help describe what we see? Fine, we keep the model. When they don't, we go back to the data and try to understand which of our assumptions failed.
We use the model to discern the situations that meet the assumptions stated in the model from those that don't. Models help us shape our thinking, not the data! For example, evolution is a model, too. Go tell that to creationists and followers of intelligent design. They can challenge evolution as much as they want, but until they hand me a model that explains the genetic diversity we observe today better than evolution does, I will stick with evolution.
Back to the PNAS paper. It's a hot topic right now, and I'm kind of late discussing this particular paper in the blogosphere. Razib Khan discussed it
here, Luke Jostins
here and
here, and I'm sure many others whom I don't know have talked about it too.
So, what is the missing heritability? Since I've already defined it in an
earlier post of mine, for the time being, let me just quote
Razib Khan:
"The issue is basically that there are traits where patterns of inheritance within the population strongly imply that most of the variation is due to genes, but attempts to ascertain which specific genetic variants are responsible for this variation have failed to yield much. For example, with height you have a trait which is ~80-90 percent heritable in Western populations, which means that the substantial majority of the population wide variation is attributable to genes. But geneticists feel very lucky if they detect a variant which can account for 1 percent of the variance."
The implications of this are clear: we want to find risk alleles to predict common diseases, but given the missing heritability, we can't predict common diseases.
Is this surprising?
Given the reactions I saw on the internet, apparently it is. People claim we still haven't found all variants and that's where the missing heritability's hiding. Maybe. However, after reading so much about epigenetics, RNA editing, and epistasis, allow me to be skeptical. Traits (proteins, diseases, etc.) are not genes. The path from genes to traits is long and convoluted.
So, what's Lander's point in this PNAS paper? Something I've also previously discussed:
epistasis, or the way genes interact together. We're missing heritability because we think of risks as additive, but additivity doesn't count for interactions. If you take into account interactions between genes, the total heritability is much smaller than anticipated and hence the percentage of what the variants are explaining (all together) much larger.
"Quantitative geneticists have long known that genetic interactions can affect heritability calculations. However, human genetic studies of missing heritability have paid little attention to the potential impact of genetic interactions."
Now here's the beauty of this paper. They do not deny the additive risk model. They extend it:
"We thus introduce the limiting pathway (LP) model, in which a trait depends on the rate-limiting value of k inputs, each of which is a strictly additive trait that depends on a set of variants (that may be common or rare). When k = 1, the LP model is simply a standard additive trait. For k > 1, we show that LP(k) traits can have substantial phantom heritability."
Again, mathematician thinking here, but that's exactly what models are for: some traits may very well be additive. However, the model does not fit all the data we observe it. Hence we need a better model, one that encompasses the old one and at the same time goes beyond it. Gene-gene interactions need not explain all missing heritability. But since they've been observed, we need to account for them in those situations where they may be real.
"The potential magnitude of phantom heritability can be illustrated by considering Crohn's disease, for which GWAS have so far identified 71 risk associated loci (13). Under the usual assumption that the disease arises from a strictly additive genetic architecture, these loci explain only 21.5% of the estimated heritability. However, if Crohn's disease instead follows an LP(3) model, the phantom heritability is 62.8%, thus genetic interactions could account for 80% of the currently missing heritability."
"In short, genetic interactions may greatly inflate the apparent heritability without being readily detectable by standard methods. Thus, current estimates of missing heritability are not meaningful, because they ignore genetic interactions."
"The results show that mistakenly assuming that a trait is additive can seriously distort inferences about missing heritability. From a biological standpoint, there is no a priori reason to expect that traits should be additive. Biology is filled with nonlinearity: The saturation of enzymes with substrate concentration and receptors with ligand concentration yields sigmoid response curves; cooperative binding of proteins gives rise to sharp transitions; the outputs of pathways are constrained by rate-limiting inputs; and genetic networks exhibit bistable states."
Mother Nature did not create mathematics. We created mathematics to describe Mother Nature. We start with a simple model and build up on it. The data is always the reality check, we should never forget that.
[1]
Lander, E., Linton, L., Birren, B., Nusbaum, C., Zody, M., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., et al. (2001). Initial sequencing and analysis of the human genome Nature, 409 (6822), 860-921 DOI: 10.1038/35057062
[2]
Zuk, O., Hechter, E., Sunyaev, S., & Lander, E. (2012). The mystery of missing heritability: Genetic interactions create phantom heritability Proceedings of the National Academy of Sciences, 109 (4), 1193-1198 DOI: 10.1073/pnas.1119675109