Debunking myths on genetics and DNA

Sunday, February 2, 2014

Computer generated viruses

By "computer generated viruses" I don't mean bits of code that can harm your desktop. I mean actual viruses, objects that have the ability to infect and replicate, but were created in silico, by a computer algorithm. I know this is a concept that has the anti-vaxxers enraged, but in HIV it has become quite common to generate vaccine candidates through computer algorithms. Today I want to address two questions: why and how.

Candidate vaccines are made from virus isolates: you take a real virus, make it weaker, and inject it into the body so that it will elicit an immune response. Why hasn't this worked for HIV? One of the issues with HIV is that it is a highly variable virus. Think about the influenza virus: every year there's a new flu vaccine because the virus mutates into a new strain every year. HIV can reach that kind of diversity in one individual alone. So, you can't just take one strain of HIV and make a vaccine because it would only protect from one particular strain against millions of others.

These strains have evolved from one single common ancestor, one "patriarch" that jumped from monkeys to humans last century (see this post and the second part for a discussion of the papers that estimated when the HIV pandemic started). Since then, HIV has changed drastically and diversified in 4 major groups. Most HIV-infected people are infected with strains from group M, and within that group alone there are 9 distinct subtypes, plus "recombinants," strains that resulted from a "cross-over" of two or more subtypes.

The way we study the "history" of HIV is through phylogenetics. Imagine a room full of people, and imagine making groups based on similarity. Related people (brothers, sisters, parents) are going to form the closest subgroups. Zoom out one step and you are going to form larger groups based on physical characteristics: brunette dark-skin, brunette fair skinned, blonde fair-skinned, blonde dark skin. Next, you'll probably have ethnic groups. At the end of the process, you end up with a graphical depiction of the group of people: each person is a leaf, and the leaves closest together are on a branch (family) which comes from a larger branch, which in turn comes from a larger branch, until you get to the main big branches that are the ethnical groups and the trunk of the tree is the common mother we know lived in Africa many, many years ago.

We do the same with HIV. Each virus is a leaf. When we group the leaves into branches we see that the big tree that retraces the history of the main HIV group, group M, has 9 main branches (subtypes that are called "clades"). Even if you pick two viruses from the same clade, their envelopes (the proteins that form the outer shell of the virus) can differ up to 20% in amino acids, making it again impossible to use a single strain for a vaccine.

And yet all these strains are related. They all evolved from the same ancestor. So, wouldn't it be a good idea to try and use that ancestor as a vaccine candidate? The problem is that the ancestor is no longer found in present infections. In fact, we have no documentation of it because by the time we had the technology to genotype the virus, the population had already diversified. However, we can estimate the genome of the ancestor using the phylogenetic methods I described above. Every node in the tree represents a change in the genome. By walking "backwards in time" along the nodes of the tree, we can retrace the mutations that evolved from the ancestor. Distinct HIV subtypes can differ at as many as 35% sites. However, because of the way consensus viruses are constructed, they are on average closer to any given subtype and therefore they have the potential to elicit immune responses to more diverse viruses than just a one-clade vaccine.

A consensus virus is constructed using a computer algorithm that first creates the phylogenetic tree I described above, then estimates the genome of the root of the tree. Once the genome is estimated through the computer algorithm, viral proteins with that exact genome can be built in the lab. There are some issues associated with using an in silico virus in a vaccine. First of all, you need to prove that the viral proteins constructed in this manner are viable, meaning they retain their original functions. As it turns out, these "artificial" constructs replicate and infect like regular viruses.

One of such consensus viruses is called CON-S, and monkey studies have already shown very promising results when using it as an HIV candidate vaccine. In [2], some rhesus monkeys were vaccinated with CON-S and some with a single strain, B-clade vaccine. To assess how many and what kind of HIV strains the vaccinated monkeys were able to recognize, the researchers measured cellular responses against bits of HIV proteins taken from four major clades: A, B, C, and G. They found that the CON-S vaccine was able to elicit statistically significantly better (and more) response to clades A, C, and G, than the B-clade vaccine:
"We show that vaccine immunogens expressing the single centralized gene CON-S generated cellular immune responses with significantly increased breadth compared with immunogens expressing a wild-type virus gene. In fact, CON-S immunogens elicited cellular immune responses to 3- to 4-fold more discrete epitopes of the envelope proteins from clades A, C, and G than did clade B immunogens. These findings suggest that immunization with centralized genes is a promising vaccine strategy for developing a global vaccine for HIV-1 as well as vaccines for other genetically diverse viruses [2]".
This indicates that CON-S, being genetically closer to all clades is potentially able to protect better from viruses across clades, whether using a single clade strain would miss protecting from strains from other clades.

The other type of in silico viruses tested in HIV vaccine design are mosaic vaccines, which I will discuss next week.

[1] Gaschen B, Taylor J, Yusim K, Foley B, Gao F, Lang D, Novitsky V, Haynes B, Hahn BH, Bhattacharya T, & Korber B (2002). Diversity considerations in HIV-1 vaccine selection. Science (New York, N.Y.), 296 (5577), 2354-60 PMID: 12089434

[2] Santra S, Korber BT, Muldoon M, Barouch DH, Nabel GJ, Gao F, Hahn BH, Haynes BF, & Letvin NL (2008). A centralized gene-based HIV-1 vaccine elicits broad cross-clade cellular immune responses in rhesus monkeys. Proceedings of the National Academy of Sciences of the United States of America, 105 (30), 10489-94 PMID: 18650391


  1. Two questions, if I may, Elena.

    Would it be a worthwhile approach to develop a basic vaccine based on the ancestor's characteristics and a more "tailored" vaccine to address as many as possible of the characteristics in one person's infection?

    When you speak of in silico viruses, are you speaking of a model - for want of a better word - of an actual virus expressed in text or code, as opposed to computer aided manufacturing that results in a test tube's worth of critters?

    1. Excellent questions, Mike.

      There are ideas on "tailored" vaccines but they are in the making and I can't talk about it. :-)

      No, I do mean a computer-manufactured actual virus.

      Normally we take samples and type their genomes--those are the real viruses. But in the case of consensus viruses, these guys were never typed from samples. Their genomes were reconstructed through a computer algorithm and then assembled in the lab.

      Does that answer your question? Sorry if I wasn't clear in the main text.

  2. "...and then assembled in the lab." Got it now and, once again, I've learned something. Thanks, Elena!

    1. I added that after I saw your comment, so thank you, Mike! sometimes stuff that's obvious to me fails to come out. :-)


Comments are moderated. Comments with spam links will be deleted and never published. So, if your intention is to leave a comment just to post a bogus link, please spare your time and mine. To all others: thank you for leaving a comment, I will respond as soon as possible.