Skip to main content

Chemical Biology

A Wide Look at Coronavirus Mutants

Here’s a new preprint that goes a long way to telling us what we need to know about coronavirus antibodies and Spike protein mutations. It’s from Jesse Bloom’s group at the Fred Hutchinson center in Seattle, and it’s another one of those experiments that you could only do with modern molecular biology (and modern bioinformatics).

This is what I mean by that: as we have all had to learn, the Spike protein is what recognizes and binds to the human ACE2 protein to start off the process of viral infection with this disease. And a particular region of the Spike, the receptor-binding domain (RBD), is the business end of it all. That’s a stretch of about 200 amino acids (depending on where you draw the line), and it’s the exact piece that binds to ACE2. We have a good idea of how it does that (lots of structural biology by this point), but looking at those structures and figuring out what any changes to the amino acid sequence might do, well. . .that’s not so easy. You would be a lot better off with empirical data, which is what this new paper provides.

The authors look at a 201-amino acid range in the RBD. We already know what the canonical form is like, so that leaves us with 19 other standard amino acids that might be substituted in those positions (obviously, some of them are a lot more likely to form than others, because of the triplet code from DNA/RNA to amino acids, but in theory you have 19 variations). That gives you 3,819 possible single-point mutant proteins, so they made them all (see appendix below). Well, nearly all – they were able to express 3,804 of them on the surface of yeast cells and determine their binding to ACE2. That’s done via incubation with fluorescently-labeled ACE2 protein, cell sorting, and deep sequencing.

The results pass the sanity check (which is good!) For example, mutations that produce to a “stop codon” cannot really produce well-folded protein, and so it proved – none of those bound to ACE2. Meanwhile, the mutations with very closely related amino acids (leucine/isoleucine/valine, that sort of thing) clustered together. Most changes are about the same or slightly for the worse, which fits with one’s protein experiences as well, but overall the mutational tolerance is rather high: 46% of the mutants bind to ACE2 at least as well as the wild-type protein. And there are indeed some that bind even more tightly to ACE2.

The data have been aggregated into some useful heat maps and visualizations, and what you can see is that there are some areas that seem very tolerant indeed to mutation, while others are severely constrained (just what you’d expect). On top of that, the group looked at how well these various proteins expressed, and that’s a whole different set of constraints, because some of these mutants have trouble folding well or producing stable proteins once they’re made (see at right, figure from the paper).

Here’s a key point, though: none of those tighter-binding mutants have so far appeared in the wild. Everything that has shown up in sequencing from patients falls into the “same or a bit worse” category, so we can say, thus far, that there does not appear to be selection showing up for tighter-binding (and presumably more infectious) mutant forms of the coronavirus. There’s one (V367F) that appears to be better for protein expression, but it doesn’t seem to be spreading around more because of that. That doesn’t mean it can’t do such a thing, of course – the constantly replicating virus could stumble into a variation that it hasn’t explored yet that provides some kind of real fitness advantage, and that could happen any time. But it doesn’t seem to have happened yet.

And keep in mind, “fitness” is a complicated word. There are a lot of factors at work – the paper specifically mentions the way that these proteins are glycosylated as an example. That’s surely an important process, and one that we don’t understand very well across the mutational landscape. Also remember, there’s more to viral entry than just tight binding to ACE2 – the next step, where the viral envelope and the cell membrane start to fuse, has to happen, too. You could imagine some kinds of tight binding that actually keep that from happening; the mechanism gets stuck at a too-tight stage in the wrong position to move on. So this is a valuable but rough look at the list of mutations, and there will be more as we gain more understanding. On the positive side, the paper took a selection of these RBD proteins and expressed them in a lentivirus model, and the ability of those mutants to infect cells via ACE2 correlated well with the data the primary assay produced. So we’ve at least gotten a look at one important part of the story.

The RBD is of course the target of many neutralizing antibodies against the virus – both the ones that patients are raising themselves from their own immune responses, and the ones that are being selected out as potential monoclonal therapies. The paper notes that the part of the RBD that directly contacts ACE2 is in fact more constrained than most of the rest of the sequence, so that’s good (fewer possibilities for mutational escape), and that none of the antibodies that have been characterized so far have epitopes (binding regions) that are as constrained as the key RBD area, either. That suggests that there’s room for “epitope focusing” to make them even better and less prone to be evaded.

And the RBD region is what almost all of the vaccines are targeting as an antigen to raise an immune response, so this paper has direct bearing on them as well. Perhaps you would want such a vaccine to produce one of the variations that maintain ACE2 binding but have better expression (which likely means better protein stability), for example. Another good idea might be to focus on the RBD regions that are tightly constrained, since immunity raised to those would seem to have the best chance to deal with whatever comes down the chute in the future. This is really valuable work, and part of a huge worldwide effort that has produced an extraordinary amount of information in a very short time. This is how we’re going to beat this thing: sheer knowledge.

Appendix: As mentioned, these are the single-point mutations, one by one. Note that if you want to talk about every possible variant at every position of such a protein, all at the same time, then you’re asking “How many different 201-amino-acid proteins are there using the 20 canonical residues?” That, unfortunately, is 20 to the 201st power, and even with that number – which is Rather Large – we have totally ignored the possibility of multiple three-dimensional folded structures and the related question of varying internal cysteine bridges. At any rate, a hypothetical complete library of merely all the 9-amino-acid proteins (20 to the ninth power) comes to a mere 512 billion compounds (update: I hosed up this number earlier, fixed now). So the “all the 201-AA proteins” library is right out. Where would you put it?

43 comments on “A Wide Look at Coronavirus Mutants”

  1. A Nonny Mouse says:

    Daft question, maybe…

    Would creating a vaccine candidate out of the tighter binding versions create a better antibody responses against the actual virus?

  2. Rhenium says:

    “That gives you 3,819 possible single-point mutant proteins, so they made them all…”

    And that was when I almost had to clean coffee off my monitor…

    1. Some idiot says:

      (Yes, there can be certain advantages in reading Derek’s blog when walking the dog…! He is now used to me making odd strangling sounds every now and again…)

    2. Old Country Doctor says:

      Reminds me of the sort of slugo tasks I used to get in the lab

    3. Neon says:

      Hey undergrad, get back in there

    4. TallDave says:

      See, 2020 isn’t all bad.

  3. rhodium says:

    As someone who learned molecular cloning in the early eighties (and had to do Maxam-Gilbert sequencing), these sorts of papers remind me I am living in the future.

  4. Droopy Dog says:

    …to borrow from S. Wright… Where would you put it? “Everywhere”

  5. PhooBahr says:

    One wonders about the combined effect of coronavirus RBD variants and genetic variants of ACE2 on binding (and downstream events).

  6. Hap says:

    Maybe a better question is “Where would you get the starting materials?”

  7. Jeff says:

    One thing I wonder about when looking at this data is: From the virus’s perspective, is tighter binding always “better”?

    In the regime of “the virus binds so weakly that most particles simply degrade without ever binding to a cell”, then tighter binding is “better” for the virus.

    But it’s also possible that a very strongly binding spike protein would actually be _counterproductive_ to the virus, since many of the new virus particles would bind to the remnants of the surface receptors as soon as an infected cell lyses. So what is the ideal rate of spread for a virus particle? Is it most advantageous to infect the previous cell’s first neighbor? Fifth neighbor? Fiftieth? Or does a virus work best by holding off for a few minutes and hitching a ride on a droplet to a different part of the lungs/airway entirely? To what extent does infecting neighbors become counterproductive due to drawing in a local immune response?

    A quick google search doesn’t find answers to these questions. But I’d be interested to know whether virus spread within an individual is a predicable/modelable dynamic. Because it seems like many reasonable mental models would make the correlation between spike binding and virulence nonlinear.

  8. JasonP says:

    “We have created a unique anti-viral antibody cocktail with the potential both to prevent and treat infection, and also to preempt viral ‘escape,’

  9. Anonymous says:

    I’m a math person, not a biology person, so I am a bit confused by this:

    “At any rate, a hypothetical complete library of merely all the 9-amino-acid proteins comes to a larger number than a reasonable estimate of all the sand grains on the planet (ten to the 18th, ten to the 19th, those kinds of numbers).”

    Without taking the other stuff you mention into account (cysteine bridges, etc), 20^9 = 2^9 * 10^9 = 512 * 10^9 = 5.12 * 10^11. I don’t understand how you’d get an extra multiplier of ~10^7 on each possible sequence, that would roughly be saying every single amino acid in the chain could additionally be linked to any other, regardless of the sequence.

    1. sgcox says:

      9^20 >> 20^9 …

      1. Anonymous says:

        9^20 would be 9 options in each of 20 positions (1 position with 20 options is 20^1, not 1^20). I certainly wondered for a bit if I had it backwards, though, as that does match Derek’s number…

        1. Klagenfurt says:

          Of course it’s 20^9 for a 9-aa peptide. The 20^201 for a 201-aa protein is what gets us in trouble.

          1. Klagenfurt says:

            And since we are wondering where to put it, I volunteer to take (Trp-His-Tyr)67. Don’t asked me why.

  10. Steve Scott says:

    No threatening mutations so far, but what happens when the virus later confronts a vaccine? Would that promote mutations to defeat the vaccine? (like antibiotics that become ineffective over time)

    1. Uke Fudanshi says:

      Does a virus get directly fought by a vaccine? No right, just the antibodies it leaves behind right? Shouldn’t those mimic a natural infection? In that case, how could that happen? Sorry if I’m talking nonsense, I’m just an interested comp sci and japanology student layman, not a virulogist…

  11. DTX says:

    To follow Steve’s comment, is correct that we’ve not seen data to suggest that the virus mutates at rate that will require regular (yearly?) re-vaccination? (or do we even know?)

    The reason I ask is that Fierce Pharma recently quoted AstraZeneca’s CEO as saying that the new vaccine AZ is supporting (from the University of Oxford) is expected to provide protection for one year. – Just 1 year

    While a yearly vaccination would make it like a flu vaccine, it’s different in that only a fraction of the world’s population gets the flu vaccine, whereas most (~60%) would need a Covid-19 vaccine to stop disease transmission. Yearly dosing would mean production rates of the vaccine would need to be huge.

    Fierce Pharma provided no basis as to why Soriot said the vaccine’s protection should last just 1 year.

    1. Steve Scott says:

      Another point- according to some critics- you could not get an effective booster shot of an adenovirus vaccine such as Oxford-AstraZeneca’s, if the protection wears off over time. That is because the body would have generated antibodies against the (harmless) adenovirus that it encountered for the first time when the vaccine was delivered. The body would remember, and attack and destroy the adenovirus the second time around, before it could deliver the vaccine into the cells. Maybe somebody knowledgeable can speak to that potential issue.

      1. A Nonny Mouse says:

        This is why (if it works) the Imperial approach may be a better delivery method

    2. Charles H. says:

      IIUC, immunity to some corona viruses fades within 6 months after you catch the disease. Also there are reports from China that those who caught COVID-19 earlier are being found without antibodies. Not proof, but reason to be cautious.

      1. a s says:

        Antibodies are expected to go away after some time – that doesn’t mean you lose immunity. It’s the memory cells that store that.

        (I apologize for posting this on a medical blog who presumably knows better than me how this works…)

  12. antibody guy says:

    What about D614G?

  13. eub says:

    Borges’ 201-AA Protein Library of Babel.

    (#TIL that the library concept came from a German SF writer Kurd Lasswitz in 1901. Possibly influenced by Ramon Llull’s ars combinatoria.)

  14. Some idiot says:

    Slightly off topic (but maybe not) there is a report from a large (by Danish standards…) cluster of cases in northern Jutland in Denmark. It has now been established that this strain has not been seen previously in Denmark (or in Europe, for that matter). Precisely the same strain has been found in mink in a farm up in that area (the mink have now been destroyed). It has not been established whether or not the humans infected the mink or the other way around. This could be interesting to follow…

    1. Some idiot says:

      Just a quick update… I just read that mink on another farm have also been tested positive. In this case, the owner’s dog was also tested positive… The authorities are now going to test all mink farms in Denmark.

    2. fajensen says:

      Somebody could make a ton of money by selling a Covid-19 vaccine / cure *now* that works in mink.

  15. Ryan says:

    It’s not only about ACE2: NRP1 is emerging as an important co-receptor:

  16. Esmeralda says:

    It is quite scary if the SARS-Cov2 mutates. Although, it seems that harmuful (or even no harmful) mutation has not been detected so far.
    But, even if it mutates, and there is a vaccine against it, maybe we will have to take shots every year as it is in common flu

    1. Just another chemist says:

      All viruses mutate, all of the time. It’s just that most mutations are not helping the virus, they are just random

      1. Barry says:

        more than that. Not all viruses mutate at the same rate. Coronaviruses are big (for viruses) because they carry proofreading machinery for their RNA polymerase that most viruses lack. They’re peculiarly slow to mutate. And yes, most mutations to the actual contact surface of the RBD would cost them their virulence.

  17. Barry says:

    I would expect darwinian pressure for more infectivity, but not for more virulence. Tighter binding to ACE2 might produce either (if it were to drop the infectious dose). It is welcome news that we haven’t seen these in the wild yet

  18. anon says:

    It is remarkable that only 6 months after Corona emerged, effective treatments (such as dexamethasone) and potential vaccines have been discovered or are in development. Given that the pandemic took almost 3 months to become firmly established in America, it would not seem to be an entirely unrealistic goal for future pandemics to largely be avoidable by compressing the R&D stage for infectious illness to within this 3 month pre-pandemic arrival interval. If this had been possible with Corona, then (even now) nearly 100% of the mortality from the virus would have been prevented.

    Regarding the combinatorial library, might this approach be a rapid way to achieve the goal of near instant vaccines? If one considered all possible binding patterns, then perhaps choosing the weakest of these patterns could be included as part of vaccination strategy. With CRISPR technology, this might possibly allow extremely rapid vaccine development.

    1. Just another chemist says:

      There are organizations doing just that CEPI for vaccines and READDI for small molecules

  19. TallDave says:

    old piece, had not realized so many peptide-based COVID/ARDS treatment candidates were this deep into trial, but have not heard any efficacy numbers so not sure if they are close to anything… potential cost/scaling advantages are certainly intriguing

  20. TallDave says:

    great piece btw

    “Where would you put it?” seems it only fits in virtual space

    possibilities in protein design seem unfathomably deep

  21. TallDave says:

    sorry one more, a bit more on topic: Regeneron’s cocktail is allegedly designed to thwart mutations

    (like this one?)

    “The reason, in short, was to thwart the ability of SARS-CoV-2, the virus at the heart of the pandemic, to mutate and cause patients to become resistant to treatment. The two antibodies Regeneron selected, REGN10933 and REGN10987, were less likely to generate “escape mutants” than individual antibodies or other cocktails the company’s scientists tested, they reported (PDF) in the journal Science.”

  22. anon says:

    “a ‘3.25 log’ reduction (99.9% reduction) for a neat concentration in 5 minutes against COVID-19 surrogate Feline Coronavirus.” wiki Hinokitiol describing the product Dr ZinX. Are they serious? 3.25 logs in 5 minutes? Is “neat concentration” actually recognized as standard chemical vocabulary? Sounds more like a description for whiskey.

    Perhaps the zinc enthusiasts on thread were right after. If they could demonstrate similar benefits in humans, then perhaps they should set their alarm clock for an early phone call from Stockholm.

  23. Batman says:

    Can someone more educated than I please speak to the recent discussion around D614G mutation (there’s a notion that it has more functional spikes) and whether (1) it is a valid mutation, (2) whether it makes COVID more infectious / lethal, (3) the impact this could have on vaccines, and (4) whether this speaks to the possibility of being “reinfected”?

    Thank you in advance

  24. Rob says:

    Has there been any more work done on the differences between strains since the Scripps study? Surely there are enough cases out there to look for evidence in the real world. The earlier hotspots in NYC and, I think, New England were the same strain as Italy. Which strain are the newer hotspots in California, Texas and Florida? It’s frustrating that we don’t have a better explanation for the wildly different outcomes in various places. Lots of deaths in Europe and the Eastern U. S., very few in Asia and the Western U. S. What is it? Masks? Different strains? Better contact tracing? Old vaccines? Vitamin D?

Comments are closed.