Skip to main content

Biological News

Watching For Mutations in the Coronavirus

The coronavirus outbreak has been accompanied by a huge amount of sequencing data, as well it should be. is a great place to see this in action: region by region, the spread of the infection can be tracked, often with enough detail to say where the virus must have come in from and how many different starting points it’s had. That all depends on how many different strains are detected in the first place, of course, and in regions (like the US!) where we’re not even doing enough basic RT-PCR swab tests to know the prevalence of the virus as a whole, we’re surely missing a lot of information about deeper things like viral sequence, the number of different mutations, and how they’re distributed. GISAID is another large repository of such data, and it’s growing day by day.

As you can see from these sites, there are a lot of random single-nucleotide changes that have popped up. It’s important to realize that overall, the mutation rate of this virus is not particularly high (in line with other coronaviruses, actually). But they do accumulate; there’s a fearful amount of viral replication going on out there, and not all of it goes perfectly. A single patient may have several mutational strains going at the same time as the virus replicates, and there’s been a report of a person who turned out to be infected simultaneously by two strains with different geographic origins, which is bad luck. Take a look at the Nextstrain diversity panel here to get an idea of what’s been found across the sequence (reproduced below). These are total events, the number of times mutations have been seen, and remember, because of the triplet genetic code, some of these single-nucleotide variations are still going to lead to the same amino acid in the resulting proteins:

Here’s a general look at how that genome is organized. ORF (open reading frame) 1a/b the big orange bar and green bar in the graphic above) encodes nonstructural enzymes that are involved in replication (and in protein processing to enable that replication – this is the polyprotein mentioned this post). Then you get into some structural pieces: you have the S region which codes for the notorious Spike protein (the one that gives the virus its distinct appearance and that interacts with human ACE2 for cellular entry), ORF3a, a structural protein that is near the spike and likely modulates the human inflammation response (as it seems to in the earlier SARS coronavirus), the gene for the envelope protein (E), the membrane protein (M), the nucleocapsid protein (N), an RNA polymerase enzyme, and a few others.  You can see above that there are known mutations scattered through the whole sequence, with some spots showing more than others. It’s worth comparing these in terms of entropy: how divergent are these mutations? Here’s that plot:

Note that the baseline goes down a lot; most of the mutations in the first count are trivial ones that don’t even change what amino acid gets coded for, or produce something quite similar. That second-biggest peak, for example, right at the end of ORF1a at the ORF1b border, doesn’t look so impressive when you see the entropy number and take into account what these mutations are coding for. But you can see that there are certainly some entropic spikes, like that big one right in the middle of the S protein. There are several ways to think about this: that’s a position in the RNA sequence that’s more likely to lead to something new, and it may also be that a relatively larger number of different amino acid residues are accommodated in the resulting protein. And since we’re looking at the sequences of viruses that have successfully infected humans, that tells you that most all of the ones we’re seeing must still be functional. So that big diversity line in the middle of the Spike protein may not be coding for a crucial residue for human infection – but if you’re developing a vaccine that sets off antibodies to the Spike protein, you would want to be sure that this diversity doesn’t throw things off, and that those antibodies will still recognize the variations. Overall, the relative lack of diverse mutations in the regions shown (like pretty much all the rest of the Spike) could be a sign that weirdo changes there just aren’t tolerated, and generally don’t produce a viable and/or infectious virus.

The historical example that this inevitably calls to mind is from World War II: mathematician Abraham Wald was given the job of analyzing the patterns of damage seen in returning combat planes, with an eye to where armoring could be improved. The initial idea was that the areas with the most holes were perhaps getting hit often and should be shored up – but Wald pointed out the survivorship bias problem: these places actually indicated where a plane could take damage and still be able to return. He believed that the distribution of projectiles was probably fairly even, meaning that regions on the aircraft where no shell holes were ever detected were probably the crucial ones to armor! (Note: the accounts of this have been embellished over time, but the fundamental story is accurate – see this post at the American Mathematical Society about the math behind Wald’s work, and note especially the postscript). In those plots above, we are seeing the places that you can shoot through the coronavirus genome and still return a working pathogen.

We’ve were just talking about mutations that still produce a functional virus – what about the ones that produce something worse? Worse for us, I mean? That’s where the WWII airplane analogy breaks down a bit; no gain-of-function was probably produced by shooting off parts of the wing. But then combat aircraft are under selection pressure only through human mediation, while viruses are on their own.

We already know that the receptor-binding domain of the Spike protein is one of the more variable regions – that’s the peak that you see in those plots above in the S gene. And that’s because it’s so important for viral entry into cells – without which there is no replication at all. There’s some evidence that this might be undergoing some positive selection; there are several subtle signs in the way these mutations are coming on that make this a possibility (see that last preprint link for more details). And positive selection means what you think it means: a change that is an actual advantage for the virus and gives the new form an edge on the existing strains. In general, we could be uncomfortable with its products. They could tend to the “easier to catch, faster to multiply” end of things, although you could also imagine a “causes less trouble and fewer symptoms” variety also getting selected for, which is what tends to happen in the long run with pathogens (see the discussion of attenuated viruses here). Unfortunately, that latter phenotype can develop through selection both on the pathogen and on the hosts, that is to say us, and we’d rather keep the viral thumb off that scale. But it happens: a good part of the European population descends from people who didn’t die during the Black Death, and you can still see it in their genes.

Tracking viral mutations as an epidemic spreads has its odd features. For one thing, “founder effects” and population bottlenecks are well recognized in all species as having very noticeable evolutionary consequences. If a population grows out from a small cohort or if a once-larger population goes through a contraction to a small number of individuals, the loss of genetic diversity leaves a lasting mark. And that’s pretty much what happens every time a virus jumps from person to another!As mentioned above, the study of viral genomes is all about survivorship bias. It doesn’t take many viral particle to infect someone under favorable transmission routes (such as, in this case, inhaling a small floating droplet coughed out by someone else). So a virus spreading through a population may be going through a long series of consecutive founder events/bottlenecks, punctuated by bursts of replication once a new host is infected, each time with new possibilities for random-error mutations and selection pressure from the host’s defenses. Factor in the extreme speed with which the viruses can replicate when they get the chance, and you have a literal example of the phrase “evolution in action”, right in front of your eyes. All that bottlenecking can be more of a factor than selection in the host, because the number of variants that show up in a single infected person will surely not all make the leap to the next host.

Now let’s discuss a new preprint in light of all this. This paper looks in detail at 11 mutated forms of the Cov-2 virus and goes further to functionally characterize them a bit, such as how easily they infect cells in culture. That’s crucial information, and we’re starting to accumulate enough of it to draw some conclusions. The 11 mutations are shown below – these are laid out just like the Nextstrain graphs above, and now you know your way around the coronavirus genome a bit and can see some things that are going on. (Those who really want to dive into the architecture should go here!) These mutations were determined by deep sequencing with a huge number of reads, which is possible partly because the total viral sequence just isn’t that big (although the coronaviruses have the biggest genomes of all the RNA viruses in general):

These are from patients in Hangzhou fairly early on the epidemic, which is good. To be honest, we still don’t have as good a profile as we would want of the early mutational diversity of the virus as it got going in Wuhan; people were generally too busy to do a lot of deep sequencing. The 11 mutants shown above were the ones that got further study; the paper identified 33 mutations in all, and it’s notable that 19 of these were still novel as of a March 24th check by the authors in the GISAID database. Among the interesting things in these 11 sequences is that two of the mutants described appear to be foundational to some of the strains now spread across the rest of the world. And take a look at the ZJU-11 genome – it has four mutations just in the ORF7b gene, three of which are consecutive (!) That codes for an accessory (non-structural) protein that ends up stuck in the Golgi apparatus of the cell, which for the non-biologists is not pronounced as it is in the Phish song, and I’m not sure if anyone knows quite what the viral protein is doing down there.

This team infected Vero-E6 cells (a standard cell model) with all 11 of the strains above and watched for differences in viral load by RT-PCR, along with electron microscopy of the cells themselves. There were few if any differences at the 1, 2, and 4-hour marks after introduction of the viruses. But at 8 hours, the ZJU-6, ZJU-7, ZJU-9, ZJU-10, and ZJU-11 strains all had a higher viral load as compared to the others. At the 24 hour mark, all of them had a noticeably higher viral load still, except for ZJU-2 and ZJU-7, which had not kept up. The ZJU-10 and ZJU-11 had taken off more drastically than the others by this point. (Update: fixed this paragraph and the one following, since I had mixed the paper’s RT-PCR cycle threshold values liberally with the viral load numbers last night).

ZJU-1, whose sequence fits more with a cluster of mutations found mostly in Europe, had 19 times the viral load of ZJU-2 and ZJU-8, which are more in the Seattle/Washington state clade – these differences were already becoming apparent at 24 hours and were statistically significant (reproducibly so) at 48. And when you compare the top and bottom-performing strains, ZJU-10 had 270 times the viral load of ZJU-2 at the 24-hour mark! So there are noticeable differences in the cell assay, but the question is how these might translate to human infections. The authors note that ZJU-11, the other top performer with one of the highest viral load numbers at 24 and at 48 hours, turned out to be very bad news for the patient that it was isolated from, who tested positive for 45 days (!) and took the longest of all of the patients in this study to be discharged from the hospital. Recall that this one had a heavy mutational signature in the ORF7b gene; trying to see if this sort of thing correlates with slow recovery in humans (and if so, how) would be very worthwhile.

So there appear to be real differences in cell infection assay data even when you just look at 11 variants of the coronavirus. One of the references above, posted on Biorxiv back on March 17, had already been looking at this at the zoomed-in level of binding of the Spike protein’s receptor-binding domain to human ACE2 protein. That could be one measure of infectivity, but I would argue that cell infection data are closer to reality (albeit a mixture of different effects). There’s always the question of which cell lines you choose to infect, though, and I suspect we’ll see more investigations along those lines to make sure that we’re not getting fooled that way. You can bet that more attempts to correlate viral sequences with such cell assays (and with patient outcomes) are underway as we speak. We need to know if there are nastier varieties out there and how such things might be spreading, of course, and these data are going to have to inform the research groups working on vaccines and monoclonal antibody treatments. Molecular biology, structural biology, cell biology – these disciplines and more are going to reveal the coronavirus’ secrets and tell us how best to fight back.

79 comments on “Watching For Mutations in the Coronavirus”

  1. Jeff Brender says:

    Not so sure about the author’s interpretation. How do we know that the differences in the infectivity data in Figure 3 don’t simply reflect the initial differences in viral load, which is impacted by the sampling technique, the patient characteristics and other factors?

    1. Gan Kad says:

      The infectivity was not determined based on the patient sampling data. Instead, experiment infectivity was determined in vitro using Vero-E6 cells in cell culture where equal titers of the different strains were infected.

      1. Tom Elliott says:

        I did not see anything in the paper about equal titers. You can see the variation at early times in Ct, indicating at least 2^5-fold variation in applied virus. Also, these are pools of virus grown out directly from the samples.

  2. B says:

    Great post, Derek. Missing a hyperlink here? -> [see the discussion of attenuated viruses here]

    I’m very curious to know how some of these mutations, alongside variance in host factors, results in altered glycosylation states of the envelope proteins and how that may change transmissibility.

    Really wish I could find the pre-print (can’t for the life of me now), but there was one report of a patient that had close contact with over 300+ others without a single person developing symptoms. Bizarre!

  3. Marc says:

    Recently unearthed: an ancient Roman poem about viral mutation.
    It’s called “The Metamorphoses of COVID”.

    1. Trebitsch says:

      Video meliore, proboque, deteriore secor!

  4. Jeffrey Coe says:

    Could the bright side of Coronavirus and the resulting research be complete viral immunity for all humankind?
    Only with all nations and all people working together.

    1. LdaQuirm says:

      That’s not how immunity works unfortunately. Even some Genengineered human 1000 years in the future who’s Ribosomes and transcriptases only accept “signed” RNA/DNA would still be susceptible to a sufficiently clever virus. Although In this hypothetical distant future I guess they would be immune to natural viruses.

      1. fef says:

        I mean if we are going into the “signed RNA” sci-fi territory, it is not entirely inconceivable to create an organism that is chirality-reversed compared to anything natural. Or swap out all the nucleobases to synthetic ones, and have the relevant enzymes modified to work with them. Add in some nonselective DNA/RNA depolymerases, just in case, and presto! Immunity to all viri.

        1. LdaQuirm says:

          *All _natural_ viruses.

        2. Jari says:

          It is unclear to me whether my response is addressing the correct comment, so I want to make it clear I am responding to “fef said at 22 April, 2020 at 4:03 am:” message

          ” Or swap out all the nucleobases to synthetic ones, and have the relevant enzymes modified to work with them.”

          What would the ” synthetic ones” be? How are they different from the nucleobases? What distinction is being made here?

      2. It’s actually straightforward (and economically rewarding) to make an organism immune to all natural viruses.

        1) Pick an amino acid that several codons code for.
        2) Pick one codon, and gene-edit it to a different equivalent codon in the entire genome.
        3) Modify the organism to make that codon mean STOP.
        4) Any natural virus will have that codon many times, and will be 100% incapable of replicating even once in this organism, so it can’t evolve around this barrier.

        Sounds like science fiction – but George Church’s group is doing it in E.coli, or maybe has done it by now. (Related link on my name.)

    2. loupgarous says:

      Nature doesn’t count noses or passports, unfortunately. The prime determinant of who develops immunity is so much more complex than that, and involves the factors Derek mentions in today’s article. Derek’s articles take a little more time than most blog articles because he (quite correctly) links to other papers which are relevant to the article’s topic.

      I won’t say we won’t miss really significant information if, say, a clade of SARS_CoV2 from the Seychelles Islands (which, say, was brought over by tourists and mutated in ways that altered its function on the islands) isn’t discovered and analyzed, but the most important information will probably turn up in clades infecting more people, and after we understand the selection pressures at work in human cases of SARS_CoV2 much better than we now do.

  5. JasonP says:

    >>>where we’re not even doing enough basic RT-PCR swab tests to know the prevalence of the virus as a whole, we’re surely missing a lot of information about deeper **thinks** like viral sequence, the number of different <<<

  6. luysii says:

    ” RT-PCR swab tests to know the prevalence of the virus as a whole”. That’s for people acutely infected with the virus, but another form of prevalence is the number of people carrying antibodies to the virus. These are people who have been infected long ago enough to have developed antibodies. Two studies from SF (Stanford) and LA coming in the past few days showed that the prevalence of antibodies to the virus was one two two orders of magnitude higher than the number of cases where the viral genome was present — both studies came in at about 4% of those tested.

    This possibly is good news, and if replicated, and if the antibody tests are specific, because it implies that, virulent as it is in the elderly and nursing home population (accounting for almost half the fatalities), the virus just isn’t as bad as we thought.

    Given the numbers in the catchment areas of the tests, hundreds of thousands of people in LA and SF areas have the antibodies. Even better it may imply that many in our population have a natural immunity to infection.

    Here’s a link to the LA work — (

    Further tests are planned in the populations, and it will be fascinating if the 4% figure increases. If it doesn’t we may be near the maximum of the epidemic.

    One more thought — synonymous codons don’t always act the same. There is something called codon bias, particularly for leucine which has 6 codons which code for it. But that’s for another time.

    For a bit more on this —

    1. tt says:

      Nor sure about the LA study, but the Stanford one will likely have to be retracted as it’s been widely reported that there were severe methodological errors and even some basic math errors in the analysis. Also, the current antibody tests are essentially worthless given the high false positive rate (I believe the CEO of Roche made that exact claim today). For now, the prevalence of people with antibodies to Covid is a complete unknown in the U.S.

      1. heteromeles says:

        Not an expert and trying to stay out of the whole Santa Clara debate, but Josh Marshall made the point this morning (at that if the Santa Clara numbers are correct, then, for New York City to be getting the death toll it’s seen so far, effectively the entire city has to be infected (assuming he did his math right, and he admits he’s a political wonk, not a statistician). False positives are a hindrance, and if the real positive rate is below the false positive detection rate, it’s not clear how useful a study using that detection method is going to be.

        1. luysii says:

          As always, new data is always welcome, even if somewhat conflicting. The false positive rate in the NY study would have to be far greater than 5% to discount this study. Since NY has more cases than anyone else the data are consistent with the California studies.

          An estimated 13.9% of the New Yorkers have likely had Covid-19, according to preliminary results of coronavirus antibody testing released by Gov. Andrew Cuomo on Thursday.
          The state randomly tested 3,000 people at grocery stores and shopping locations across 19 counties in 40 localities to see if they had the antibodies to fight the coronavirus, indicating they have had the virus and recovered from it.
          With more than 19.4 million people residents, the preliminary results indicate that at least 2.7 million New Yorkers have been infected with Covid-19.

      2. Anon says:

        Can anyone name a country that has accurately measured the case fatality rate?

        1. loupgarous says:

          Even in South Korea, which is being touted by the press as the success story in Covid-19 monitoring, the appearance of over 100 patients who’ve tested clear for Covid-19 two days, but now show up positive again suggests a number of explanations
          – latent undetectable infection that reactivates after repeated “negative” tests
          – false negatives in those who tested negative for two days running
          – false positives when those patients were first tested
          – seroconversion not being evidence of immunity (maybe the innate immune system knocks SARS_Cov-19 down before the adaptive immune system has a chance to make antibodies?)
          – and there are more possibilities being discussed.
          I hope the Koreans testing active after being discharged as uninfected aren’t really infected. But the sad truth is that even the nation which did the testing and tracing of contacts we have not isn’t sure what’s going on in those cases.

          1. Fenty says:

            I believe the KCDC conducted some blood cultures on some patients that had again tested positive. The cell cultures didn’t grow or no live viral cells in their blood. It’s had them begin leaning heavily that the PCR tests are picking up viral fragments as opposed to live virus in the recovered patients.

        2. Some idiot says:

          The best one I have seen so far is from the government office of statistics in the UK. Not perfect, but probably useful. At the moment, the only COVID-19 deaths registered in the UK are hospital deaths. The office of statistics looked at other measures as well (if I remember correctly, such as excess deaths en eg nursing homes, compared to what be expected at this time of the year). Their number came out at about 50% higher than the “official” number, if I remember correctly.

    2. More than 1800/2500 in a single prison tested positive. That seems to put an unhappy upper bound on any chance of preexisting immunity. (Link in my name.)

  7. Highwater says:

    This is all way over my head (especially the Phish reference), but I have to wonder: is there any possibility (let-alone *evidence*) that differences in strains are responsible for any significant portion of asymptomatic cases?

    1. Gan Kad says:

      The symptomaticity of an individual will be dependent on far too many factors – age, innate immune response (both speed, as in how early the body catches it, and amount, as in how well it fights it off), any other infectious or other agents the body is devoting resources to etc., Strain difference will certainly play a role in all this, but it is unlikely to be the sole determinant.

      1. Highwater says:

        That is, unfortunately, pretty-much what I expected. Not that we’re likely to have enough data on asymptomatic cases for it to matter even if mutations *were* of significant relevance. Just makes the WWII reference more appropriate, I suppose.

        As for Phish, I just never liked them; too-much of a Grateful Dead fan, I suppose… but I have to give them props for playing acoustic in the street after the power went out during a show they played here.

    2. Philgro says:

      The Phish reference was the one part I understood!

      1. Brian R says:

        i only know enough about Phish to know I don’t “understand” most of any Phish song, not that that is the point of Phish songs, but the post triggered me to google “Phish golgi” and I got a nice smile listening to the song on youtube. and as is generally the case, Derek is right, Trey Anastasio mispronounces it – The second “g” is soft, not hard as it is pronounced in the song.

        I briefly wondered how that would play in Northern Italy, where Golgi performed his important staining innovations, but then realized Phish won’t be performing there, or probably anywhere, for quite some time…

        1. Philgro says:

          They’ve been performing that song for over 30 years, surely they must have played it in Italy in all that time. If you want to speculate on how Italians respond to their lyrics, please look up “You Enjoy Myself”- that one might not go over so well.

    3. sdep says:

      The patient details in the paper show no particular association of symptoms to the viral loads determined by RT-PCR from the in vitro cell culture experiment.
      Instead, the three oldest patients (who all had high blood pressure) had severe or critical symptoms, with the eldest being the only patient treated in intensive care. These three patients yielded virus isolates that with low, medium and high RT-PCR viral loads in the cell culture assays.
      The three patients with the least severe symptoms were aged 4 months, 34, and 36.
      So it seems that the familiar pattern of age and co-morbidity is more predictive of disease severity than virus genome variant in these eleven patients.

      Another point in the paper has led to media speculation about lethal mutant European strains. The ZJU-1 sequence, which carries the non-synonymous spike protein substitution S-D614G gave a 19-fold higher viral load at 24 hours than the slowest-replicating variants (ZJU-2 and ZJU-8). The paper notes that S-D614G is common in European isolates.
      Without having in vitro or patient phenotypes for the various European isolates, we can’t exclude the possibility that some variants have higher pathogenicity, However, looking at we see that the S-D614G mutation is present at very different frequencies in different European countries – highest in Italy and France, lower in Germany and the Netherlands, and lowest in Spain and the UK. There is no obvious relationship between allele frequency and morbidity or mortality across these countries.

  8. luysii says:

    ” RT-PCR swab tests to know the prevalence of the virus as a whole”. That’s for people acutely infected with the virus, but another form of prevalence is the number of people carrying antibodies to the virus. These are people who have been infected long ago enough to have developed antibodies. Two studies from SF (Stanford) and LA appearing in the past few days showed that the prevalence of the antibodies was one two two orders of magnitude higher than the number of cases where the viral genome was present — both came in at about 4% of those tested.

    This possibly is good news, and if replicated, and if the antibody tests are specific, imply that, virulent as it is in the elderly and nursing home population (accounting for almost half the fatalities), the virus just isn’t as bad as we thought.

    Given the numbers in the catchment areas of the tests, hundreds of thousands of people in LA and SF have the antibodies. Even better it may imply that many in our population have a natural immunity to infection.

    Further tests are planned in the populations, and it will be fascinating if the 4% figure increases. If it doesn’t we may be near the maximum of the epidemic.

    One more thought — synonymous codons don’t always act the same. There is something called codon bias, particularly for leucine which has 6 codons which code for it. But that’s for another time.

    For a bit more on this —

    1. PV=nRT says:

      That Stanford study was only a preprint, not a fully reviewed publication, and it’s being pilloried for sampling bias; I’d be surprised if it were accepted for publication in current form. The main issue is that they got their subjects by requesting people to come down for testing — on Facebook. So you’d be much more likely to want to come down for a test if you’d felt super sick in late Feb, i.e., major self-selection bias.

      1. luysii says:

        That’s true, but the LA Times article (link above) contains the following “The Santa Clara study recruited around 3,300 participants from social media, which has raised some concerns that the results may not be representative of the county as a whole. The researchers made adjustments to their data to account for that problem.

        The study was composed differently in Los Angeles; 863 adults were selected through a market research firm to represent the makeup of the county. ”

        Santa Clara is the Stanford study. The fact that the numbers agree is hopeful.

      2. SlingMeSomeHash says:

        It’s true that there was selection bias. But the authors also tried to account for that.

        More to the point: There are now a huge number of people with a bias towards dismissing any research that might prick a hole in their assertions regarding how decimating Coronavirus will be.

        I mean, if you shut down the economy for this, and someone later presents data showing that you didn’t need to, that’s pretty damning.

        You also have a lot of crap predictions made using sound models but with worse-than-crap input data. See Neil Ferguson’s first predictions for one particularly spectacular example.

        So I would take the hammering of the Stanford paper (while appreciably worse doomsday prognostications are accepted more willingly) to reflect a bias on many of the people doing the hammering.

        1. luysii says:

          Nonetheless, in the vulnerable it can spread like wildfire. Consider the Soldier’s Home in Holyoke, Massachusetts. The statistics are truly ghastly. There were 210 vets living at the home. As of 21 April, 52 have died due to COVID19, 94 more are alive but positive for the virus, so 69.5% of the vets living there are infected or dead.

          1. march21 says:

            But in Pine Street Inn homeless shelter 36% were positive for the virus but were symptomatic.


    2. Derek Lowe says:

      Note that a number of statisticians have expressed vocal concerns about this work. . .

      1. Anonymous says:

        This is very easy to explain. If covid19 is no more deadly than the flu, then the economy can be opened up. If the economy improves, then Trump will be re-elected. Therefore, the Stanford and LA studies must be suppressed.

        1. Derek Lowe says:

          And the people pointing out the methodological and statistical flaws in those studies are. . . ?

          And the excess corpses in New York and elsewhere are. . . .?

          1. Anonymous says:

            What does the incidence rate in California have to do with the incidence rate in NY?

          2. Anonymous says:

            What does the incidence rate in California have to do with the incidence rate in NY?

          3. luysii says:

            The New York data (unavailable at the time most of these comments were written) essentially confirms the two studies from California. For details —

          4. Derek Lowe says:

            The thing is, the NYC data should be more accurate because of the higher prevalence, right?

          5. euro-peon says:

            There is also the little matter of the rest of the world who don’t have a Trump to re-elect/vote out in November. Unless the existence of us godless socialists over here is also an elaborate hoax and we’re just media hirelings with funny accents.

          6. Derek Lowe says:

            Yeah, that’s always been an interesting part of the “all a hoax” worldview, that so many people around the world were willing to play along and act as if they had a virus spreading through their population. . .

    3. KazooChemist says:

      I also read somewhere that the false positive rate of the test they used was almost as high as the signal they detected. I have not read the actual preprint to see how (or if) they corrected for that. I will have to go look it up.

      1. AVS-600 says:

        This is something I’ve been trying to find out more about, to little avail. It seems like most studies of asymptomatic populations, whether they are measuring viral load or antibody development, turn up results of “about 3-5% infected”. This has been true in Vo Euganeo, in the LA study, and in South Africa (which has started a widespread testing campaign of people without symptoms). What are the chances that rather than measuring the rate of asymptomatic disease in the population, we are commissioning studies to measure the true false-positive rates of various tests for the virus?

        1. Omar Stradella says:

          That’s wrong. The testing in Vò showed that “Notably, 43.2% (95% CI 32.2-54.7%) of the confirmed SARS-CoV-2 infections detected across the two surveys were asymptomatic.”

      2. KazooChemist says:

        Here is a link to an analysis of the data. Much of it goes over my head, but the conclusions seem to recommend taking the results with a grain of salt.

  9. Toni says:

    “The authors note that ZJU-11, which had pretty much plateaued at 24 hours and had one of the lowest viral load numbers there and at 48 hours”
    If I understand it correctly, sample 11 shows the strongest decrease in Ct which means that the virus load is highest even if there is no further decrease (Ct) after 48 hours post infection.

    1. Algirdas Velyvis says:

      Signal boosting this comment. Derek, it seems you misread qPRC results in their figure 3. Lower Ct = higher viral load. To quote the manuscript itself:

      251 threshold values, Ct, were used to quantify the viral load, with lower values indicating
      252 higher viral load. Because the results based on the three genes are highly consistent (R >

      1. Derek Lowe says:

        Right you are! I had grasped this while reading the paper during the afternoon, but my brain had decided to grasp something else by the time I finished the post last night. Fixing that now.

  10. Barry says:

    Although the Nature paper
    cites mutations in the Spike protein, it stresses that the RBD(the docking surface for ACE2) must be conserved for the virus to remain infective. Although there’s been some optimization/positive selection for more affinity to ACE2 over the years, it’s still recognized by Abs to the old SARS virus

  11. Grammar police says:

    “This are” is a non-functional mutation.

    1. Hap says:

      OT: I like the fact that in Wolfenstein:The Old Blood there is an actual grammar Nazi.

    2. Derek Lowe says:

      Mis-sense indeed!

  12. Tom Elliott says:

    With respect to the NextStrain plots, besides selection at least two other factors will influence representation of a mutation in these samples: where it occurred in the lineage, and inherent mutability of sites. Not an animal virologist so not sure what’s known about the latter, but the first is obviously important. In fact, playing with their web page, it seems that some mutations that were observed at early times also “reoccur” in later lineages, but an alternative explanation might be that these late samples are misplaced on the tree.

  13. JJM says:

    Scientific and/or knowledgeable thoughts on the information linked suggesting that the virus may have a certain course to run then dissipate?

    1. Barry says:

      The general course for a pathogen is to mutate towards less virulence. The optimal pathogen uses its host for propagation/dispersal w/o killing the host*. This famously frustrated repeated efforts to wipe out invasive rabbits in Australia by introducing disease. As these less virulent pathogens spread through the population, they elicit immunity (partial or complete) to both the less virulent mutant and to the more virulent parent pathogen. At this point, the population is no longer “naive”, and even the more virulent strain can’t spread as fast or as far. This was the difference between influenza running around Europe in the 16th century and Influenza devastating the Americas (naive populations) in the 16th century. The less virulent strain functions as a cheap, natural (risky) “attenuated” vaccine.
      There are famously exceptions to this trend. Cholera manages to spread very effectively even as it kills its host; we have seen no trend away from virulence in vibrio cholera.

      * a really successful STD would make the host sexier

      1. Andrew P says:

        Your last comment is similar to what I am thinking about this virus. What if the Wuhan Virus evolves a mechanism to counteract social distancing, such as eliciting erotic stimulation in its host, so the host overcomes fear and seeks out partners? Other diseases (like rabies) are known to manipulate hosts in order to spread.

        1. peter tetteroo says:

          I love chicks on ventilators!

        2. loupgarous says:

          Coughcough wolbachia coughcough

      2. eub says:

        Or smallpox as a viral example, which has stayed quite lethal while circulating in the human population for some thousands of years.

        Do epidemiologists have an understanding of which pathogens “attenuate themselves” and which don’t? Knowing none of that I’d expect each pathogen would have an ‘equilibrium’ tradeoff curve of lethality versus infectivity — maybe some can find more a lot more infectivity by reducing lethality while others can’t, or their infectivity is hard to beat already (smallpox R0 is 5+). And they may also have different kinetics in how they can mutate towards their achievable equilibrium optimum.

        Just shooting my mouth off, if SARS-CoV-2 gets good infectivity through its 90% moderate cases, it may have no ‘interest’ in reducing lethality in severe cases. Especially if severe clinical cases turn out to be largely an immune-response phenomenon and might not be shedding a lot of virions.

  14. DTX says:

    It sometimes presented as if the idea that virus is 10-20X more prevalent than we realized is good news (i.e., if the SF & LA data are correct).

    Early estimates of SARS Cov-2 R0 value were mostly ~2-3, suggesting it is far more infectious than influenza.

    If SARS Cov-2 R0 actually should be 10-20X higher than estimated, how is this reassuring? While the mortality rate will be lower, total #s of deaths remains concerning.

    We’ve seen what happens with SARS Cov-2 in retirement homes, i.e., death rates that far exceed influenza. If the SF & LA studies are correct, it won’t reassure me to know that this virus is far more infectious than any other virus we’ve ever seen (even measles has an R0 of “only” 12-16).

    1. Ian Malone says:

      I think you’re mixing up a high transmission rate (R0) with high prevalence. If the number of people who’ve had it is much higher than we know then the proportion of people who’ve had it who go on to die is lower. The maximum number of people who can have it is everybody, and a lower proportion of everybody dying of covid-19 is good. (We can only know R0, the transmission rate by looking at things like the increase in the rate of known cases, if our way of measuring that is only finding some of them then it may not affect R0 that much. If R0 really was much higher then number of deaths would be rising much quicker, since the (unknown) number of total infections would also be rising much quicker.)

  15. Toni says:
    (China’s early patients unable to shed coronavirus)
    Unfortunately I can’t find more detailed information, e.g. whether the RNA was sequenced or whether there are any shedding deficient mutants (somehow attenuated).
    Is it even known that viruses can no longer shedding themselves?

    1. Barry says:

      The headline is misleading; the writer doesn’t understand what “shed” means to an epidemiologist. These patients continue to test positive, continue to shed virus, continue to be contagious weeks after they’ve recovered from (symptomatic) disease.

      1. Toni says:

        then it means these patients are still contagious?

        1. Toni says:

          sorry, you already wrote it.

  16. cynical1 says:

    I wonder why researchers don’t use Calu-3 cells (lung carcinoma) in all their studies (which COVID-19/SARS-CoV-2 has been shown to infect) rather than Vero cells derived from the kidneys of African green monkeys. I mean, I wouldn’t necessarily expect antiviral effects to necessarily translate all that well to in vivo activity but I’d give it a better shot than Vero cells. Yes, I know that VeroE6 seem to give high replication rates but the reference below doesn’t make them look special. There is a reason that all that HIV work was done in lymphocytic cell lines like MT4, HeLa and PBLs and not in CHO cells over expressed with CCR5 or something to get it to infect them. Personally, I would only be interested in what the infectivity rates of these mutations show in Vero cell lines if I were an African green monkey……………with bad kidneys………….maybe.

    (SAR-CoV-2 infects a lot of different cell lines……….Cell, Volume 181, Issue 2, 16 April 2020, Pages 271-280.)

  17. Greg says:

    Perhaps this is why understanding not only the consequences of the mutations of the viral components such as the Spike protein also needs to be put in the context of host. For example, based on the known/reported susceptibility to SARS-Cov2, the mutation mapping of its putative receptor, ACE2, and the possible consequences these have on its interaction with the Spike protein has already been attempted in silico ( Apart from this, this analysis indicates that the list of species that can be potential carriers is perhaps larger than currently appreciated.

    1. cynical1 says:

      Yeah, like the tigers in the Brooklyn zoo which tested positive? (Or maybe a house cat?) BTW – I would have thought that social distancing from a tiger would have been pretty much of a given. I didn’t think that the tigers were in the petting zoo.

      1. Robert says:

        Bronx Zoo

  18. west coast says:

    Thank you so much Derek. I’m a random untrained person who is interested in how long this virus has actually been around and how virulent different strains might be. I’ve been looking at the genetic data and research while not understanding most of it. This is well and simply written and very understandable- at least; I can get what I want from it and it shows how to get more from other sources.

    I’m kind of wondering if it’s been around in humans a lot longer than anyone thinks right now and maybe a “starter” strain became virulent enough that people started noticing it. Just a thought from a random web surfer with no information or knowledge about any of this.

    1. Derek Lowe says:

      A lot of people have wondered the same thing! Genetic evidence argues against it, though – if it had been in humans for quite a while, it would have done its share of mutating (as we’re already seeing), but the earliest samples are very close to the animal virus isolates. So it looks like one that made the leap across species very recently indeed.

    2. Understanding these highly technical studies is pretty challenging and it doesn’t help that a lot of headlines create hype by exaggerating the significance of research studies.

  19. west coast says:

    Thanks. I thought they were not quite that similar, from what I’d read.

    Well, probably not, but if other people are still wondering that I would say maybe a way to possibly find out (without enough data from China) would be to test the flu samples from the north west coast much further back. I know people in far northern California (almost Oregon) that have been diagnosed with the flu without actually having tested positive for it. We had a very nasty version of it this year and last year, but a lot of people never seemed to catch it.

    Of course, if a mutation of a human strain did start it, would the tests even work on the original strain?

  20. Lauren Holland says:

    Also a non-science newb here. I was wondering if the mutations would be having an effect on the validity of current testing modalities… having read everything above, I’m now wondering if the mutations are responsible for some people developing post viral fatigue. I’ve been a write-off since early March and the western medical community is quite dismissive of the reality of my experience.

Comments are closed.