Skip to main content

Mutants Among Us

Looking back on the Human Genome Project, it’s becoming increasingly odd to read the articles at the time about how we were about the sequence “the human genome”. As if there were only one! Even at the time, of course, it was clear that we weren’t capturing human genetic diversity, but that was seen as a much longer-term goal. The big thing was getting the first sequence, for the first time, which is fair enough.

But the popular press coverage of the event didn’t always emphasize this, which has put some semi-contradictory ideas into the heads of the lay public. Pretty much everyone out there has taken on board the idea that everyone’s DNA is unique, thanks to the ubiquity of forensic genotyping in the news and on TV shows. At the same time, pretty much everyone knows that we’ve Sequenced The Human Genome. I’d hope that nonscientists ended up with a mental picture of mostly-the-same-but-subtly-different, but I’m not sure. But if they haven’t, I think that they’re going to get plenty of chances to hear about it – more each year – because the steadily increasing power and scope of sequencing technology is allowing us to really get to work in the zone between those two concepts. That, as it turns out, is where a lot of the promise that attached to the original human sequencing really resides.

Here’s a paper in Nature Biotechnology (from a large multicenter team) that illustrates the point (commentary in that issue here).We can learn a great deal from “human knockout” phenotypes, which can illustrate the effect of key genes several ways. You can see a sudden loss of function (usually a bad thing, as in Mendelian diseases, but sometimes a reasonably good one, like loss of PCSK9). And by looking even more carefully, you can find a few people who have such loss-of-function mutations but for some reason are still fine. Both of these strongly suggest avenues of potential treatment, and perhaps not only for the (often rare) Mendelian diseases themselves.

This new paper is the sort of thing that would have been completely impossible not so long ago (which is the sort of line that you can insert into most current stories about gene sequencing). It starts with a look at over 589,000 individuals, with varying degrees of genetic detail available, and narrows down, in a series of more rigorous stages, to try to find those participants who have known disease-bearing mutations but who still show no signs of the corresponding disease. It isn’t easy, because there aren’t many.

There’s good news and bad news. The good news is that out of that large starting set, the team managed to find 13 individuals who have apparently escaped their genetic destinies for 8 severe diseases. The bad news is that they can’t study any of these people further! Why might that be, you ask? Paperwork!

We were unable to recontact any of the 13 candidate resilient individuals identified in this study, often due to the absence of a recontact clause in the original informed consent forms used for the studies from which these individuals were identified. Although recontact was possible for some cohorts in this study (e.g., Mount Sinai School of Medicine Biobank), no candidates were identified from those cohorts. Given this, we were unable to perform additional critical preprocessing steps to further confirm the resilient status of these individuals. Such steps would include confirming that the analyzed DNA matched the correct medical records for each individual, that they had not been diagnosed with the indicated Mendelian disorder, and that they were not mosaics. We consider these preprocessing steps as critical in order to formally characterize candidates as truly resilient.

Thus does science march on, waist-deep in consent forms. Even as it stands, though, this study is a valuable look into both the riddle of genetic resistance and into just how to deal with finding such people in large data sets. It’s a series of tradeoffs: time versus thoroughness, cost versus speed. Even with today’s sequencing techniques, you’re not going to start off by completely sequencing 500,000 people. The techniques used here probably missed some interesting cases, but that was in the service of getting to the ones that they did find. If you don’t really lean on the data along the way, you’re going to go off into the swamp:

The utility of a high-impact screening panel depends directly on rigorous informatics processes and clinical review. Less than 1% of the candidates we initially identified from the screening panel survived our filtering criteria. More than 75% of the initial candidates identified were filtered out due to errors in variant calls resulting from low coverage that made it difficult to reliably call homozygous genotypes, high GC or AT content known to lead to higher sequencing-error rates, or from repetitive sequences known to lead to alignment errors that in turn lead to false small insertion or deletion calls. The remaining false positives represented candidates that failed to pass our established clinical presentation criteria, harbored mutations that were inaccurately represented in the mutation databases, or for which there was insufficient scientific evidence to support the predicted phenotypic impact of the mutation.

But what you’re left with is potentially incredibly worthwhile – if you can contact the people involved, of course. We’re left with what sounds like the pitch for a (probably not very good movie): walking among us are 13 mutant humans, able to fight off what should be crippling genetic defects. And we don’t know how. But we don’t know who they are, or if we can ever find them again. . .

31 comments on “Mutants Among Us”

  1. Mark Thorson says:

    They’re obviously spies from an alien civilization that inadvertently gave them genomes that wouldn’t normally work, and these guys have discovered the means to root them out — if it wasn’t for those darn consent forms. Now, there’s a movie.

  2. milkshaken says:

    their ability to see in the microwave region of spectra allows the individuals to observe you through the wall, and flee before you can approach them with a consent form

    1. John Wayne says:


  3. Old Pump Kicker says:

    Science advances one funeral^H^H^H^H^H^H^Hconsent form at a time.

    1. Some idiot says:

      Good to see another pre-Windows user…! 🙂

  4. Me says:

    David Icke could probably write a book on them…

  5. HTSguy says:

    The general question here is one of gene penetrance. Here’s an interesting look at the penetrance of variants of a single gene in humans:

  6. luysii says:

    It’s much worse than that. Each of us probably has 205 nonSynonymous variants in our protein coding genes (based on exon sequencing of 2,439 people). Amazingly, over 500,000 single nucleotide polymorphisms were were found (of which 60% were nonSynomous — chance would have predicted about 30%) in the cohort. For details please see —

  7. Anon says:

    Doesn’t this just mean that there are too many genetic variables (degrees of freedom) with too few people on the planet (observations) to draw any statistically significant inference?

  8. a. nonymaus says:

    I’m waiting for the discovery of a gain-of-function mutation in the gene for L-gulonolactone oxidase.

    1. Paul D. says:

      Given all the changes and deletions that have accumulated in that pseudogene in the last 63 million years that would be quite a mutation.

    2. Curious Wavefunction says:

      Why mutate a gene when you can simply follow Linus’s advice.

  9. rcyran says:

    The silver lining is the plummeting price and rising adoption of sequencing means these (and plenty more) interesting mutations will be uncovered in a few years.

  10. You appear to be leaving out the possibility that some of these represent errors in mapping DNA samples to patient records; without being able to resample the patients, that can’t be ruled out. Some may also represent mosacism, which would be interesting.

    500K complete human exomes determined (from either exome sequencing or derived from complete genomes) is probably just around the corner — the installed base of Illumina HiSeq X10s/X5s is probably in excess of 100K genomes/year now (~17K genomes/X10 at full crank; half that for X5 — and both Broad and Venter have more than 2 X10 complements), and there are a number of groups cranking out exomes at ferocious rates. 500K exomes available for one researcher to peer at may take a bit longer, given that private efforts won’t share with everyone, but five years from now it is likely that several million exomes will be available for analysis, given broader adoption of genome sequencing and further cost reductions from technology developments.

  11. Derek Freyberg says:

    The consent form isn’t just a matter of legal nit-picking either.
    I may be quite happy to have my genome, or part of it, sequenced for a particular well-controlled purpose, but I would be very unhappy to have my genome out there in public to be used for whatever nefarious purposes it could possibly be used for – like denying me health insurance or life insurance as a “bad risk” genetically, or perhaps denying my family the same based on an assumption that we share whatever characteristic that causes me to be denied, or denying me employment because a prospective employer thinks I might use excessive sick leave or cost too much in health insurance premiums.
    Considering how other personal information (mostly financial) has been data-mined, I don’t think being very protective of your genomic information is unjustified.

    1. Mark Thorson says:

      He’s one! He’s an alien! Get him!

      1. Derek Freyberg says:

        I assure you I’m not – I was naturalized in 1984 (make of that what you may).

    2. future retro-ist says:

      If you haven’t seen it, the movie GATTACCA (not sure on the spelling) explores this concept quite well.

      1. Pennpenn says:

        Even as a child I hated the main character’s parents in that movie. Yep, “trust in God” for your first child, and when that doesn’t work just get a properly fixed-up kid and leave your first child a barely employable menial with a miserably short lifetime of abuse and/or crime to suffer through. Sure, drama demands they be that they be incredibly stupid that one time for the story to work, but even so…

      2. Mark says:

        Gattaca ref

        Fascinating concept and all too plausible.

    3. Joe Strummer says:

      According to the AACR, there’s a move in Congress to overturn GINA, so concerns about the privacy of genetic data are reasonable. The AACR lawyer who had said this publicly later confirmed privately that this is indeed an effort driven by Republicans on behalf of insurance companies.

    4. Sarah Grady says:

      Goodness, I call my insurance company once a week to track down what they lost internally, if somebody over there would get their stuff together enough to discriminate against me based on my genes, they would have to violate the ACA to do so, and I would be rather impressed based on their current state of mismanagement.

      1. Sean Fearsalach says:

        That looks as if it would be an interesting article if it was written in English

      2. tangent says:

        I will guess that your insurance company’s mistakes manage to result in them paying out more stingily? When they have the opportunity to raise premiums, I’m confident in their ability to get the details right!

      3. eyesoars says:

        Given that mismanagement always seems to work for their financial benefit*, I’m entirely unsurprised by the amount of incompetence+ I see weekly from my insurance company. Their ability to lose my records every six months, and require multiple hours of on-the-phone time with myself, my spouse, multiple fax machines, and my healthcare providers in pursuit of treatment and payment are of nightmarish proportions, perhaps even comparable to the bonuses their CEO receives.

        Should that change, and benefits accrue from competence in retaining and analyzing data, I am far less than entirely convinced they will maintain their current levels of deliberate, blundering incompetence.

        * At least until they are sued (again) by multiple states’ attorneys general for criminal negligence.
        + Although one might accept the common axiom, “Never ascribe to malice that which can be explained by incompetence,” one does eventually wonder why there does seem to be so much motive and consistency behind the apparent incompetence. Eventually one happens upon the theory that their behavior is driven by undiluted greed, and finds its explanatory power nearly perfect.

        1. Pennpenn says:

          Greed is nothing if not incredibly stupid.

  12. There’s an interesting meta-study to be done here, that should not actually be that difficult to rough out.

    What’s the probability that that these 13 resilient individuals are actually all clerical errors? Are they clustered in a small number of cohorts (that may have had shoddier paperwork) or is is in fact perfectly reasonable that none of them happened to pop up in a cohort with the relevant consent forms?

  13. Alteredego says:

    As someone who does genetic modifiers screens for a living I was very underwhelmed by these results. If you had asked me the expected benefit of large-scale sequence mining BEFORE reading this paper I would have been enthusiastic, not so anymore. Most of the data (~400,000/~600,000) and a majority of the ‘hits’ (8/13) come from 23andMe which has a self-selection bias of ‘people interested in personal genetics’ including, wait for it…. patients with genetic disease. Of the other hits only ONE was re-sequenced for validation. There is also significant talk of ‘N of 1 decoding’, yes you read that correctly, which is the only way to carry forward from this type of low yield fishing expedition. More likely this is a flailing money grab by Anne Wojcicki (CEO of 23andMe) to get grant agencies/foundations to pay for a million sequencing kits. IMHO….

  14. Li Zhi says:

    Reminds me of one “problem” with lab notebooks (a painful subject). Trying to mine old lab experiments (as recorded in the books) to answer new questions. As far as I can remember, over 35 years never once was a new question answered with old data. (Although suggestive directions to answers might be inferred). On the meta-level, it’s a fools errand to attempt to have all (future) eventualities “locked-down”. The perfect being an enemy of the good. I suppose few here need this pointed out but information is filtered data, necessary and sufficient is not complete. In this context, who really thinks adding an additional 3 or 5 forms (in order to capture the next layer of (possible future) concerns) wouldn’t be a waste of resources in most (but not all) cases (as well as dissuade some volunteers)? Hindsight is 20-20, but how likely is it that data patched together in this type of meta-study is of sufficiently high quality to justify the extra creation, collection and storage (and retrieval) requirements of “one more form”?

  15. chiz says:

    Maybe the 13 individuals are chimeras? If the mutation is present in the source of the DNA sample (cheek,say) but absent in the relevant tissues then there is no mystery. We still have no idea often chimerism occurs – it could be very rare or even very common but 13 out of 500k is possible.

Comments are closed.