Skip to main content

In Silico

Standards of Proof

Here are some slides from Anthony Nicholls of OpenEye, from his recent presentation here in Cambridge on his problems with molecular dynamics calcuations. Here’s his cri du coeur (note: fixed a French typo from the original post there):

. . .as a technique MD has many attractive attributes that have nothing to do with its actual predictive capabilities (it makes great movies, it’s “Physics”, calculations take a long time, it takes skill to do right, “important” people develop it, etc). As I repeatedly mentioned in the talk, I would love MD to be a reliable tool – many of the things modelers try to do would become much easier. I just see little objective, scientific evidence for this as yet. In particular, it bothers me that MD is not held to the same standards of proof that many simpler, empirical approaches are – and this can’t be good for the field or MD.

I suspect he’d agree with the general principle that while most things that are worthwhile are hard, not everything that’s hard is worthwhile. His slides are definitely fun to read, and worthwhile even if you don’t give a hoot about molecular dynamics. The errors he’s warning about apply to all fields of science. For example,he starts off with the definition of cognitive dissonance from Wikipedia, and proposes that a lot of the behavior you see in the molecular dynamics field fits the definitions of how people deal with this. He also maintains that the field seems to spend too much of its time justifying data retrospectively, and that this isn’t a good sign.
I especially enjoyed his section on the “Tanimoto of Truth”. That’s comparing reality to experimental results. You have the cases where there should have been a result and the experiment showed it, and there shouldn’t have been one, and the experiment reproduced that, too : great! But there are many more cases where only that first part applies, or gets published (heads I win, tails just didn’t happen). And you have the inverse of that, where there was nothing, in reality, but your experiment told you that there was something. These false positives get stuck in the drawer, and no one hears about them at all. The next case, the false negatives, often end up in the “parameterize until publishable” category (as Nicholls puts it), or they get buried as well. The last category (should have been negative, experiment says they’re negative) are considered so routine and boring that no one talks about them at all, although logically they’re quite important.
All this can impart a heavy, heavy publication bias: you only hear about the stuff that worked, even if some of the examples you hear about really didn’t. And unless you do a lot of runs yourself, you don’t usually have a chance to see how robust the system really is, because the data you’d need aren’t available. The organic synthesis equivalent is when you read one of those papers that do, in fact, work on the compounds in Table 1, but hardly any others. And you have to play close attention to Table 1 to realize that you know, there aren’t any basic amines on that list (or esters, or amides, or what have you), are there?
The rest of the slides get into the details of molecular dynamic simulations, but he has some interesting comments on the paper I blogged about here, on modeling of allosteric muscarinic ligands. Nicholls says that “There are things to admire about this paper- chiefly that a prospective test seems to have been done, although not by the Shaw group.” That caught my eye as well; it’s quite unusual to see that, although it shouldn’t be. But he goes on to say that “. . .if you are a little more skeptical it is easy to ask what has really been done here. In their (vast) supplementary material they admit that GLIDE docking results agree with mutagenesis as well (only, “not quite as well’, whatever that means- no quantification, of course). There’s no sense, with this data, of whether there are mutagenesis results NOT concordant with the simulations.” And that gets back to his Tanimoto of Truth argument, which is a valid one.
He also points out that the predictions ended up being used to make one compound, which is not a very robust standard of proof. The reason, says Nicholls, is that molecular dynamics papers are held to a lower standard, and that’s doing the field no good.

9 comments on “Standards of Proof”

  1. johnnyboy says:


  2. OldLabRat says:

    Very droll slides, dripping with irony. Having used OpenEye applications for a long time, and having tested their claims, all the lamentations about cognitive dissonance, data selection and needing to do better science apply to OpenEye as well.
    The bottom line hasn’t changed: when doing experiments, know the limitations of the tools, equipment, etc.; test the null hypothesis and evaluate results. Design and run the next experiment. Repeat.
    I do agree that MD, and any other computational methods should be held to the same standard. While journals could do a better job, it’s really up to the scientists. If nothing else, continued exhortations to do science are always welcome.

  3. Anonymous says:

    MD = Made-up Data

  4. MikeC says:

    I was really excited about protein molecular dynamics back when I was an undergrad – and then I saw a lecture by van Gunsteren (the developer of GROMOS) on the limitations of MD and the forcefields then in use. He definitely wasn’t interested in sugarcoating the situation, and showed for many of the problems he was interested in back of the envelope was outcompeting the bleeding edge jobs getting submitted to our National Center for Super-Duper Computers.
    For me it was a good wake up call.

  5. luysii says:

    The current state of molecular dynamics may be like the early days of functional magnetic resonance imaging (fMRI), where the authors invariably confirmed their initial hypotheses. It became known as pseudocolor phrenology, but the visual images were compelling (as they invariably are, what with 1,000,000 axons leaving each eye and nearly half our brain involved in analyzing what they send).

  6. Brock says:

    @MikeC — exactly. I am not sure if MD will classical FF will ever “work”. But without Wilfred (and Herman and Andy and other honest Scientists) we will never know
    @OldLabRat “apply to OpenEye as well.”. Very observant. I think they are trying to lead — but they have some ‘splainin to do — and a web-site to do it on 🙂

  7. FEP convert says:

    Anthony makes some good points, and as recently as 12-18 months ago I would have agreed that MD was very long on promise and very short on verifiable delivery of results that drug hunters could actually use. Then along came some significant advancements in FEP methodology and analysis tools, QM-based custom forcefield parameterization, plus better hardware that brought the computational cost to a fraction of the cost of synthesis and assay. All of a sudden, time-after-time prospective results were out-performing not only traditional modeling techniques but my best medchem and structure-based design experience. I’ve never seen anything like it. There’s still some way to go before this method is black-box, but it’s good enough for my group to be monetizing it now in the form of more efficient HTL and lead optimization. Give it a couple of years, and I predict that the rigorous evaluation Anthony demands will be forthcoming in the literature, and this tool will be coming to an industrial computer-assisted drug design group near you.

  8. Thanks for the thoughtful comments, Derek. As for Brock and the OldLabRat’s comments- I would not claim OpenEye has always been without sin; we were definitely more about hope over experience at the beginning. However, for the last five or so years we have been learning and trying, which is why I hold the opinions I now do about MD. We’ve continued to sponsor prospective competitions- which no one else does, organized ACS sessions on validation, a GRC on statistics and published what we consider are careful comparisons. Those who carp typically have done none of this, nor contributed to any of this. As for the FEP convert- I am looking forward to seeing if the considerable investment in FEP by Schrodinger, to which I am sure they are referring, has actually paid off. As I quote in my presentation- it’s not science until someone else does it. And I do hope when these papers finally do appear they are reproducible. Extraordinary claims require extraordinary proof.

  9. brock says:

    @Ant (#8)
    The SAMPL challenges (and *some* of the related) are invaluable. Industry should do more to support (not control)
    We are a very young “Science” — I still remember 78 – 6th grade for me. But that was the CECAM conference — right? ($100 bucks to anybody who can prove Andy stole time on the CECAM computers to run the BPTI C-alpha simulation )
    But you really need to look past Woody — We know QM, we know Boltzmann — we are just waiting for computers to catch up (?)
    (I say QM and not Schrodinger b/c I find the matrix formulation vs. Integral formulation much more amenable to computers).
    Physics will win. But comprise is necessary in the stone age (empirical models).
    P..S. I left Mike out of my list of honest Scientist and find it quite despicable how he is being ripped off by OE (Again !!!)
    Yes — he is an “Academic” — but he is kicking your ass.
    Not unlike when he crushed you in Barry’s lab
    I have already booked my hotel for CUP 🙂

Comments are closed.