Skip to main content

Drug Assays

All Those Drug-Likeness Papers: A Bit Too Neat to be True?

There’s a fascinating paper out on the concept of “drug-likeness” that I think every medicinal chemist should have a look at. It would be hard to count the number of publications on this topic over the last ten years or so, but what if we’ve been kidding ourselves about some of the main points?
The big concept in this area is, of course, Lipinski criteria, or Rule of Five. Here’s what the authors, Peter Kenny and Carlos Montanari of the University of São Paulo, have to say:

No discussion of drug-likeness would be complete without reference to the influential Rule of 5 (Ro5) which is essentially a statement of property distributions for compounds taken into Phase II clinical trials. The focus of Ro5 is oral absorption and the rule neither quantifies the risks of failure associated with non-compliance nor provides guidance as to how sub-optimal characteristics of compliant compounds might be improved. It also raises a number of questions. What is the physicochemical basis of Ro50s asymmetry with respect to hydrogen bond donors and acceptors? Why is calculated octanol/water partition coefficient (ClogP) used to specify Ro50s low polarity limit when the high polarity cut off is defined in terms of numbers of hydrogen bond donors and acceptors? It is possible that these characteristics reflect the relative inability of the octanol/water partitioning system to ‘see’ donors (Fig. 1) and the likelihood that acceptors (especially as defined for Ro5) are more common than donors in pharmaceutically-relevant compounds. The importance of Ro5 is that it raised awareness across the pharmaceutical industry about the relevance of physico- chemical properties. The wide acceptance of Ro5 provided other researchers with an incentive to publish analyses of their own data and those who have followed the drug discovery literature over the last decade or so will have become aware of a publication genre that can be described as ‘retrospective data analysis of large proprietary data sets’ or, more succinctly, as ‘Ro5 envy’.

There, fellow med-chemists, doesn’t this already sound like something you want to read? Thought so. Here, have some more:

Despite widespread belief that control of fundamental physicochemical properties is important in pharmaceutical design, the correlations between these and ADMET properties may not actually be as strong as is often assumed. The mere existence of a trend is of no interest in drug discovery and strengths of trends must be known if decisions are to be accurately described as data-driven. Although data analysts frequently tout the statistical significance of the trends that their analysis has revealed, weak trends can be statistically significant without being remotely interesting. We might be confident that the coin that lands heads up for 51 % of a billion throws is biased but this knowledge provides little comfort for the person charged with predicting the result of the next throw. Weak trends can be beaten and when powered by enough data, even the feeblest of trends acquires statistical significance.

So, where are the authors going with all this entertaining invective? (Not that there’s anything wrong with that; I’m the last person to complain). They’re worried that the transformations that primary drug property data have undergone in the literature have tended to exaggerate the correlations between these properties and the endpoints that we care about. The end result is pernicious:

Correlation inflation becomes an issue when the results of data analysis are used to make real decisions. To restrict values of properties such as lipophilicity more stringently than is justified by trends in the data is to deny one’s own drug-hunting teams room to maneuver while yielding the initiative to hungrier, more agile competitors.

They illustrate this by reference to synthetic data sets, showing how one can get rather different impressions depending on how the numbers are handled along the way. Representing sets of empirical points by using their average values, for example, can cause the final correlations to appear more robust than they really are. That, the authors say, is just what happened in this study from 2006 (“Can we rationally design promiscuous drugs?) and in this one from 2007 (“The influence of drug-like concepts on decision-making in medicinal chemistry”). The complaint is that showing a correlation between cLogP and median compound promiscuity does not imply that there is one between cLogP and compound promiscuity per se. And the authors note that the two papers manage to come to opposite conclusions about the effect of molecular weight, which does make one wonder. The “Escape from flatland” paper from 2009 and the “ADMET rules of thumb” paper from 2008 (mentioned here) also come in for criticism on this point – binning averaged data from a large continuous set and then treated those as real objects for statistic analysis. Ones conclusions depend strongly on how many bins one uses. Here’s a specific take on that last paper:

The end point of the G2008 analysis is ‘‘a set of simple interpretable ADMET rules of thumb’’ and it is instructive to examine these more closely. Two classifications (ClogP<4 and MW<400 Da; ClogP>4 or MW>400 Da) were created and these were combined with the four ionization state classifications to define eight classes of compound. Each combination of ADMET property and compound class was labeled according to whether the mean value of the ADMET property was lower than, higher than or not significantly different from the average for all compounds. Although the rules of thumb are indeed simple, it is not clear how useful they are in drug discovery. Firstly, the rules only say whether or not differences are significant and not how large they are. Secondly, the rules are irrelevant if the compounds of interest are all in the same class. Thirdly, the rules predict abrupt changes in ADMET properties going from one class to another. For example, the rules predict significantly different aqueous solubility for two neutral compounds with MW of 399 and 401 Da, provided that their ClogP values do not exceed 4. It is instructive to consider how the rules might have differed had values of logP and MW of 5 and 500 Da (or 3 and 300 Da) had been used to define them instead of 4 and 400 Da.

These problems also occur in graphical representations of all these data, as you’d imagine, and the authors show several of these that they object to. A particular example is this paper from 2010 (“Getting physical in drug discovery”). Three data sets, whose correlations in their primary data do not vary significantly, generate very different looking bar charts. And that leads to this comment:

Both the MR2009 and HY2010 studies note the simplicity of the relationships that the analysis has revealed. Given that drug discovery would appear to be anything but simple, the simplicity of a drug-likeness model could actually be taken as evidence for its irrelevance to drug discovery. The number of aromatic rings in a molecule can be reduced by eliminating rings or by eliminating aromaticity and the two cases appear to be treated as equivalent in both the MR2009 and HY2010 studies. Using the mnemonic suggested in MR2009 one might expect to make a compound more developable by replacing a benzene ring with cyclohexadiene or benzoquinone.

The authors wind up by emphasizing that they’re not saying that things like lipophilicity, aromaticity, molecular weight and so on are unimportant – far from it. What they’re saying, though, is that we need to be aware of how strong these correlations really are so that we don’t fool ourselves into thinking that we’re addressing our problems, when we really aren’t. We might want to stop looking for huge, universally applicable sets of rules and take what we can get in smaller, local data sets within a given series of compounds. The paper ends with a set of recommendations for authors and editors – among them, always making primary data sets part of the supplementary material, not relying on purely graphical representations to make statistical points, and a number of more stringent criteria for evaluating data that have been partitioned into bins. They say that they hope that their paper “stimulates debate”, and I think it should do just that. It’s certainly given me a lot of things to think about!

13 comments on “All Those Drug-Likeness Papers: A Bit Too Neat to be True?”

  1. Yes, it’s a great paper and I am glad you highlighted it. For some reason it reminded me of Stephen Jay Gould’s famous essay “The Median Isn’t the Message”.
    Gould’s take-home message was that the true representation of a distribution is the distribution itself, not a metric like mean or median. Both Gould and the authors are making a similar point; only the raw, untransformed data can give us an accurate picture of reality.

  2. weirdo says:

    Yeah, definitely a paper whose time has come. Reminds me very much of a blog I read regularly about 4-5 years ago lamenting the very concepts related here — it went dormant years ago. Makes me think the blogmaster was Peter Kenny or a close associate.
    If you have read Nick Silver’s book you will certainly recognize the primary issue here. Too much data, too little understanding.

  3. Ed says:

    Great molecular crapshoot?

  4. weirdo says:

    Thanks Ed!
    I was thinking of the precursor to what is apparently the current permutation. Definitely something for me to catch up on.

  5. jd says:

    do you mean nate silver?

  6. weirdo says:

    jd– well, if I understand the statistics well enough, Nick and Nate are pretty much the same.
    But then maybe I’m a little rusty on my math.

  7. Anonymous says:

    The problem is not Lipinski’s or any of the subsequent points-to-consider rubrics, but their rapid evolution into the rule of 3/4/5 commandments that placed limits on medchemists’ imaginations and their ability to pursue interesting biological activity not fitting the rules. Now after a decade of stupidity, we are recognizing the obvious: a rule of thumb can be useful only when the dogmatism is left on the sidelines.

  8. Pete says:

    Data is a good servant but a poor master.

  9. on-ice-in-new-england says:

    Reader Exercise: Compare the quality and testability of data used to generate these papers with the quality and testability of data used for climate modeling. Recommend appropriate social policies. Be sure to explain your error bars.

  10. Post-Newtonian says:

    Great Bulletin, Peter (and Carlos).
    Responders to the Universe – Unite!
    Controllers of the Universe – your game is up…

  11. Pete says:

    Will the Controllers of the Universe be easily prised from their bunker?

  12. Rock says:

    I am still one to believe that following the suggestions in all of those papers will lead to higher quality compounds. Some people argue it is stochastic in nature; I say so be it. Why would you chose not to work in higher probability space if the target allows? If you have been around long enough, you have witnessed the perils of working in less desirable space for yourself. At the same time, you would be a fool not to determine the property boundaries of your series using experimental data. That includes LogP.

  13. InSilicoConsulting says:

    Agree with rock why work in less desirable chemical space when a lot of chemical space is unexplored, even within the boundaries of such thumb rules?
    The idea is to minimize downstream risks as early as possible. Noone claims that these simple rules help with lead optimization. If they did, what would a medchemist do?
    For some more trends

Comments are closed.