Skip to main content

Drug Assays

A New Way to Estimate a Compound’s Chances?

Just a few days ago we were talking about whether anything could be predicted about a molecule’s toxicity by looking over its biophysical properties. Some have said yes, this is possible (that less polar compounds tend to be more toxic), but a recent paper has said no, that no such correlation exists. This is part of the larger “Rule of 5” discussion, about whether clinical success in general can be (partially) predicted by such measurements (lack of unexpected toxicity is a big factor in that success). And that discussion shows no sign of resolving any time soon, either.
Now comes a new paper that lands right in the middle of this argument. Douglas Kell’s group at Manchester has analyzed a large data set of known human metabolites (the Recon2 database, more here) and looked at how similar marketed drugs are to the structures in it. Using MACCS structural fingerprints, they find that 90% of marketed drugs have a Tanimoto similarity of more than 0.5 to at least one compound in the database, and suggest that this could be a useful forecasting tool for new structures.
Now, that’s an interesting idea, and not an implausible one, either. But the next things to ask are “Is it valid?” and “What could be wrong with it?” That’s the way we learn how to approach pretty much anything new that gets reported in science, of course, although people do tend to take it the wrong way around the dinner table. Applying that in this case, here’s what I can think of that could be off:
1. Maybe the reason that everything looks like one of the metabolites in the database is that the database contains a bunch of drug metabolites to start with, perhaps even the exact ones from the drugs under discussion? This isn’t the case, though: Recon2 contains endogenous metabolites only, and the Manchester group went through the list removing compounds that are listed as drugs but are also known metabolites (nutritional supplements, for the most part).
2. Maybe Tanimoto similarities aren’t the best measurement to use, and overestimate things? Molecular similarity can be a slippery concept, and different people often mean different things by it. The Tanimoto coefficient is the ratio of shared features of two molecules to their unique features, so a Tanimoto of 1 means that the two are identical. What does a coefficient of 0.5 tell us? That depends on how those “features” are counted, as one could well imagine, and the various ones are usually referred to as compound “fingerprints”. The Manchester group tried several of these, and settled on the 166 descriptors of the MACCS set. And that brings up the next potential problem. . .
3. Maybe MACCS descriptors aren’t the best ones to use? I’m not enough of an informatics person to say, although this point did occur to the authors. They don’t seem to know the answer, either, however:

However, the cumulative plots of the (nearest metabolite Tanimoto similarity) for each drug using different fingerprints do differ quite significantly depending on which fingerprint is used, and clearly the well-established MACCS fingerprints lead to a substantially greater degree of ‘metabolite-likeness’ than do almost all the other encodings (we do not pursue this here).

So this one is an open question – it’s not for sure if there’s something uniquely useful about the MACCS fingerprint set here, or if there’s something about the MACCS fingerprint set that makes it just appear to be uniquely useful. The authors do note in the paper that they tried to establish that the patterns they saw were “. . .not a strange artefact of the MACCS encoding itself.” And there’s another possibility. . .
4. Maybe the universe of things that make this cutoff is too large to be informative? That’s another way of asking “What does a Tanimoto coefficient of 0.5 or greater tell you?” The authors reference a paper (Baldi and Nasr) on that very topic, which says:

Examples of fundamental questions one would like to address include: What threshold should one use to assess significance in a typical search? For instance, is a Tanimoto score of 0.5 significant or not? And how many molecules with a similarity score above 0.5 should one expect to find? How do the answers to these questions depend on the size of the database being queried, or the type of queries used? Clear answers to these questions are important for developing better standards in chemoinformatics and unifying existing search methods for assessing the significance of a similarity score, and ultimately for better understanding the nature of chemical space.

The Manchester authors say that applying the methods of that paper to their values show that they’re highly significant. I’ll take their word for that, since I’m not in a position to run the numbers, but I do note that the earlier paper emphasizes that a particular Tanimoto score’s significance is highly dependent on the size of the database, the variety of molecules in it, and the representations used. The current paper doesn’t (as far as I can see) go into the details of applying the Baldi and Nasr calculations to their own data set, though.
The authors have done a number of other checks, to make sure that they’re not being biased by molecular weights, etc. They looked for trends that could be ascribed to molecular properties like cLogP, but found none. And they tested their hypothesis by running 2000 random compounds from Maybridge through, which did indeed generate much different-looking numbers than the marketed drugs.
As for whether their overall method is useful, here’s the Manchester paper’s case:

. . .we have shown that when encoded using the public MDL MACCS keys, more than 90 % of individual marketed drugs obey a ‘rule of 0.5’ mnemonic, elaborated here, to the effect that a successful drug is likely to lie within a Tanimoto distance of 0.5 of a known human metabolite. While this does not mean, of course, that a molecule obeying the rule is likely to become a marketed drug for humans, it does mean that a molecule that fails to obey the rule is statistically most unlikely to do so.

That would indeed be a useful thing to know. I would guess that people inside various large drug organizations are going to run this method over their own internal database of compounds to see how it performs on their own failures and successes – and that is going to be the real test. How well it performs, though, we may not hear for a while. But I’ll keep my ears open, and report on anything useful.

35 comments on “A New Way to Estimate a Compound’s Chances?”

  1. Anon says:

    Wow, this looks like a great way forward for me-too drug developers. I haven’t read the paper, but I’m guessing there is an argument that this supports their argument that there are is no such thing passive diffusion – drugs are active because they ride exclusively on uptake transporters?

  2. Morten G says:

    But many of the drugs from the “Golden Age of Drug Discovery” were basically developed by making analogues of known metabolites, right?

  3. Anon says:

    @2 exactly. Antimetabolites, etc. Did they perform an analysis of post-2000 registered agents, for example?

  4. Haven’t read the paper yet but have similar questions as yours. One of the well-known limitations of the Tanimoto coefficient is that it doesn’t take asymmetry into account. For instance, if A is a smaller molecule and B is a larger molecule and if A looks like a substructure of B, human perception can easily say that A is more similar to B than B is similar to A. The Tanimoto does not take this lopsidedness into account. The Tversky coefficient does. In other words, size dependence is built in into Tversky but not Tanimoto.
    As for MACCS keys, I usually like to use ECFP4 fingerprints instead since unlike MACCS, they take local bond topologies into account and therefore consider the atom environment better in my opinion. This makes them one of the most popular metrics for similarity searches. In some sense they are more accurate than MACCS, and this often makes Tc values calculated with them smaller than those with MACCS. In general the distribution of Tc values for MACCS and ECFP4 would be very different for a given similarity search.
    To summarize, I would be quite interested to know how different the results of this study would be if the authors used Tversky and ECFP4.

  5. Cellbio says:

    Question for the chemists: What is the proportion of a common screening deck that would be >0.5 (today’s smaller decks, not the screen the universe decks of years ago)? Similarly, once deep into SAR where ADME properties are a driver of chemistry, does one spend much time below 0.5? Thanks.

  6. sim_2d says:

    While they seems to do some work on the 0.5 cut-off, releasing a big statement + press release based on a single value may need more justification on that value.
    From the text (open access) it seems that they picked 0.5 MACCS because it was covering 90% of drug and thus be similar to Ro5:
    “This ‘rule’, by which the very great majority (90 % of)
    drugs are within a Tanimoto distance of 0.5 in MACCS
    fingerprint space, may be viewed in the context of the wellknown
    ‘rule of 5’ (Lipinski et al. 1997) (Ro5) mnemonic
    for predicting drug lead quality”
    So the descriptors is more there to show what they are looking for, 0.5/90% more than a “scientific” reason. It is a rule that is not going to remove many compounds, but has anyone being able to download the recon database in format that is readable?
    As an example, if one takes ChEMBL, 95% of GPCR alpha1a compounds have a similarity > 0.5 to a kinase p38a compound.

  7. sim_2d says:

    For the difference between ECFP4 and MACCS (still with tanimoto) is in the paper, they used RDKit implementation of ECFP4 (and FCFP4). With those the distribution is on the left (low values), and doesn’t have much compounds with score > 0.5, which seems to be what they were looking for. From the graphs it could have been 90% of the drug with ECFP4 > 0.18

  8. LeeH says:

    I agree with Curious. ECFP4 tends to be a much better fingerprint. Specifically, a Tanimoto of 0.5 using MACCS will tend to cause many things to be similar that calculation or the MedChem Eye would not. This calibration is supported by published work at Abbott, where the authors looked at the probability that pairs of active compounds would be within a given chemical distance. Using MACCS, you had to go to a very low threshold, and the probability that a compound was active, given its proximity (above that threshold) to an active compound, was low. ECFP4 or 6 was much better (I’ve seen this effect often in my own experience). Also, the utility of this method would be low, given the small number of known metabolites.
    Personally, I think a compound pair analysis would be much more useful here.

  9. chris says:

    @6 “but has anyone being able to download the recon database in format that is readable?”
    Seems to be a Matlab file?
    I don’t have Matlab on my laptop but if I have time I’ll have a look tonight.

  10. OldLabRat says:

    As pointed out by the previous comments, the use of the public MACCS keys is perhaps not a great choice. As an example, depending on the size, two ligands can have a Tanimoto similarity of 1 using the public MACCS keys, but can differ by one atom. Fl vs. Cl on a phenyl ring, or pyridine isomers are classic examples.
    All one can really say about the Maybridge sets is that they don’t look like drugs, using MACCSS keys.

  11. DJN says:

    @2,3 From the method it seems they only used drugs that were approved in November 2013 (1491 molecules).

  12. Pete says:

    My understanding of MACCS keys is that they are based on the presence in the molecule of specific substructural elements. This means that they can pick up similarity even when the substructural elements are shuffled. This tends to be less of an issue for path-based fingerprints.
    With something like this, I’d want to see an analysis of the the sensitivity of the results to the similarity threshold. Also different thresholds tend to apply to different structural series when using fingerprint-based molecular similarity (I can elaborate later if people are interested. One the subject of series, I’d expect all compounds in a series to either satisfy or fail to satisfy the criteria on which this model is based. How would we use such a model in these situations? This is bit like asking how to use Ro5 when all compounds in series are compliant.

  13. anonymous says:

    Unclear to me… did they test the null hypothesis? Ie. is their criteria significantly enriched in marketed drugs over failed ones (that made into clinic and we know their structure,…)

  14. sim_2d says:

    @Pete. I think you may find some answer in the paper and graphs (open access). And here they want to say 90% of the drugs are covered, like R05, which seems to be what they aim for, something general enough that it covers most of the drugs but should be ok to remove less less desirable compounds. And again as their goal, Ro5, the sensitivity is not that good, it is easy to find compounds which are compliant (all depends on what one is looking for, and actually know what the result means).

  15. Chris Swain says:

    @13 I suspect this gives you the answer..
    “While this does not mean, of course, that a molecule obeying the rule is likely to become a marketed drug for humans”

  16. anonymous says:

    In the context of the current discussion, the following information regarding definition of 166 MACCS keys might be useful:
    Key Description
    2 103 1
    119 N=A
    120 HETEROCYCLIC ATOM > 1 (&…)
    122 AN(A)A
    123 OCO
    124 QQ
    125 AROMATIC RING > 1
    126 A!O!A
    127 A$A!O > 1 (&…)
    128 ACH2AAACH2A
    129 ACH2AACH2A
    130 QQ > 1 (&…)
    131 QH > 1
    132 OACH2A
    133 A$A!N
    134 X (HALOGEN)
    135 Nnot%A%A
    136 O=A > 1
    138 QCH2A > 1 (&…)
    139 OH
    140 O > 3 (&…)
    141 CH3 > 2 (&…)
    142 N > 1
    143 A$A!O
    144 Anot%A%Anot%A
    145 6M RING > 1
    146 O > 2
    147 ACH2CH2A
    148 AQ(A)A
    149 CH3 > 1
    150 A!A$A!A
    151 NH
    152 OC(C)C
    153 QCH2A
    154 C=O
    155 A!CH2!A
    156 NA(A)A
    157 C-O
    158 C-N
    159 O > 1
    160 CH3
    161 N
    162 AROMATIC
    163 6M RING
    164 O
    165 RING
    The definition of 166 MACCS keys shown above uses the following atom and bond symbols to define atom and bond environments:
    Atom symbols:
    A : Any valid periodic table element symbol
    Q : Hetro atoms; any non-C or non-H atom
    X : Halogens; F, Cl, Br, I
    Z : Others; other than H, C, N, O, Si, P, S, F, Cl, Br, I
    Bond types:
    – : Single
    = : Double
    T : Triple
    # : Triple
    ~ : Single or double query bond
    % : An aromatic query bond
    None : Any bond type; no explicit bond specified
    $ : Ring bond; $ before a bond type specifies ring bond
    ! : Chain or non-ring bond; ! before a bond type specifies chain bond
    @ : A ring linkage and the number following it specifies the atoms position in the line, thus @1 means linked back to the first atom in the list.
    Aromatic: Kekule or Arom5
    Kekule: Bonds in 6-membered rings with alternate single/double bonds or perimeter bonds
    Arom5: Bonds in 5-membered rings with two double bonds and a hetro atom at the apex of the ring.

  17. LeeH says:

    While it is true that two compounds can have a MACCS similarity of 1 and be slightly different, this is not the problem. Extremely similar compounds are like to have similar biological properties (we’re talking similarity here). The problem is what happens at the lower limit of the allowed range. As I explained above, 0.5 for MACCS is probably too inclusive (i.e. you’ll get too many false positives if predict metabolite-likeness).
    By the way, has anyone actually used metabolite-likeness as a design criterion? The authors reference a single paper in Drug Discovery Technology (, with one of the authors being in both papers, but there doesn’t seem to be any further validation of the idea.

  18. TX raven says:

    @ 15, 13
    The full quote is:
    “While this does not mean, of course, that a molecule obeying the rule is likely to become a marketed drug for humans, it does mean that a molecule that fails to obey the rule is statistically most unlikely to do so.”
    Actually, it would only mean so if they followed this model. Which I doubt, since molecules don’t seem read the chemical literature 🙂
    Oh, these poor little illiterate things…

  19. john delaney says:

    A tanimoto of 0.5 with public maccs keys is plain nuts. I wouldn’t go lower than 0.8 – there’s plenty of literature going back to the mid nineties on validating structural fingerprints/keys and I’ve never seen anyone use a value this low.

  20. Molecule Hacker says:

    Like a lot of computational work, this paper lacks proper control experiments.
    I just constructed a completely random set of 1000 molecules taken from commercial sources. I’ll call this set “Magic Beans”. 88% of marketed drugs in the Drug Bank set have a Tanimoto coefficient of at least 0.5 to at least one molecule in Magic Beans. Is Magic Beans a predictor of drug-likeness?
    One wonders how this paper would have been reviewed in a computational journal.

  21. Sim_2d says:

    As mentioned earlier I do think that went down to 0.5 because of 2 reasons, 0.5 looks like a nice round number (easier to remember than 0.82 or 0.36); and they wanted to have 90% (or around that number) passing the filter (at last it was the graphs suggest). And the significance of 0.5 with one reference paper is just a bit justification for scientific purposes but not that important.
    Now is it useful? Ro5 were not selective at all, millions of compounds would pass the filters but it is widely used (I think for good reasons). So may be this study will prove useful, or will be forgotten after a big web presence.

  22. Carl Lumma says:

    #20 Indeed, the result is meaningless without controls. The string “control” doesn’t appear in the paper.

  23. TX raven says:

    @ 21
    I wonder… what criteria do you use to state that Ro5 has been useful?

  24. sim_2d says:

    @TX raven
    MAy have stopped people going bigger and bigger with combi-chem and large HTS collection, buying stuff because it can be made, and then struggle to find decent hit from expensive campaign. And now people are moving to smaller size, lead or fragment, but I think follow a bit the spirit of small molecules have more potential than something big (if you don’t work in peptide, natural product based). Did it bring more drugs, may be not, but does it cover the drugs made before, yes. It is a guideline and a description of what was done, but something that you don’t need apparently.

  25. Rajarshi says:

    Interestingly, they hashed the 166 bit MACCS fingerprint to a 1024 bit fingerprint. Clearly the identity of the keys is not important in this application. All the more reason then, to use a circular fingerprint

  26. TX raven says:

    @ sim_2d
    “Did it bring more drugs, may be not..”
    I rest my case.
    Being popular is one thing, being useful is another.
    When we sacrifice reality for a catchy name and an easy to understand concept (e.g., cLogP as a perpetrator of every drug design crime), we are not helping anyone.
    I do agree that folks back in the late 90’s were getting carried away making libraries that should never have been made. But the pendulum seems to have gone to the other extreme, and now we can’t wait to see what the next med chem rule is.

  27. Molecule Hacker says:

    @21 – The point of the paper seems to be that similarity to metabolites is somehow special. However, the distribution of similarities to metabolites is not different from the distribution of similarities to a meaningless random set of molecules. How can this be useful?
    Why are we worrying about the nature of the fingerprints when the authors have failed to demonstrate that there is anything special about the dataset being used as the basis for their metric?
    Even if there is something special about the dataset, as pointed out by @19, a Tanimoto coefficient of 0.5 for MDL keys is so far down in the noise that it has little, if any meaning.
    In any case, hard similarity cutoffs should be avoided. There are better ways to do this sort of thing

  28. Just for the heck of it, if Tanimoto distance *from* this catalogue of metabolites is interesting, mightn’t it be useful to estimate the intrinsic diversity (see, e.g. this note) *of* the same catalogue? That is: there are many compounds in it, but are they similarly far apart? How much space do they cover and is there a smaller representative sample?

  29. Annoyed says:

    This paper is so cargo cult … correlation and causation … but we have the great mighty god of p-value. Further (as mentioned) depends the Tanimoto cut-off on the subjective classification/feeling of being similar and this cut-off is different for all fingerprints. So I am sure the next publication will discuss the MACCS 0.5, ECFP4 0.45 and FCFP6 0.35 rule and highlighting the special powers of the number 5 in contrast to the 7.
    If the people would have put as much work in the last 17 or so years since the Ro5 paper appeared into understanding the paper and its implications in respect to the e.g. discussed outliers as in proposing one stupid rule after the other, we would have actually learned something.

  30. sim_2d says:

    @TX Raven
    Agree that there may be too many rules. After it is also to the med-chem to know what he/she is doing. People went crazy bigs, a paper came out to show that, took time but now compounds are not massive.
    I don’t think this paper would help a lot, my main issue is that I don’t see what problem it is trying to save.
    Ro5, fragment, LE, QED, CNS rules, PAINS have more or less a purpose to highlight the fact that med-chem should think a bit more on size/lipophilicity/reactive groups when they make a compounds and don’t try everything they can buy from Sigma because they can. And also if some rules are broken that they should understand that there may be some issues that could arise (they may not, but better to know early that when it is too late). It also works for some academic group to filter more the hits they got from screening and remove potential compounds with no future (even if we still some PAINS showinf up as Derek often reports), and in some extend it helps with commercial library filtering and for the vendors to propose more compounds with less risk.
    But well, it may not be relevant for long with RNAi and biologics coming up…

  31. Kent Kemmish says:

    Pehr Harbury’s group at Stanford has developed methods to screen literally on the order of 10^17 smallish molecules encoded by DNA. I’d like to hear from ya’ll how helpful you think this will be in exploring medicinal chemical space…

  32. Mike C says:

    @everyone: so what’s the Tanimoto score for a big set of NCEs that failed tox at phase II or later? The differential would be the most direct measure of how useful this is(n’t).

  33. Douglas Kell says:

    Lots of comments (thanks, folks, and to Derek for a very fair commentary). Here’s some answers to some of them; others are in the OA paper itself; others will be worked on for a follow-up:
    @3. Haven’t tested post-2000-registered agents (not that many…). Would be interesting to see the trend over time. See end of paper for refs to two citations of general trend in the opposite direction.
    @4. We did look at ECFP4 (version in RDKIT). Didn’t look at other similarities e.g. Tversky.
    @8 yes, will be interesting to do a matched pair analysis; NB though this was a purely unsupervised study (not a QSAR).
    @13. We would LOVE to compare the analysis with failed drugs; as academics I don’t think I have online access to a good dataset. I suspect folk may be trying to see if failed drugs more commonly fail the ‘rule’ as it is easy enough to do. If anyone has available lists of failed drugs please contact us!
    @17. I have no inside track on whether folk are using metabolite-likeness in their designs, but suspect generally not. Regular metabolism is a field and space possibly not usually looked at by Med Chem folk? Would love to know, of course. BTW, the differences with the 2009 paper include (i) a much greater stringency about what was deemed an endogeneous metabolite and ESPECIALLY what was deemed a drug, (ii) much better graphics (!), (iii) comparisons with non-drugs, (iv) etc.
    @19. Depends on the question. ‘Plain nuts’ may be true for some purposes for sure but a little O.T.T. as a catch-all; we argue why the rule should be useful (and NB it is based on the REAL BIOLOGY – it is not just an in silico exercise of itself). It is also based on transporter considerations – see for the latest on that.
    @20. According to your analysis, Magic beans is a predictor of drug-likeness. However, Maybridge and other molecules were not (or much less so); those represented the controls – we could not change the metabolites or the FDA list of drugs…. Obviously we encourage people to play with what is in their decks (as of course folk do do a lot from other perspectives, like diversity – also a comment of @28).
    @22. No that string (‘control’) does not appear; however, the data do – see @20 answer). Happy to have other experiments suggested of course.
    @23 Quite.
    @27 – agree wrt hard similarity cutoffs (obviously)
    @29 I’m sorry you did not read it; other than for relationships between the encodings in Fig 2 we mentioned neither correlation nor causation nor any p-values. I do not mind if you simply like to dis other people’s stuff, but you do need to be aware of the biological basis underpinning this analysis – see comment to @19. If analysis of the actual biochemical network is a cargo cult so be it.
    @30 Sorry if it wasn’t clear enough. The problem it is trying to solve is to find some filters that tend to hold true for successful drugs. In its aim it succeeded. It also indicated (as it was based on) a rough mechanistic basis.
    @31 and @32 – my group doesn’t have enough computer power for analysing 4 billion molecules…, but I’d comment (in general) that what matters more is the assay than the numbers of things you throw at it. My pitch for reverting more to phenotypic screening at

  34. Molecule Hacker says:

    @34 – Thanks for your responses. I think the point that @29 was making was that your work didn’t contain any p-values. You simply put up cumulative value distributions and claimed that the results were different. This in not correct. Take a look a Rajarshi’s blog entry for today.
    Note that in the case of your Figure 5c, the p-values should be corrected for multiple comparisons.

Comments are closed.