Skip to main content

Drug Assays

Predicting New Targets – Another Approach

So you make a new chemical structure as part of a drug research program. What’s it going to hit when it goes into an animal?
That question is a good indicator of the divide between the general public and actual chemists and pharmacologists. People without any med-chem background tend to think that we can predict these things, and people with it know that we can’t predict much at all. Even just predicting activity at the actual desired target is no joke, and guessing what other targets a given compound might hit is, well, usually just guessing. We get surprised all the time.
That hasn’t been for lack of trying, of course. Here’s an effort from a few years ago on this exact question, and a team from Novartis has just published another approach. It builds on some earlier work of theirs (HTS fingerprints, HTSFP) that tries to classify compounds according to similar fingerprints of biological activity in suites of assays, rather than by their structures, and this latest one is called HTSFP-TID (target ID, and I think the acronym is getting a bit overloaded at that point).

We apply HTSFP-TID to make predictions for 1,357 natural products (NPs) and 1,416 experimental small molecules and marketed drugs (hereafter generally referred to as drugs). Our large-scale target prediction enables us to detect differences in the protein classes predicted for the two data sets, reveal target classes that so far have been underrepresented in target elucidation efforts, and devise strategies for a more effective targeting of the druggable genome. Our results show that even for highly investigated compounds such as marketed drugs, HTSFP-TID provides fresh hypotheses that were previously not pursued because they were not obvious based on the chemical structure of a molecule or against human intuition.

They have up to 230 or so assays to pick from, although it’s for sure that none of the compounds have been through all of them. They required that any given compound have at least 50 different assays to its name, though (and these were dealt with as standard deviations off the mean, to keep things comparable). And what they found shows some interesting (and believable) discrepancies between the two sets of compounds. The natural product set gave mostly predictions for enzyme targets (70%), half of them being kinases. Proteases were about 15% of the target predictions, and only 4% were predicted GPCR targets. The drug-like set also predicted a lot of kinase interactions (44%), and this from a set where only 20% of the compounds were known to hit any kinases before. But it had only 5% protease target predictions, as opposed to 23% GPCR target predictions.
The group took a subset of compounds and ran them through new assays to see how the predictions came out, and the results weren’t bad – overall, about 73% of the predictions were borne out by experiment. The kinase predictions, especially, seemed fairly accurate, although the GPCR calls were less so. They identified several new modes of action for existing compounds (a few of which they later discovered buried in the literature). They also tried a set of predictions based on chemical descriptor (the other standard approach), but found a lower hit rate. Interestingly, though, the two methods tended to give orthogonal predictions, which suggests that you might want to run things both ways if you care enough. Such efforts would seem particularly useful as you push into weirdo chemical or biological space, where we’ll take whatever guidance we can get.
Novartis has 1.8 million compounds to work with, and plenty of assay data. It would be worth knowing what some other large collections would yield with the same algorithms: if you used (say) Merck’s in-house data as a training set, and then applied it to all the compounds in the CHEMBL database, how similar would the set of predictions for them be? I’d very much like for someone to do something like this (and publish the results), but we’ll see if that happens or not.

11 comments on “Predicting New Targets – Another Approach”

  1. Anonymous says:

    Will each additional assay provide as much value as the last? Seems like a great way to increase R&D costs and decrease R&D productivity even further, if you ask me.

  2. anon the II says:

    If one guy can do this part time, then fine. Otherwise, hire a few medicinal chemists. They can do the job better, make more molecules at the same time and they need the work.

  3. Neo says:

    Derek, your comment “People without any med-chem background tend to think that we can predict these things, and people with it know that we can’t predict much at all.” puzzles me.
    The authors of this study achieved 73.8% hit rate in prospective validations. Are you implying that they are lying? Or perhaps that these Novartis scientists don’t have any med-chem background?
    Perhaps you should look for other informatics collaborators if you think this way…

  4. Cellbio says:

    You ask a very relevant question. In my opinion, drawn from experience, one does have to assess the value of each assay. At some point in the evolution of a data set, it becomes possible to compare how each assay is segregating compound behavior. The work I was associated with demonstrated that a small number of assays, ~4 bioassays chosen from a larger set, were sufficient to account for the bulk of behavioral diversity.
    The value of these distinctions is not guaranteed, however, it could be demonstrated that screening for previously unappreciated biological impacts did segregate compounds that had very different profiles in tox studies (benign to nasty).
    To be fair, most of my experience could suffer from over-fitting data sets. Only one example was prospective, but it did work to find compounds that escaped tox problems that were not associated with biochemical profiles and therefore only known after scale up and expensive tox studies. Few examples like this that save one from preclinical or clinical failure are necessary to offset the costs of additional screening data.
    However, it is true that measuring more is not an assurance of success. In my opinion, this is one of the ways that pharma responds, non-productively, to risk inherent in human biology and pharmacological intervention.

  5. Derek Lowe says:

    #3 Neo – I’m not putting down the Novartis work at all; I think it’s quite worthwhile. It does indeed look like a step towards predicting such things. But there’s a long way to go: the fact that they confirmed 73% of their predictions is good news, but what we don’t know are the false negatives: how many of the things that were predicted to be inactive were actually active? That would require a good deal of (tedious) work to get data on.
    And even more importantly, this technique generates information on binding sites that it knows about, based on assay data. There are far more binding sites in vivo, though, and most of them are very poorly characterized, if at all. There’s not much way an effort like this could tell you about many of those; they’re “unknown unknowns”.
    What I run into when talking with the general public, though, is some sort of idea that we can look at a compound and get some sort of instant profile on it – “Oh, that’ll do this and that and the other thing”. Compared to that, we really can predict very little.

  6. simpl says:

    “To predict truly novel and unexpected small molecule–target interactions, compounds must be compared by means other than their chemical structure alone”.
    Chemistry is essentially about predicting reactivity, m.p., solubility etc. on the basis of structure. I finally grasped the paradox when the HTS people explained that they were searching to maximise diversity of structures with a similar effect, so that they can choose between alternative structural strategies when tox. problems appear.

  7. JoJo says:

    we can look at a compound and get some sort of instant profile on it – “Oh, that’ll do this and that and the other thing”.
    Isn’t that what med. chemists do all the time? I see such comments on this blog all the time, including phys. chem. properties…

  8. Just reporting predictive success rates can be misleading. For example, I can build a spam detector with roughly 90% accuracy: just say every message is spam. It’s not clear to me from this description (I never took any biochem) how much of the 73% accuracy can be gotten just by blind and naive guessing instead of actually intelligent prediction.

  9. Harrison says:

    Although he might not have meant it this way, I interpreted Derek’s comment about in vivo testing as a statement towards the PETA-types who insist that science is beyond animal testing and that it is completely unnecessary. I think most pharmacologists would love an completely in vitro system that accurately predicated activity 80% of the time.

  10. Anonymous says:

    People should read Shiochet’s paper on polypharmacology

  11. pgwu says:

    Wonder how this approach differs from what Terrapin (now Telik) started about 20 years ago using HTS data to get some kind of molecular fingerprints.

Comments are closed.