Skip to Content

Model This?

Via Ash Jogalekar on Twitter, I came across this new paper from researchers at AstraZeneca (and collaborators in Sweden, the UK, and Denmark) on the synthesis and activity of some plasmin inhibitors. Plasmin is an anticoagulation target, and has a lysine-binding site in its Kringle-1 domain (yeah, that’s the real name) that is the site of action for tranexamic acid. The AZ team had discovered some small piperdinyl heterocycles that hit the same binding site, and you can easily see how they might (TXA is at the left, and the prototype AZ compound is at the right).

They produced a set of 16 simple analogs of that first isoxazolone/hydroxyisoxazole, adding methyls, changing the ring size of the piperidine or putting a double bond into it, moving the heteroatoms around, adding a nitrogen or swapping the oxygen for a sulfur, and so on – classic medicinal chemistry on a small scaffold, and perfectly reasonable stuff. The efficacy in the clotting assay matched very well with the Kringle-1 binding affinity, so everything holds together just fine (three or four of them had similar or better potency to the lead, and the others were slightly to noticeably worse). The interesting part is when they went back with this set of compounds and used all the computational tools at their disposal to try to see if anything could predict their affinities or rank-order them well.

They went after it with classic QSAR descriptors, three-dimensional similarity scores, ligand geometries from protein-bound crystallographic data, docking programs, FEP (free energy perturbation), and MM/GBSA. Here’s how all these techniques performed: the QSAR models using simple descriptors, with or without principle components analysis, were of no use at all. Ligand-centric QSAR using molecular shape scoring was equally worthless. The structure-based QSAR calculations also failed, because everything came out as too similar in shape and electrostatic potential. Moving on to the docking programs, the poses for the various compound all came out very similar, but the scoring functions (that try to determine the effect of hydrogen bonds, electrostatic effects, and so on) gave very poor results across the board. The FEP calculations likewise gave an extremely poor fit, as did the MM/GBSA calculations, which gave a very small trend in the wrong direction entirely.

This is not an impressive set of outcomes, clearly, and these are not large or complex molecules. That in itself might be part of the problem, since relatively high ligand efficiency means that most parts of the structure are important. But still. The authors believe that part of the problem is the heteroaromatic group, and the charged/zwitterionic nature of the compounds. But we certainly make an awful lot of heterocycles in this business, and charged interactions are some of the bread-and-butter of medicinal chemistry (and biology!) It may be that some of the techniques in this paper have been applied suboptimally, and I’m sure that if this is the case (or many even if it isn’t) we’ll hear about it. But for now: one would hope for better.


23 comments on “Model This?”

  1. Mike says:

    Interesting paper! You don’t often see someone get empirical binding data, then go back and see if it can be predicted with current methods. I can’t say I’m surprised at the results. My time as a medicinal chemist left me with little faith in computational chemistry.

    That said, has anyone used computational modeling and had the results work well? It must have happened due to chance alone, but I don’t hear of many success stories.

    1. anoano says:

      May be same as pure traditional med-chem. How many drugs come out every year from FDA compared to number of projects people work on?

    2. MR says:

      I have had computational modeling work (FEP) and it wasn’t due to pure luck. Quite frankly I wouldn’t have even tried computational modeling with these molecules at any level. Protonation state shifts upon ligand binding in the ligand/protein are not uncommon, not captured by these methods; and these molecules seem prone for them.

    3. CheMystery says:

      That really depends on what you mean by “work well”. Pretty project specific IMHO.

      There are examples of modeling “working well”, such as

      When modeling “works”, pure medchemists say “of course we aaaaall knewwwww thaaaaaat”. Yeah. Ok.

  2. KN says:

    Typo; “piperdinyl “

    1. Derek Lowe says:

      Got it – thanks!

  3. Barry says:

    I’m no computational chemist (purely a user) but these look like difficult cases because of the charge density. The binding we’re looking for is the small difference between large numbers. In this case the large enthalphic cost of ripping a lot of waters off the ligand, and a lot of waters off the charged binding site, and then getting back a similar amount of enthalpy on binding.
    Hydrophobic interactions (Van derWaal, pi-stacks…) are more tractable because they’re not obscured by as much desolvation cost.

  4. anon says:

    The equilibria of solvation/desolvation coupled with protonation/deprotonation is still a complete bitch to address accurately in silico. (in the context of large molecules/ligand binding)

  5. Peter Kenny says:

    There are a number of reasons that FEP calculations might be challenging for this system. The compounds are zwitterionic which means that a gas phase QM calculation is less likely to describe ligand electrostatics accurately (it wasn’t clear how ligand electrostatics were modelled). The molecules are relatively small and surface polarity is all due to charged groups. The pi-systems of the heterocyclic anions are relatively extended (in comparison with carboxylate) and, consequently, ligand polarization may be more of an issue than for carboxylate. It may be worth checking pKa values for structures 9-12 and 14-16. I’m guessing that an in depth look at the simulations for the isosteric pair 1/6 may yield learning points given that the Kd values differ by two orders of magnitude.

  6. Andrew says:

    A very different computational approach admittedly, but a profound demonstration of the seemingly hit-or-miss nature of computational work in pharma.

    At least the rule still holds: There are NO easy answers when it comes to drug discovery

  7. Jonas Boström says:

    @PeterKenny. Agree with your points. We have exp pKa values for just a few. Unfortunately most of the compounds were too small and hydrophilic to be determined by the experimental setups we use.

    1. Peter Kenny says:

      Hi Jonas, the warheads area on the intranet had literature pKa measurements for a number of these under ‘acid mimics’ although I assume those web pages are no more. Your CompChem colleagues in Cambridge may still have the ‘acid books’. Beilstein (I think that it is now called CrossFire) was very good for pKa searches because one could do a substructural literature search while requiring that a measured pKa was available for compounds matching the substructure.

  8. Derek and @Mike. The intention with the paper was not to diss CAMD. Getting geometries right is more or less trivial, but scoring/ranking is still a big problem. By sharing our ‘negative’ results we hope other scientists can help out. That said, the POSIT+Tanimoto rankings are okay?

    There’s computational success hidden – that is, how the lead molecule was discovered: Virtual screening using shape and electrostatics similarities (ser JMedChem paper in handle). I doubt any med/comp-chemist would have come up with the 4-PIOL structure with pen and paper?

    1. Agent M says:

      Hi Jonas. Ironically, I was just looking at γ‐aminobutyric acid (GABA) receptor agonists today – the series GABA, muscimol, THIP (gaboxadol), gabapentin, … is reminiscent of the tranexamic acid analogs that your team studied, with a 4-carbon spacer instead of 6-carbon, of course. There might be some old world (physical chemistry) data from past GABA med-chem campaigns that’s transferable to your tranexamic acid analogs. For example, the tautomer forms of gaboxadol would seem to be quite relevant, especially sensitive to docking/modeling as you and Peter alluded to. Lastly, is there any data to suggest hydrolysis (ring opening) of some the heterocycles (that would yield funky SAR) – apologies, I only skimmed your paper – but I like it. Best regards.

      1. Well spotted Agent M! There’s a third paper where we design away from GABAa – discovery of the
        fibrinolysis Inhibitor AZD6564 (link in handle)

        1. Istvan Ujvary says:

          Have you considered nipecotic acid homologue types where a CH2 is inserted between piperidine ring and the acid mimic?

  9. Confused says:

    Uh…that’s an interesting question.

  10. needs_educating says:

    Why is R2 still the most valued metric? Why not MAE/RMSE? Even more importantly, why not put them in a larger dataset and see if any of their modeling efforts would have extracted them? These kinds of statistics are supposed to be used in a prospective manner, if anything. Looking at trend lines of examples like this proves very little to me…

  11. milkshaken says:

    If even binding of relatively small cyclic molecules with relatively few degrees of freedom cannot be reliably modeled in silico using commercial software, I really worry about all these startup companies doing virtual library docking or fragment based computer predictions “to streamline” drug discovery by supercomputing. (I had a fruitful collaboration with a computer modeler but his suggestions were just new ideas to try, not the ranking/elimination mechanism to tell me what not to do – and we worked with kinase/ligand co-crystal data for which several X-ray structures got solved)

    This reminds me a fairly successful combichem company that developed originally “one bead one molecule” on-bead screening of million compound libraries, where I worked, and we belatedly realized the hit frequency of real hits was very low (at least an order of magnitude bellow what it should have been) whereas the frequency of false positives was alarmingly high. Belatedly the libraries were spiked with known inhibitors, and sure enough the assay did not identify them as hits. Eventually the core technology – on bead screening of resin bound ligands followed by structural deconvolution by sequencing – was quietly dropped in favor of small libraries assayed in solution (after their release from solid support), because assaying bead-attached compounds interaction with protein targets does not tell you much anything – despite many papers and patents claiming it does. All those papers and patents were based on nonfunctioning sloppy biology advanced to sell the company to the investors.

    1. Peter Kenny says:

      Scoring functions used in virtual screening are typically trained on affinity but validated by enrichment

  12. Anon says:

    In the subject of FEP calcualtions I am no expert, but I have run them.

    It should be noted that this docking set is doing exactly what they tell you not to do in FEP binding. You can’t have a charged compound and make modifications that move the charge around.

    And with the amine they are doing it on both sides. It is no surpise the results aren’t very good.

  13. Morten G says:

    I remember a talk from… probably John Irwin… talking about ZINC and saying that if they threw the whole database at the target and half of the top 100 had some kind of binding then that was a success. Has that changed?

    1. tcor says:

      Most definitely. The goal is to find starting points for optimisation without screening full decks.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.