Skip to Content

Free Compounds, Chosen By Software

Here’s how the press release starts, and I’ll say this for it, it does get the reader’s attention: “Atomwise Inc. seeks proposals from innovative university scientists to receive 72 potential medicines, generated specifically for their research by artificial intelligence.” As you’d imagine, this is the sort of thing that immediately engages my skeptical targeting systems, and probably many of yours, too, so let’s dive in and have a look. It goes on:

“It’s this easy: researchers tell us the disease and protein to target, we screen millions of molecules for them, and then they receive 72 custom-chosen compounds ready for testing,” said Dr. Han Lim, MD, PhD, Atomwise’s Academic Partnerships Executive. “As a former UC Berkeley principal investigator, I helped design the kind of program I wish existed for my own work.”

This puts me in mind of the scene from Henry IV, when Glendower says “I can summon spirits from the vasty deep”, and Hotspur answers him with “Why, so can I, or so can any man. But will they come when you do call for them?” I have no doubt that researchers who apply for this program will indeed receive 72 custom-chosen compounds. But will they work? Put another way, how much better will they work compared to 72 random compounds out of a screening deck, or (more stringently) 72 compounds imagined by a medicinal chemist after an hour sketching possibilities on a whiteboard?

The software is called AtomNet; here’s the paper on it. “AtomNet is the first deep convolutional neural network for molecular binding affinity prediction. It is also the first deep learning system that incorporates structural information about the target to make its predictions“. I will stipulate that I know nothing about deep convolutional neural networks. But here’s a nonmathematical description of what seems to be going on:

Chemical groups are defined by the spatial arrangement and bonding of multiple of atoms in space, but these atoms are proximate to each other. When chemical groups interact, e.g. through hydrogen bonding or π-bond stacking, the strength of their repulsion or attraction may vary with their type, distance, and angle, but these are predominantly local effects. More complex bioactivity features may be described by considering neighboring groups that strengthen or attenuate a given interaction but, because even in these cases distant atoms rarely affect each other, the enforced locality of a DCNN is appropriate. Additionally, as with edge detectors in DCNNs for images, the applicability of a detector for e.g., hydrogen bonding or π-bond stacking, is invariant across the receptive field. These local biochemical interaction detectors may then be hierarchically composed into more intricate features describing the complex and nonlinear phenomenon of molecular binding.

Now, I have no problem with the local bonding calculations that they’re talking about doing, although they’re subject to the usual disclaimers about the accuracy of the calculations. But the assumption that “distant atoms rarely affect each other” does not seem to me to be valid. Medicinal chemists are quite used to seeing changes in a structure-activity relationship when a reasonably distant atom is changed – “You can get away with a methyl there as long as you don’t have one over there”. There are SARs that do work on the “greatest hits” principle, where you can independently mix-and-match various regions of the molecule, but the great majority of the projects I’ve worked on haven’t gone that way, or not quite. And if I’m interpreting that paragraph correctly, it’s explicitly aimed at the mix-and-match. I’d say that the most common situation is the one where you can get away with independent changes within a given range, which can be a rather narrow one, and then all bets are off. And the only way to discover that you’ve gone outside those ranges is to go outside them.

As mentioned, AtomNet, to its credit, also brings in data about the binding target. But that’s a tricky business, too. As is well known, binding sites accommodate ligands by adjusting their own shapes – sometimes subtly, sometimes dramatically – and this is one of the hardest things to account for in virtual screening techniques. Likewise, the ligands themselves can adopt a range of conformations in response to a binding event, which also adds to the computational burden. I’m not at all sure how this software deals with these problems, particularly the protein mobility one, but if I come across more details, I’ll update this post.

From what I can see, the AIM program is screening databases of commercial compounds and furnishing the applicants with the 72 best purchasable hits. The compounds will be given an LC/MS quality check diluted to an appropriate concentration, and plated out, which is a good service. “Custom-chosen”, though, does not mean “custom-synthesized”, as you’d imagine (I don’t think anyone will be taking that on for free). They’re asking that people come to them, ideally, with targets that have an X-ray protein structure and an identified small-molecule binding site, which is fair enough.

I would very much like to know what the hit rates will be for these, and I suspect that AtomWise very much wants to know that, too, which is why they’re offering to do this for people. The awardees get some potentially interesting molecules to test, and the company gets a presumably diverse set of real-world examples to test their technology against. (I should note that they already have agreements with several academic groups, and one with Merck, for an unnamed project). Personally, I’ll be surprised if there’s much of an enhancement for many of these, but I wish the company luck, and I think that their commitment to putting their software to the test is admirable.

Is it “artificial intelligence”, though? That’s a topic I touched on in my talk last year in Manchester. I think that if you time-machined people from the 1950s into our present-day world and hit them with Google Maps (for example), they’d probably call that artificial intelligence. “Sure, that’s intelligent, although for some reason you only seem to have taught it about roads”. From that standpoint, AtomWise would also be called AI, but from a modern perspective, if that’s AI then so are the rest of the modeling and docking programs. I’ll put that one down to press release language, and hope that it doesn’t become a big part of their pitch.

The part that annoys me more is the “72 potential medicines” line. Screening hits are potential medicines in the same way that AtomWise is a potential Amazon.com – sure, they all start out this way, but not many make it through to the end. People are confused enough about where drugs come from and what it takes to get them there; I’m never happy to see more confusion being dumped in on top of what we’ve got.

34 comments on “Free Compounds, Chosen By Software”

  1. another cs chemist says:

    To address your latter point about whether it’s AI, the short answer is yes. The AI community defines deep convolutional neural networks as part of deep learning, a subset of machine learning, which is a subset of artificial intelligence. The other methods you describe could, as well, be labeled artificial intelligence. There’s an interesting article by Jerry Kaplan at Stanford that comments on the naming, in which he says “Had artificial intelligence been named something less spooky, it might seem as prosaic as operations research or predictive analytics.” A good example is linear regression. By all means it’s accurately labeled as a machine learning or statistical learning technique (and thus AI!), but hasn’t it been around for decades?

    The key here for AtomWise is that they are applying state of the art technology, deep convolutional neural networks, to structure-based virtual screening. These have blown image recognition and other fields wide open. For example, this is the technology that allows computers to now diagnose heart disease and various cancers with human-level or better than human-level accuracy. The specific paper shows improvement over other state of the art techniques at the moment for SBVS. I think what they want here are more targets (unpublished) so that they can further refine their techniques (and start more collaborations, and sell their software, I’m guessing).

    Although the effectiveness of LBVS and SBVS in drug discovery is probably a hotly debatable topic on this board between medicinal chemists, I err on the side of embracing technology and giving it all a shot. After all, how many of this blogs visitors are reading this page on their smartphones (all of the voice and image recognition features use state of the art machine learning methods) as we speak?

  2. Curious Wavefunction says:

    One of the main factors impacting protein ligand binding is the solution conformations of the ligand. There is a kind of “conformational focusing” (a term coined by Bill Jorgensen) in which the protein picks out the relevant bioactive conformation from solution. The smaller the population of the bioactive conformation, the bigger the penalty paid by the protein. How does AtomWise account for this?

    Ultimately, whatever the merits of AtomWise’s algorithm, they are doing to be limited by the kind and quality of data that goes in: the classic GIGO problem, and one that every kind of machine learning without exception has to contend with. Do they have enough examples of structures with discontinuous SAR and activity cliffs? With flipped binding modes? With induced fits? With chemotype diversity? With shape diversity? Ultimately if your data is either inadequate or of poor quality, even Skynet-quality AI is going to end up in a serious straitjacket.

    All that being said, I am glad they are sticking their neck out. I wish AtomWise all the best and will look forward to their results.

    1. R says:

      “The smaller the population of the bioactive conformation, the bigger the penalty paid by the protein”

      A very important point.

      Based on dereks commentary I think they are focusing too much on what would be considered ‘textbook’ SAR, and not real world examples, especially ignoring long range interactions. Drugs are not just collections of functional groups, they’re an equilibrium of physiologically accessible conformational states of those groups and more ‘holistic’ than people, even chemists, tend to think. Maybe if you try to avoid treating them less like that to simplify calculations you’re gonna have a bad time.

    2. JFlaviusT says:

      Maybe, rather than asking “is this system a perfect drug designer?” we should be asking “is this system approximately as good a drug designer as most medicinal chemists?” It seems to me that all of the “garbage-in-garbage-out” issues encountered by this AI would also be encountered by a human intelligence, and would lead to similar false assumptions. Now, the AI may be less likely to recognize the flaws where a real experienced chemist would say “I don’t quite trust this data.” But how effective are chemists, really, at doing this? Typically you’re given a target and you try to find a drug, even if you don’t trust the data, so the outcome is no different. And most PhD medicinal chemists make a lot of money compared to what it would cost to buy and run this software. You can’t replace a department of 100 PhDs with this technology, but what if you could take a department of 100 PhDs and turn it into this program, 15 PhDs to watch what the program is doing, point out flaws, and do stuff the program can’t do, and 50 MS/BS chemists (or Chinese CRO chemists) to make the compounds. This would probably be much less expensive and may replicate the productivity of the 100 PhD department. This doesn’t seem like a situation we’re too far away from right now…

      1. anon says:

        Put simply, no.
        This software predicts potency or likelihood of activity at a target. That’s only a very small part of a medicinal chemist’s job. As Derek points out elsewhere in this thread, there’s a heck of a lot more design, interpretation and science, unrelated to the target of interest, that a medicinal chemist puts in to getting a candidate molecule.

        That said, perhaps I could use a version of this software to analyse the past 20 years of NFL plays, then take the top 5 players from next year’s draft, add in a bunch a highschool benchwarmers and a few weekend warrior players, and come back in a couple of years time with fistfuls of big fat superbowl rings. Surely our chances will be no worse that that of the pro teams…

        1. JFlaviusT says:

          Just to clarify, I am a medicinal chemist, so I am quite aware of what it is that medicinal chemists do. It is for this reason that I’m so concerned about what AI could do to the field. Molecular design is probably one of the hardest things a computer could do in medicinal chemistry because the parameters influencing it are not well understood. Subtle Van der Waals interactions, long range electrostatics, aligning dipoles, even experienced medicinal chemists aren’t capable of taking all of these factors into account and designing good molecules de novo. This is why something like 80-90% of ambitious new structure-based designs don’t work. We just can’t understand all of the factors at play (entropy, enthalpy, conformational effects, solvation effects, etc) because small errors in estimating their magnitude propagate to large errors in the output. This means most of design is educated guessing that turns out to be wrong most of the time (again, I’m speaking from experience here. I’m not a Silicon Valley “let’s hack the genome” idiot, this is just what I see going on around me). As such, a computer doesn’t have to be an extraordinarily good drug designer to be useful, it can actually be a quite bad drug designer, like most chemists are, yet still earn its keep if it works 24 hours a day coming up with all possible designs and scoring them using a deep learning algorithm. I know that certain companies already use deep machine learning to predict things like permeability, solubility, and metabolic clearance. These are those “others things” that medicinal chemists do, and AI programs are already helping to inform chemists regarding what are the good and bad molecules to make. These programs are rather good at what they do. They’re not perfect, but when the competition is a chemist who says “I think that might not be so soluble…” the bar is pretty low with regard to what is useful and will improve outcomes. I think many medicinal chemists have their heads in the sand here because they want to believe that what they contribute is useful and necessary. It very well may be, but I think it is foolish to start from that assumption rather than analyzing the situation and coming to a judgment based on the facts. The only saving grace for us is that medicinal chemistry isn’t usually rate or cost limiting for most drugs, so the incentive to gut it might not be there. Maybe the slightly reduced quality of all-AI drug design vs human drug design makes all the difference in clinical success. But maybe it doesn’t, and pretty good and cheap turns out to be as good as very good an expensive. I wouldn’t rule out the possibility that reducing medicinal chemistry costs low enough could result in an inflection point that changes the cost-benefit equation drastically.

          1. Anon2 says:

            “The only saving grace for us is that medicinal chemistry isn’t usually rate or cost limiting for most drugs, so the incentive to gut it might not be there. ”
            I’ve seen analyses that suggest that the risk-adjusted cost of Lead Optimization for small molecule programs is similar in risk-adjusted cost to clinical Phase III. Not all LO costs are for Medicinal Chemists, but I wouldn’t be so comforted by the relative cost.

          2. Anon2 says:

            I meant to say “cost for Phase III”

          3. JFlaviusT says:

            @Anon2 I believe you, the “might not be there” was optimistic thinking, but its the only convincing argument I’ve heard as to why AI won’t displace medicinal chemists. Do you have a link to that study, or is it proprietary? I’m really interested in looking at those numbers.

          4. My guess is that Anon2 is thinking of something like Figure 2 of Paul et al. 2010’s “How to improve R&D productivity” paper (http://www.nature.com/nrd/journal/v9/n3/full/nrd3078.html), which gives a capitalized cost per drug launch of $414M for Lead Optimization versus $314M for Phase 3.

    3. Peter Kenny says:

      I think ‘conformational selection’ is a more appropriate term than ‘conformational focusing’ and you can think of the (free) energy costs of ligand (and protein) achieving their bound conformations as analogous to a tax. One point that these guys need to take on board is tha the contribution of an intermolecular contact to affinity is not, in general, an experimental observable.

  3. Dominic Ryan says:

    What is ‘Artificial Intelligence”?
    Unfortunately it means many things. That makes it easy to pull out what you want. In my view, the heart of it is making automatic decisions based on complex data and those decisions are based on linking observations: “must turn left, is that blob approaching a truck or exhaust from a truck that passed?”. In our world AI methods work hard at identifying and refining weak links between data points. When there are many examples the link can be made stronger -if there is a link. Many examples of dubious data will make dubious links more persuasive but less real. That’s called overfitting.
    Even in very well studied systems with good crystal structures and ligands, some careful calculations can be very predictive but others still lack predictive power.
    It is tempting to think that the limitation is all in the AI / machine learning / physical model being used. But, maybe the limitation is fundamentally in the data? Consider the list of experimental uncertainties in a list of protein-ligand ‘relationships’. Most of the data is relative (IC50 etc.) rather than absolute binding energy. Even then perhaps what you want is Kon? But, a bigger uncertainty is what hitting a target hard means to the biology when so much is still unknown. I include in that unknown the significance of a given protein structure as deposited to the function of that protein in the cell. Structures usually have bits cut out, missing things like glycosylation and phosphorylation and sometimes missing whole partners. It is a bit remarkable that we do as well as have. All that goes to show you that you cannot wait for the perfect model. Take what you have and bang on ahead. I’m sure some AI methods will have an edge at recognizing some data relationships that other methods don’t easily and I am glad more come online. I’m with Derek on a dose of skepticism here but if they can identify a niche where they can extract interesting and testable linkages then good.
    Methods that cannot make a reasonable prediction about where they will succeed and where they will fail do not get a strong following in project teams. I think that shiny new methods need to provide a strong understanding of how they handle biological complexity better than other, not just method cross validation.

  4. Peter Kenny says:

    Scoring functions used in virtual screening are typically trained with affinity but ‘validated’ by enrichment (hit-rate).

  5. Mark says:

    The big question is whether they are going to provide a control set for their experiment. Expect a press release from them in a year’s time about how they got a 15% hit rate for target X and a 47% hit rate for target Y (with no mention of how many unsuccessful screens there were). However, these numbers are meaningless without a base hit rate.

    If they had a control arm which took the set of known actives, computed the average H,C,O,N count for them as well as the number of basic/acidic centres, and picked random commercially-available compounds matching this profile, then any results would be significantly more useful (and impressive, if it works). Without this, the results (as with most published VS results) are pretty meaningless, as different proteins have very different random hit rates. A hit rate of 1 compound out of 72 could be fabulous if the target was a tricky PPI that just doesn’t bind small chemical matter, or worse than random if it’s one of the more promiscuous targets.

    1. anon says:

      One would hope that some (most?) of the academics that would take Atomwise up on their offer would have already established and assay and screened a number of small (or even large) diverse chemical libraries against their targets, given that the intention is to test these 72 VS hits against their target of interest. Couldn’t these divers libraries act as a control arm against which to benchmark the Atomwise compounds?

    2. anon says:

      So basically you want a company to act completely open and honest AND make money at the same time?

  6. thufir says:

    This is a docking scoring function that is target agnostic, which according to literature [https://arxiv.org/pdf/1406.1231.pdf], does essentially no better than most machine learning methods (e.g. SVM/RF/etc).

    This is quite literally the lowest hanging fruit which could be considered “AI”. It is not a generative process, nor does it learn from new data, and it will only perform as well as eMolecules/ZINC can against your target du jour with whatever docking program you prefer (if at all).

    Especially with the advent of LSTM/Autoencoders for SMILES generation and FEP methods, this practice of having one-off methods that only screen known libraries has got to stop. To be fair, a lot of comments here seem to confuse ‘AI’ with a process that can design a drug from nothing and drop it off at the FDAs doorstep with a neat bow, which is also a bit silly.

  7. John Campbell says:

    I’m not a medicinal chemist by training which may qualify my comments but isn’t what you would really like are molecules which interact well with your receptor but those that do so without messing up a thousand other receptors as well? Also, whether the molecules will survive first pass of the liver – likely breakdown etc. That sort of screen could save a lot of heartache later on.

    1. Derek Lowe says:

      No, your comments are quite valid. Screening against one target, computationally, is hard enough, but no one is able to turn around and do that against all the others that we have structures for. Then there are all the ones we *don’t* have structures for! Metabolism is another issue, and a big one. There are some computational tools out there that predict likely sites of metabolism, but for the most part, the obvious ones are obvious to a medicinal chemist’s eye, and the others are found empirically. Even the obvious ones don’t always work out like you’d think, especially if you’re trying to rank-order the likely spots.

    1. metoo says:

      I’m guessing they were able to strike a deal with a supplier/cro that would give them sufficient molecules to be provided which also gives them good odds at finding hits.

    2. anon the II says:

      I’m guessing it’s a microtiter plate with the top and bottom rows open for standards. This is a major innovation since, historically, the standards have been applied on the left and right columns. This new approach uses 10% fewer compounds.

  8. Magrinho says:

    “molecular binding affinity predictions”. There is thing called water (free energy of solvation) that keeps screwing up everybody’s calculations! Can’t we all agree to just keep running calculations in vacuo?

    Sorry but I foresee this heading to the vasty deep to join many, many others. Nicely packaged though.

  9. Anon says:

    From the company’s perspective this makes complete sense. Basically, “you tell us the target, we’ll run our program to generate “leads” which costs us nothing, and you do all the validation grunt work”.

    So the company has nothing to lose (no cost to validate their leads whether they are any good or total crap), and everything to gain if they do work. Meanwhile the student does all the work, but doesn’t own the IP.

    Nice!

    1. John Wayne says:

      Huh, you are right; this is brilliant.

    2. anoano says:

      Compounds are commercial, so IP is going to be limited at 1st.
      I think the idea is to get enough people to give a go, and from those where it works that some will want to continue with AtomWise on a more lucrative contracts.
      I’m surprised that some are saying that until it solves the 90% failure, it is not worth it. People have been hiring med-chem for years and years to do those odds. So even with 90% failure the computer would do the same as a med-chem team. (Medicinal chemists know so much, that by reading at the comments, here and on twitter, one would expect that the success rate would be much much higher than reality – “I’m essential” mentality)

      1. John Wayne says:

        I agree that medicinal chemists need to stay humble, but pushing forward leads identified in silico will have a failure rate closer to 99% without a decent medicinal chemist looking over the results. There are a lot of simple pitfalls that cannot currently be screened out by a computer. Careful, well though out experiments work wonders.

        1. Curious Wavefunction says:

          Correct. I have never been part of a virtual screening campaign where one did not have to cherry pick results from the final list using chemical intuition and sound medicinal chemistry principles. This process can also give an inflated sense of the success of virtual screening, leading people to think that the top n compounds from the VS were simply tested without human intervention. Ideally any exercise of the kind AtomWise is attempting needs to be run with and without human endpoint human intervention for a candid assessment.

  10. Chris Phoenix says:

    I don’t know chemistry, but after reading Google’s research blog, I can say that deep convolutional neural networks are a pretty powerful AI tool.

    For example, this kind of neural net is currently the world leader in natural language translation. It’s a pretty new tool; a few years of work with it have eclipsed decades of previous AI work. Also, it was part of the AlphaGo program that was the first to beat the best humans at that game. And it’s been used to play Atari video games – sometimes better than humans – from direct video input. https://arxiv.org/pdf/1312.5602.pdf

    I don’t know whether that tool can address this problem. And even if it can, there’s no guarantee that they’re using it well. But it certainly seems fair to call their effort AI, and it’s worth paying attention to what they’re doing.

  11. Barry says:

    I’d be impressed if such a program could find e.g. staurosporine as a ligand for e.g. PDGFR (without having staurosporine in its training set) but of course you don’t have a “drug” or a “potential drug” at that point. Hundreds of chemist-years have been spent trying to get a drug with sufficient selectivity out of such a pan-kinase inhibitor.
    A sophisticated (by current standards) AI might recognize that a ligand that makes contacts only to the peptide backbone (rather than to AA side-chain residues) won’t discriminate between proteins that share a folding motif. But such a hit is valuable (even as we laugh at calling it a “potential drug”)

  12. myma says:

    I think its a clever marketing shtick to get potential customers and more data. Modelling, docking, virtual screening are not exactly new, even if they think they are the Best Ever with their Super Whoopie Do Gee Whiz Advanced Math.

  13. David R says:

    Semantics aside, this seems like a nice way to test the technology en masse and could be an interesting and efficient way to identify new molecules modulating the activity of your target of interest.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.