Skip to main content

In Silico

Objections to (Some) Drug Discovery AI

Here’s a piece to start some arguing: “AI in Drug Discovery is Overhyped”, by Mostapha Benhenda. I realize that a lot of people will read that title and go “Well, yeah, sure”, but it’s definitely worth seeing some specific examples (which the post has).

Update: some of the authors involved have left detailed comments  – definitely make sure to see these if you have an interest in this area.

One of these is this paper, from AstraZeneca, on using neural networks to generate molecular structures for screening. Benhenda’s complaint is that the paper spends a lot of time and effort showing how different the AI-generated structures are from a “natural” set, but little or no time showing how different they are from each other. If you’re using this to produce libraries for virtual screening (and what else would you be using it for?), then you’d want more diversity, because huge physically-hard-to-realize diversity is one of the whole points of virtual screening. (Benhanda himself has more detailed objections here).

The second paper he’s looking at is from a group at Harvard, using “Generative Adversarial Networks”. As I understand this, it’s a technique where the output of one network gets critiqued by another one – in this case, downranking structures that get too strange-looking – to try to improve the whole process. But it appears that (as with the AZ work) that the molecules don’t get compared to each other very much, and (as Benhanda dug more into the work), that the second network seems to spend most of its time penalizing whatever comes out of the first one, which indicates that something is not quite right.

He then goes to a third recent example, from Vijay Pande’s group at Stanford. Some of that has come up here on this blog before, with mixed reviews. This paper is related to the MoleculeNet project, which is being funded by Andreessen Horowitz, whose moves into biopharma I’ve written about as well. I won’t get into the details of this one, but the criticism is basically that the work described seems (to Benhanda) to be both lacking in depth and part of a move to make a particular format/data standard (DeepChem) the default for the field as opposed to others (that could be as useful or more). I have no idea whether that’s true, but I would be interested in hearing from practitioners on both sides of the issue.

Now, to be sure, Benhanda himself is selling something – the services of an outfit called Startcrowd, which he touts as an independent way to evaluate AI claims in chemistry and drug discovery. And I’m not qualified to evaluate them, either, but his claims about these recent papers can be addressed independently of whether you want to hire someone else. So here are the questions: first, are these recent papers representative of the field? Second, how well-founded are the objections to them? If these are indeed problematic, what work should people be looking at instead to get a higher-quality read on what’s going on?

I am far from being an expert in this area, but I’m also very much interested in learning about it and keeping an eye on it. The whole AI/machine learning field is something that I think we should all be watching, because it has the potential to both wildly helpful and wildly disruptive, and it would behoove us to be ready for what might happen. I doubt very strongly that I’m going to turn into a neural-network programmer, but I don’t want to just ignore all that stuff, either, because it could change very drastically by the next time I get around to paying attention!

50 comments on “Objections to (Some) Drug Discovery AI”

  1. Uncle Al says:

    Efficient deep inward focus: “This is not the answer we are seeking.” Is it any answer? Freak a pharmacophore. Triquinacene plus diborane (HC Brown, and God help us) should give the 2-bora species,

    Apical azatriquinacene (DOI: 10.1021/jo005571o), then above to the 2-aza-9-bora derivative (and dative polymerization). Add HCN or HF to quaternize both ends. HyperChem modestly calculates 23 debyes. The interior is not a common circumstance, and it is reasonably isosteric with benzene. 8 chiral centers. Does it do things?

    1. Judge says:

      Objection sustained!

  2. Maude says:

    All this article seems to suggest is that AI is being positioned incorrectly. For some reason, the common perception seems to be that AI needs to solve many little ‘human’ problems (approximating DFT calculations, creating libraries), instead of considering drug discovery from a more holistic standpoint. We need to focus less on how similar the compounds are (is that really the biggest concern?), or how to replicate human intuition (surprise: it’s not very good), and look more to processes that can create and test hypotheses faster.

    1. Benonymous says:

      There’s a reason for this. Holistic AI to predict if something will be a drug or not can’t possibly work. -partly there is insufficient data. There are about 1500 drugs I reckon, and about a billion failures. Some of those failures, an unknown number, could have been drugs if only they were tested in the right way for the right disease. And some of the drugs do nothing. Secondly the number of variables is astronomic. That molecule causes heart failure in every thousandth person, sometimes, so it’s not a drug. Predicting the small things (is this molecule going to bind to hERG) would be very useful. Is it going to revolutionise drug discovery? Doubt it. Useful yes, but massively overhyped, because that’s what you need to do to raise money in order to try it…

  3. Curt F. says:

    The criticism is overwrought but does make a few good points. The claim that text-CNN is better than graph-CNN and that the DeepChem people are intentionally avoiding it is weird. I remember reading exactly the opposite in a Harvard paper that was heavily criticized here on In the Pipeline a year ago. That 2016 paper said

    In this work, we used a text-based molecular encoding, but using a graph-based autoencoder
    would have several advantages. Forcing the decoder to produce valid SMILES strings makes
    the learning problem unnecessarily hard since the decoder must also implicitly learn which
    strings are valid SMILES. An autoencoder that directly outputs molecular graphs is appealing
    since it could explicitly address issues of graph isomorphism and the problem of strings that
    do not correspond to valid molecular graphs. Building an encoder which takes in molecular
    graphs is straightforward through the use of off-the-shelf molecular fingerprinting methods,
    such as ECFP22 or a continuously-parameterized variant of ECFP such as neural molecular
    fingerprints.23 However, building a neural network which can output arbitrary graphs is
    an open problem. Further extensions of this work to use a explicitly defined grammar for
    SMILES instead of forcing the model to learn one 41 or to actively learn valid sequences 42 are
    underway, as also is the application of adversarial networks for this task. 43,44

    So more than a year ago someone was saying that graph-based encoding of molecules would be better than text-based (i.e. SMILES-based) encoding, and now Benhanda is condemning this effort. I don’t think there’s a definitive answer yet. The best way to encode molecules for deep learning may depend on the network architecuture — maybe graph CNNs are best for autoencoders but text-based is best for GANs… its early days, and probably pharma companies that are serious about getting into the space would be better off hiring a few experts to work in-house than by enlisting a secret strike force of anti-hype consultants, or whatever Startcrowd is exactly.

    1. I did not condemned the graph CNN effort, I condemned the lack of effort to compare graph-CNN against SMILES-CNN. I think that in the future, chemistry-specific methods should outperform text methods, but it is unclear whether we reached this point at the moment (and that’s why it’s unclear whether we need the deepchem library now).

      You hire who you want, but in-house experts often lack the freedom to deliver independent conclusions. They get trapped into office politics. The whole point of my post was to show how internal politics affect science. The reactions that I received, publicly and privately, just confirmed my point.

      Moreover, the task of AI for drug discovery is huge, and I doubt that few experts would suffice, if your company is serious about AI (Google is serious about AI, and have a look at their plethoric AI staff. By the way, Google is publishing papers in drug discovery, like the graph-CNN paper).

      Out there, there is a huge workforce online, largely untapped. That’s my idea behind Startcrowd. It is not a secret force, I try to make everything transparent, it just lacks fame. I agree Startcrowd is not a reputable brand yet, but if that’s what you are looking for, hire IBM Watson instead. But then don’t blame AI.

  4. Tomas says:

    The GAN approach is quite promising as the next step in AI.

    I agree it’s implemented the usual way: it’s “fake” if it wasn’t in the original set of structures. It would be an improvement to train it to recognize “fake” as non-membership in a much larger class of plausible structures. That would give it more generalization potential.

  5. Anon says:

    Not just over-hyped, but AI in drug discovery is complete bollocks and won’t deliver squat.

    That’s because AI can easily spot billions of potential correlations and connections, but the quality (signal-to-noise) will be.very low given billions of independent variables (degrees of freedom) and billions of (largely erroneous and irreproducible) observations, so a greater percentage of these correlations than ever before will turn out to be spurious and misleading outcomes of pure chance. AI can’t give new information, it can only process whatever garbage you give it to give more garbage.

    Thus I would say, AI will not only fail to improve drug discovery, it will be the worst, single most destructive thing that ever happened in the field. Success rates will plummet faster than we’ve ever seen before, because SI is the epitome of putting blind quantity over quality.

    1. Anon says:

      And unlike chess or Go, where all the rules, moves and outcomes are known with perfect information, biology is a cesspit of fuzziness, uncertainty and plain wrong information.

      There is no substitute for an experiment driven by a well-formulated hypothesis.

    2. Maude says:

      lol, ok

    3. Anon anon anon says:

      Humans also have to contend with “billions of potential correlations”, “billions of independent variables”, and “billions of (largely erroneous and irreproducible) observations”. On this very blog, Derek has discussed how bad people are at grappling with these:

      Why do you think people can succeed where machines cannot? Is there a specific technique that can’t be taught to the machine? Otherwise, to an outsider like me, it sounds merely like a new vitalism awaiting a new Wöhler.

      1. Anon says:

        I’m not saying that humans are better at spotting real patterns from random patterns. AI just makes the same mistakes faster. Garbage in, garbage out. The difference is that humans pause to design and run actual experiments.

      2. MrRogers says:

        The issue is sparse data. Humans are very (too?) good at inferring rules from sparse data.

        It takes less than an hour to teach a human the complete rules to play games that AI is good at (chess, etc.) In contrast, in my experience it takes around a decade to teach a human to reliably distinguish plausible results and conclusions in a paper from those that are implausible. In the case of chess, even if you don’t explicitly program in rules, there are enough recorded games to infer the rules. In contrast, there are very few areas of biology with data that is sufficiently dense to infer rules, and most rules can’t be explicitly given because we don’t know them. Even the genetic code is susceptible to RNA editing, ribosome skipping, selenocysteine incorporation, and almost certainly a host of other oddities that we haven’t yet discovered. That doesn’t mean that AI will be useless, but it does mean that initially it will be better than humans only in highly restricted domains where data has been thoroughly scrubbed.

      3. Benonymous says:

        As people, we’re really bad at drug discovery, and we need to never forget this. Look at our failure rates. I’m quite prepared to accept that the ability of AI to integrate orders of magnitude more data than I’m capable of can be useful in the process somewhere. But revolutionise? I don’t see it. We can’t predict the interaction of one molecule with one protein correctly. In fact, worse. We can’t even predict how soluble a molecule is in water. If AI can even result in that I’d be very happy… But i don’t think the quality data needed to do this exists at the moment. AI is not magic. It’s just model fitting, and people have been doing that for decades.

    4. Imaging guy says:

      You are completely right. Enrico Fermi would have agreed with you.
      “To reach your calculated results, you had to introduce arbitrary cut-off procedures that are not based either on solid physics or on solid mathematics.” In desperation I (Freeman Dyson) asked Fermi whether he was not impressed by the agreement between our calculated numbers and his measured numbers. He replied, “How many arbitrary parameters did you use for your calculations?” I thought for a moment about our cut-off procedures and said, “Four.” He said, “I remember my friend Johnny von Neumann used to say, with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” With that, the conversation was over.”
      “A meeting with Enrico Fermi”, Nature, 2004, DOI: 10.1038/427297a

    5. yf says:

      A.I. can not predict chaotic systems (e.g. weather, stock market, biological systems). Is drug discovery a static system that can be formulated in a probabilistic model? If so, A.I will be very helpful.

  6. Ellen Berg says:

    Just FYI, additional considerations in applying AI/ML in drug discovery are in my post here:

    1. Mol Biologist says:

      Agreed 🙂 External domain of knowledge to provide the relevant context is very sufficient. I know Derek would like put me in Flightosome again but fortunate I think biological data is very harmonic and valuable. However, like in theoretical physics you have to prioritize what is most important and what is a noise? I doubt very strongly that AI can crack a biology but I don’t want to just ignore that correctly validated vital data can be substantial in drug discovery.

    2. DH says:

      Nice article. I’d add that more than anything else, a machine learning (ML) algorithm requires high-quality input data. For identifying common objects in images, large quantities of such data are easy to come by. For biological systems, not so much. While ML and AI algorithms continue to progress, IMO what the field needs is more smart biologists designing and executing clever experiments to produce the quantities of data needed for ML and AI.

  7. me says:

    Absolutely shouldn’t shoot the messanger on this one, since the AI stuff is more interesting, but did anyone read this bit at the beginning:

    “I am optimistic that things can be different in 2018. It’s not really because of breakthroughs in artificial intelligence, but rather because R&D organization can be improved: stronger checks and balances are possible now, with the rise of online education and social medias. They present new opportunities for open peer-review, which can deflate the bubble. The mission of Startcrowd is to accelerate this trend.”

    Yea let’s use social medias and online education (???) to accelerate R&D!

    1. By online education, I was referring to MOOC: Coursera, EdX…

  8. DH says:

    Benhenda writes: “In this post, I argue that they must be careful, because pretty often, AI researchers overhype their achievements…”

    This is not unique to AI. In my experience, pretty much *all* researchers overhype their achievements — as do service providers.

  9. Insilicoconsulting says:

    I apply machine learning routinely and have had some success with classification and regression in QSAR and even drug target/tox /adme prediction. Expert systems for organic structure elucidation could also be included in that list.

    AI , particularly NN’s and Deep learning( extension of NN and the current focus of discussion) is meant to mimic human intuition, logic and learning . Theie main strength is supposed to be automatically learning relevant features rather than having to learn them. To that extent GAN’s and CNN’s do indeed bring in novelty and seem to learn essential features from smiles .
    Graphs are notoriously hard to represent as viable input to deep learning algorithms since they need vectors of a fixed length. Only a few categories of these algos can take variable length input. So the easier thing to do is use SMILES that intrinsically encode all relevant information other than 3d.

    Another idea might be to generate Molecular Electrostatic potential , FF related 3d visualizations/ images and use these for training.

    My criticism of these GAN’s is that GA’s are meant to do the same thing. Use fitness functions to weed out bad structures and evolve better ones. Is that our of fashion?

    Moreover, most new structures can be easily generated by using isosteric and bioisoteric transforms and then evaluating their activity in a separate model. So thus far interesting but they are catching up with what other algorithms have been doing for a while.

  10. Jake says:

    As a ML guy I thought this : is the most interesting of all the generative approaches, in particular I like the idea of working with the parsed smiles tree and not the actual smiles string.

    1. Insilicoconsulting says:

      One can just use atom centered circular fingerprint that consider each atom, it’s neighbourhood to n neighbours. Or even HOSE code? Or a functional group hierarchy!

      1. Jake says:

        Sure, is there a grammar out there for those codes? (I couldn’t immediately find anything for HOSE)

  11. LF says:

    I agree with Derek that one should follow the AI/Machine Learning development in the field of drug discovery. I’m still skeptical and I want to see more “real world” experimental results of applied AI methods. In my opinion it would be interesting to compare these AI methods with established methods , for example in compound library design. I think it’s worth trying, at least in academia.
    Here is an example where AI was applied and led to promising experimental results.

    The Schneider lab at ETH Zurich developed a generative model , similar to the one from AstraZeneca, to generative de novo designs (
    In a recent publication (10th of January), they applied the model and identified novel bioactive small molecules (RXR/PPAR agonists). (

    1. Alfred M Ajami says:

      The example from Schneider’s lab is a good one. Shows that even citizen scientists, even those of us who are not true believers, can get mileage out of machine learning. For an encapsulated narrative of the Schneider work in 6 steps, and my addendum to the commentary, see on LI:

  12. Vijay Pande says:

    Hi Derek, Vijay Pande here. I wanted to respond and clear up a few errors in your post. I’m always here if you’d like to chat details before you publish.

    1. DeepChem and MoleculeNet are open source, academic software projects out of my lab at Stanford. They are not startups or projects funded by Andreessen Horowitz as you stated.

    2. As you astutely point out, Mostafa has his own service that he is selling for a fee. In the spirit of transparency, he is making claims against our free, open source project while making an argument for his paid service — which he’s of course free to do. Specifically, he claims the DeepChem project a) has too much “lock in” on DeepChem and b) we did not include his preferred CNN model in MoleculeNet’s comparison.

    It’s important to note that DeepChem is an open source academic project, not a commercial piece of software. Nobody is “locked in” any more than they are to NumPy or SciPy – anyone is free to fork the code and do whatever they want with it. We hope that by opening it up, we’re useful to the community. We will of course strive to improve the code within the limits of what one can do with software engineering in a non-profit, academic enterprise. After reading his email on, I invited Mostafa to contribute to MoleculeNet to address his concerns and he declined.

    I encourage everyone who is curious to check out these free, open source projects and see for themselves:

    1. Anon says:

      “chat details before you publish” – is that an attempt to manipulate opinions?

    2. Hi Vijay, Mostapha Benhenda here.

      A clarification and a question.

      2. My service Startcrowd is not competing against the DeepChem product. Simply because Startcrowd is not a product, it’s a consulting service. For example, a Pharma company can decide to pay me, to tell them that they don’t really need DeepChem, and that plain TensorFlow is enough for the moment.

      If it can save their precious time, they won’t care about paying. Orders here:

      Why did you cite NumPy or SciPy for analogy? Why not citing the AI library TensorFlow?

      TensorFlow is also an open-source project maintained by Google (which is a commercial entity, but does it matter? Is A16Z a charity?)

      My answer is: because Google is leveraging TensorFlow to promote their Cloud Platform, which is pretty expensive to use. It’s the Gillette razor-blade model. Free razor, expensive blades. Details in this Forbes article:

      How Google Turned Open Source Into A Key Differentiator For Its Cloud Platform:

      I predict that Andreessen Horowitz could quietly invest in a Cloud drug discovery platform powered by DeepChem.

      I am very interested to hear Vijay’s answer to this question.

      1. Jake says:

        I’m not immediately seeing why you’re bringing up that argument about google leveraging tensorflow, it’s not as if one is prohibited from using it on other commercial clouds, on private clouds, or elsewhere. If you’re referring to the TPUs, it’s not as if their existence implies TF does worse on GPUs.

  13. Dear Derek,

    As our work is mentioned in this post, we would like to respond as well.

    Let me refer to two different preprints/papers of our group. I think part of the confusion is the early results that can be posted as a preprint vs. the finalized peer-reviewed publication that is always improved by comments from the community as a preprint and by the reviewers and editors of the journals involved.

    1. First of all, let’s begin discussing the first version of our autoencoder preprint a year ago, which you criticized in your blog ( We did not want to respond to your criticisms until the paper was accepted for publication and properly peer reviewed, as we were improving the models for over a year. The paper is now set to appear in ACS Central Science very soon. The latest version of the preprint is available here I refer you and others to the figures (esp. 2, 4 7) with regards the structure of the latent space of our model. It should be visually apparent that when the model is modified and trained further, it can substantially increase its yield of chemically-relevant structures. We learned a lot during the revisions of the paper, and as Vijay posted above regarding his code, we are also soon releasing our code in open source fashion for others to expand and modify. We still have not done so, as our goal is only to publicize code when it is mature enough for public consumption. The paper, in its arXiV form has been cited 44 times according to Google Scholar, and it was discussed a lot in a recent workshop at the Neural Information Processing Systems (NIPS) workshop dedicated to this topic. One of my favorite models that extends the original idea is the work of Le Song and co-workers (

    2. With regards to the second paper criticized in Mostapha’s company advertisement medium post, which attempts to create a controversy where there is none (!), let me say that we are open about the limitations of the first generations of our GAN model. As many commenters above say, GAN is an attractive avenue to explore. Our ORGANIC reinforcement-learning approach allows us to enrich molecular distributions with desired molecular properties. If used with models based on proper pharmacological information rather than the RDKit descriptors that we employed, may be useful for drug design. We discuss in our paper what are the avenues of improvement (which we are working on actively), “A major drawback of the ORGAN paradigm is the amount of non-valid molecules. In our experiments with organic, we have found the ratio of non-valid molecules to vary greatly in the range 0.2-99.9%. What is more, from the portion of valid molecules, it is possible to find a lot of repetitive patterns. Both cases depend greatly on the training set and optimized metrics.”. So I don’t see any major controversy as we are open about the limitations and open opportunities for our approach. What matters to us is to introduces techniques early to the chemical community and further improve upon them. After all, reserach is a cycle of preprint-improvement-publication-preprint-improvement-publication that should be done as much in the open as possible. Civility is also important in this process and should not be lost. That is why preprint servers are so useful to the community. The criticism about molecular diversity that Mostapha is harping on will be addressed by us and others in a timely fashion, with proper metrics in further revisions of our preprint/paper or in new publications. It has not been our full priority, e.g. drop whatever we are doing to satisfy the whims of a commercial researcher. When Mostapha contacted us several months ago and having read his preprint, decided not to engage with him directly but rather through the proper peer-review channels.

    3. It is quite unfortunate that there is so much hype about AI and that blog posts such as Mostapha’s just add noise instead of actually relevant information.

    Alan Aspuru-Guzik and Benjamin Sanchez-Lengeling
    Harvard University

    1. Curt F. says:

      I’d like to thank Professors Pande and Aspuru-Guzik for responding. Their posts have certainly clarified my views on the topic. Specifically, the fact is that DeepChem and MoleculeNet are open source, and that anyone could either contribute to these ongoing projects, or even “maliciously fork” them to spin off their own version. That Startcrowd hasn’t done this says, to me at least, a lot about their motivations. (Dislosure: I work at the same institution as Vijay Pande but am not involved with his lab’s work in any way, other than silently lurking GitHub’s issue tracker for DeepChem to watch what kind of fixes all the developers are currently working on. Haven’t tried to install it yet myself.)

    2. Dear Professor Aspuru-Guzik,

      0. Thank you again for repeating your intention of answering me one day.

      As an impatient person, I totally agree that it’s better to post a half-baked paper quickly, instead of waiting for a polished piece. I hope my comments help you improving your piece. I am also convinced about the value of shouting loud that your piece is half-baked, so that other people can finish your valuable job.

      2. Please forgive my impatience, but it was not only about my personal whims.

      I also had some compassion for all the people who suffer from cancer and global warming, and who are even more impatient that the world finally finds those magic AI-molecules, which will save their lives.

      They can’t rely on Trump whims to increase American research credits, which could allow Harvard to hire more PhD students and postdocs, and deal with this question, which currently ranks low on the priorities of your small lab.

      Given my situation, I preferred to send a reliable signal to various private investors, who have tons of cash to burn on those important problems (as I quickly understood that A16Z was not gonna be a good fit). Maybe they will choose to invest in Startcrowd, instead of donating to Harvard or Stanford.

      While always remaining civil and respecting people, I still love the tone of entertainement. It’s not only a matter of personal whims, it’s also a commercial policy: academic papers are boring. Investors can comfortably read my post in their Uber. In any case, I think that my post is more civil than the average anonymous review, coming out from the peer-review channels, which you seem happy with.

      To keep this civility, cherished by all, I suggest you to be careful with name-calling, like ‘commercial’ researcher. Personally, I don’t see any difference between ‘commercial’ and ‘academic’ researchers. Did you imply that one category is better than the other, in some way? I don’t think so, but it’s still a good joke to submit to the PhD comics.

      I am sure Googlers in machine learning will laugh at being name-called ‘commercial researchers’. There are less Startcrowders out there, but they still deserve the same respect.

      Anyway, I will never be offended by the ‘commercial’ label. If anything, it’s a source of pride: I am trying hard to find a sustainable business model, for doing useful science, even in counter-intuitive ways (like this ‘anti-hype’ consulting service from Startcrowd, which is unique in the world, and I believe it has a huge boulevard ahead). Money is not falling from my sky.

      By the way, I think that building a useful business is more respectable than unnecessarily asking people for charity, as you are doing on the Benefunder donation platform:

      and video clip:

      3. Finally, I am sorry if your ivory tower is not sound-proof yet, but you are not in the right position to complain about noise. The ORGAN paper on Arxiv is very noisy too: why showing all those numbers, metrics and tables, if you don’t use them in your conclusion??? It just adds noise to the paper 😉

      Mostapha Benhenda
      Startcrowd lab.

  14. tlp says:

    I wonder if AI/ML methods in chemistry will end up repeating the fate of ab initio calculations, docking/MD simulations, etc. – creating a zoo of sophisticated methods that nobody understands except their authors, serving highly specific problems and eventually going out of fashion because nobody cares even if they are correct. So far I see just ranting about how to represent molecules, which souonds like an ages-old cheminformatics problem.

    If I remember correctly a year or two ago there was a nice analysis which showed that the main culprit in drug discovery is poor predictive quality of animal models. Is there any effort from AI community in this area? Or is there even a way to do that computationally? I don’t mean organ-on-chip stuff but some kind of human – animal1 – animal2 predictive model.

    1. What about this as a very small but quite straightforward example why AI is “not just another ab initio calculations, docking/MD simulations cool stuff”.
      In short, the AI model was trained, then it suggested new molecules, then they have been synthesized and showed nM activity.

      1. tlp says:

        Docking/MD/Pharmacophore modeling also provided tons of nM binders over the years. So I don’t see how the mentioned AI paper proves your point. It rather shows that AI can be a replacement for docking/pharmacophore modeling with more reliable scoring function (but I’d rather wait and see an overview with N>1 examples). I know ‘de novo design’ sounds better than pulling stuff from ZINC but, come on, that was the whole point of virtual screening – that one could buy bunch of molecules and test quickly rather than synthesizing by yourself.

        I don’t mean AI, ab initio, and other computational methods in chemistry are worthless or bad in any other way, I just mean that shortage of nM binders is not the biggest problem in drug discovery.

  15. BLynch says:

    Turning to the broader issue of applying AI to biology, the experiences of IBM’s attempt might be a useful reminder that it is not easy.

    1. Ellen Berg says:

      One thing IBM does not appear to have solved is the issue of the published literature being terrible. It has gotten incredibly bad with predatory journals coupled with the common practice of assuming that if there are many citations supporting a particular finding, that this finding is of high confidence. It’s easier to publish what is in agreement of the current literature and also much easier getting funding!

      1. Chris Swain says:

        I agree this is a critical question and I don’t think there is an easy answer.

        In organic chemistry Org Syn provided an annual catalogue of independently validated experiments. But I’m not aware of similar resources for biology.

        We are now seeing listing of validated (or not) chemical probes ( which might help put some perspective on published findings.

  16. drsnowboard says:

    No offence, but if you can’t all post a rebuttal without a typo, why should I believe the detail handling of your models?

  17. Hypy peepy says:

    Imho the answers produced by the papers presented here are a bit too easy and too premature that they overshadowed years of effort endowed by the experimentalists. (As for why the R1 authors did not get rejected by the reviewers? It is hard to imagine.)

    Certainly, what gives “hype” a definition could be the retrospect received years later when we saw the AIs proposed here did not actually help real world problem.

    Qui vivra verra.

  18. Marwin Segler says:

    Dear Derek,

    There are a few issues with the linked article, which I want to highlight here, particularly because it mentions also the molecule generation work I was involved in at AZ.

    1) The purpose of the Pande group’s moleculeNet benchmark (which has been peer reviewed and published in Chemical Science) is to provide a publically available set of tasks that anyone with ideas for new models can use, expand, and compare against. As it is free and open source, there is no lock-in at all! Providing good benchmarks is a great service to the community, as they help to drive the field forward. As a comparison, consider method development in organic chemistry: You could see an open source benchmark as a particularly easy to reproduce experimental procedure. You can run it as a control in parallel to your new method e.g. your new C-H activation catalyst, to show that it produces a higher yield than the established (benchmark) method.

    2) The statement that molecules generated by neural networks are not diverse enough is incorrect. Looking at figures 5 and 8 our paper immediately shows that the generated molecules have a similar distribution in chemical space as “real” (ChEMBL) molecules, and are not just clustered in a small area. Staying close, but not too close to established organic chemistry is exactly what is needed to achieve both novelty and synthesizability. We provided 400,000 generated example molecules in the SI ready for inspection. Our paper was posted on arXiv already a year ago, and has in the meantime also been published in ACS Central Science. DOI: 10.1021/acscentsci.7b00512 Molecule generation with neural networks has now also been investigated by several other academic and industrial drug discovery groups (e.g. Olivecrona et al, and Blaschke et al. at AZ). Most prominently, Gisbert Schneider and coworkers have just days ago reported that neural network generated molecules can be prospectively generated and synthesized (DOI: 10.1002/minf.201700153).

    3) What is particularly unfortunate is that the linked article suggests that our research paper tries to stir some kind of hype. Quite the contrary is true: We provide a critical and balanced discussion on deep learning methods in our paper. De novo design and QSAR are tools that have been around for decades, and everyone in the field knows that they have limitations. While there is now increasing evidence that modern neural networks can lead to improved models, it would be unreasonable to expect them to do magic.

    4) The article implies that online classes and social media can provide “checks and balances” to R&D. I am not convinced. Call me old-fashioned: I firmly believe the main place for students to learn chemical research remains the lab.

    1. Mostapha Benhenda says:

      2) You wrote:

      Looking at figures 5 and 8 our paper immediately shows that the generated molecules have a similar distribution in chemical space as “real” (ChEMBL) molecules, and are not just clustered in a small area.

      Same as for ETH Zurich:

      Basically, 2- Dimensional visualizations of 100-D spaces are not enough. To really show something, you must define a quantitative metric.

      The claim made in the blog is not incorrect, but it’s a simplification of the claim made in my preprint (to which I already gave the link).

      About open-source lock-in: same as for Vijay:

      1. Jake says:

        “””2- Dimensional visualizations of 100-D spaces are not enough….”””

        While I agree that it would have been nice to have seen numbers such as ‘percent of variance explained by first n singular values’, for what they’re trying to do the t-SNE plots are perfectly fine. (Also how are you getting 100 dimensions out of “7 physicochemical descriptors “)

  19. yf says:

    An A.I. enabled monkey would have run faster, found greener pasture, eaten healthier, even had a bigger group than its rivals. However, it would have never picked up a bone from a dead animal and used it as a weapon. In short, A.I. will make wheels run more efficiently, but never invent one.
    ( 2001, A space Odyssey).

  20. MoMo says:

    Great. Now the AI freaks are arguing amongst themselves while chewing up our valuable heartbeats with Fad Science. Again. AI tried 30 years ago with the work of Klopman yet its been reinvented here for the sake of the younger synapse set.

    When AI gets to where Fragment Based Drug Design is send us all a sign….Any Sign….

    We will all be waiting…. and waiting…. and waiting……waiting……..

  21. Jing Zhou says:

    After reading all of the posts, I don’t remember seeing anyone linking AI to biophysical parameters such as affinity, deltaH, on/off-rate constants and co-crystal structural data. I am imagining that we can train the AI with all the available co-crystal structural data and ligand binding affinity data, the AI could output some novel ligand with input of any apo-protein structure without knowing all the chemistry and physics.

Comments are closed.