Skip to main content

Chemical News

The Rise of the Rise of the Machines

There’s yet another paper on computer-devised retrosynthesis out today – it and the previous one make an interesting pair. I have a Nature “New and Views” comment on this one (free access link) for a broader audience, but I’ll expand on my thoughts here. (Update: I’m also going on about this on a Nature podcast here).

Overall, the same general thoughts apply to this work as to the last one. What we have, via a team from Münster/BenevolentAI/Shanghai, is another piece of software that has picked up large numbers of possible synthetic transformations and has ways of (1) stringing them together into possible routes and (2) evaluating these against each other to elevate the ones that are deemed more desirable. Naturally, the response from many organic chemists to these things has been “But that’s what we do”, followed by “Surely no program could do it as well”. The strong form of that latter objection is “Surely no program can do this to any useful extent at all”, and the weak form is “Surely no program can do it for all the molecules that we can”.

I’m going to dispose of the strong-form objection immediately. Whatever you think of the idea, the programs in the last paper and this one (I’m getting to it!) are generating plausible synthetic routes. You may find them derivative of known ones; you may object that they’re not really any different than something that any competent chemist could have done. But those are already victories for the software. You might also object, with varying degrees of justification, that the molecules and syntheses chosen are there to show the software in the best light, and that real-world use won’t be as fruitful. But that’s a holding action, even should it have merit: the fact that it can work at all turns the problem, if there is one, into optimizing something that already exists. As the history of chess- and Go-playing software shows, such piecing-together-strategies-and-evaluating-them tasks improve relentlessly once they’ve been shown to work in the first place.

That takes us well on the way to disposing of the medium objection, because if the programs aren’t doing this as well as a person can now, well, they will. And that also is an introduction into this new paper. You will have heard over the last two or three years about how Google’s (DeepMind’s) AlphaGo program was able to compete with and them beat the best human players of the game. Go is significantly harder to deal with computationally than chess, so this was a real achievement, and it was done partly by building in every human maneuver and strategy known. But last fall, they announced a new program, AlphaGo Zero, that comes at the problem more generally. Instead of having strategies wired into it, the new program is capable of inferring strategies on its own. The software ran hours and hours of Go games and figured out good moves by watching what seemed to work out and what didn’t in various situations, and at the end of the process it beat the latest version of AlphaGo, the one that beats every human on the planet, one hundred games in a row. It makes moves that no one has yet seen in human play, for reasons that Go experts are now trying to work out. (Here’s the latest iteration, as far as I know).

The Chematica software I wrote about earlier this month is an example of the AlphaGo style: its makers have spent a great deal of time entering the details of literature reactions into it: this goes to that, but only if there’s not a group like X, and only if the pH doesn’t get a low as Y, etc. Synthetic organic chemists will be familiar with the reactivity tables in the back of the Greene protecting group book – that’s just the sort of thing that was filled out, over and over, and with even more detail. Without this curation, software of this kind tends to generate routes that have obvious “That’s not gonna work” steps in them.

This new paper, though, appears to be in the AlphaGo Zero mode: the program digs through the Reaxys database (all of it) and infers synthetic transformation rules for itself. If this works, it could be a significant advance, because that data curation and entry is a major pain. There are at least two levels to such curation: the first (as mentioned) is capturing all the finer details of what is likely to work (or fail) in the presence of something else. The second goes to the reliability of the synthetic literature in general – you don’t want to feed reactions into the system that haven’t been (or can’t be!) reproduced by others. The way this new program deals with these is pretty straightforward: the first type of curation is handled by brute force processing of Reaxys examples, and the second by a requirement that only transformations that appear independently a certain number of times in the database are allowed into the calculations.

Organic synthesis is a lot harder to reduce to game-type evaluation than chess is, as the authors rightly point out. To get around this, the program combines neural-network processing with a Monte Carlo tree search technique:

In this work, we combine three different neural networks together with MCTS to perform chemical synthesis planning (3N-MCTS). The first neural network (the expansion policy) guides the search in promising directions by proposing a restricted number of automatically extracted transformations. A second neural network then predicts whether the proposed reactions are actually feasible (in scope). Finally, to estimate the position value, transformations are sampled from a third neural network during the rollout phase. The neural networks were trained on essentially all reactions published in the history of organic chemistry.

A strength of the Chematica paper is that the routes were put to a real-world test at the bench. This new work didn’t go that far, but what the authors did do was have the program generate retrosyntheses for already-synthesized molecules, and then have these routes and the known ones evaluated blind by experienced chemists. The results were a toss-up: the machine routes were considered just as plausible or desirable as the human ones, and that (as above) is a victory for the machine. AI wins ties.

At right is an example of a route generated by the 3N-MCTS technique. That’s not a particularly hard molecule to make, but it’s not an artificially easy one, either. That is, in fact, an intermediate in a published synthesis of potential 5-HT6 ligands, and the route the program found is identical to the one in the paper, so you can be reasonably sure that it’s valid. You or I would probably come up with something similar – I personally didn’t know that first spirocyclobutane step, but I would have done what the program basically did: look in Reaxys or CAS to see if something like that had been prepared. (Note that the program doesn’t memorize routes, just steps – it pieced this together on its own and declared it good). Note that the program delivered this one in 5.4 seconds, and none of us are going to beat that. Add up the total time we all spend on stuff like this and it starts to look like it’s cutting into other work, you know? If you’d like to see several hundred more schemes along those lines, they’re in the paper’s SI files.

So now we have two types of retrosynthesis software that (at least in some realistic examples) are if not better than humans, apparently no worse. Where does that put us? And by “us”, I mean “us synthetic chemists”. My conclusions from the earlier paper stand: we are going to have to get used to this, because if the software is not coming to take retrosynthetic planning away from us now, it will do so shortly. You may not care for that – at times, I may not care for it, either – but it doesn’t matter what we think. If it can do even a decent job of stitching together and evaluating routes, the software will beat us on just general grasp of the literature alone, which some time ago passed beyond the ability of human brains to organize and remember.

And the next step is more-than-decent ability to see and rate synthetic plans. Despite my comparison to AlphaGo Zero (which is valid on mechanistic grounds), it’s not that this new software is coming up with routes that no human would be able to. But if we’re approaching “good as a human”, the next step is always “even better than a human”. Eventually – and not that long from now – such programs are going to go on to generate “Hey, why didn’t I think of that” routes, but you know what? Those of us in the field now are going to be the only ones saying that. The next generation of chemists won’t bother.

They will have outsourced synthetic planning to the machines. Retrosynthesis will remain a valuable teaching tool, and it will still be the way we think about organic chemistry. It will persist in various forms in the curriculum just as qualitative functional group analyses did, long after they died out in actual practice. Actual practice, meanwhile, will consist of more thinking about what molecules to make and why to make them, and a lot less thinking about “how”. That will only kick in for structures too complex for the software to handle, and that will be a gradually shrinking patch of ground.

This will not go down easy for a lot of us. Thinking about how to make molecules has long been seen as one of the vital parts of organic chemistry, but knowing how to handle horses was long seen as a vital part of raising crops for food, too. It’ll be an adjustment:

No memory of having starred
Atones for later disregard
Or keeps the end from being hard

That’s Robert Frost, and his advice in that poem was “Provide, Provide!” We’ll have to.

54 comments on “The Rise of the Rise of the Machines”

  1. steve says:

    Don’t worry, there will still be jobs for chemists – serving our AI overlords.

  2. Me says:

    This brings into context the problem of being ‘only a synthetic chemist’ in med. chem. terms. I’ve been at places where that was institutional and places where it was simply a value judgment made by management, but it goes to show that if you can’t make decisions on what molecule to make next, your are adding less value than those chemists that can. How long will it be before ‘ethyl propyl futile’ can be handled in an entirely automated fashion from design to synthesis?

  3. ba says:

    Reading about computers engaging in what can fairly be described as genuine invention – new Go moves, synthetic routes, and who knows what else in the near future – reminds me of a line from Watchmen:

    “If that statement starts to chill you after a couple of moments’ consideration, then don’t be alarmed. A feeling of intense and crushing religious terror at the concept indicates only that you are still sane.”

    1. Kyle Wilson says:

      On a smaller scale this is what compiler optimizers do for software engineers today. The code that gets generated by a good compiler long ago reached a point where a human can’t get better. The compiler can look at all sorts of internal details of the execution hardware and make decisions that would have taken a human much research to arrive at. This has just pushed the human part of the process up to a more abstract level where human minds are still needed.

      1. Eldritch says:

        Actually, the computers aren’t better than expert humans at writing really, really fast programs whenever it’s actually possible for both to compete – the most speed-sensitive software, generally painstakingly optimized numerical libraries that a lot of other compute-intensive resources use, often will contain a fair bit of hand-optimized assembly.

        Hand-optimizing assembly, however, takes an enormous amount of effort, and is a rather esoteric skill these days, such that as long as the compilers can do a half-decent job it basically never makes sense to pay a human to do it for you – especially since the compiler is virtually never the bottleneck for performance problems with your code. And hand-optimizing assembly makes the program much, much, much harder to maintain and modify, and most applications are simply too large for humans to work with them on that level, and…

        1. Some guy says:

          It’s more complicated than that these days. You get the best results by knowing the details of the processor and then doing things like annotating with compiler/processor specific directives and seeing how the output changes.

          Especially when your using profiler guided optimizations, the compiler is generally better than any human at deciding what tradeoffs need to be made.

      2. Paul D. says:

        Testing compilers reveals another interesting parallel with med chem.

        You’d think finding bugs in software would be easier than finding drugs that treat some illness. After all, the software is all there, fully described, hopefully deterministic in its behavior.

        But it turns out it’s very effective when testing compilers to just bombard them with random inputs (drawn from various distributions biased to test various parts of the system.) Any compiler that is confronted with such a testing tool for the first time will reveal bugs, and often in large number. It’s something like high throughput screening of drug candidates.

        1. eyesoars says:

          Yes… and it seems like one of the steps here for the chemistry program is to rank the synthetic steps it is fed in terms of ‘unlikeliness’, and see if those can be reproduced at all. I.e., asking ‘how good is the program at recognizing nonsense and the improbable’?

          Meanwhile, as someone who has worked in the compiler and optimization arena, a combination of routes is often best. The compiler is very good at keeping track of lots of apparently independent (and not-so-independent) facts, and generating good, if not always outstanding, code. If optimization is needful (not usually, these days), then it’s usually a kernel of some sort, to be executed billions or trillions of times, and there humans can step in. Either by replacing the kernel with a bit of hand code, or hand-editing the machine-generated code to improve it, or by annotating the original source to tell the compiler that ‘this and this are safe’ and letting it do more optimization. Having a human optimize a large program by hand is simply not done anymore.

  4. anon says:

    Similar software soon will be the standard and there won’t be any need to memorize Name Reactions or all those protecting groups. Just enter your target, go get your coffee while the software is doing the hard work. This will also dramatically change the undergrad/grad curricula. No need to memorize pKa tables, functional groups that react with this or that etc..

  5. another guy named Dan says:

    I think that these systems are beginning to blur the gap between expert systems and AI, but may not be there yet, especially as they probably cannot use insight gained from one field of study to enlighten another.

    It does look like they’re reaching a weak Turing Test milestone in their limited fields: given an input, the output they produce is becoming less distinguishable from that of a human being. It may be that they produce results that are mundane, inane, or even just plain wrong, but as much as us meatsacks want to deny it, we do to.

    1. Design Monkey says:

      Insight transfering from one field to other might very well be a specific quirk/ deficiency of wetware ways of working , and not at all especially needed for proper AIs of different architectures. After all, why to try to wrench something from different field, when 1. there is no guarantee that foreign insight would be useful or applicable in any way, 2. you (AI) can properly derive all the relevant insights within the field itself.

      Would Stockfish or Rybka or what not of chess engines improve by insigts about antique wars and battle strategies and library of outcomes? Not at all. And chess even was designed as crude simulator of war.
      Even sharper example is that same AlphaGo Zero. Who really derived it’s crazy strategies itself and ab initio, without insigts from others.

      1. another guy named Dan says:

        You’re arguing the reverse of th epoint I’m trying to make. From your analogy regarding chess as a gamified simulation of battle tactics, I would run the direction of knowledge transfer in the other direction. I don’t argue that an intelligent system should be able to use insigts from war to improve its performance at chess, but that insights learned at the chess board should be applicable to war tactics (maintain centers of strength, keep your units in mutually supporting positions, play to the unique capabilities of each of several different types of units, etc.) It’s that level of pattern fitting and abstraction that would be necessary to pass any but the weakest forms of the Turing Test.

  6. Big Lez says:

    From the conclusion:

    “While our approach is able to treat stereoinformation,
    the most important part, predicting enantiomeric or
    diastereomeric ratios quantitatively, remains an open challenge.
    Convincing global approaches for the quantitative prediction of
    enantiomeric or diastereomeric ratios over a wide range of different
    reactions without recourse to expensive quantum-mechanical
    calculations have not been reported”

    Surely that is a huge problem, there is no current way to make these predictions and no foreseeable way. So how can they progress beyond flat molecules and on to natural products etc. without some huge breakthrough in computing molecular structure.

  7. Kyle MacDonald says:

    Question from a non-chemist: to what extent do you think that it’s possible to think about “what molecules to make and why to make them” without a detailed understanding of “how”? As an analogy, how good can someone who has never been in a kitchen be at creating a restaurant menu?

    1. Derek Lowe says:

      Oh, those two are pretty much divorced from each other. You can come up with all sorts of structures, ideas that might be good variations on what you’ve already got or ideas that were suggested by some sort of computational model, and not quite know how to make them. “Wouldn’t it be interesting if we could turn this part into a seven-membered ring? Or if we had a basic nitrogen over here instead?” “You know, there seems to be an amino acid in the binding pocket that we could interact with if we just had a hydrogen-bond acceptor coming off this end of the lead compound. . .maybe if we had something like this. . .”

      To stick with your restaurant analogy, it would be a little like deciding that a seafood place might do well, or that your existing restaurant might want to add more low-carb appetizers to its menu since those seem to be popular, even if you’re not the person cooking any of it.

      1. another guy says:

        Pray they don’t package the software with pharmaceutical marketing AI or we’re all toast.

        The restaurant analogy that Derek expanded on seems to predict just that.

        1. NJBiologist says:

          You guys need to check out the endlessly entertaining weirdness of Janelle Shane’s blog, AIweirdness. I’ve linked her post on AI-generated pies. Based on some of the results (“Impossible Maple Spinach Apple Pie” and many, many others), human restauranteurs have some breathing room.

          1. gwern says:

            That’s just because it’s char-RNN on nothing but the titles. It has nothing to do with generating recipes. An actual example of generating recipes is much more compelling: for example, “Sukiyaki in French style: A novel system for transformation of dietary patterns” , Kazama et al 2017. Watson apparently also comes up with intriguing novel recipes which often work.

  8. steve says:

    Kyle MacDonald – This may answer your question.
    All of these protestations sound to me like we’re back in the days of Guttenberg.
    “But a machine could NEVER replace a scribe. What machine could possibly know what it’s like to put pen to parchment, to carefully draw each and every letter, to painstakingly transcribe from the original to make a new copy?”

  9. Kyle Wilson says:

    It seems to me that this is the same sort of path that compilers sent software engineers down. You stop sweating the low level details and move up the stack to higher level concerns. Along the way you’ll likely find that the problems people ask you to solve get more difficult so in the end the time from start to closure doesn’t change much with the machine helping out with the routine bits…just that the end result is something more complicated that would have been feasible previously.

  10. 123 says:

    Unlike GO or GO Zero, this software would not be able to teach it self as it does not have a way to know as to what works and what does not work in the lab and to add to that REAXYS or CAS does not have list of reactions that do not work (failures are seldom reported and documented in the literature by chemists!). Also, as is pointed out, the molecules are three dimensional and the stereo chemical implications are too complex for the computers to manage. I am not an alpha GO player so I suspect that it’s just a 2-dimentional like “chess” and mastery is easy for the machines, I would guess ( would like to corrected, if I am wrong)

    Also, the curse with synthetic chemistry is that the retrosynthesis or synthesis planning is not sufficient enough to be published unless the grad students are put to work and generate the supporting info to show that the “synthesis plan” indeed works as planned ( hence the perils of the previous post). It’s high time that there should be a journal or all the synthesis journals should start accepting simple synthetic plans as they carry a lot of educational value and would save the planet earth lot of chemical waste. It would also revive the creativity of synthetic chemists which is slowly fading due to the funding crisis.

    Any suggestions as to how could one go about starting an online free journal to publish just the synthesis planning where neither authors or readers will have to pay?

  11. Design Monkey says:

    And chemistry, that’s only side notes. The real fun will start, when Alpha Zero will figure out, you know, that stock market stuff. That might fix the little red wagon really good. Wholesale at that.

  12. Magrinhopalido says:

    I will be a skeptic until AI advances to the point of knowing which catalyst, solvent and base to use in an aryl amination.

      1. caveat says:

        I wouldn’t call this “done” for aryl amination, as it only applies to reactions run under soluble (DMSO) conditions with soluble strong, bases. Would have really liked to see this run under scaling relevant conditions to explore effects of KF, base particle size, stir rates, activation of catalyst, as well as comparing kinetics (i.e. some reactions go fast and then decompose, hence you get false neg results). Nonetheless, I love this paper and want to see more reaction performance and catalyst predictions based on ML…perhaps turn an algorithm loose on well characterized C-N couplings in the lit?

  13. Thiagarajan Bala says:

    In addition to retro synthetic plan, if the machine has something like 3D printer, it will make easy for chemists like me. Basically this is going to generate an atmosphere where the management is going to distribute the printed sheets and asking the real chemists to deliver the targets. It is good for difficult molecules and can shorten designing time.

    1. Derek Lowe says:

      You lost me at “3D printer”.

      1. metaphysician says:

        Indeed. No one sane is going to hand an AI a 3D printer. You might as well just declare for the robot overlords outright. 🙂

        1. x says:

          Nonsense – we’ll be fine as long as (1) no one is training the AIs to be cross-disciplinary, and (2) no one is providing the AIs with a corpus of viable “AIs take over the world” plans.

  14. DNAencodedFeces says:

    Yes, this will work perfectly for automated synthesis, with all those ultra-pure reagents my boss thinks he is saving money on from overseas chemical suppliers.

    Yep, cant wait until it all works perfectly!

  15. steve says:

    In terms of Go, as I’ve mentioned in other threads, typical games between experts last about 150 moves, with an average of about 250 choices per move, suggesting a game-tree complexity of 10e360. I think that probably would encompass the complexity of most synthetic routes. In terms of AI not knowing if a synthesis works, that’s what I meant by there still being jobs for chemists working for their AI overlords. Though one could imagine combining AI synthetic robots with AI analytical robots and eliminating the middlemen (or women).

  16. Hap says:

    This could help people, just as other forms of technology have, but since we (as a society) haven’t been good at figuring anything to do with people once they are no longer needed, I tend more to the fear part than the “Cool” part of this.

    Maybe it would help to let AI loose on some pharma/fine chemical portfolios.

    1. x says:

      Figuring out things to do with surplus labor isn’t actually that hard. The problem is that we’ve chosen to spend the necessary funding on producing billionaires instead, even though we can’t really justify their cost.

  17. 3rd year PhD says:

    Not sure if this was mentioned, but the software only learns from published results. Chemists (and specifically organic chemists) are NOTORIOUS for omitting negative results. If the software had access to negative results, surely it would immensely more powerful. Moreover, a lot of results go unpublished, a software only relying on positive results that are published has a huge limitation…

    nevertheless it is clear that AI will eventually beat us… I should have gone to med school…

    What will be even more earth-shattering is when AI get’s reasonable good at methodological reaction discovery (i.e. figuring out X ligand gives higher ee). For example, still in most pharma companies huge libraries of chiral ligands are used in screening for enantioselective hydrogenation reactions. If software can immediately tell you what ligand will give optimal ee for what substrate, then we are truly screwed. But as the paper points out, at the moment stereochemical predications are limited. Humans cannot really predict stereochemical outcomes — it can be very subtle — but humans are good at back-rationalizating stereochemical outcomes.

    1. RJMystery says:

      Will humans’ reluctance to report their own lack of perfected technique or their failures in planning keep the AI overlords from achieving complete dominance? Even if a reaction or scheme failure isn’t necessarily anyone’s fault I suppose it still hurts one’s pride to have to disclose or explain it. Otherwise it would in fact get reported because it is a result of its own. Maybe our own primate emotions will save us from providing the machines all the information they need to make us obsolete. I’m no expert on the technical capabilities of AI, I just think it’s interesting that incomplete result reporting may hinder the seemingly inevitable robot takeover. Maybe hubris is good here.

      The stereochemical limitations are also very important here, imho. Is every kind of prediction possible?

  18. tlp says:

    So organic synthesis is officially not an art anymore.
    Looking forward to competition Baran lab vs. B.Sc.-level-CRO-employee-empowered-by-AI-and-robots in a couple of years.

    Also, I imagine CAS and Reaxys are already fighting for the software.

  19. Anonymous says:

    I’m unable to read everything related to this post, but I’ll pitch in anyway, hoping not to be too redundant. I am familiar with Syngen (JB Hendrickson), having tested and used it many times thru numerous versions. Most chemists and AI programs can easily do one-bond retrosynthetic plannng, snipping off a side chain or dangling aryl group. After trimming off the spaghetti, one of Syngen’s (JBH’s) principles was to dissect the Target into two almost equal-sized pieces by cleaving two central bonds, if feasible. But it would process everything anyway and the user could adjust the filters. (You probably do not want to make a C20 polycylic via C19 + C1 ; C18 + C2… a C10 + C10 could be the most efficient. Quassin can be seen as a “dimer” of two identical C11 “hemi-quassin” carboxymethyl-dimethyl-methoxy cyclohexenones (one tautomeric to the other). Not picked up by Syngen was that hemi-quassin is itself a dimer of 1-methoxy-3-pentene-2-one = quarto-quassin.)

    Syngen did not know rearrangement reactions very well. And I’m not sure that any other AI synthesis programs can “see” some of the spectacular rearrangements that can be used to assemble and reconfigure complex skeleta.

    Sometimes, Syngen would generate utterly stupid and impossible retrosynthetic suggestions and THOSE were some of the most interesting. They would be the basis for coming up with new ideas and proposals to make such impossibilities realizable. (E.g., umpolung, homo-aldol, etc..)

    I don’t think that Management (industry, government, or private funding) is interested in investigating new chemistry ideas or synthetic reactions so most of that will never be tested.

  20. Once a chemist says:

    ‘Actual practice, meanwhile, will consist of more thinking about what molecules to make and why to make them, and a lot less thinking about “how”. ‘

    I think what to make may well also end up in the realms of the computer. Historically the main comp!aint in de-novo design is synthetic accessibility. If these methods work, then suddenly that roadblock is reduced.

  21. Wavefunction says:

    Agree that this kind of algorithm cannot anticipate things like unexpected things like rearrangements, cyclizations etc. but firstly, humans cannot do that until after the fact either, and second and more important, the algorithms only have to be good enough to start making enough of a dent and helping out their human counterparts. For the really thorny issues we’ll always need people to sort out the mess, but presumably much fewer than now.

  22. Rubarf says:

    Yeah right! As if a machine could ever sequentially try each photocatalyst with each solvent with each radical trap. That’ll be the day!

  23. Yepsir says:

    I don’t know. Say you have the path. A rob or could do X reactions with diff solvent, reactant A, temp ect and in a day or 2 determine if it’s viable. If it works human on large scale, then rinse and repeat. The machines will win

  24. David says:

    This paper and the earlier chematica one are both cool, but I’m not sure they’re actually going to speed things up all that much. In a typical synthesis project, the amount of time needed to develop a plausible-on-paper synthetic route is vastly smaller than the amount of time required to reduce it to practice and produce useful quantities of material. It seems like the computers are accelerating a step that is not rate-limiting….

    1. Derek Lowe says:

      I agree with that – in most med-chem situations, figuring out a route really isn’t rate-limiting. This isn’t going to speed up drug discovery. But no matter how much time we spend thinking about routes, we’re going to be spending less than we do now, it looks like.

  25. Some idiot says:

    To me, this is evolution, not revolution (albeit a larger step). But I actually don’t see it changing our ways of working much, apart from giving us a larger set of options and information for what we are about to do. For a a med chem, the main part is finding the right compounds to make (and I am in no way belittling the excellent synthetic word done in med chem). For a process chem (like me) the main part is making a practical synthesis (with all the myriad ins and outs that entails). Yes, route-finding is important, but the hands-on work is paramount.

    In both cases (as has been mentioned numerous times) this will be an excellent tool. But the hard work (as always) is getting the damn stuff to work, and the chemical insight to realise what the problems are when things don’t work. I will be watching with great interest to see whether or not these systems can deliver insight. And I think this is the critical point. To take the example of alpha go and alpha go zero, despite the fact that they come up with excellent moves (and are quite superior to humans), they cannot explain why. To the best of my knowledge this is the same with all (or the vast majority) of machine learning systems (which is also very problematic for for example people who are refused insurance for apparently no reason at all).

    I think that their statement that the system does not handle enantioselectivity (incidentally, the autocorrect on my phone suggested “DNA tips electricity” for enantioselectivity… 🙂 ) due to the requirements of heavy QM calculations shows that it is precisely this insight that will elude these systems until the next leap in computing technology arrives (massively parallel quantum computing, perhaps?). And even then, predicting that the product is actually in the dark oil which is clogging around the thermometer will be a tough ask.

    The only fear I have is that some top bosses who know far less about chemistry than they think they do will mean that they can use this software to cut significantly the number of chemists. This might look good in a spreadsheet, but far less so in reality…

  26. Marcin Stasiak says:

    My worry nowadays is only about robots and AI replacing nurses

  27. GO player and Immunologist says:

    As a decent GO player, I can say the rules of engagement in GO is exceedingly simple. I will say with confidence that they are much simpler than Chess. These simple rules allowed GO as a game to quickly scale up to a much bigger board to achieve the seeming complexity beyond ordinary human being’s computation power. As a result, a good GO player has to rely on his/her raw calculation power as well as intuition. These intuitions can be replaced by the high speed computational power of a super computer. In retrospective, the 0 and 1 decision making by computers is more similar to GO than to Chess. To beat AI, in my opinion, is to ask how many basic rules we are using? how many new rules for basic science can we discover in future?

  28. Eugene says:

    Next step, AI retrosynthesis output interfaced directly to an automated flow chemistry system.

  29. Dear Derek,

    This paper was newly accepted in Nature , but it’s an old one that dates back to the last August 2017

    And the algorithm was published in January 2017

    Next time, please comment new papers, and don’t wait until Nature editors slowly pick it.

    There’s nothing new in this ‘news’, and many other papers followed around the paper you noticed.

    Show intellectual leadership

    1. regularguy says:

      Oh my! A non-peer reviewed version was up on a preprint server for a few months before Derek covered it! How embarrassing! Just scathing mr. Benhenda, a truly devastating quip.

  30. Jie Shen says:

    It is not proper to compare AlphaZero with this “AI Chemist”. AlphaZero has infinity games to learn from, while the “AI chemist” has limited reactions to learn. So I don’t think it will come up with “why didn’t I think of that” route. Unless, it connects to the Synthesis Robots system and generate data by itself.

    1. Derek Lowe says:

      That’s a good point, as are the ones about the relative lack of negative results. What I might expect would be large organizations putting in their own e-notebook data (which will have more negative results and cover a wide range of conditions, via the process labs). The long-term hope, which I know that some groups are already working on, is to see what places there should be reactions, but aren’t yet for some reason. Those might be a good place to turn the microscale reaction-hunting systems loose in hopes of coming up with some new and useful transformations.

      I appear to have outlined an entirely new blog post. Look for a fleshed-out version of this in the next week or two.

      1. Matthew Todd says:

        “The long-term hope, which I know that some groups are already working on, is to see what places there should be reactions, but aren’t yet for some reason”
        I look forward to reading that, Derek. An area of great importance. For an early (and poorly cited) example of this from Rainer Herges, see

  31. Mostapha Benhenda says:

    Reading reviews about the previous version of this paper, published last year in ICLR 2017 is instructive:

  32. Scott says:

    Bit of a necro-post, but.

    Not knowing why the computer is giving you that specific output is rarely a good thing. Losing the institutional knowledge of why the computer is giving that output is even worse.

    The US Navy is going through something similar, as all the old maintenance guys finally retire from their shipyard jobs after they retired from their Navy jobs, the guys that the Navy has been training are just flat not up to the task.

Comments are closed.