So let’s talk robotic chemistry experimentation – that always calms everyone right down, doesn’t it? This new paper in Nature from a group in Liverpool is (at heart) a pretty straightforward implementation of modern reaction optimization, with the added feature that it’s being done by a mobile robot, rolling around the lab in the dark for about 20 hours a day. The motorized chemist itself is shown at right – it’s battery operated (the four hours per day of down time is for charging) and can perform a wide variety of tasks and manipulations on both solid and liquid samples.
The reaction being investigated is the semiconductor-mediated photochemical production of hydrogen from water, which has of course been the subject of a massive amount of research. It’s an important topic – water is nothing more than burnt hydrogen, and “unburning” it in a catalytic manner like this would be very useful. We need hydrogen, among other things, to make ammonia to keep us all from starving. Right now we get most of it from steam reforming of hydrocarbons, especially methane, which is an energy-intensive process (has to be, thermodynamically) that runs at about 70% efficiency. About 4% of worldwide hydrogen is produced by electrolysis of water, and this photochemical route is basically a search for a better way to do that latter reaction as compared to sticking the electrodes in and throwing the power switch.
Semiconductor surfaces can react with light to produce electon/electron hole pairs, and those electrons can be used for the water-splitting. But you have to do something about those positively charged holes left in the semiconductor, so “hole scavengers” have been an active area of research. Tertiary amines can do the job, donating electrons back to the material, but that decomposes them in turn. Could there be a catalytic hole scavenger cycle running alongside the catalytic water-splitting one?
Finding such a thing requires a lot of experimentation; it’s very hard to model these processes from first principles, what with all the mixing, surface effects, multiple kinetic steps, and so on. So in this paper the authors picked a photocatalyst (the polymer P10) and first had the robotic system screen 30 candidate hole scavengers. This involves several steps – dispensing the solid polymer into vials, adding solutions of the candidate scavengers, capping the vials, sonicated them to disperse the contents, photolysis of each vial (under a mixture of visible and UV light), and analysis afterwards by GC. The capping and photolysis stations were built with the robot in mind, but the others were all regular equipment.
Any promising scavenger candidates were set to repeat five times for comparison, and the only two that looked of any interest were ascorbic acid and cysteine, which was converted cleanly to the dimer cystine. The next round of experiments tried to optimize that system, with three photosensitizer dyes, varying ionic strength (addition of NaCl), changes in pH by NaOH addition, addition of surfactants, and addition of sodium disilicate. All of these variations could affect the others, and the group calculated that a full search of all the possibilities (over about 20 different concentrations for the components and the P10) could come out to 98 million experiments.
To deal with that, they set up a system where the system started from predictions of what combinations might be interesting (with bounds on how confident that prediction might be), and then uses a Bayesian framework with a “capitalist acquisition” strategy. The confidence bounds are recalculated as new data come in, and a whole portfolio of experimental conditions is generated with various levels of risk aversion/risk tolerance – greed, in the language of the capitalist algorithm. These portfolios are, in effect, searching for the highest return (maximum rate of hydrogen production), and the system ensures that some of the experiments will be very conservative while others will be comparatively wild shots into the unknown.
After about 150 experiments (roughly two days of running time) it seemed clear that neither the surfactants nor the extra dyes were bringing anything useful – all of them killed hydrogen yields every time they were added. So these were deselected. And after 8 days of total working time (688 experiments) the system had found a mixture of P10, NaCl, NaOH, sodium disilicate, and cysteine that increased the hydrogen yield by a factor of six. Interestingly, the beneficial effects of adding NaOH had earlier been masked by the addition of the dyes, and it made a comeback in the later rounds of experimentation. The sodium silicate variable had a lot of work put into it earlier, but was less important by the end. This might be a good place to note, since the paper makes no mention of it, that the addition of NaCl might not just be an ionic strength effect – chloride ion is also thought to be a useful hole scavenger itself in some systems.
Judging from the time needed to run these experiments by hand, the authors estimate that the mobile robot system is up to one thousand times faster in exploring such a large experimental space, and note that “It is unlikely that a human researcher would have persevered with this multivariate experiment using manual approaches given that it might have taken 50 experiments or 25 days to locate even a modest enhancement . . .” That’s certainly true, but the humans who set up the robots and watched their progress (or lack thereof) can be similarly impatient. I could easily imagine me (or some other human) looking at the results after a couple of days and saying “150 experiments and nothing to show for it! We’re doing something wrong, aren’t we?” Actually, by getting rid of the dyes and the surfactants, that’s more or less what these authors did. Keep in mind that the strategies for exploring experimental space are separate from the capabilities of the mobile robot itself – it’ll roll around and set things up according to whatever genius (or completely cockamamie) schemes you provide.
And that brings up some larger points. I like the idea of using automated and semi-autonomous systems to plow through big problems like this one. But it’s important to realize what the robot and the software don’t do. The search algorithm, for starters, doesn’t seem to have decided to ditch the dyes and the surfactants: the human experimenters did, and it turned out to be key to getting results in the end. Above that, it was of course the human experimenters who decided on the parameter space to search in the first place – the idea of using surfactants was based on another polymer catalyst where that seemed to help, and likewise the pH and ionic strength. These were human-generated hypotheses. The robots and software can run off and do experiments, but humans have to tell them what to do, or at least where to get started and what variable to consider. An even larger issue is the decision of what to turn the robots loose on in the first place – it is, after all, a human decision to try to look for photocatalytic hydrogen generation systems as opposed to spending your time and money somewhere else.
The authors address these points directly, to their credit:
This approach also has some limitations. For example, the Bayesian optimization is blind, in that all components have equal initial importance. This robotic search does not capture existing chemical knowledge, nor include theory or physical models: there is no computational brain. Also, this autonomous system does not at present generate and test scientific hypotheses by itself37. In the future, we propose to fuse theory and physical models with autonomous searches: for example, computed structures and properties1,2,3,4,5 could be used to bias searches towards components that have a higher likelihood of yielding the desired property. This will be important for search spaces with even larger numbers of components where purely combinatorial approaches may become inefficient. . .
So in the end, it comes back to what’s been said before about such automation: it does not get rid of the human element, so much as push the humans to work on the parts that only humans are good at. Those are the higher-level things: what experiments to run, what hypotheses to formulate and how to test them, what variables to introduce. And above all that, what entire types of experiments and projects should be running in the first place. No software would have told you that there was such a thing as the Diels-Alder reaction out there to be discovered and optimized, nor that photoredox synthetic chemistry was an underdeveloped field where effort would pay big dividends. No software would have said “Hey, go search in bacterial defense mechanisms for a tool to edit all the other genomes up to humans” or “Whoa, did you realize that there must be a phase-separation component in the transcriptional machinery?” Nope, that stuff is up to us to find, to know what we’re looking at when we see it, and to realize what it could be used for after that.