Here’s a retrosynthesis challenge from Merck KGaA in Darmstadt. They’re celebrating the company’s 350th anniversary, and this is apparently part of the festivities. Anyone can enter for free, and the company will choose up to 12 entrants to take part in the competition itself. As I understand it, each selected person/group will then be furnished with the structure and will have 96 hours to submit a proposal. These routes will then be tested at the bench in a contract synthesis lab. The winner (based on number of steps, overall yield, and isolated purity of the final product) will get 10,000 euros, which is not a bad reward.
There are, of course, crowdsourcing platforms for scientific problems, with Innocentive being the best-known. This challenge, though, doesn’t seem to be as much of a direct “How do we make this molecule” question (for example, they aren’t going to commercialize the compound or anything arising out of the challenge). The Merck folks tell me that one motivation comes from their recent purchase of Chematica. They would like to know what the differences are between human ingenuity in retrosynthesis and the software challengers to it, so they’re throwing this one open to the chemical community at large.
To me, this sort of competition is two competitions mixed together. The first is paper synthesis – how many reasonable route are there to a given structure? How many rough classes do these approaches fall into, and how many weirdo singleton routes show up? What sorts of reactions and conditions get proposed more than others, and is this some reflection of the “natural order of things” or just an artifact of how we all learn organic chemistry?
The second competition is reduction to practice, and that’s where things get randomized a bit. As one or two readers may have had occasion to notice, not all good paper routes work out exactly as drawn. Where, then, are the skidmarks left when the rubber hit the road – what reasonable steps turned out to fail, and why? Could these failures have been anticipated, or are they more cases of “Well, whaddya know”? Can reactions and whole synthetic strategies be evaluated for robustness before you even start, or not (yet)?
All of these questions, at both stages of the challenge, are of great interest for AI development of retrosynthesis, so I can see why the Merck/Chematica team would want to see what happens. (I would guess that the Innocentive folks have a good amount of data on these topics as well, although I don’t know what use is being made of it). I very much hope that we get full reports along the way from Darmstadt, and I encourage the company to keep everyone informed. An open database of proposals would be great, along with updates about how the actual bench chemistry performs as it happens. Good look to all entrants!