Here’s another look at retrosynthesis software, building on the earlier Chematica paper that looked at generating new routes to known compounds. This is a more detailed look at the same idea, using the software to both analyze the existing routes to marketed drugs (and the patent landscape around them) and to come up with new ones that would slip through those existing patents.
As anyone who’s had to do this sort of analysis can tell you, a small-molecule drug can have a forest of patent filings around it. Many of these will be around the chemical matter itself (and associated things like formulations and dosage methods for it), and others will be process patents for useful ways to prepare it. Some of these may have “product by process” claims, where you’re indirectly claiming the final drug substance itself (when made by the specified route). But others will be direct claims for the process itself, not the chemical matter it produces. In many jurisdictions, the burden of proof is on the potential violator of such patents, and they have to show that they’re not infringing a patented route. Under US law, importing a drug substance that’s made outside the country via a US-patented route can constitute infringement itself (as clarified in Momenta v. Teva a few years ago). So there’s a pretty strong incentive to find new routes.
I have to say, it looks like the software does a pretty nice job of it (both the analysis and the new-route-finding). The latter is done by analyzing the existing route for the key bonds that are being formed, and then forcing the software to consider other options. Without such constraints, it tends to come up with routes that are similar or identical to the already-patented ones (which speaks well of it, actually). But if you tell it “No, I don’t want anything where that ring gets formed as part of the sequence” or the like, then it goes off-roading and comes up with something new.
For example, shown above is the program’s analysis of the patented routes to linezolid. A common theme is the synthesis of the oxazolidinone ring (and to be sure, most organic chemists would think the same way). But making those bonds “protected” makes the program come up with a different set of potential routes (shown at right). Now, this is not the most difficult synthetic problem in the world, but that looks like a perfectly creditable job of it, and the program is also spitting out literature precedents for all of these steps along with the analysis. You can certainly do this by hand, but why would you? That’s the question that such programs force us to ask, and it’s the same question that other labor-saving devices have made people ask over the years. And that’s what this and the other programs like it are: labor-saving devices. The labor in this case is not hauling laundry or digging holes, but rather the mental exertion in thinking up these pathways and the time spent in coming up with plausible procedures and precedents. The work is taking place between your ears (well, and with your fingers at the keyboard) but it’s still work.
The paper shows similar analyses for sitagliptin and panobinostat, all of which are (I would say) solid representatives of the kinds of structures and syntheses that we work with in med-chem. The former ends up being a chiral-pool synthesis rather than depending on a chiral reduction, and the latter comes up with two different approaches, depending on whether or not you allow a Pd-catalyzed coupling reaction. It would not surprise me to find that the process chemists at the various companies (both the originators and generic competition) have considered these general schemes already, of course. But they probably took a lot longer to do it!