Here’s another Big Retrospective Review of drug pipeline attrition. This sort of effort goes back to the now-famous Rule-of-Five work, and readers will recall the Pfizer roundup of a few years back, followed by an AstraZeneca one (which didn’t always recapitulate the Pfizer pfindings, either). This latest is a joint effort to look at the 2000-2010 pipeline performance of Pfizer, AstraZeneca, Lilly, and GSK all at the same time (using common physical descriptors provided to a third party, Thomson Reuters, to deal with the proprietary nature of the compounds involved). The authors explicitly state they’ve taken on board the criticisms of these papers that have been advanced in the past, so this one is meant to be the current state of the art in the area.
What does the state of the art have to teach us? 812 compounds are in the data set, with their properties, current status, and reasons for failure (if they have indeed failed, and believe me, those four companies did not put eight hundred compounds on the market in that ten-year period). The authors note that there still aren’t enough Phase III compounds to draw as many conclusions as they’d like: 808 had a highest phase described, 422 of those were still preclinical, 231 were in Phase I, 145 in Phase II, 8 were in Phase III and 2 in Phase IV/postmarketing studies. These are, as the authors not, not quite representative figures, compared to industry-wide statistics, and reflect some compounds (including several that went to market) that the participants clearly have left out of their data sets. Considering the importance of the (relatively few) compounds in the late stages, this is enough to make a person wonder about how well conclusions from the remaining data set hold up, but at least something can be said about earlier attrition rates (where that effect is diluted).
605 of the compounds in the set were listed as terminated projects, and 40% of those were chalked up to preclinical tox problems. Second highest, at 20% was (and I quote) “rationalization of company portfolios”. I divide that category, myself, into two subcategories: “We had to save money, and threw this overboard” and “We realized that we never should have been doing this at all”. The two are not mutually exclusive. As the paper puts it:
. . .these results imply that substantial resources are invested in research and development across the industry into compounds that are ultimately simply not desired or cannot be progressed for other reasons (for example, agreed divestiture as part of a merger or acquisition). In addition, these results suggest that frequent strategy changes are a significant contributor to lack of research and development success.
You think? Maybe putting some numbers on this will hammer the point home to some of the remaining people who need to understand it. One can always hope. At any rate, when you analyze the compounds by their physiochemical properties, you find that pretty much all of them are within the accepted ranges. In other words, the lessons of all those earlier papers have been taken on board (and in many cases, were part of med-chem practice even before all the publications). It’s very hard to draw any conclusions about progression versus physical properties from this data set, because the physical properties just don’t very all that much. The authors make a try at it, but admit that the error bars overlap, which means that I’m not even going to bother.
What if you take the set of compounds that were explicitly marked down as failing due to tox, and compare those to the others? No differences in molecular weight, no differences in cLogP, no differences in cLogD, and no differences in polar surface area. I mean no differences, really – it’s just solid overlap across the board. The authors are clearly uncomfortable with that conclusion, saying that “. . .these results appear inconsistent with previous publications linking these parameters with promiscuity and with in vivo toxicological outcomes. . .”, but I wonder if that’s because those previous publications were wrong. (And I note that one such previous publication has already come to conclusions like these). Looking at compounds that failed in Phase I due to explicit PK reasons showed no differences at all in these parameters. Comparing compounds that made it only to Phase I (and failed for any reason) versus the ones that made it to Phase II or beyond showed, just barely, a significant effect for cLogP, but no significant effect for cLogD, molecular weight, or PSA. And even that needs to be interpreted with caution:
. . .it is not sufficiently discriminatory to suggest that further control of lipophilicity would have a significant impact on success. Examination of how the probabilities of observing clinical safety failures change with calculated logP and calculated logD7.4 by logistic regression showed that there is no useful difference over the relevant ranges. . .
So, folks, if your compounds most fit within the envelope to start with (as these 812 did), you’re not doing yourself any good by tweaking physiochemical parameters any more. To me, it looks like the gains from that approach were realized early on, by trimming the fringe compounds in each category, and there’s not much left to be done. Those PowerPoint slides you have for the ongoing project, showing that you’ve moved a bit closer to the accepted middle ground of parameter space, and are therefore making progress? Waste of time. I mean that literally – a waste of time and effort, because the evidence is now in that things just don’t work that way. I’ll let the authors sum that up in their own words:
It was hoped that this substantially larger and more diverse data set (compared with previous studies of this type) could be used to identify meaningful correlations between physicochemical properties and compound attrition, particularly toxicity-based attrition. . .However, beyond reinforcing the already established general trends concerning factors such as lipophilicity (and that none too strongly – DBL), this did not prove generally to be the case.
Nope, as the data set gets larger and better curated, these conclusions start to disappear. That, to be sure, is (as mentioned above) partly because the more recent data sets tend to be made up of compounds that are already mostly within accepted ranges for these things, but we didn’t need umpteen years of upheaval to tell us that making compounds that weight 910 with logP values of 8 are less likely to be successful. Did we? Too many organizations made the understandable human mistake of thinking that changing drug candidate properties was some sort of sliding scale, that the more you moved toward the good parts, the better things got. Not so.
What comes out of this paper, then, is a realization that watching cLogP and PSA values can only take you so far, and that we’ve already squeezed everything out of such simple approaches that can be squeezed. Toxicology and pharmacokinetics are complex fields, and aren’t going to roll over so easily. It’s time for something new.