Clinical trial failure rates are killing us in this industry. I don’t think there’s much disagreement on that – between the drugs that just didn’t work (wrong target, wrong idea) and the ones that turn out to have unexpected safety problems, we incinerate a lot of money. An earlier, cheaper read on either of those would transform drug research, and people are willing to try all sorts of things to those ends.
One theory on drug safety is that there are particular molecular properties that are more likely to lead to trouble. There have been several correlations proposed between high logP (greasiness) and tox liabilities, multiple aromatic rings and tox, and so on. One rule proposed in 2008 by a group at Pfizer is that clogP >3 and total polar surface area less than 75 square angstroms is a good cutoff – compounds on the other side of it are about 2.5 times more likely to run into trouble. But here’s a paper in MedChemComm that asks if any of this has any validity:
What is the likelihood of real success in avoiding attrition due to toxicity/safety from using such simple metrics? As mentioned in the beginning, toxicity can arise from a wide variety of reasons and through a plethora of complex mechanisms similar to some of the DMPK endpoints that we are still struggling to avoid. In addition to the issue of understanding and predicting actual toxicity, there are other hurdles to overcome when doing this type of historical analysis that are seldom discussed.
The first of these is making sure that you’re looking at the right set of failed projects – that is, ones that really did fail because of unexpected compound-associated tox, and not some other reason (such as unexpected mechanism-based toxicity, which is another issue). Or perhaps a compound could have been good enough to make it on its own under other circumstances, but the competitive situation made it untenable (something else came up with a cleaner profile at about the same time). Then there’s the problem of different safety cutoffs for different therapeutic areas – acceptable tox for a pancreatic cancer drug will not cut it for type II diabetes, for example.
The authors did a thorough study of 130 AstraZeneca development compounds, with enough data to work out all these complications. (This is the sort of thing that can only be done from inside a company’s research effort – you’re never going to have enough information, working from outside). What they found, right off, was that for this set of compounds the Pfizer rule was completely inverted. The compounds on the too-greasy side actually had shown fewer problems (!) The authors looked at the data sets from several different angles, and conclude that the most likely explanation is that the rule is just not universally valid, and depends on the dataset you start with.
The same thing happens when you look at the fraction of sp3 carbons, which is a characteristic (the “Escape From Flatland” paper) that’s also been proposed to correlate with tox liabilities. The AZ set shows no such correlation at all. Their best hypothesis is that this is a likely correlation with pharmacokinetics that has gotten mixed in with a spurious correlation with toxicity (and indeed, the first paper on this trend was only talking about PK). And finally, they go back to an earlier properties-based model published by other workers at AstraZeneca, and find that it, too, doesn’t seem to hold up on the larger, more curated data set. Their-take home message: “. . .it is unlikely that a model of simple physico-chemical descriptors would be predictive in a practical setting.”
Even more worrisome is what happens when you take a look at the last few years of approved drugs and apply such filters to them (emphasis added):
To investigate the potential impact of following simple metric guidelines, a set of recently approved drugs was classified using the 3/75 rule (Table 3). The set included all small molecule drugs approved during 2009–2012 as listed on the ChEMBL website. No significant biases in the distribution of these compounds can be seen from the data presented in Table 3. This pattern was unaffected if we considered only oral drugs (45) or all of the drugs (63). The highest number of drugs ends up in the high ClogP/high TPSA class and the class with the lowest number of drugs is the low ClogP/low TPSA. One could draw the conclusion that using these simplistic approaches as rules will discard the development of many interesting and relevant drugs.
One could indeed. I hadn’t seen this paper myself until the other day – a colleague down the hall brought it to my attention – and I think it deserves wider attention. A lot of drug discovery organizations, particularly the larger ones, use (or are tempted to use) such criteria to rank compounds and candidates, and many of us are personally carrying such things around in our heads. But if these rules aren’t valid – and this work certainly makes it look as if they aren’t – then we should stop pretending as if they are. That throws us back into a world where we have trouble distinguishing troublesome compounds from the good ones, but that, it seems, is the world we’ve been living in all along. We’d be better off if we just admitted it.