This is an excellent article, and the title is self-recommending: “Common Pitfalls in Preclinical Cancer Target Validation”. The abstract speaketh the truth:
An alarming number of papers from laboratories nominating new cancer drug targets contain findings that cannot be reproduced by others or are simply not robust enough to justify drug discovery efforts. This problem probably has many causes, including an underappreciation of the danger of being misled by off-target effects when using pharmacological or genetic perturbants in complex biological assays. This danger is particularly acute when, as is often the case in cancer pharmacology, the biological phenotype being measured is a ‘down’ readout (such as decreased proliferation, decreased viability or decreased tumour growth) that could simply reflect a nonspecific loss of cellular fitness. These problems are compounded by multiple hypothesis testing, such as when candidate targets emerge from high-throughput screens that interrogate multiple targets in parallel, and by a publication and promotion system that preferentially rewards positive findings.
Yes, yes, yes, and yes indeed. Anyone working in the area should be ready to shout “Preach, brother!”, and if any of this comes as news, or if it seems overblown, then this is a great opportunity to get more familiar with the problems the article is talking about. To start with, an important distinction is made between reproducibility and robustness. These often get mixed together in discussions of problems with the scientific literature, but we’re (mostly) dealing with the latter. To be technical about this technical subject, a flat-out reproducibility problem means that when every bit of the experiment is done exactly the same way, the result still doesn’t come out as reported. This can be due to fraud, unfortunately, or it could be that the original result was some random-chance thing that just doesn’t repeat. Robustness, on the other hand, means that the experiment will indeed work reproducibility, but only if everything is done right – and by “everything”, one includes variables that even the original authors may not have been aware of, as well as the ones that they kind of knew about but didn’t bother to actually put into the experimental section.
A robust result can probably be reproduced even if you switch to a different buffer, or if your cell lines have been passaged a different number of times, or if the concentration of the test molecule is a bit off, etc. The more persnickity and local the conditions have to be, the less robust your result is, and in general (sad to say) the lower the odds of it having a real-world impact in drug discovery. There are certainly important things that can only be demonstrated under very precise conditions, don’t get me wrong – but when you’re expecting umpteen thousand patients to take your drug candidate and show real effects, your underlying hypothesis needs to be able to take a good kicking and still come through.
The paper also makes some solid philosophical points about how we should be thinking about correlation and causation. For example:
Clear thinking about causation versus correlation is aided by using words that have precise meanings rather than vague terms, such as ‘linked to’ and ‘associated with’, which often create ambiguity (intentionally or not) about whether two things are causally related to one another. Two words that are particularly useful in describing causal relationships are necessity and sufficiency. . .The statement ‘A is necessary for B’ means that if A is not true, then B cannot be true. The statement ‘A is sufficient for B’ implies that if A is true, then B will be true.
Failure to distinguish between necessity and sufficiency can lead to illogical conclusions. For example, when BRAF mutations were first detected in malignant melanoma, I heard it argued by some participants at scientific advisory board meetings that mutant BRAF would not be a good drug target because BRAF mutations are also present in benign naevi. However, the latter observation indicated only that BRAF mutations are not sufficient to cause malignant melanoma. The more important question from a therapeutic perspective is whether BRAF activity is necessary for the maintenance of BRAF-mutant melanomas, which has now been answered affirmatively.
“Necessary but not sufficient” is a very common state of affairs in biology, and anyone working in this field needs to have a well-developed sense of it. That goes not only for complex disease states, but for assay conditions and even to compound SAR trends. We have a lot of multifactor effects in our business, and no shortage of chicken-and-egg questions, and thinking about them as clearly as possible is essential. Here’s another example:
There are many examples in cancer biology of molecular changes that correlate with increased aggressiveness of cancer without necessarily causing the aggressive behaviour. For example, intratumoural hypoxia and the resulting upregulation of the transcription factor hypoxia-inducible factor (HIF) in tumours almost invariably correlates with poor outcomes in patients. . .This could signify that hypoxia and HIF cause some tumours to become more aggressive. Alternatively, it could simply reflect the fact that aggressive tumours outgrow their blood supplies, become hypoxic and therefore upregulate HIF.
The paper goes on to address the issues mentioned in the abstract (such as problems with “down” phenotype readouts and the particular importance of negative controls), and also has an excellent section on rescue experiments. A powerful piece of evidence that you’re onto a real target is when you can show that variants of your target protein that are resistant to your drug candidate actually confer that resistance on cell lines. Similarly, you can find some resistant cell lines and sequence them back down to find out what proteins are causing the rescue (which can not only validate your target, but tell you a lot about related biology). If you don’t have such experiments, your case is weaker. If you can’t seem to get them to work, your case may be much weaker. There are still valid reasons why such things might not work out, as the paper details, but you need to consider those explicitly.
I found this section particularly relevant, as will, I think, anyone who reads the literature in this area:
There has been a trend, especially in papers in high-profile journals, towards making far-reaching claims in an attempt to paint a seemingly complete picture that incorporates both new mechanistic insights and the physiological or clinical relevance of a given set of findings. The field would be better served if papers claimed less, but provided more lines of corroborating evidence in support of their conclusions. Describing a properly controlled and complementary set of target validation experiments can easily constitute an entire manuscript. It should not be an afterthought relegated to the last figure of a manuscript, in a gratuitous attempt to achieve in vivo or clinical relevance.
Unfortunately, the reward system we have in place encourages just that sort of behavior, and it’s not going to be easy to change it. We get what we subsidize; one pretty much always does, although it might not be what you thought you were asking for.
Finally, it should be noted that this paper isn’t just a list of reasons why your new cancer target isn’t so great. It also provides some reasons to keep going in the face of a common objection:
For example, the proteasome inhibitor bortezomib has anti-myeloma activity at a concentration that causes a 50–80% decrease in proteasome activity, but is toxic at higher concentrations that more completely block proteasome activity. This last observation underscores the fact that the issue in cancer therapeutics is not whether a target is important or essential in normal cells, but whether the target is more important in cancer cells than it is in normal cells. The degree to which there is a differential requirement for the target in cancer cells relative to normal cells is the biological determinant of the therapeutic window for inhibition of the target. Even in hindsight, it is not clear why most approved cancer drugs, including the above-mentioned imatinib mesylate and bortezomib, have therapeutic windows. This question is even more perplexing for many classic cytotoxic agents, including DNA-alkylating agents and microtubule poisons.
That’s a very good point, and well worth remembering. We have to try to keep from working on things that aren’t real, and there’s no shortage of those, but we also have to give the real things every chance to work. This is particularly important as you get more experienced in any scientific field. I’ve said many times that I don’t want to be that guy in the back of the conference room who’s always saying “That’s not gonna work”. I mean, sure, 95% of the time, it really isn’t gonna work; that’s how this business goes. But what good does that do anyone, then? I’d much rather be the person who perks up when something comes along that looks like one of the other 5%, and tries to get it to happen. That’s harder – but it’s a lot more worthwhile.
I highly recommend that drug discovery people in any field, not just cancer, give this paper a read. It’s full of extremely sound advice, and reminds us all of what we should be trying to do, and how we should be trying to do it.