A new paper in PLoS Biology looks at animal model studies reported for the treatment of stroke. The authors use statistical techniques to try to estimate how many have gone unreported. From a database with 525 sources, covering 16 different attempted therapies (which together come to 1,359 experiments and 19,956 animals), they find that only a very small fraction of the publications (about 2%) report no significant effects, which strongly suggests that there is a publication bias at work here. The authors estimate that there may well be around 200 experiments that showed no significant effect and were never reported, whose absence would account for around one-third of the efficacy reported across the field. In case you’re wondering, the therapy least affected by publication bias was melatonin, and the one most affected seems to be administering estrogens.
I hadn’t seen this sort of study before, and the methods they used to arrive at these results are interesting. If you plot the precision of the studies (Y axis) versus the effect size (X axis), you should (in theory) get a triangular cloud of data. As the precision goes down, the spread of measurements across the X-axis increases, and as the precision goes up, the studies should start to converge on the real effect of the treatment, whatever that might be. (In this study, the authors looked only at reported changes in infarct size as a measure of stroke efficacy). But in many of the reported cases, the inverted-funnel shape isn’t symmetrical – and every single time that happens, it turns out that the gaps are in the left-hand side of the triangle, the not-as-precise and negative-effect regions of the plots. This doesn’t appear to be just due to less-precise studies tending to show positive effects for some reason – it strongly suggests that there are negative studies that just haven’t been reported.
The authors point out that applying their statistical techniques to reported human clinical studies is more problematic, since smaller (and thus less precise) trials may well involve unrepresentative groups of patients. But animal studies are much less prone to this problem.
The loss of experiments that showed no effect shouldn’t surprise anyone – after all, it’s long been known that publishing such papers is just plain harder than publishing ones that show something happening. There’s an obvious industry bias toward only showing positive data, but there’s an academic one, too, which affects basic research results. As the authors put it:
These quantitative data raise substantial concerns that publication bias may have a wider impact in attempts to synthesise and summarise data from animal studies and more broadly. It seems highly unlikely that the animal stroke literature is uniquely susceptible to the factors that drive publication bias. First, there is likely to be more enthusiasm amongst scientists, journal editors, and the funders of research for positive than for neutral studies. Second, the vast majority of animal studies do not report sample size calculations and are substantially underpowered. Neutral studies therefore seldom have the statistical power confidently to exclude an effect that would be considered of biological significance, so they are less likely to be published than are similarly underpowered “positive” studies. However, in this context, the positive predictive value of apparently significant results is likely to be substantially lower than the 95% suggested by conventional statistical testing. A further consideration relating to the internal validity of studies is that of study quality. It is now clear that certain aspects of experimental design (particularly randomisation, allocation concealment, and the blinded assessment of outcome) can have a substantial impact on the reported outcome of experiments. While the importance of these issues has been recognised for some years, they are rarely reported in contemporary reports of animal experiments.
And there’s an animal-testing component to these results, too, of course. But lest activists seize on the part of this paper that suggests that some animal testing results are being wasted, they should consider the consequences (emphasis below mine):
The ethical principles that guide animal studies hold that the number of animals used should be the minimum required to demonstrate the outcome of interest with sufficient precision. For some experiments, this number may be larger than those currently employed. For all experiments involving animals, nonpublication of data means those animals cannot contribute to accumulating knowledge and that research syntheses are likely to overstate biological effects, which may in turn lead to further unnecessary animal experiments testing poorly founded hypotheses.
This paper is absolutely right about the obligation to have animal studies mean something to the rest of the scientific community, and it’s clear that this can’t happen if the results are just sitting on someone’s hard drive. But it’s also quite possible that for even some of the reported studies to have meant anything, that they would have had to have used more animals in the first place. Nothing’s for free.