Skip to main content

Animal Testing

A Call For Better Mouse Studies

Here’s an article by Steve Perrin, at the ALS Therapy Development Institute, and you can tell that he’s a pretty frustrated guy. With good reason.
ALS chart
That chart shows why. Those are attempted replicates of putative ALS drugs, and you can see that there’s a bit of a discrepancy here and there. One problem is poorly run mouse studies, and the TDI has been trying to do something about that:

After nearly a decade of validation work, the ALS TDI introduced guidelines that should reduce the number of false positives in preclinical studies and so prevent unwarranted clinical trials. The recommendations, which pertain to other diseases too, include: rigorously assessing animals’ physical and biochemical traits in terms of human disease; characterizing when disease symptoms and death occur and being alert to unexpected variation; and creating a mathematical model to aid experimental design, including how many mice must be included in a study. It is astonishing how often such straightforward steps are overlooked. It is hard to find a publication, for example, in which a preclinical animal study is backed by statistical models to minimize experimental noise.

All true, and we’d be a lot better off if such recommendations were followed more often. Crappy animal data is far worse than no animal data at all. But the other part of the problem is that the mouse models of ALS aren’t very good:

. . .Mouse models expressing a mutant form of the RNA binding protein TDP43 show hallmark features of ALS: loss of motor neurons, protein aggregation and progressive muscle atrophy.
But further study of these mice revealed key differences. In patients (and in established mouse models), paralysis progresses over time. However, we did not observe this progression in TDP43-mutant mice. Measurements of gait and grip strength showed that their muscle deficits were in fact mild, and post-mortem examination found that the animals died not of progressive muscle atrophy, but of acute bowel obstruction caused by deterioration of smooth muscles in the gut. Although the existing TDP43-mutant mice may be useful for studying drugs’ effects on certain disease mechanisms, a drug’s ability to extend survival would most probably be irrelevant to people.

A big problem is that the recent emphasis on translational research in academia is going to land many labs right into these problems. As the rest of that Nature article shows, the ways for a mouse study to go wrong are many, various, and subtle. If you don’t pay very close attention, and have people who know what to pay attention to, you could be wasting time, money, and animals to generate data that will go on to waste still more of all three. I’d strongly urge anyone doing rodent studies, and especially labs that haven’t done or commissioned very many of them before, to read up on these issues in detail. It slows things down, true, and it costs money. But there are worse things.

19 comments on “A Call For Better Mouse Studies”

  1. annon too says:

    Hey, it generates manuscripts.

  2. Kelvin Stott says:

    Very concerning indeed, but on a positive note, this is a great example of how an institution can lay down the foundation of how to test drugs in animal models the right way, which others can adopt. First get the animal models right, then hopefully the drugs will follow, while avoiding so much wasted hope and resources on false positives.

  3. Anonymous says:

    I’ve worked in pharmacology and in vivo modeling for years and I find this paper less than compelling. Some animal models are excellent predictors of clinical outcomes (e.g., the OVX rat for osteoporosis, the db/db mouse for diabetes). Apparently TDP43-mutant mice are a lousy predictor for ALS. Does this mean the studies were poorly conducted? I don’t know – but the author doesn’t provide particularly convincing evidence. Even the included graph (without error bars!) doesn’t really address this issue. It just suggests the author couldn’t replicate a few selected studies. How were the studies selected? Are they representative of all published studies or did he cherry-pick the publications with the biggest response? Ultimately I’m sure that ALS models are not good, but reducing the noise or studying tissue distribution of compounds isn’t going to turn a bad model into a good one.

  4. Robb says:

    A graph with no error bars in an article about scientific rigour. Sigh.
    I was at a talk last spring where the presenter asked the audience (mostly basic immunologists) how they chose the number of animals to use for a study. Do a power calculation? (No hands) Ten animals? (most of the audience).
    That might be okay for exploratory basic science studies, but any kind of translation should require replication with at least some of the rigor of a human study.

  5. MoMo says:

    Many of the compounds, if not all, have already been through ALS human trials so what is the point of this study?
    If you think neurodegeneration is due to one singlular cause or target (SOD1 anyone?) you are either naive or a biologist.

  6. luysii says:

    The situation in stroke is even worse, and probably results, in part, from similar problems with animal models.
    Almost 25 years ago, some 25 different compounds of ‘proven efficacy’ in treating focal and global brain ischemia in animal models had failed in human trials [ Stroke vol. 21 pp. 1 – 3 ’90 ]. I stopped reading the experimental stroke literature in ’95 for this reason. The last time I looked (at least 10 years ago) the number of failed human trials of treatments that had worked in animal stroke was up to 65.
    I’m also far from convinced that TPA offers anything for stroke.

  7. exGlaxoid says:

    I learned years ago that if you use a small sample size, no or poor controls, and noisy data source, the odds of getting a “positive” finding were at least 50%, much like the odds of picking red or black in roulette. So if you are trying to publish or sell your biotechnology company, those are the keys to success.

  8. Anton says:

    I have a lot to say about this topic since it is my area of expertise. But why bother since Derek seems to delete all my comments. Censorship is bad DL

  9. Derek Lowe says:

    #8 Anton – I delete the ones with gratuitous ad hominem comments (or should I say ad feminem). If your comments deal with your scientific expertise, they’re always welcome.

  10. Kip Guy says:

    Wowsa I thought we had it bad in oncology modeling. Underlying issues of power calculations, adherence to blinding strategies, careful attention to PK/PD correlations are all very real. Also a real problem in the primary driver for academia is the first paper on the model, not a careful study of predictive power. And then there’s the whole curing mice thing. Still would be useful to focus this debate on how academia could realistically do a better job in developing and vetting models.

  11. Max says:

    I agree with the anon comment #3 in that some mouse models are just plain bad, while others are decent. I’ve never worked with ALS, but I’ve worked with two neurodegenerative models along with other human disease mouse models.
    As for the linked paper however I do agree with their suggestions, but having done mouse studies in four different med school/hospital settings, I would add a few confounding factors that would still make things not so straight forward even if those suggestions were all implemented:
    1)Not all animal facilities are the same: some have different cage sizes, and as such the number of animals in a cage can vary. The bedding type and food access (distance from the bottom of the cage) can also be different. Differences in bedding, the number cage mates and what not may also lead to differences in mouse body temp. Some facilities have cages connected to air supply vents while others do not. Noise levels (construction/cleaning going on/people going in and out of rooms and so on) can be different. Despite attempts at control of room temp and light cycle, these can vary (doors opening to hallways, humidity in the building and so on). Some facilities like to have stimulation for the mice in the cages like the plastic igloos while other do not use them.
    2) Survival studies are extremely difficult to run in every facility I’ve worked in (I don’t know if this is different in industry animal facilities). Try getting IACUC approval with death as an end point; it is almost never going to happen. Deciding when to sacrifice a mouse can be difficult especially if there are multiple animal facility techs that also check on the mice, as all of them may have their own opinion on when a mouse is too sick and needs to be euthanized whether you agree or not. For models with relatively short lifespan this can be a big issue in maintaining consistency in survival assessment.
    3) Drug administration can really change results. Even if you give the same concentration of drug, how the drug is suspended (vehicle type) and the means and timing by which it is given can all generate differences in results.

  12. SAR screener says:

    @9 In which case, you might want to tidy up the comments in the recent GSK blog. There’s a nice ad feminem post in the comments.

  13. Red Fiona says:

    @11 – Some foolish questions since the nearest I get to animal work is E. coli, but are there not standard mice per cage or per square meter of space rules or something like that?

  14. Anonymous says:

    @13 – There are some general guidelines that most IACUC follow, but there is a great deal of variability as @11 pointed out. And this is just a short list, many more could be added.
    The general tone for many of the comments on this and other websites has been – ‘wow, what shoddy research!’ I don’t see the evidence to support this belief. Looking at the graph couldn’t one also conclude that ALS TDI did poor quality work? Perhaps the prior studies were thorough but the mice just don’t respond the same as humans.
    In the absence of a thorough statistical comparison of ALS TDI results with previous data it’s impossible to conclude. Which is somewhat ironic given that topic of the article.

  15. Anonymous says:

    I think the last paragraph of the article hits the mark.
    “Public and private agencies should fund characterization studies as a specific project… This is unglamorous work that will never directly lead to a breakthrough or therapy, and is hard to mesh with the aims of a typical grant proposal or graduate student training programme.”
    To be able to perform better studies with well characterized animal models, someone will need to support the effort to develop and characterize such models. Congress seems to be doing an excellent job dropping the ball on this, so what recourse is left? Clearly I’m not an industry person, so I’m curious to know how much development and validation of animal models happens in the private sector.

  16. Immunoldoc says:

    @3, I think you missed the point. The animal model, when done rigorously, perfectly predicted the clinical failures. This is not an article about poor mouse models of human disease per se, but rather they way they are used. Other readers who are commenting on lack of stats, I note that this is an opinion piece, not a peer-reviewed scietific report…

  17. Anonymous says:

    @16, I think you missed @3’s point. The author has essentially disparaged published studies claiming they didn’t perform the experiment as rigorously. Very possibly true, but that wasn’t really demonstrated.

  18. Max says:

    @13 There are guidelines that institutional IACUCs will follow, which basically say you can have so many mice in a certain size cage. One institution I was at allowed a maximum of 3 mice per cage and another 5, as they used different cage types. These maximum numbers also assume the mice are all the same sex, as if there are mixed sexes you have to have less mice in the cage, which gets further complicated if there are pups being born. Additionally if male mice aren’t litter mates they often can’t be housed together as they will fight other male cagemates, whereas this isn’t often a problem with housing non-littermate females. If cagemates constantly fight, or groom others they may get move to a separate cage, which could occur in the middle of an experiment. As researchers typically get charged per cage, per day, housing cost can play a role experimental design.

  19. Andrew says:

    To the posters wondering about the lack of error bars, I noticed this too but I wonder, isn’t “percent survival” pretty absolute? What would error bars show in this case?

Comments are closed.