Skip to Content

Academia (vs. Industry)

What If Those Wonderful Results Are Wrong?

Venture-capital guy Bruce Booth has a provocative post, based on experience, about how reproducible those papers are that make you say “Someone should try to start a company around that stuff”.

The unspoken rule is that at least 50% of the studies published even in top tier academic journals – Science, Nature, Cell, PNAS, etc… – can’t be repeated with the same conclusions by an industrial lab. In particular, key animal models often don’t reproduce. This 50% failure rate isn’t a data free assertion: it’s backed up by dozens of experienced R&D professionals who’ve participated in the (re)testing of academic findings. This is a huge problem for translational research and one that won’t go away until we address it head on.

Why such a high failure rate? Booth’s own explanation is clearly the first one to take into account – that academic labs live by results. They live by publishable, high-impact-factor-journal results, grant-renewing tenure-application-supporting results. And it’s not that there’s a lot of deliberate faking going on (although there’s always a bit of that to be found), as much as there is wishful thinking and running everything so that it seems to hang together just well enough to get the paper out. It’s a temptation for everyone doing research, especially tricky cutting-edge stuff that fails a lot of the time anyway. Hey, it did work that time, so we know that it’s real – those other times it didn’t go so smoothly, well, we’ll figure out what the problems were with those, but for now, let’s just write this stuff up before we get scooped. . .
Even things that turn out to be (mostly) correct often aren’t that reproducible, at least, not enough to start raising money for them. Booth’s advice for people in that situation is to check things out very carefully. If the new technology is flaky enough that only a few people can get it to work, it’s not ready for the bright lights yet.
He also has some interesting points on “academic bias” versus “pharma bias”. You hear a lot about the latter, to the point that some people consider any work funded by the drug industry to be de facto tainted. But everyone has biases. Drug companies want to get compounds approved, and to sell lots of them once that happens. Academic labs want to get big, impressive publications and big, impressive grants. The consequences of industrial biaes and conflicts of interest can be larger, but if you’re working back at the startup stage, you’d better keep an eye on the academic ones. We both have to watch ourselves.
Update: by request, here’s a translation of this page in Romanian

43 comments on “What If Those Wonderful Results Are Wrong?”

  1. Hap says:

    With academics patenting more as well (universities want the licensing money), “Follow the money” might be the appropriate aphorism. People will do whatever gets them paid (or power, or status), in business or academics.
    It would be helpful to have data (rather than examples) or irreproducibility – if it is happening, it could be “optimistic results” from the authors (bias), insufficiently specified protocols in the papers, or something else, which would help to figure out what to do about it.

  2. You're Pfizered says:

    This becomes more of an interesting factoid given how the pharmaceutical industry is doing more and more collaborating with academic groups on early stage stuff.
    I’m sure every company has folks out there scouring the academic labs for the next hot target and/or technology. Won’t they be surprised that once the checks are written the science doesn’t quite pan out the way it was described in a PowerPoint presentation…
    Maybe our CRO collaborators will be able to duplicate this dubious work with greater ease…

  3. Iridium says:

    At my small start up we currently use the mouse R6/2 model for some of our neuro-D programs. To date, every chemical entity that has been published as being efficacious in this model has failed to work in our hands as a control compound. Unfortunately, all of the chemical entities that are published as working in this model have typically only been studied in the model once. We have also had similar experiences with assays and/or chemical entities as controls in assays. The percentage quoted by Booth seems staggeringly high, but feels quite real based on our experiences to date. Quite scary to think about at times.

  4. gyges says:

    If you’re looking for an example, something that everyone thought should’ve worked but didn’t, check out this taxol prep.

  5. strayxray says:

    If you’re in academics and your results don’t hold up to further scrutiny, you’ve already gotten the grant or tenure to keep you going.
    If you’re in industry and your results don’t hold up to further scrutiny, you’ve lost millions of dollars and likely many of your team’s jobs.
    Every published result could be investigated further and tested more extensively, but resources are always limited. I imagine there is greater financial motivation in industry for pouring in more money and manpower to make doubly sure that result is solid, whereas in academics you’ve usually only got enough time and money to do the initial experiments that get put in the publication.

  6. Steve says:

    Timely article in light of the decision of NIH to dive into translational medicine…

  7. Kevin says:

    It isn’t limited to pharma either – I see it in other areas too. I suspect part of it comes from taking work rejected by one journal and going down a tier. You can almost always find someone to publish something.

  8. gyges says:

    Another point to bear in mind is that thinking that something will work adds value to the company. The value stays until the technology is found not to work.
    During the period from announcing ‘we have this brand new technology which allows us to tap into a billion USD market’; and, finding out that the technology is rubbish, big bucks can be made since the stock price roller-coasters.
    Up, up, up and down, down, down.
    Both money making opportunities if you know how to play the market.
    The upshot of all of this is that we have a process of rumour mongering, where the rumour is based upon a published paper, which makes it all the more valuable.

  9. johnnyboy says:

    @2 (You’re Pfizered): I really, really resent your suggestion that CROs would duplicate bad results with greater ease, i.e. make stuff up.
    If you work in big pharma, you probably wish that lay people would respect the kind of research you do. I wish you would extend the same respect to people like me who work in CROs.

  10. BioBrit says:

    I heard this suggestion a few days ago (I can’t take credit myself). We, as in the journals, insist on standardized chemistry elucidation experiments – NMR, elemental analysis etc, in order to publish. This helps us compare across research labs – like vs like. A (worthy but lofty) goal would be to insist on the same in the biological sciences – standardized assays and animal models. That would help us compare like vs like.
    Of course there would have to be a place for method development papers, just as there is in chemistry. But, want to publish your brand new ligand for a disease? New pathway hypothesis? You’d better demonstrate it using standard models within defined experimental criteria.
    I’m sure many would claim this in unattainable and would inhibit their research. May be true, but worth considering. Getting all those elemental analysis was a pain in the butt for me too.

  11. Nick K says:

    Wouldn’t it be better for everyone, academics included, if the 50% of specious, irreproducible work were not published in the first place?

  12. HelicalZz says:

    I think this is partly affected by a lot of ‘model optimization’ that goes into a lot of research. Such efforts (being intentionally very general here) do tend to make models more robust, and provide for clearer, more differentiated data signals. But those results can tend to be less impressive in more general models.
    “If the new technology is flaky enough that only a few people can get it to work, it’s not ready for the bright lights yet. ” I think flaky is too strong, but not robust across platforms is probably pretty common, with the result being not often ‘failure’ as much as a less distinct / differentiated signal or effect.
    So yeah, more confirming in outside labs should occur.

  13. You're Pfizered says:

    Maybe I should give CROs more credit because in nearly all of the cases that I’ve been involved with, we are teaching our foreign CRO counterparts how to do drug discovery.
    On the process side, that’s a different story. If that’s where you work, I apologize for the unintended slight. It was meant more tongue-in-cheek, but with more and more of our jobs going to CROs that we’re showing the ropes to you can understand a certain amount of implied bitterness…

  14. NJBiologist says:

    @13 You’re Pfizered: Perhaps you shouldn’t make a habit of sending your money to the lowest bidder. The CRO I work for preferentially hires former Pharma scientists; we’ve got a pair of 35+ year industry veterans working for us, one with a PhD and one without. Other CROs I’ve talked to do likewise.

  15. johnnyboy says:

    Perhaps the distinction should be made between CROs in developed vs. developing countries. In my field (safety assessment), from what i’ve seen from working in both the pharma and the CRO sides, the expertise in CROs (north american and european) is equal to or superior to that of their pharma counterparts. I can’t speak for what goes on in developing countries, however.

  16. Thomas McEntee says:

    @10 BioBrit: There is the “Organic Synthesis” model in which submitted preps of chemicals are actually checked for reproducibility and other important factors resulting in highly-detailed procedures. The impetus for the Org Syn process was, of course, the unavailability in the US of German-made chemicals after WWI commenced in Europe nearly, egads!, 100 years ago.

  17. Hap says:

    #14: That assumes that YP’s management actually cares about such things – it doesn’t look like a lot of companies’ managements look any further then when their next options vest.
    To quote Dilbert, “Freedom’s just another word for not caring about the quality of your work.”

  18. Ricardo Ros says:

    The work from CROs in India/China tends to be close to atrocious, the same cannot be said for CROs in Russia, North Europe and USA. These to provide very good service.
    I have had to train many scientists in India mostly, and have some horror stories. Not only about falsifying analytical data, but lying about purity, concentration … and this is chemistry, I do not even want to go into biological screening.
    But at the end of the day, one gets what one pays for.

  19. chris says:

    I’m sure many have examples of things that have not been reproducible, a few that come to mind:
    Filtering samples prior to iv injection, apparently the academic lab was happy to inject suspensions.
    The drug has to be given before the insult (stroke model).
    Only one person in the lab has the surgical skills to get reproducible occlusions.
    Compounds given iv with a “quick push” cause haemolysis which influences the readout.

  20. Commissar says:

    “…at least 50%”
    and headed to 99.999999%…Glorious!
    “We pretend to work and they pretend to pay us”

  21. Chrispy says:

    One of my biggest shocks, upon coming to a large biotech/pharma from an academic background, was how little of what was published is reproducible. Time and time again, results (especially animal results) in top-tier journals cannot be reproduced. At first, I suspected that people at my new company were incompetent. Now I realize that they are simply more careful than the original researchers.

  22. HTSguy says:

    I’ve also been on the receiving end of this more than I care to remember (in my case they were molecular interactions, not animal models). As with Chrispy, each time we would first try to determine why we were incompetent. Only after attending scientific meetings and finding out that everyone else had similar experiences did the truth dawn.

  23. dearieme says:

    “And it’s not that there’s a lot of deliberate faking going on”: ho hum. Untruthfulness may not always qualify as “faking” but it usually qualifies as suggestio falsi or suppressio veri.

  24. Anonymous BMS Researcher says:

    We also have seen plenty of published results that are challenging to replicate. Even when we can replicate them, it sometimes turns out we need to optimize various parameters not mentioned in the paper to make it work reliably.
    But even for standard teaching exercises there a lot of lore nobody ever writes down; when I was a TA back in the 1980s before having the undergraduates do each exercise we’d all get together and try it ourselves. Usually those who had taught this class before would warn us about many little tricks not mentioned in the manual.

  25. imatter says:

    The worst part is that “it” got patented before it got published.

  26. Malcolm says:

    I would think there’s no need to assume overenthusiasm on the part of academic labs; simple selection bias between competing researchers does the statistical work of pushing fluke findings to the front of the queue.
    This was covered nicely by John Ioannidis’ Why Most Published Research Findings Are False.

  27. Anonymous says:

    The same lax training in academics slowly transfers into industry. CRO’s can be dicey, they get paid per compound and they know that it’s too much work to double check everything. Process work is different because they are on the hook for everything.
    I actually blame the erosion of scientific integrity in academics. Look in the Supp info, 10 mg reactions GC yields (calibrated) do I see the GC traces.. no
    The fault lies with the current system which forces people to play this dirty game. Hence you see the most competitive countries suffer from dishonestly the most.
    As a scientist I find that we are actually wasting the majority of tax payer money on useless highly irreproducible crap, and train a lot of mediocre people for jobs that are not there.
    And before you criticize me for being all high and mighty. My papers WERE accurate and reproducible, but I never considering anything I put out “ground breaking”.
    Would you fund your own research with your own money??

  28. Anonymous says:

    Here is the solution:
    Publish irreproducible results in Paper, results in banning of all authors from publishing in that journal again.

  29. cliffintokyo says:

    Perhaps there is a MESSAGE here for what NIH needs to focus on in helping out with bridging the *valley of death*, i.e. the in vivo/ pre-clinical POC studies.
    Is there also an element of spreading ‘confusion to your enemies’ (i.e. competitors, in the broadest sense) in academic research publishing?

  30. Anthony says:

    The comment: “Would you fund your own research with your own money??” is a potent way to self examine not only the importance of research (industrial or academic) but the quality to which it is done.
    (It dramatically misses the point that most research is beyond the income of the researcher but reality is a failing of most thought experiments)

  31. Anthony says:

    The comment: “Would you fund your own research with your own money??” is a potent way to self examine not only the importance of research (industrial or academic) but the quality to which it is done.
    (It dramatically misses the point that most research is beyond the income of the researcher but reality is a failing of most thought experiments)

  32. A Computationalist says:

    And you people have the gall to laugh at us? The shame…
    At least we don’t burn through stacks of hundred dollar bills on a daily basis.

  33. Biologist says:

    Reading these comments, it strikes me that there is very little reflection about the deeper nature of biological systems. One of the main biological attributes is phenotypic variation (not genetic variation). In the hunt for significant p-values and a nice hypothesis-driven story to publish, biomedical and basic research (industrial or academic) is ever perfecting their models to show exactly what they want them to show. The main theme in most biomedical research is to ‘eradicate’ variation within the lab as to increase sensitivity, reproducibility and of course nice p-values (it helps to increase publishing rate also). This is done with the purpose of reporting results that are generally representative, i.e. we remove all variation to explain all variation. The absence of results from industry in their hunt for all encompassing blockbuster drugs is an equally good example of this fallacy. So yes, of course you can’t repeat most of the stuff coming out of other labs.

  34. Rick says:

    Out of scientific integrity, it’s important to be circumspect when dealing with the touchy subject of experimental reproducibility. In our right-minded pursuit fraud and incompetence that lead to some irreproducibility, let’s not throw out the baby with the bathwater. Sometimes – I don’t know how often because I haven’t seen a thoughtful review of the subject – irreproducibility results from a previously overlooked subtlety of the system and figuring out that subtlety leads to more valuable (intellectually and sometimes financially) understanding of the system. One example I know quite well is the discovery of Sequenase, the enzyme used for DNA sequencing at the begining of the genome sequencing era. Other labs could not reproduce the results of the originators, whereas the originators flawlessly reproduced it repeatedly as did scientists who visited the lab or received the enzyme from the originators. Figuring out the source of the irreproducibility became the dissertation project of a very talented graduate student, who discovered that incredibly tiny concentrations of iron in the lab’s distilled water (I think it was in the pM range) combined with small amounts of EDTA, a metal chelator commonly used in enzyme storage buffers because it inhibits the action of metal proteases, to form a potent free radical generator that modified a single specific amino acid on the T7 DNA polymerase and increased its processivity, which is the salient feature of Sequenase. It took many years, FAR longer than the time horizon of any investor I know of, to elucidate this mechanism and make use of it. But once that was done, we had a way to make a DNA polymerase that was ideally suited for DNA sequencing, without which sequencing the human genome would have taken far, far longer than it actually did. I hope most of us would agree that it was worth the extra effort.
    In our desire to root out and punish irreproducibility and our hubris in thinking we can coerce nature to reveal her most valuable secrets on a financial analysts time scale, this story serves as a potent example of the value of UNDERSTANDING the sources of experimental variation. Sometimes it pays to dig a little deeper into sources of “error” because it could reveal deeper, far more valuable knowledge. Indeed, if you really understand the history of science, you will understand that some of our greatest discoveries resulted from deviation from neat reproducibility. Of course, this extra effort could also yield really strong evidence of fraud or ineptitude, but isn’t that better than the rumor and inuendo that result from overly hasty attempts to repeat a competitor’s (or “collaborator’s”) results?

  35. HelicalZz says:

    There have been a few comments here related to publication, and bans or retractions based on work hard to reproduce. I’d say again that I don’t think the problem is an insidious one, but rather more often the result of internally optimized models. Results reproduce, but not not ‘as well as reported’.
    As I noted in the linked blog, publication may be a solution in select cases. A journal that requires (or itself seeks) CRO confirmation of research results prior to publication. That is, replaces part of peer review with contracted confirmation. This would be expensive, and not a model for all, or even most, research. It could however be worthwhile to publish under this type or model when you are providing research that is looking for outside funding. [Whether or when one would want to publish such research is an entirely different topic].

  36. Science_is_messy says:

    I have an alternate suggestion. How about not treating the literature as The Truth? The literature as a conversation, not a GLP regulatory submission. Nor should it be.

  37. alf says:

    @36: yes science is messy – but Bruce’s point is that often the results can’t be reproduced or the interpretations were incorrect due to sloppy science. Great, this gives VCs one more excuse not to invest in new ideas. Basically, he was saying buyer beware. Perhaps, his perspective might also explain, in part, why some academic scientists feel that there is an abundance of good targets and ideas that industry and investors are ignoring.

  38. Cellbio says:

    In my experience, the failure rate to confirm academic findings over a few decades is much more than 50%. However, I fall back to an early lesson from grad school. On one test, I was asked to name the authors of the work noted, but had no clue, though I could break down methods, results and interpret. I was furious, thinking this an ego thing of the Professor types. The prof calmly explained that knowing the lab and the caliber of their work was an important factor in assessing the merit of a single piece of work and the bias inherent in their thinking. If applied to high profile academics, Schreiber for example, or Serhan with Resolvins,Speigelman others, one can see the value of the work and leave behind the profound implications for human suffering. My opinion, best to first filter the results through common human frailties then dissect the technical merit.
    One point worthy of further discussion is that this low rate of confirmation is widely known, has been known for a long time, but has not driven VCs to value people with accurate technical assessment ability over hypsters. How many “serial entrepreneurs” have created real value as opposed to selling a story well enough so that the first round investors do well. I’d be surpassed if this rate of confirmation in the VC world is any better than that for high profile academic papers.

  39. Gambler says:

    I’m with Rick (#34). When dealing with complex biological systems the amazing thing is that so much data does reproduce. Little things do make a difference, figuring out what they are can be tortuous and is mostly not worth the effort. But science is an iterative process, profound findings will be reproduced and the rest is largely forgotten. I think there is value in having the conversation.

  40. Hap says:

    Literature is supposed to be your best shot at the truth – no one has it all, but they do have pieces, and eventually people are supposed to be able to put them together. The fact that no one paper has all the truth or is supposed to doesn’t absolve people from trying their best, and “whatever will get me grant money” does not necessarily count as one’s best. It’s easier to cut authors slack when they make honest mistakes (or mistakes perceived as honest) such as the iron-catalyzed coupling, then when they haven’t done sufficient controls to know it’s real (Gaunt? NaH oxidation?).
    In addition, research is often overhyped, and while at least some of that is the misunderstanding of the people writing about it, some authors don’t exactly deny the hype. If you don’t want people to believe that the literature is the Truth, then authors should hype their results as such.

  41. dnarich says:

    I must say that I have become keenly aware of “investigator bias” and lack of data reproducibility through direct experience at a major research center. I’ve now worked in pharma, biotech, CRO, and drug delivery businesses (in scientific/management/Sr. Director capacities) and a few years ago transitioned to a BD role in an academic center. In trying to commercialize some of the science, I requested raw data and was “interested” to find some results conflicting with the principal hypothesis that were not published, while supportive results from the same experiments made it into print. Because the data was critical to the commercialization effort, I asked for additional data in support of the hypothesis, since I wanted to be confident that the non-supporting data was just a spurious result. The PI was most defensive, “What do you want – for me to retract my paper?” (which wasn’t even on my mind). The repercussions implied by the institution (mind you without my ever alleging any improper data – simply questioning if the data supported the commercial drivers – i.e. implied therapeutic relevance/effect) as a consequence for discovering the data discrepancy have been severe. Fearing a potential loss of reputation and a desire to protect the same has dominated over the simple objective of questioning the fidelity of the data and suggesting relevant experimental approaches to resolve the questions (this from the perspective of an individual that managed development of hundreds of molecular tests that required high reproducibility).
    A challenge in the academic setting is that each laboratory often operates as an independent entity – and an outstanding and intelligent PI may be great with the theory, but not have decades of experience with variability of assay methods when employed on an industrial scale. Data reproducibility is critical for industry.
    To the same extent that industry can learn from academics, academics could benefit from the experience of industry. The tragedy is that we seldom find ways to build the bridges and to illuminate each other’s path.

  42. doc says:

    I was awaiting comments to Nick @ 11, who has recommended a sterling solution, but don’t see any. So here goes:
    a)it’s like med school: 50% of what you’re learning is wrong. Unfortunately, we don’t know which half. However, half of what’s correct will be outdated in 5 years, so by the time you’re mid-career, nothing you learned will have been worth the bother. Don’t worry about it, the problem is self-correcting.
    b)Send everything to the Journal of Irreproducible Results. Only publish elsewhere what they reject.
    c)issue each Ph.D. with tear-off publication coupons; say 50. No tickee, no printee. That would allow one or two really good papers per year of professional life, which is realistically about what can be done (individually, at least) with any quality. If coupons were anonymous, an interesting -and lucrative- secondary market might develop as well.

  43. John says:

    The “unspoken rule” about lack of evidence has itself no evidence to back it up. Please use some examples to show why 50% might be an accurate number.

Comments are closed.