Skip to Content

Reproducibility: Crisis or Not?

Here are the results of a Nature survey on reproducibility in the scientific literature. They themselves admit that it’s a “confusing snapshot”, but it shows that we’re still arguing about what “reproducibility” means. 52% of the responders (over 1500 scientists) said that there was “a significant crisis”, though, so this issue is on people’s minds.

Interestingly, chemists were among the most confidant in the literature of their own field (physics and engineering as well). The medical literature was considered by its own practicioners to be the least reliable, and I think that order is probably about right. (I wrote in a recent column for Chemistry World about how chemistry might not have it as bad in this regard, <s>but that one’s not online yet</s>). At the same time, chemists had the highest proportion of respondents who said that they’d been unable to reproduce someone else’s experiment.

I don’t think that’s necessarily a contradiction, though. Chemistry is a field with lower barriers to replication than many others, and we also probably do more replications in general. Biology goes out of date more quickly, but chemists think nothing of trying a thirty or forty-year-old reference if it looks like a useful reaction. And one of the other points the article makes is that “failure to reproduce the paper” can range from something that’s a tissue of lies from beginning to end, all the way to “that reaction doesn’t go in as high a yield as they said it does”. Chemistry has plenty of that, for reasons both forgivable and not. I suspect that many of the non-replications that chemists reported were in the venial category, which still leaves room to believe that the literature of the subject itself is largely reproducible.

But sorting discoveries from false leads can be discomfiting. Although the vast majority of researchers in our survey had failed to reproduce an experiment, less than 20% of respondents said that they had ever been contacted by another researcher unable to reproduce their work. Our results are strikingly similar to another online survey of nearly 900 members of the American Society for Cell Biology (see That may be because such conversations are difficult. If experimenters reach out to the original researchers for help, they risk appearing incompetent or accusatory, or revealing too much about their own projects.

I’ve had that exact experience – we had problems with some published work, but we couldn’t say anything about it to the original authors without giving too much away. In the times I have contacted authors, though, I’m about 50/50. And the responses are bimodal indeed – the only things I’ve ever had are helpful responses with suggestions about what might be going on, and complete lack of response whatsoever. No one’s ever bothered to get defensive – they just don’t reply. Here’s another problem:

A minority of respondents reported ever having tried to publish a replication study. When work does not reproduce, researchers often assume there is a perfectly valid (and probably boring) reason. What’s more, incentives to publish positive replications are low and journals can be reluctant to publish negative findings. In fact, several respondents who had published a failed replication said that editors and reviewers demanded that they play down comparisons with the original study.

That last part is interesting, and a bit unexpected. Any readers have any similar experiences? I guess the famous “Comment On a Paper by Samir Chatterjee” would have a harder time getting published today (if you don’t have access, just the abstract will give you the flavor of the thing). I suspect that John Cornforth bypassed the standard review process for that one!

When asked why we have such problems, the Nature respondents listed a number of very likely causes: selective reporting of results, pressure to publish, insufficient replication inside the lab before publication, lack of statistical power in the first place, and so on. Outright fraud was about halfway down the list. Unfortunately, when people outside of science hear about a reproducibility crisis, that’s what they tend to thing we’re talking about.

A reader sent me a link to this piece at First Things, a magazine of religion and philosophy that I have to say that I don’t link to very often. It’s a well-written article about scientific reproducibility, and gets a lot of things right (while getting some subtle things wrong, too, in my opinion). By the end, it’s largely an attack, from a religious perspective, on “scientism”, the tendency to treat science as a religion itself. And there the author has a good point. This irritates sincere religious types a great deal, as it should – I’m not religious at all, and it can sure irritate me. Seeing people who know nothing about a particular field (and especially nothing about statistics) citing some journal article to make some political point, and then acting as if no possible counterargument could then ever be made (“You can’t argue with the settled science! It’s peer-reviewed!”) is scientism all the way – received wisdom, taken as true simply because of its (unexamined) source.

I think that the First Things article overstates the problems in science today, but given all the talk about the state of the scientific literature, it’s easy to see how someone could end up doing that. The scientific enterprise, though, is not yet in danger of crashing down under the weight of its own contradictions. Presumptuous human reason is still having its innings, like it or not. The self-correcting nature of science is not some sort of magic sword, that’s for sure, but it’s still a real weapon, and it looks like the best we have. It may well be that out of the crooked timber of humanity, no straight thing was ever made, but science, at its best, is one of the straighter things going.

51 comments on “Reproducibility: Crisis or Not?”

  1. Hap says:

    I imagine people outside of science rely on peer review or other recognition of scientific merit because they aren’t going to be able to do so (because they don’t have the information or the background in the field to fully understand the information) – if we live in a society that thrives on specialization, then people are going to know lots about some things and little to nothing about others, and so they trust other people’s judgements unless it’s important enough to them that they are willing to pay the cost in learning and opportunity to learn about it themselves. (Clancy’s aphorism was that focusing on your own patch was good teamwork when it worked and tunnel vision when it didn’t.)

    Another issue for people is what level of certainty they should take as diagnostic of a reasonably decided matter (in the absence of new evidence, which always gets the last word). It will vary with the cost of consequences and actions, but even with that, it is likely for people to disagree on what level of certainty they need to be confident that something is true.

    People can’t know everything, and so they take shortcuts so they can learn either more in one area or in more areas. I don’t think not taking those shortcuts is possible, so perhaps it’s better to focus on ways to figure out when a shortcut is bad.

    1. Hap says:

      Perhaps the best rule is (proximately) from Neil DeGrasse Tyson – that people should keep in mind that they could be wrong. In most cases, you may have to do something in the absence of (approximate) certainty, but the belief that you cannot be wrong is the root of a whole lot of evil.

      1. Some idiot says:

        Agreed… Whenever I asked for my opinion on something from a junior colleague or student, and I think it is wrong (or won’t work or similar) I set out my reasoning, clearly stating my starting point, and state my opinion. And I almost always finish it by saying “but please do me a favour: prove me wrong!” That’s my attitude to things… I like being right, but I _love_ being proved wrong! It just means that there is another dimension to something that I wasn’t previously aware of…

  2. Peter Kenny says:

    I do get the impression that the experimental psychology field does get a disproportionate amount of attention in reproducibility studies. One issue for MedChem reproducibility is that the compounds are not generally available and any attempt to reproduce published IC50 values would entail huge amounts of work resynthesizing material. That said, a number of influential data analysis studies (e.g. on drug-likeness, compound quality, PAINS, ligand efficiency metrics, attrition in clinical development…) cannot be reproduced because the data is proprietary. Some of the journals have policies on the use of proprietary data although these appear to be applied selectively. In some data modelling studies, the models themselves are not disclosed and I include a link to a post on that point as the URL for this comment.

    1. KRL says:

      To Peter Kenny – preach it, brother, preach it. It seems that I am forever asking authors of manuscripts submitted to JMC (and elsewhere) to include a reference compound (if available) when generating their stack of IC50s so to enable comparison to the wider literature.

      1. Peter Kenny says:

        Obviously it’d be great if authors of MedChem articles did provide samples of compounds although I was not suggesting that they should be under any obligation to do so. One advantage of sharing samples is that your compounds get tested in a more diverse range of assays and your article will get cited more. However, the ‘finite quantity of sample’ issue doesn’t apply to data and a lot could be achieved if journals enforced existing rules for making data available. I favor making data available as supplementary information as this ties the data to the publication and prevents dirty tricks like those mentioned by It’s worth noting that data and models are sometimes not shared even in open access journals and I occasionally have to remind open access advocates that open access is not equivalent to open science. I have a linked a blog post on open access as the URL for this comment.

  3. says:

    The worst field I came across is psychology. After showing that a paper was complete garbage, we could not even initiate talk about getting it retracted. That happened despite authors surreptitiously changing raw data in public database after after our critical review got published.

    I published in top journals of theoretical physics, chemistry (nanotech) and biology, but never came across a culture so unscientific.

  4. Harrisoin says:

    I think part of the issue with the reproducibility crisis that people don’t even agree on what that means. I have been part of a couple of efforts around this for biological research. Some people take the attitude of “If you aren’t replicating every little detail of the original study, you haven’t done a replication study.” Others say “One shouldn’t need to duplicate every aspect of the original study, nor would you want to, if a phenomenon is reproducible.” It find it useful to consider this as two separate things. It may be possible to replicate a study by duplicating an experiment precisely. However, the general phenomenon may not be more widely reproducible. If you have to flick the lights 3 times and spin in a circle to get your result, just how biologically relevant is that?

    1. biotechie says:

      This is a great point. One sensible approach is to reproduce data that support the main claims of a paper. That means you don’t have to reproduce every single experiment in the original but you do have to provide sufficient evidence to backs up the original claims, even if you cannot replicate everything (in the life sciences in particular, technical and biological variability make that very difficult indeed)

  5. G2 says:

    The mentioned paper from Cornforth can be obtained (#48) free on:

  6. CMCguy says:

    From personal experience I know the lack of reproducibility in chemical reactions can often be directly due to subtle details or techniques that do not adequately get presented in the literature or even are not well captured in notebooks. Since there are many actions frequently uncontrolled or subject to varied standard practices in different labs things like how glassware was washed or the stated room temperature not accounting for the hot humid day experiment performed may have a meaningful influences that are not obvious unless repeated. There is where having to work under GMP and doing scale-up can aid awareness and definition around many considerations would ignore in typical bench work. I would suggest the majority of general chemical literature is fairly reliable but has to be taken as a starting point and rarely comprehensively adaptable to your application. At the same time I have encountered a few patents attempted to duplicate where the procedures appeared to be either highly inaccurate or possibly intentionally misleading which felt went against the spirit, if not probably the legality, of patenting rules.

  7. Kelvin says:

    I believe there is an overall tendency towards increasing noise to signal as we explore more complex systems with more hidden sources of variability. But to what extent that is contributing to diminishing reproducibility I don’t know. Any thoughts?

    1. tally ho says:

      Hi Kelvin – I second your intuition regarding an increase in noise-to-signal in observations of complex systems.

      Biological assays, given all the (hidden) variables of biological systems, seem particularly prone to problems with reproducibility. They often have to be engineered to enable reproducibility, sometimes at the expense of over-engineering, which raises questions about their relevance to natural conditions. This seems to tie back to (Shannon) information theory and requirements for obtaining reliable signal over noisy channels. Also, Waddington (a forefather of systems biology) had interesting concepts regarding biological physiology and complexity. Depending on the biological state, physiological response could be diffuse or highly focused, given the nexus of coupled (signalling) pathways driving the response – “creodes”. His concepts are still relevant today. Some biological states are more sensitive to perturbation than others – hence reproducibility can be ephemeral and very context dependent.

      Given the professional rewards for investigators to do novel experiments on complex systems, rather than reproducing someone else’s experiments, it’s no wonder that reproducibility is an issue. Reproducibility needs to be incentivized.

  8. Anon says:

    Should the actual results be reproducible? Or just the overall distribution of results? Or just the ultimate conclusion?

  9. Chrispy says:

    Part of the issue stems from a fundamental misunderstanding of statistical significance. When I have asked scientists: “If you redid a study in exactly the same way that originally achieved a P value of .05, what would be the odds that you would achieve that result or better?” (Normal distribution, two-tailed test.) Almost no one gets it right.

    The answer is 50%.

    1. colintd says:

      Absolutely agree that misunderstanding (or choosing to ignore) the real meaning of p values underlies much of this issue.

      This has been covered in many places, but to my mind this article does a great job of illustrating the problem.

      I believe it (or something similar) should be mandatory reading before anyone publishes a paper which claims a discovery…

      1. tally ho says:

        Hi colintd – Thanks for the link to Colquhoun – it’s refreshing to get an old school perspective on a long-standing issue. I also like the title of the opening section (1.1) of his “Lectures on Biostatistics” 1971 – “How to avoid making a fool of yourself. The role of statistics”.

        1. Colintd says:

          It is in many ways sad how little we seem to have progressed in the last 50years, in terms of equipping researchers with a real understanding of what the figures they quote actually mean.

          I cringe every time I hear someone quote a p value as if that is the chance they are wrong in their assertion. If they do so working for me, they are sent off to read Colquhoun et al. and will often come back saying “why the hell didn’t they tell us this in college?”

    2. Daniel Barkalow says:

      You should then ask: “If you performed a second independent trial of a some process which generates a result with some randomness, what is the chance that you would get at least as large a value as the first result?”

      Actually, I think 50% isn’t quite right. If the distribution of p values of all studies attempted is higher than .05 (probably true), it’s more likely than not that the original run was a better-than-average run for that study. 50% is right for trying to replicate a study where the median p value for all replications is .05.

    3. Anon says:

      I disagree that the answer is 50%. The real answer depends on whether the null hypothesis is tru or not. If the null hypothesis is true then the answer is 5%. And if it’s false then the answer will be *at least* 5% depending on the sample size and effect size. But the bottom line is: nobody knows.

      And this demonstrates an even more critical point: Even those who think they understand stats better than others, often don’t.

      Myself included!

      1. anon4 says:

        Yes, i don’t get the 50% either. Its the null hypothesis is true, then there is a 5% chance of getting a p values of <.05. That is what the p value means here. If the null hypothesis is false then its not obvious from just the p value what the result would be if the experiment where run again.

      2. Stat Guy says:

        Chrispy is in fact correct. In the way the hypothetical situation is described – i.e. if you repeated a study “in exactly the same way” – the p-value and null hypothesis are irrelevant. What’s important is that both results are generated from the same distribution. Without knowing where the first result falls along this distribution, the odds of the second result being more extreme than the first are 50/50, or the flip of a coin.

        1. Anon says:

          But one could say 50/50 that the null is correct (it is equally likely since we have no idea), in which case the overall probability of getting a better result would be:

          50% * 5% + 50% * 50% = 27.5%

          1. Stat Guy says:

            There are a couple flaws in that logic, the most crucial one being that the probability of the null hypothesis being correct is not 50%. Actually, the probability is zero if you consider the famous George Box quote, “all models are wrong, but some are useful.”

        2. MTS says:

          No that’s not right. 50/50 is only true if you don’t condition on p = 0.05 (i.e. if you don’t fix the p-value of the first study to be 0.05).

          Suppose we have two i.i.d. continuous random variables P1 and P2, representing the p-values of of two studies conducted independently in the exact same way, then in fact p(P2 > P1) = 0.5. However, we don’t know p(P2 > P1 | P1 = 0.05) because we don’t know the distribution – if your experimental design tend to produce high p-values (e.g. if it has low power) then it could be a lot higher than 50%, and vice versa.

  10. Sweden calling says:

    Talking about stats. Most of us have tendency to rate ourselves above average (like the famous example of like 90% of drivers think they are better than the average). Probably the same here regarding reproduceability. Easy to blame others?

    1. Anon says:

      That’s because the really bad drivers all died in car accidents and are no longer around to rate themselves below average. 😀

  11. Glen says:

    C P Snow wrote extensively on this in “The Search”, 1934. Though in a work of fiction, he states the issue very clearly.

    C P Snow, writing in The Search (1934)
    Now if false statements are to be allowed, if they are not to be discouraged by every means we have, science will lose its one virtue, truth. The only ethical principle which has made science possible is that the truth shall be told all the time. If we do not penalize false statements in error, we open up the way, don’t you see, for false statements by intention. And of course, a false statement of fact, made deliberately, is the most serious crime a scientist can commit. There are such, we both know, but they’re few. As competition gets keener, possible they will become more common. Unless that is stopped, science will lose a great deal. And so it seems to me that false statements, whatever the circumstances, must be punished as severely as possible.
    -End Quote

    One can agree or disagree, but it is a very clear statement of the issue.

  12. Anon says:

    I did a double take when I saw footnote 1 in the Cornforth comment – JHU was my alma mater. While I did not know Prof. Nickon personally (he has long since retired), my supervisor did, and from what I can tell the guy was a genius.

  13. Curt F. says:

    One of the most exciting developments in reproducibility is the preprint server Bioarxiv. Scientists can self-publish there, which means if they want to put out a rebuttal of a recent (usually high-profile) paper, what editors or reviewers think is no longer an issue. Before, editor and reviewer approval of “comments” on a paper led to less than forthright discussion, slower discussion, and a much lower rate of papers that were publicly discussed in the written literature.

    A very recent example: A recent Nature paper “Late acquisition of mitochondria by a host with chimaeric prokaryotic ancestry” ( was just completely (and to my eyes, quite effectively) refuted by a Bioarxiv response “Late mitochondrial origin is pure artifact” ( The time between the initial publication and the rebuttal was only 3.5 months, and the rebuttal goes into greater detail than “technical comment” letters to Nature. The phrasing of the rebuttal is very clear, presumably in part because the rebutters didn’t have to worry about wounding the pride and reputation of the Nature *editors* in their rebuttal.

  14. Steve says:

    I once wrote a criticism of a paper in Macromolecules – they did publish it but only after sending it out to 13 referees who all agreed with me!

    I think that every published paper should have a comments section where non-anonomous comments can be added – e..g tried this but only got 15% yield, this source of antibodies has a poor reputation, the reaction works better with DMSO rather than DMF. That way with enough negative comments bad papers can get ignored, good papers promoted and improvements noted.

    1. Dana says:

      Commenting on papers is possible in Pubmed Commons (link under my name) – an NCBI login is required, hence no anonymous comments are allowed. It’s a good initiative.

  15. Joe says:

    “What defines irreproducibility” doesn’t seem to be as important of a problem as what to do afterwards. It’s not black-and-white.

    A spectrum of possibility ranges from 1) no one even knows, talks about it; let alone trying to reproduce it (0 impact), to 2) someone tried reproducing; didn’t work but “not a big deal,” still trust the authors/journals to 3) publish a non-100% correct result in high IF and publish another high IF to fix the problem (high impact), to 4) publish a non-100% correct result in high IF and embarrassingly publish a “patch” in lesser journals (lower impact)….all the way to 5) having a congressional misconduct investigation and bad blood along the way.

    But it has become a problem Pharm industry starts to find as low as 11% reproducibility, and the academic reputation is on the line. Now that’s beyong the issue of who’s pointing finger at whom.

    “This lack of reproducibility in science is an open source of consternation in the pharmaceutical industry, where many people feel they can no longer rely on basic findings from academia. Researchers at Bayer and other drug companies have reported dismal success rates trying to reproduce studies in cancer, women’s health, and cardiovascular disease — in some cases as low as 11 percent.”

  16. Shion Arita says:

    I think the reason that chemistry tends to reproduce better is that it rarely relies on statistics.

    If you claim you made compound X, you either are correct in your interpretation of X’s spectral and analytical data, or you aren’t. There isn’t much room for erroneous correlation there.

    And it’s no surprise that the place where the most trouble comes (yields for example) is exactly where statistics go into the mix. I personally think it’s stupid to report yields to the precision that we do, since the imprecision of repeated trials is often pretty high, and to get a good read on how that actually goes you would have to do many trials. But I don’t think it’s important enough to do that.

    1. a. nonymaus says:

      Chemistry also has very large numbers working in its favor. When I do a reaction, I am using about 10^20 molecules. With this many replicates, I can be damn sure that another reaction under similar conditions will behave similarly. Of course, my ability to precisely define those conditions can be subject to debate.

  17. Derek’s Chemistry World piece on reproducibility is now online (linked from my handle) – Good to see this robust discussion over here!

  18. typo says:

    confidant -> confident

  19. typo2 says:

    practicioners -> practictioners

    and there are visible tags ( )

  20. matt says:

    so we found we were having a reproducibility crisis…but on further reflection we aren’t sure whether that was a significant result?

  21. Uudon Rock says:

    I worked R&D for a major multinational manufacturing concern for years. Reproducibility and even scaling up research level procedures for our manufacturing line was a serious issue. We could never hammer out every variable and had to scrap quite a few. Some that we did implement could not be reproduced over sees, typically because of the difference in the reagents available to those sites. I always tried to assist, as I was paid to do so, but there were times when we were just plain stumped as to the root cause of the problem. Presently I work in bio-chem research. Reproducibility is a legendary problem here. There are instances that go back decades still talked about for their erroneous claims and misleading statements. Without mentioning names or specifics I am aware that the instance I’m referring to resulted in the end a tenure and criminal charges. With that over our heads we only attempt to publish material we ourselves can verify. It can be frustrating when you think you have a great result only to find out the next time your procedure has no effect at all. C’est-la-vie.

  22. Rich Rostrom says:

    Science has been compared to a vast structure composed of individual bricks, where each brick is a paper presenting a new bit of discovered knowledge. If no one is testing the bricks, the quality of the structure will degrade. The damage done by an erroneous or fraudulent paper could be much larger than the value of an average valid paper.

    ISTM that perhaps scientists should devote as much effort to reproducing papers as to producing them; that such reproductions should be considered useful scholarly activity; and that scientists who do nothing else should be valued comparably with those who generate original papers.

    1. a. nonymaus says:

      I don’t know if human nature would allow reproduction to be as valued as the new. However, I do want to see a journal dedicated entirely to reports about the reproducibility or otherwise of articles. Much like procedures in Organic Syntheses are checked and the names of the checkers are listed along with the original authors, this journal could serve as a time-delayed form of the same. I know for sure that if I have two procedures to consider using and one is in “J. Reprod. Chem.” and the other is not, I’ll go with the one that someone else has gotten to work.
      Even if it’s not burning up the prestige-journal charts, it should still get plenty of submissions from work that is part of other projects and junior authors who want a first-author spin-off publication. Essentially, what I want is Blog-Syn with an impact factor.

  23. Organocatalyst says:

    Well, if you need some quick papers to get yourself tenure/grants there are two options: 1) make up the results and hope that no one bothers to check the paper, or 2) take someone else’s results that work and claim them for your own. I’ve seen both strategies be successful, but only the first has a direct impact on reproducibility, I suppose…

  24. anonymous says:

    The problem with ‘catalysis’ isn’t statistics and lack of knowledge, as mentioned by someone above. It’s that it’s severely abused in the classical ‘funny business’ sense. This has a lot to do with JACS and Angewandte deciding to let your paper through if the yield is in the 90s instead of 80s or 70s and same for the ee. Often you can get your yield up for cetrain substrates because it’s just engineering of a reaction, but that really should be left up to an industrial company that wants to use your process and I question the education value of having a postdoc work on this problem.

    A lot of the time you can’t get your yield up, so you simply lie and weigh with a bit of solvent, and then wash your compound several times with CDCl3 to get a clean spectrum. Or you run a prep HPLC and isolate a pure compound, and then say the reaction was 99% selective when it was really 70%.

    I don’t think it’s such a big deal, in the grand scheme of ‘lying about stuff’ in science, with other fields being worse and this being something that is best optimized outside an academic lab anyways, but when publications in the best journals and the grant money and academic positions that go along with them start to matter on whether you’ve got 90% yield instead of 70%, then yes it does matter. There is no way to catch this type of lying as well except for some company who really wants to do the reaction failing at replication and high selectivity as promised, and then getting in touch with the boss. If the boss claims that you need ‘good hands’, then there is nothing you can do, and you can’t really prove that the postdoc ran an HPLC and then took an NMR of a clean compound when the reaction was not really clean.

    That’s why it’s a real shame that when stuff like this is found out, as in the Buchwald/Tsvelikhovsky case, the authors get away with a correction and the postdoc still gets a faculty position; when a correction of this type would have precluded the paper being published in the best chemistry journal and maybe the faculty position that goes along with it. You can’t even argue that it was ‘sloppiness’ on the part of the postdoc; it really screws over those who do not lie. There needs to be a strong signal sent whenever some funny business is discovered in exaggerating yields / selectivity in catalysis, because the positives of getting away with it so far heavily outweigh the negatives of not getting away with it. The paper should have been retracted to send a strong signal, simply because in most cases when people lie on a catalysis paper, you will never catch them.

    This is making me want to leave the field of catalysis actually, despite being able to publish more often in it… Too much funny business and I’d rather be somewhere when you have to take CVs and get crystal structures that keep everyone honest. Maybe after I get tenure.

  25. Nick K says:

    Here is a remarkable Youtube video by Stefan Molyneux on reproducibility in Medchem:

    Unfortunately, the identity of the corrupt academic is not revealed.

    1. hn says:

      One thing we should do is out those who commit fraud, especially those in positions of training students.

      1. Nick K says:

        It is an unfortunate fact of life that the brave people who blow the whistle on academic fraud and malfeasance almost always come off worse than the perpetrators. Just look at what happened to the grad students in the Dalibor Sames/Bengu Sezen scandal.

  26. WhoAmI says:

    Chemistry reproduces better than other sciences, but it doens’t mean it reproduces excellently either, anyone who remembers their lab experiments in school knows that with each experiment there were are at least one student who got something completely different from the wanted product.
    Some reactions are more prone to this, one-pot chemistry stuff for example I believe. At my uni they made us do a very basic one-pot reaction based on a paper.
    We were distributed in 8 groups, each group got something different. The result was supposed to be a white crystalline powder with UV fluorescence, and while a group did get that, the other products ranged from a bright yellow powder (not the wanted product with impurities in it) to a deep orange, impossible to crystallize oil, to a white, lustrous, abestos-like solid, to an inseparable solid mix of the product and one of the reagents (which were totally not supposed to do that), to a viscous mix (totally anhydrous by the way) with no UV activity whatsoever.
    They didn’t try it again this year because it rendered most of the students extremly confused and nervous, and the teachers couldn’t explain all of the weird results we’ve got. I suppose that was just a taste of how chemists can be scared of not being able to reproduce an experiment.

    1. hn says:

      I think that’s a great experience for students. Teaches critical thinking and that real life science differs from textbook science.

  27. Anonymous says:

    It is rare that failure to reproduce is publishable at all and, if so, that it even gets published. There are a few famous examples. Although Cornforth called out Chatterjee based on fundamental reasoning, Hudlicky actually synthesized the Chatterjee intermediates and could not get them to react as claimed. Synth. Commun. 1986, 16(4), 393-399. DOI:10.1080/00397918608057714 Lawrence D. Kwart, Mark Tiedje, James O. Frazier & Tomas Hudlicky “Total Synthesis of (±)-Epiisocomenes VIA Hydrogenation of “Chatterjee’s Ketone”.”

    Another famous example that has direct bearing on med chem, the history of chem and synthesis in general is Lysergic Acid. The original claim of a successful synthesis is: “A New Synthesis of Lysergic Acid”, James B. Hendrickson and Jian Wang, Org. Lett. 6, 3-5 (2004), DOI: 10.1021/ol0354369.

    The claims of that paper were convincingly debunked by Nichols in Org Lett. 2012 Jan 6; 14(1): 296–298. doi: 10.1021/ol203048q “A Reported “New Synthesis of Lysergic Acid” Yields Only The Derailment Product: Methyl 5- methoxy-4,5-dihydroindolo[4,3- f,g]quinoline-9-carboxylate.” Markondaiah Bekkam, Huaping Mo, and David E. Nichols.

    On the opposite end, there is, of course, the famous example of the ‘long awaited demonstration of reproducibility’ of the quinine synthesis: “Rabe Rest in Peace: Confirmation of the Rabe-Kindler Conversion of d-Quinotoxine to Quinine: Experimental Affirmation of the Woodward-Doering Formal Total Synthesis of Quinine.” Aaron C. Smith, Robert M. Williams. Angewandte Chemie International Edition, 2008, 47, 1736–1740 doi:10.1002/anie.200705421

    I have some other examples but not the time to dig them out right now.

Comments are closed.