Skip to Content

The Scientific Literature

Not 100%. Not Really.

I’ve mentioned Tomas Hudlicky’s views on the state of the current synthetic organic chemistry literature here before: he’s not very complimentary, and he’s good some good reasons not to be. I had an email from him the other day with another example of some of the problems that he’s talking about.
Take a look at this paper, which just came out in JOC. It’s a total synthesis of a natural product called brazilin, from a group in South Korea. Now, I have no doubt that they have made brazilin. And I have no doubt that they have made it by the route that they detail in the paper. But (like Hudlicky) I do have doubts that six reactions in their synthesis all went with flat 100% yields.
He’s shown that if you do a standard workup and chromatography, the odds of you getting 95% and above are very small indeed. You can’t even recover weighed amounts of known compounds to 100% with that treatment, much less clean up reaction mixtures. But that’s just what happens in this paper. Now, this may seem like a minor point – OK, the yields were high, the reactions worked well, so what’s the big deal with saying 97%? Or even 100%?
The big deal is that this is a symptom of a larger problem – the hyping of results, dressing things up to look better than they are. 100% yields are wishful thinking at best, and deception at worst (self-deception, most likely) and none of these are good things to let into the scientific literature, even at very dilute levels. The same impulse and the same tendencies can lead to much worse things. If we’re all going to start thinking honestly and clearly about our work, maybe we could start small, and admit that there are no 100% yields after extraction and chromatography. Yes, yes, your hands are great and your technique is awe-inspiringly flawless. But you didn’t get a 100% yield.

61 comments on “Not 100%. Not Really.”

  1. anon says:

    If you always extract with carbon tetrachloride you’ll get 100% yield and the product is completely pure by NMR!

  2. Anonymous says:

    I smell 100% BS.

  3. fluorogrol says:

    @1. anon
    Using that technique, I’d be really disappointed with a yield as low as 100%.

  4. Hap says:

    Six steps out of nine got 100% yield? Really? Can I (or someone better) throw something in a round-bottom and get it back out in 100% yield? I doubt it.
    I wonder if synthetic papers aren’t becoming like political speeches – “yes, I know I’m promising things that I can’t actually do, but you have suspended rationality and I don’t believe myself anymore anyway, so no harm done.”

  5. anon the II says:

    I think it’s a little lame to beat on these poor Koreans.
    This technique was pioneered by Nobel prize winners.

  6. Just Sayin' says:

    Perhaps their lab balance (likely balances) haven’t been calibrated?
    As an experienced organic chemist once told me about discovery-phase synthesis, “It’s okay to start with a kg and end up of with a mg. Since most compounds are biological duds, it’s better to reduce storage demands and waste disposal fees for the final products.”

  7. Chris says:

    When I was a grad student at Michigan, Ed Vedejs would dress down students who put a 100% yield in front of him for exactly this reason. Even “quantitative yield” got a narrow look and a suggestion that your material must not have been very clean.

  8. A Nonny Mouse says:

    Looking at the NMRs, which are perfect with essentially no solvent, it does look like these are fully isolated materials rather than just a crude “essentially quantitative” yield.

  9. Back to the drawingboard says:

    So, the real question then becomes: How do the reviewers let this sort of thing get past them? Does anybody actually challenge the details of the experimental sections, much less read them thoroughly when they review the manuscripts? How about the journal editors? It seems that if reviewers demanded some accountability on this score, the problem would (perhaps) disappear after a while.

  10. Nick K says:

    Personally, I’ve never believed the yields in papers published by the distinguished chemists whose initials are KCN and EJC.

  11. hypocrite police says:

    I think it’s hilarious that you post something about Hype, Derek when you spent a large portion of your blog discussing one of the best cases of hype/bad science I have ever seen — Burke’s suzuki machine. 10 years from now we’ll remember that as a giant embarrassment. Not a couple of koreans with 96% yields reported as 100%…..

  12. Hap says:

    If no one’s ever going to reproduce a synthesis (say for the often-stated purpose of making more or making analogs), then you can write anything you want for the yields and no one’s going to be able to tell that you did so. Someone might be able to see that the NMR spectra don’t fit what you say you have, but yields aren’t verifiable in any case (if you can’t get them, or can’t get the reaction to go, then it’s hard to tell whether it’s a lack of complete reporting or a lack of technique on your part – hence why negative results aren’t often reported). If people (advisors and reviewers and readers) are going to ask for high yields, and no one can check that they’re real, well, guess what they’re going to get.
    It’s not a good argument for most total synthesis – if no one’s ever going to repeat the work, then we have little way of trusting that it isn’t a technical achievement of fiction rather than a technical achievement of another sort.

  13. Hap says:

    Hype is different than lying or self-deception though – predicting the impact of something is difficult and uncertain (roughly, one quote is “The only way to predict the future is to create it.”) One can verify that Burke’s machine does what it says, and people probably will. I don’t agree with the hype, but it’s a lower sin than being less-than-honest about yields. On the other hand, hype is part of the trinity of sins Hudlicky decried earlier, and our ability to respond to it (and thus to accrue currency for those who create it) drives people towards other forms of dishonesty.

  14. Anonymous says:

    The most surprising part of the blog post: someone still reads JOC?

  15. Anony-brain says:

    I agree with “The big deal is that this is a symptom of a larger problem”.
    But this is by a mile not the biggest problem in medicinal chemistry or the pharma industry, IMHO.
    I’d say we should focus on the key things we should change – or we will end up like the dinosaurs…

  16. Vader says:

    This strikes me as an excellent illustration of broken window theory.
    A broken window may seem petty compared with robbery, burglary, or gang shootings. But neighborhoods with lots of broken windows signal a breakdown of law and order that may, in fact, mean more robbery, burglary, and gang shootings.

  17. DCRogers says:

    Ha, I once ran a workup that showed a 105% yield!
    But at least, I interpreted it correctly: I had really screwed something up, and needed to redo my work more carefully.
    I would have treated a 100% yield the same way.
    (I see a related issue in computational analysis when people report q^2 [cross-validated r^2] values approaching 1.0 – while I do not doubt there is a lot of real information caught in the model, it’s almost certainly contaminated by lots of noise, throwing its predictivity into great doubt.)

  18. Pete says:

    I agree with Hap that there is a distinction between hype and lying. Hype involves ‘talking up’ the importance of results, for example, by implying that a model has a broader applicability domain than it really does. Some of the problems encountered by computational chemists in drug discovery can be traced back to extravagant claims made earlier by other computational chemists. Drug discovery breeds hype and those who criticize will be denounced as ‘negative’ and sent on courses run by highly paid consultants in order to improve their behavior.
    It is quite common to hype trends in data in order to give weight to the guidelines and metrics that the data analyst would have the drug discovery community adopt. A common tactic is to present the trends in ways that disguise the weakness of the trends, for example, by hiding variation or by using standard error as a measure of the spread of distributions. A different flavor of hype arises when analysis is extrapolated out of its applicability domain. For example, PAINS (frequent hitter in panel of 6 AlphaScreen assays) behavior may be invoked in support of assertions that a compound is promiscuous or protein-reactive.

  19. antibac says:

    #5 – agreed! How convenient it is to pick this paper and not the hundreds of other ones that are right here in the U.S., in big time institutions, and headed by big name PIs.

  20. Derek Lowe says:

    #5, #19 – find me some recent ones that show a series of 100% yields, and I will be more than happy to call them out. I don’t doubt that they’re out there; it’s just that I don’t read the experimentals of total synthesis papers, and this one is both particularly egregious and was specifically called to my attention.

  21. antibac says:

    plenty with 95+% …

  22. oldnuke says:

    I remember seeing yields over 100% in some Organic Chem I lab notebooks. I realize that a lot of them were gunning for med school, but “giving 101%” is contraindicated. 🙂

  23. Anonymous says:

    “The big deal is that this is a symptom of a larger problem – the hyping of results, dressing things up to look better than they are.”
    sort of like your commentary on the Burke chemistry a few weeks ago

  24. InfMP says:

    Derek always calls out people no matter if it’s from USA, China, India etc.
    Pointless to accuse him.
    That being said, every time i submit a paper, it’s my worst nightmare to make a mistake and end up on this blog!

  25. Nick says:

    The accuracy of yields is also nowhere near 1%…i.e. so if you repeat a reaction three times, you won’t get the same value each time.
    So it would probably make sense to report yields to the nearest 5% or something anyway…maybe that’s what this group have done (although I doubt it!).

  26. Hap says:

    I’ve seen lots of “quantitatives”, but not too many 100% yields, and I don’t think I’ve seen six in a total synthesis ever.
    &gt 95% yields are kind of questionable, but possible, but 100% doesn’t seem possible. (I don’t know how you would report error bars for a 100% yield, because the yield certainly can’t be higher than 100%, so +/- is out.)

  27. Chrispy says:

    Years ago an advisor told me that I needed to do the “Corey 1-2 inversion” on my reaction.
    That makes a 49% yield into 94%

  28. antibac says:

    why aren’t organic chemists required to do reactions in triplicate and report error bars? everyone else is on the Bio side. I know it’s tedious but it is so for everyone. Granted it would be silly for a paper with lots of different reactions. But for methodology papers when the ‘selling point’ of the paper is an increase of yield over a previous paper – I’d say it becomes difficult to make a call on the novelty without error bars.

  29. biotechtoreador says:

    “100% yields are wishful thinking at best, and deception at worst”.
    Why not just come out and call the authors liars? It seems they are clearly not telling the truth, and yet they (and many many others) are allowed to continue obtaining funding based on what amounts to fraud.

  30. Anonymous BMS Researcher says:

    Much of the silver owned by the US government was made into wire for electromagnets used by the Manhattan Project to separate isotopes of Uranium (not gold from Fort Knox, as some versions of the story say; it was silver.) The people making the wire very carefully collected all the bits of silver dust from their machines. They were so careful that they actually returned a little MORE silver to the government than they had been issued. Seems they picked up residual silver from previous work.

  31. Anon says:

    In graduate school I obtained 97% during optimization of an unprecedented reaction. I repeated three times on the same scale. It felt strange, but I was terrified to publish a 97% yield. I would have much rather that it had been 84%. It’s published now (still 97%), and I still feel uneasy about it. I suppose it’s good science to be increasingly skeptical about our highest yielding reactions.

  32. Rock says:

    For me, I have had it “up to” here with all the “up to” yield and ee designations on graphical abstracts of methodology papers. Upon closer inspection of the paper, the “up to” 99% yield (most common phrase on graphical abstracts) often have median yields in the 70’s or 80’s % range. To all the other reviewers out there, let us take that one small step and purge that phrase from graphical abstracts.

  33. what the???? says:

    I am having a hard time getting one to go in 30% yield that is reported in an ACS journal as 70% from a big time institution from a big name PI.

  34. antibac says:

    my point exactly – and for those we keep anonymous. For the ‘smaller’ groups -their paper with names and all shows up in these discussion. Funny business.

  35. Hap says:

    @23: But I don’t think Dr. Lowe misrepresented what Burke wants to do – he may be more/less optimistic about its usefulness and its consequent effects on chemical practice, but he didn’t say that the method and object were something that they were not.
    That’s what makes hype a lesser sin than fudging yields – hype oversells something or amplifies its reputation beyond what something may deserve, but the object of the hype is present and able to be pondered on its own. With yield fudging and greater forms of dishonesty, the research is described inaccurately; since for everyone but those performing the research, the paper is the only evidence for the representation of the research. There is no other place to go to determine if it works as described. Thus, dishonesty in reporting creates a dishonest representation of the research that is not accurate and is much harder to remove. (It also depends upon what is not reported correctly – in this case, it probably makes people look bad but doesn’t do much else; in other cases, it might invalidate part or all of the research as written.)
    Hype can be either naivete, incorrect or unclear thought, or dishonesty; misreporting or faking your research can only be dishonesty. As a person, “your word is all you have, really.”

  36. Doug Steinman says:

    I think it is perfectly okay to do a synthetic transformation and report the product as being carried on without purification. This, of course, assumes that the product really is reasonably pure by TLC, NMR, etc. and that at some point down the road you will have to purify and then report the overall yield of the combined steps. Sometimes it does not make sense to take the time to purify something that is 90% pure or better depending on what the subsequent reactions are. That being said however, it is totally not believable to do a chromatographic purification and come out with 100% yield as is reported in this paper. Furthermore I agree that the reviewers and the editors should have flagged this as being unacceptable for publication.

  37. anon says:

    The best reaction in yield I ever did was a THP protection of Roche ester. Following bulb-to-bulb distillation on a 10g scale I got 98% yield.
    19 papers or patents report a yield for this reaction (using either R or S Roche ester) by a sciquest search. 7 of them report 100% yield. 5 more report 98 or 99%, and 2 more say 95%.
    It really works that well, at least on a big scale. All you see is a little scum from dihydropyran decomposition and/or self-polymerization. If you pump off the DHP and leave the catalytic p-TsOH in the bulb you’re distilling from, you don’t need an extraction.
    I felt funny putting that yield in my thesis, but it was true, as distilled material that passed analytical analysis.

  38. GLP_353 says:

    It doesn’t take long for synthetic practitioners to recognize which PI’s have a reputation for exaggerating yields, and adjust expectations accordingly. Science, as a self-correcting enterprise, eventually moves forward, no matter the hype of those who feel compelled to engage in hucksterism.

  39. Algirdas says:

    #34 antibac
    – is anyone preventing you from calling out any suspect yields right here, right now, in this comment thread?

  40. Claudia Bakker says:

    From what I’ve learnt thus far in basic chemistry, is that 100% yields are generally not attained practically. So many scientists and researchers doctor results to suit personal agendas and they are often not called out on these alterations. It’s sad really, what science can be trusted? What is the point of peer review if falsified data is published in any event? u15090648

  41. MoMo says:

    Ye Gods! You all are taking this paper way too seriously. Who cares about yield anyway? Real medicinal chemists don’t and yield varies chemist to chemist- we all know this. If Korea creates matter spontaneously who really cares? Surely not the ACS and their reviewers.
    Just get it, test it, and move on.

  42. Anonymous says:

    you want reproducibility problems… try running some nanoparticle syntheses from the literature sometime. Fun stuff.

  43. Synthon61 says:

    Then’s there’s George Olah yields: purity by GC, yield based on amount of unreacted oxidant, or whatever. I work with carbon-14 labelling, so we count it in and count it out. Yields are occasionally close to 100%. More usually best yields are 95%, with often no clear idea what happened to the other 5%.

  44. what the???? says:

    OK so I ran the reaction for a third time and figured out that a fresh bottle of dicyclohexyl amine was needed. Now the yield is bomber dude…something like 75% after distillation. It is one of those tricky palladium catalyzed enolate additions to bromoarene with an air sensitive Pd(I) catalyst. And yes I care about yields, because when it comes time to make 5 g for a dog tox study, the chemistry has got to work.

  45. kriggy says:

    haha, one of my friends had presentation about month ago about this synthesis. The discusion got pretty wild 😀

  46. Anon... says:

    Like my post doc advisor use to say to me, you can’t even pour 100% of the water out of a glass of water.

  47. Great post, Derek!
    The only thing a scientist has is his/her integrity. The Hudlicky paper has had an indelible impact on me, as has learning the craft from world-class synthetic chemists. In fact, I make the Hudlicky Synlett paper required reading in every undergraduate organic chemistry lab I teach, and will continue to do so until the day I die. In this paper, he suggests reporting a range of yields, which is what I teach my students to do. Further, the foundational tenets set forth in this seminal work echo in my mind’s eye when I review synthesis papers.
    As a reviewer of journal articles, it is my (your) responsibility to raise these issues when submitting your review. Though we get little/no credit for it, I would argue that diligent and thorough reviewing is as important a task for faculty and professional scientists as publishing original research. Those of us who are honest and experienced must be the gatekeepers. It is incumbent upon us to prevent irreproducible garbage from worming its way into the synthetic lexicon. I feel like Jerry Maguire for saying this, but I’d rather publish one high quality paper with reproducible and useful chemistry than ten least-publishable units (LPUs) that no chemist with good hands could execute. Be a good example to your peers and to those you mentor/train/teach. Just be honest.

  48. anonymous says:

    First of all, yes, I agree that a string of yields of exactly 100% sounds like bullshit.
    But everybody is talking as if 100% or 100+% yields are never encountered. Just this week I had two yields of 102%. Yes, obviously that means that there’s some shit in there. But I still put it down in my notebook, because that’s the number I got. It would be easy to leave a bit in the flask when I scrape it out and that’ll bring my yield down to say 95%, but all that does is hide the fact that there’s still some shit in my material, and hiding it is the road to hell.
    All I care about is getting my compound made and reporting my data as accurately as I can. If the yield of an intermediate is greater than 100%, it’s usually just some excess solvent. So long as it’s clean enough to move forward that’s fine with me. I’m not going to waste time getting it perfectly clean if I don’t have to, or nothing will ever get finished.
    You could get an 80% yield on one run and a 105% yield on another, and still have the same purity.
    Whenever I see “quantitative yield” reported in a paper I assume the person is hiding a yield greater than 100%.

  49. Cody says:

    T Hud loved this paper! lol it is a good overall synthesis but if someone were to look close at their experimental inthe Barton Mc-Combie deoxygenation they started with 300 something mgs and ended up with 400 something.

  50. Anonymous says:

    I agree that 100% yields are possible.
    I remember doing the acylation of simple steroids with excess Ac2O/Py, overnite at room temp. Usually the yields were 100% or very close to that number. Products were isolated by extraction with CH2Cl2 and washing with diluted HCl. products were usually crystalline solids.

  51. t says:

    This is why process chemist never trust yields reported by an academic group. We just take it as a given that there’s a whole bunch of impurities that the authors don’t understand or care to look for.
    RE comment 48- have you no curiosity? Take a quantitative 1HNMR and provide a real corrected yield. Your version of yield calculation is completely useless. Some of the most interesting methodology and mechanism work has come out of a diligent student figuring out what their impurities are and getting an actual mass balance.

  52. Curt F. says:

    (I don’t know how you would report error bars for a 100% yield, because the yield certainly can’t be higher than 100%, so +/- is out.)
    Yields are ratios of starting material to recovered products. Both of those quantities have uncertainties. Call starting material s and recovered products p. Then yield is s/p. If starting material uncertainty is ds (in units of mols), and recovered products uncertainty is dp, and if ds is uncorrelated with dp, then the the uncertainty in the ratio s/p can be estimated from sqrt((ds/s)^2 + (dp/p)^2).

  53. Bagnar says:

    I remember two things from my first years of chemistry:
    – There is no 100% yield. Above 95% is considered as “quantitative”.
    – A mechanism is just a possible explanation for a reaction. (Solvent, others molecules… could interfere in this reaction)
    Whatever. Quantitative is a nice word isn’t it ?

  54. Anonymous says:

    @48 again, to t @51
    On rereading my comment, I may have given the impression that I take pride in doing sloppy work and having a lack of curiosity. That isn’t the case.
    You process chemists are the ones who do the real chemistry, I acknowledge that you people are much better chemists than me. In discovery the work, like it or not, is much more slap-dash. We only really have two yields- enough and not enough. At any given time I typically have 4 to 8 targets on my to-do list. Nobody wants me spending my time quantifying residual solvent or doing “interesting methodology and mechanism work” to adjust the yield of the occasional 102% reaction going to one of the vast majority of compounds we make that are inactive, insoluble, toxic, or otherwise not going to be the drug, when I should be spending my time identifying the side product of the 20% yield reaction that’s holding me back from another target that may be THE ONE. I do plenty of troubleshooting in the course of my day, and enjoy it, but I have to pick my fights.
    The only compounds the process team ever sees are compounds that everyone is very keenly interested in- you never see the 99.5% of compounds we make that don’t get on the short list. The less time spent on those, the better- we get the data on them, decide what we’ll make next, and move on. By the time a compound gets to the process team it will have been made several times, and as that happens and we become aware that this is going to be shortlist compound I generally do a better job. You probably won’t believe that, because I’m sure you’ve seen a lot of shitty, shitty notebook pages from discovery chemists in the course of your career.
    In my original post I may have given the impression that I get 100+% yields all the time, but they’re actually not very frequent. My point is, when I get them I don’t hide them. At our team meetings I put my yields on my slides, and if a yield is more than 100% then so be it. My bosses have never had an issue with it.

  55. Anonymous says:

    @28. Hear hear! Why don’t chemists have to do stats? Just give it as average +/- SEM. If you haven’t done it three times already you have no business publishing it anyway.

  56. t says:

    Re 54: Thanks for the clarification…thought you were speaking more as an academic on some total syn project. I’m fairly used to Med Chem yields being as you described and have a great appreciation for what their goal is. I always felt it would be nice if they would just record in their notebook a HPLC conversion and a simply area percent, rather than a yield (also amount isolated).

  57. Mark says:

    I once worked with a T. Hud PhD who was a postdoc in my lab when I was a grad student. I can still hear him say “I can’t even piss in 100% yield that is why I wear underwear”.

  58. KevinG says:

    The missing element here goes back to first year undergraduate chemistry… significant figures! 100% is different than 100.0% which is different than 100.% Maybe they mean the yield is between 140%-50%.
    If you’re really reporting 100% to 3 significant figures, you had better be doing pretty big scale so that: #1 The error from operations like weighing, volume measurements, etc. do not impact the last digit. #2 Losses due to samples taken for analysis are too small to impact the last digit
    I only recall seeing this level of analysis in process papers (where this level of precision matters) and from a couple academic groups (Denmark comes to mind).
    The 100% yield is a red flag to me that suggests (a) the authors are over-hyping their results and/or (b) they authors are sloppy in their reporting of data.

  59. Anon says:

    They have come up with another route to fix that low yielding deoxygentaion….!divAbstract ……looks like a pretty serious group about yields….
    To me it suffices to see the neat approaches they have come up with for tetracyclic compound…I would ignore their 100% and take it as quantitative….”must be easy to reproduce” the product, if not the yield.
    It is difficult enough to complete a total synthesis….so journals should try not to focus much on yields to accept or reject….it is they who set the standards. I remember one of the publications was rejected just based on low yields. I you reap what you sow…

  60. JFT says:

    Quantitative yield reactions:
    Zemplen deprotection of sugars with DOWEX work-up and filtration (close to 100 % often)
    Boc deprotection of amines with evaporation of solvent as only work-up.
    I can’t really think of any others-anything more than a filtration (and tha only if done with near wasteful amounts of washes) can’t be 100 %. I have-once-in thousands of columns gotten a 97 % yield after chromatography. But I distrust all yields in the literature-after all as most of us do not report ranges, the yields are, at best, from a single run, or the best result from multiple runs (and might not be the same run that gives the analytical data provided for the compound), or just random guesses. We should all have to report ranges for our yields in most cases, or explicitly state that the yield is obtained from the only attempt at the reaction or that the yield is the best obtained from the reaction. Be especially wary of anything over 85 after chromatography.

  61. Matt B. says:

    JFT: If your compound is stable and stains reasonably on TLC you shouldn’t lose more than a few percent in chromatography. 85% is an awfully low barrier to skepticism. My first paper had a few 92 and 93% yields on 1 mmol, all after chromatography *and* either recrystallization or sublimation. Those yields were of the pure material which passed CHN analysis and provided all of the analytical data. It was a fancy catalytic enantioselective bond-forming step too, run by a first year grad student. I realize very few labs are this careful, but my point is 85% is only hard if you are doing chemistry that is difficult or uncooperative.

Comments are closed.