Skip to main content

The Scientific Literature

Which Will Sprout and Which Will Bear Fruit?

Back in 2013, I mentioned the “JACS Challenge”, an interesting attempt to see if papers that eventually got cited a lot were obvious prima facie. Given a selection of older papers from the journal that readers were unfamiliar with, could they pick out the ones that ended up getting cited more?

Now this work, revised and expanded, is the subject of a paper in PLOS ONE (open access, by definition), and the author line features some blog pseudonyms, interestingly. A 2003 issue of JACS was selected, and respondents were asked the following questions:

  • Which three papers in the issue to you think are the most ‘significant’ (your own definition of ‘significant’ is what is important here)?
  • Without looking up numbers, which three papers do you think will have been cited the most to-date?
  • Which three papers would you most want to point out to other chemists?
  • Which three papers would you want to shout about from the rooftops (i.e., tell anybody about, not just chemists)?

I am glad to say that I seem to have skewed the set of responders a bit, since it appears that many of the people who answered the survey were readers of this blog following a link. Looking over the papers that were suggested, it seems that the correlation between the first set and the third (significant, and should be shared with other chemists) was pretty strong, as you might think, but that correlation between “significant” and “will have been cited the most” was somewhat weaker.

In fact, the correlation between what respondents thought would be the most cited articles and the actual ten-years-later citation counts was quite poor (see the paper’s Figure 1). Looking at Figure 2, you can see that none of the other questions, in fact, correlate well with the real citation counts (I would be rather unhappy if these graphs represented project assay correlations!) Of course, it’s also true that the respondents disagreed pretty significantly about which papers were significant in the first place. That’s strong evidence that the survey set was indeed composed of practicing chemists, because we rarely agree on much of anything.

Are there indeed differences between “interesting, thought-provoking” papers and ones that you feel like telling other chemists about? Or between those and the ones that pick up citations? These data, though far from comprehensive, suggest that both of these may be true (as does ones own intuition, for what that’s worth). The paper tries to correlate responses to the reported areas of specialization of the responders, and there may well be something to that. For example, this paper was cited 325 times over the next ten years, but only five responders to the survey picked it as one that would get cited. All five of them, though, indicated that they specialized in this general area.

On the other hand, this paper was selected by many survey responders as one that would pick up citations, but its actual ten-year citation count was average-to-modest. My own guess here (I may have picked this one myself!) was that the title sounded like something “hot”, by current standards, and it surely would have been picked up on. But that shows you the peril of such gut feelings.

What this does is drive yet another nail into the idea that current publication-based measures to evaluate research importance, quality, and impact are much good at all. They aren’t. The paper finishes up on just this point, referencing several other initiatives that are trying to overturn citation counts, h-indices, journal impact factors and so on. These things are measures, sure enough, but they’re not necessarily measuring very much, and not necessarily what some of their users think that they’re measuring!

14 comments on “Which Will Sprout and Which Will Bear Fruit?”

  1. Curious Wavefunction says:

    I am sure nobody thought that the Folin phenol paper would become the most cited paper of all time, or that Axel Becke’s paper on what became the B3LYP functional in theoretical chemistry would become the most cited in chemistry. For that matter, there were almost no citations to Watson and Crick’s seminal 1953 piece paper for a long time after it was published. The same goes for Steven Weinberg’s groundbreaking paper (“A Model of Leptons”); it was only when he received the Nobel Prize for it that the citations started to rack up (it was the most cited paper in physics for years before it was superseded by Juan Maldacena’s paper on string theory – a field which has no experimental validation, so go figure). And heck, Fermi’s famous paper on beta decay which was the theory of the weak force was actually rejected by Nature on the grounds that it “contained speculations too remote from reality”. The importance of most scientific papers is clear only in retrospect, and personal rankings and citation factors are both highly flawed measures of value.

    1. artkqtarks says:

      I think you are wrong about “A Model of Leptons.” It was published in 1967 and Weinberg won the Nobel Prize in 1979, but it started to get a lot of citations in ~1972-73.

      It probably started to get attentions because t’Hooft showed that Weinberg’s model is renormalizable in 1971. One of experimental supports of the model was discovered in 1973. A Nobel Prize is usually awarded after the importance of the work is already clear.

      1. Curious Wavefunction says:

        You are right. It did not get citations right away but started to get them a few years later.

      2. Eugene says:

        The 10 year time frame seems to short.

  2. Thoryke says:

    Many of the article metrics seem to be “what’s easy to count” as opposed to “what is useful to measure”. Of course, identifying which articles really spur new insights/areas of research might require an actual person doing an actual literature review…. and it might be years before the basis for making those decisions becomes apparent.

    1. Derek Lowe says:

      A lot of metrics in general, in our field and out of it, are what’s easy to count as opposed to what should really be measured, when you get down to it. . .

  3. anon says:

    this is all in retrospect though, so I don’t see much value in this survey… I’d like to see them open up the latest issue of the Journal X and answer those questions.

    1. Derek Lowe says:

      It’s in retrospect so the actual citation counts are known, though. We’ll have to wait ten years for the answer to your proposed survey – not that it’s not such a bad idea. . .

      1. a. nonymaus says:

        Unfortunately, a prospective study could run into problems of citation numbers being skewed by the study participants citing the works that they think should be cited a lot.

  4. Eric Nuxoll says:

    Try taking this one step further: Can one even tell whether the work is new or old? In grad school my PI was writing an editorial for one of the standard chemical engineering journals. He had me randomly grab an issue of that journal from 1974 and another from 1999 (the previous year), take the first 25 article titles from each and list them in random order. I then walked through our building asking faculty and grad students (and anyone else who would participate) which articles were new and which were a quarter-century old, based on the title. The average success rate was 59%, slightly better than random. The highest score (71%) was achieved by a construction worker helping put an addition onto the building. Not a good result for that journal, but they still published the editorial.

  5. Anonymous says:

    I conceptualized some studies of my own related to this topic even when I was a grad student. But I start with more about something already mentioned above.

    You are all familiar with need to sensationalize and publicize research results. University news offices publish articles about recent campus publications and try to get the mainstream press and other outlets to pick them and spread the word. Being highlighted in News and Views, C&EN, The NY Times, etc., is good for business, tenure review, funding, annual raises, public perception, etc.. I agree with what was alluded to above and that a lot of “scientists” get their information from such headlines and not from the source documents themselves, and that can easily, I proposed, skew citation rates.

    I am trying to think of some doozies. The first one that comes to mind was the NY Times highlight of “Self-replicating Molecules” (JACS, 1990). That paper got a lot of citations. It is even mentioned, without correction, on Wikipedia. However, no one seems to take note of the fact that it was later acknowledged that it was NOT a self-replicating system (buried in a review in Acta Chem Scand). The Acta Chem Scand review, also cited fairly often, was probably not actually READ by people doing the citing, a well known phenomenon in citation propagation.

    There are several other publications over-hyped by manufactured press releases and news office promotions. Another GREAT example of citation propagation via news conference is “Cold Fusion” (1989). Do not think that such tactics or their consequences are anything new. The Woodward – Doering Quinine Synthesis was written up in Life Magazine(!) with pictures of the crystalline intermediates and such. In the 1950s, cortisone was headline news in the popular press. For many years prior, newspapers reported other science results (Eddington’s Proof of Einstein’s Gen Rel via the solar eclipse was front page news in 1919; polar explorations were public newspaper news in the 1800s; etc.). Moving on …

    Starting in the 1980s, there was a set of publications called ChemTracts (Organic, Physical, Inorganic, Bio, …). (I think it started out as a Wiley publication; it got sold to another publisher; I think that it is now defunct.) The distinguished editors and invited reviewers kind of picked out recent literature to provide some perspective and embellishment. I recall wondering why they picked particular articles and ignored others that I thought were worthy of additional discussion. If I had to guess, I’d say that some of it was professional courtesy (cronyism). “I’ll highlight your work, you say nice things about my work.” I suspect that being in ChemTracts probably boosted citation counts quite a bit as well as it boosted junior faculty authors into better positions.

    But there is another existing database of valuable information on the prediction of the value of newly published chemistry. I started using Theilheimer (Theilheimer’s Synthetic Methods of Organic Chemistry) as a first year grad student. The toughest part was figuring out the indexing method (with curly arrows and what not to indicate replacement reactions, rearrangements, red-ox, etc.). Personally, I loved it! At the beginning of each volume was the unsigned “Trends and Developments in Synthetic Organic Chemistry” for each year. Maybe 6-12 pages of brief discussion with references. I used to photocopy the Trends section from each new volume and check to see what was right and what was wrong in subsequent years. (I actually wrote to Karger and suggested that they reprint the Trends sections as a slim volume either for sale (for money) or as a promotional tool to keep Theilheimer within “eyeball” view on every chemists’ bookshelf. They declined to followup on the suggestion.) I had several boxes of Trends and annotated references … thinking that someday I’d do further analysis and reveal how often Theilheimer (and Tozer-Hotchkiss) got it right and wrong. That is STILL a project that can be easily done with access to SCI / WoS and other literature resources. The recognition and prediction of Trends was very gutsy and usually done within only 1-3 years of actual publication, not 10 years later. Theilheimer ceased publication in 2015.

    (I once predicted that Castro’s Reagent, an air stable, solid Mitsunobu reagent (J. Org. Chem. 1994, 59, 2289-2291, Mitsunobu-like Processes with a Novel Triphenylphosphine-Cyclic Sulfamide Betaine), would be Reagent of the Year. It wasn’t. I think it lost to tetrakis-TMS-silane.)

    Anybody want to co-author a retrospective analysis of Theilheimer’s Trends with me?

    1. anon says:

      ” I recall wondering why they picked particular articles and ignored others that I thought were worthy of additional discussion. ”

      and I still wonder how and why those articles are picked by C&EN, Nature news&comment etc. Is it totally up to the reporter to pick those articles? Are these reporters scientists themselves? If not, how can they evaluate which paper is worth highlighting or not? judging by the labs that publish them? They can affect an early career scientist’s future.

  6. What says:

    Just like many of the articles chosen here: editor bias. Sometimes good sometimes random sometimes way off.

  7. Massive PI says:

    The significance of the structure of DNA was immediately obvious. We can split hairs over the significance/predictability of lesser works, but the truth is that if you are working on something useful, you would know it right away. Otherwise, its better to get out of the lab, tell your PI to shove it, and begin formulating something of actual novelty.

Comments are closed.