How many scientific papers drop into the void, never to be cited by anyone, ever again? There are all sorts of estimates floating around, many of them rather worryingly high, but this look at the situation by Nature suggests that things aren’t so bad.
The idea that the literature is awash with uncited research goes back to a pair of articles in Science — in the 1990 one and in another in 1991. The 1990 report noted that 55% of articles published between 1981 and 1985 hadn’t been cited in the 5 years after their publication. But those analyses are misleading, mainly because the publications they counted included documents such as letters, corrections, meeting abstracts and other editorial material, which wouldn’t usually get cited. If these are removed, leaving only research papers and review articles, rates of uncitedness plummet. Extending the cut-off past five years reduces the rates even more.
With a ten-year cutooff there are still uncited papers, of course, but it varies interestingly with the field. Biomedical papers have a 4% uncited residue, chemistry has 8% refractory material, and physics has 11%. Note, though, that if you remove self-citation by the same authors, these rates go up, sometimes quite noticeably. The ten-year uncited rate across all disciplines, minus self-citation, is about 18%. But another thing that such studies have uncovered is that this rate has been dropping for many years. That’s presumably a function of better access and searching across journals and a related tendency towards longer reference lists in general. (In the sciences, that rise starts around 1980 and has gotten steeper in recent years).
Getting these citation numbers is not easy, though – the harder you look, the more you find:
It’s hard enough to check a handful of papers. In 2012, for instance, Petr Heneberg, a biologist at Charles University in Prague, decided to examine the Web of Science records of 13 Nobel prizewinners, to scrutinize a preposterous-sounding paper that claimed that around 10% of Nobel laureates’ research was uncited. His first glance at the Web of Science suggested a number closer to 1.6%. Then, checking on Google Scholar, Heneberg saw that many of the remaining papers actually had been referenced by other works indexed in the Web of Science, but had been missed because of data-entry errors or typos in the papers. And there were additional citations in journals and books that the Web of Science never indexed. By the time Heneberg gave up searching, after about 20 hours of work, he had reduced the proportion another fivefold, to a mere 0.3%.
So the figures above are most certainly upper bounds. Note also that newer technology has shown that papers get viewed or downloaded at a much greater rate than they’re ever cited, so the papers that no one ever reads or sees must be a much smaller fraction in turn. As the Nature article notes, things also get put into reference database without being formally cited. Figuring out how useful all these papers are is another question entirely – and one that’s impossible to answer, for a lot of reasons – but it’s at least possible to disprove the idea that a substantial number of papers are never seen, read, or cited.
But still. . .the papers in the uncited zone do tend (as you’d expect) to be in much less prominent journals (apparently almost all papers published in a journal that you’ve heard of do get cited by someone). And that brings up the “dark matter” problem – these figures are all from Web of Science-indexed journals, a large cohort, but one that (justifiably) ignores the hordes of shady journals that will publish anything at all. You’d have to imagine that citation rates are abysmal among the paper-mill-publishing “journals”, and the great majority of that is surely self-citation. If we count these things as “papers”, then the number of never-seen no-impact publications rises again.
But personally, I’m not willing to count the shady publishers’ product at all, because I just don’t know if it can be trusted. I have no problem with obscure but honest journals, and of course I have no problem with low-impact reference data. There should be piles of it; let’s keep it all now that storage isn’t really a problem. There really are cases where odd little papers from years ago suddenly become relevant – science should never throw things away. But going with some outfit that wants cashier’s checks sent to an island in the Caribbean, accepts your papers in ten minutes every time, and publishes every single word exactly as written (while at the same claiming to review things) is throwing your work away right from the beginning.