I had a book review recently in Nature, on a new volume (Thrifty Science) that looks over the history of early scientific experimentation from the viewpoint of its frugal nature – the idea of reusing and repurposing equipment, objects, and even rooms in one’s house. There was indeed a lot of this sort of thing, as the book makes clear, but I wondered about one of its conclusions – which was that this sort of thing could well be making a comeback. I’m not so sure about that. While there are things that can still be discovered by basement experimenters and make-do apparatus, my own belief is that the bar to discovery has been creeping inexorably higher just because of the nature of science. We build on each other’s work, and we’ve built on an awful lot of stuff that was discovered with simple equipment and under simple conditions. So over time, working in any of those areas tends to become progressively less simple.
A countervailing trend is the availability of such equipment, but the stuff that’s widely available tends to be so because lots of people have already done lots of research with it. If you’re going to make new scientific discoveries on your own out in the garage, you can get plenty of apparatus, but you’re going to have to be more ingenious than ever to come across something that no one has has already explored. Robert Boyle and the rest were indeed great scientists (Newton deserves his high reputation even after any historical adjustments you care to make), but they did have plenty of open room to run in as well. And the open spaces of today, it seems to me, are less and less available to thrifty home experimenters. Mind you, I wish that weren’t so. But I don’t think that the scanning tunneling microscope (to pick one example) could have been prototyped in a garage lab. One response to this is “Well, PCR could have been”, but think about the state of molecular biology when PCR was first developed – I’m not so sure about that at all.
This recent paper in Nature has perhaps some bearing on this topic. The authors are looking at a measure of the impact of published papers:
Here we analyse more than 65 million papers, patents and software products that span the period 1954–2014, and demonstrate that across this period smaller teams have tended to disrupt science and technology with new ideas and opportunities, whereas larger teams have tended to develop existing ones. Work from larger teams builds on more-recent and popular developments, and attention to their work comes immediately. By contrast, contributions by smaller teams search more deeply into the past, are viewed as disruptive to science and technology and succeed further into the future—if at all.
They quantify this by an ingenious bibliographic technique. When a paper gets cited, do those citing it also cite many of the references in the original paper? If so, that original paper is more likely to represent work that’s consolidating or extending a field that was already somewhat worked out. By contrast, if a paper is cited, but more in a solo manner, it is more likely to represent a new direction by itself, leaving those coming after it to cite it alone without other examples of something similar. To go to extremes, at one end you have review articles – very useful, but deliberately not breaking any new ground whatsoever. And at the other, you have one-off reports (at least at first!) of something that no one’s ever thought of or tried.
Applying this “disruptiveness” metric works surprisingly well – for example, papers that are known, in retrospect, to have contributed directly to major discoveries (as measured by eventual prizes and recognized impact) do indeed rank higher on this lone-citation scale. And when you look at the number of authors on all these papers, you find a very noticeable trend: the number of such “echo citations” (my phrase) grows as the size of the team grows. Which means that average disruptiveness goes exactly the other way. When you look at the set of most-disruptive papers, they are much more likely to have been smaller-than-median teams, for any given scientific discipline. This distribution also holds for the patent databases, and even for code and routines published on GitHub. Differences in the subjects being researched and the way the experiments are (or have to be) set up are real, but the team-size factor is larger than any of them (although the relative sizes of the teams varies in each data set). If you look at the top 5% in each category (developmental versus disruptive), the graphs shown above diverge even more robustly.
Think, for example, of the famously huge author lists found in high-energy particle physics. If there are seven hundred and forty-three authors, the resulting paper is more likely to be the long-sought confirmation of some extremely difficult-to-observe phenomenon (like the Higgs boson) whereas the paper that proposed that something like the Higgs boson had to exist will have had nowhere near such an army behind it. Such ideas take a while to catch on to the point that you can get seven hundred coauthors to work on them – and such large teams are more vulnerable to failure if the whole idea is wrong, too.
But this team-size effect holds up from almost every angle. The relative disruptiveness of review articles is not large, for example, but the most disruptive ones have the smallest number of authors. If you remove self-citations or try cutting out all the references but the high-impact ones, the effect is still there. Controlling for publication year doesn’t get rid of it, nor breaking things down by scientific discipline. Controlling for authors themselves actually make the correlation more robust: any given scientist’s most disruptive papers are generally those published with the smallest number of co-authors. Looking at the patterns of the citations themselves, it appears that smaller teams (and solo authors) tend to reach back to older and/or less popular ideas, rather than chime in on something that’s already rolling along. It’s for sure that more of these small-team papers disappear without much of a trace, and they tend to have a long delay before citations pick up, but when they have an impact, it’s a larger one.
The lesson of this paper is not “Small teams good, large teams bad”, though. Both kinds of work are needed, and they’re part of the normal development of scientific ideas. I was gratified to find that the paper addresses the washing-your-car-to-make-it-rain problem: you don’t necessarily generate disruptive work just by artificially forming people into small teams. That’s not how it works – disruptive work instead tends to cause smaller teams to form around it. Funding small teams by traditional means (renewable grant proposals, for example) may well be a recipe for filtering out what could have made them interesting in the first place:
We analysed articles published from 2004 to 2014 that acknowledged financial support from several top government agencies around the world, and found that small teams with this funding are indistinguishable from large teams in their tendency to develop rather than disrupt their fields. . .This could result from a conservative review process, proposals designed to anticipate such a process or a planning effect whereby small teams lock themselves into large-team inertia by remaining accountable to a funded proposal.
To circle back to the topic I started off with, it would also be interesting to know what sort of facilities and equipment correlate with disruptive work. There probably is a tendency towards increasing cost and complication as you go to larger teams (it’s hard to see how there could not be). But it’s important to get that causality right: giving people fewer resources is probably not the recipe to make them more inventive, since the great majority of people will not respond in the manner you’re trying to induce. And my guess is that the relative technical sophistication (and cost) of even the smaller teams, for a given discipline, has increased over time as well.