Introduction to modeling success rates from appropriations data
A substantial portion of fundamental scientific research is government-supported (see my editorial in Science). An important factor that affects the experiences of individual investigators and the efficiency of government funding programs is the likelihood that a given scientific application receives funding. Although this can seem to be a simple parameter, it depends on the systems properties of the scientific enterprise such as the fraction of appropriated funds available for new grants and the number of investigators competing for funding, both of which evolve over time.
Here I introduce a modeling approach that links historical data about funds appropriated to agencies for grants to the probabilities that grant applications are funded in a given year. I will initially focus on the US National Institutes of Health (NIH) because the relevant data are readily available and changes in the funds appropriated over time have varied substantially, introducing some important behavior that can, at first, seem hard to understand.
NIH appropriations history
The levels of funds appropriated for NIH from 1990 to 2015 are shown below:
These data are in nominal dollars—that is, they are not corrected for the effects of inflation. The NIH budget was doubled from 1998 to 2003 in a coordinated effort by Congress, and these points are shown in red. Note that these data and the subsequent analysis exclude funds and applications associated with the American Recovery and Reinvestment Act, which affected budgets in 2009–2010.
NIH success rates
One of the factors that has a great influence on the research and science policy communities, both directly and culturally, is the likelihood that a given grant proposal will be funded, usually measured in terms of the “success rate” or “funding rate.” The success rate is the number of new and competing grant applications that are awarded in one fiscal year divided by the total number of grant applications that were reviewed in that year.
Success rate = number of grants awarded/number of grant applications reviewed
Here the term “new and competing” grants refers to grants that have not been funded previously (new grants) and grants that have been funded for one or more multiyear terms but now are competing through full peer review prior to receiving additional funding (competing renewal grants).
Two major factors determine the success rate. The first is the amount of funding available for new and competing grants, as opposed to the overall annual appropriation. This, combined with the average grant size, determines the number of new and competing grants that can be awarded—that is, the numerator in the success rate calculation. The second is the number of grant applications that are submitted and reviewed in a given year. This is determined by the number of investigators that are submitting grant applications and the average number of applications submitted in a given year per investigator. This is the denominator of the success rate calculation.
Note that the success rate fell dramatically immediately after the doubling and continued to fall for several additional years. This led to outcries from the research community and consternation from Congress because they had made funding biomedical research a high priority for a number of years.
The effects of multiyear funding
Why did this dramatic drop in success rate occur? A major factor involves the manner in which NIH research project grants are funded. NIH grants average 4 years in duration, which are almost always paid out in four consecutive fiscal years. Thus, if a 4-year grant is funded in a given fiscal year, the NIH is committed to paying the out-years for this grant over the next three fiscal years. Because of this, ~75% (actually closer to 80% or more because of other commitments) of the NIH appropriation for a given year is already committed to ongoing projects and only ~20% of the appropriation is available for new and competing projects. This makes the size of the pool for new and competing projects very sensitive to the year-to-year change in the appropriation level.
Funds from grants that have ended are recycled to fund new and competing grants. This recycling is shown schematically below:
The recycling of funds from year to year with funds for grants that end moving into the pool for new and competing grants.
A model for estimating success rates based on appropriations history
To put these effects in quantitative terms, I developed a model for the number of new and competing grants. This model will be described in detail in a subsequent post. Briefly, the model is based on the assumptions that NIH funds grants with an average length of 4.0 years with 1/4 of the grants with a duration of 3 years, 1/2 of the grants with a duration of 4 years, and 1/4 of the grants with a duration of 5 years and that the average grant size increases annually according to the rate of biomedical research price inflation.
This model is combined with a model for the number of research project grant applications that are reviewed annually. The basis for this latter model is that the number of applications submitted rises in response to increases in the NIH appropriation with a lag of about 2 years. This model will also be described in a subsequent post.
The success rates predicted from the model are compared with the observed success rates below:
The agreement is reasonable, although certainly not perfect. The overall Pearson correlation coefficient is 0.866. However, the model does accurately predict the sharp drop in the success rate immediately following in the doubling period. In addition, because the model assumes constant policies at NIH, the areas where the model results do not agree as well with the observed values suggest time periods where NIH did change policies in response to ongoing events. This will be explored in a subsequent blog post.
Several parameters can also be examined to characterize this and other funding scenarios. The first is the total amount of funds invested, both in nominal and constant dollars.
The investment in nominal dollars is 557 billion.
The investment in constant 1990 dollars was 334 billion.
The observed mean success rate was 0.248.
The mean success rate predicted from the model was 0.251.
The observed standard deviation in the success rate was 0.052.
The standard deviation in the success rate predicted from the model was 0.063.
Modeling an alternate scenario with more consistent funding increases
Suppose that, instead of the doubling, Congress had committed to steady increases in the NIH appropriation beginning in 1998. To match the investment in constant 1990 dollars from the doubling and postdoubling era, this corresponds to annual increases of 7.55%.
We can now use the modeling tools that we have developed to estimate the consequences of such an appropriations strategy in terms of success rates and other parameters.
As might be anticipated, under the new scenario, the success rates vary much less dramatically. The standard deviation in the success rate predicted from the model for the new scenario was 0.022. This is smaller than the observed standard deviation by a factor of 2.4. Thus, the scenario with steady appropriation increases would decrease the variability in and, hence, the apparent capriciousness of, success rates substantially.
The mean success rate predicted from the appropriation scenario with steady increases from 1998 on was 0.257. This is higher that the mean success rate based on the actual appropriations data by 2.6%.
Although this is a relatively modest change in mean success rate, it corresponds to a decrease in the number of unsuccessful applications from 702,000 under the actual scenario to 667,000 under the new scenario. Thus, the steady approach to funding would have reduced the number of unsuccessful applications by 35,000. With the conservative assumption that preparation of a grant application requires 1 month of work, this difference corresponds to the efforts of 111 investigators working full-time over the entire 26-year period.
A modeling approach has been developed that allows estimation of NIH grant success rates given the history of appropriations. The model is used to demonstrate that alternatives to the “boom” in appropriations corresponding to the NIH budget doubling followed by the “bust” of more than a decade of flat (or falling when the effects of inflation are included) would have resulted in 2.6% more efficient distribution of funds (measured by the number of applications that would be needed to distribute the same amount funds in constant dollars) and less variable success rates by a factor of 2.4. The model can be applied to other potential past or future appropriation scenarios, and the modeling approach can be applied to other agencies.
The next post will explore the component of the model focusing on the number of new and competing grants in more detail.
Available code and documents
An R Markdown file that generates this post, including the code for the model, is available.