Skip to main content

Book , ,

A historian explores the dark side of metric-based performance evaluation

The Tyranny of Metrics

Jerry Z. Muller
Princeton University Press
240 pp.
Purchase this item now

The notion of a “metric” as a performance measure became familiar in the 1970s and 1980s as a tool of business management. Almost immediately, its use was extended to the assessment of a range of activities and institutions, from medical outcomes and educational programs to military projects. The credibility of metrics rests in part on an affiliation with ideals of business efficiency and in part on the supposition that measurement is tantamount to science.

Although the numbers whose “tyranny” form the subject of Jerry Muller’s timely book share some of the attributes of scientific measurement, their purposes are primarily administrative and political. They are designed to be incorporated into systems of what might be called “data-ocracy,” often for the sake of public accountability: Schools, hospitals, and corporate divisions whose numbers meet or exceed their goals are to be rewarded, whereas poor numbers, taken to imply underperformance, may bring penalties or even annihilation. In The Tyranny of Metrics, Muller shows how teachers, doctors, researchers, and managers are driven to sacrifice the professional goals they value in order to improve their numbers.

Scientists these days sometimes contrast “data-driven” science with institutional routines—and especially with the fake news that seems now to run rampant in public debate. Yet metrics, when used to ease administration, have become graveyards of such naïve idealism. Data are almost never, as the etymology implies, “given,” but have to be actively made or taken. The appropriate uses of data are subject to intense discussion and debate and depend in most cases on the framing and interpretation of models.

Metrics are typically designed for routine administrative use, often in a context of political disagreement. As Muller shows, the expectation that they should be transparent leaves little space for expert interpreters to adjust or explain misleading measures of therapeutic effectiveness, school quality, research impacts, or even corporate profitability. Because none of these are concrete in a way that would permit rigorous measurement, such metrics can only function as indicators and can lead to terrible consequences when they are imposed reflexively.


Overvaluing standardized test scores creates perverse incentives for teachers, argues Muller.

The American obsession with metrics of school performance, which took off in the early 1980s and is now increasingly globalized, has probably done more harm than good, suggests Muller. Standardized tests are unable to capture what is really valuable in an education, and schools have no control over many of the variables that enter into measures of student performance. In one specific sense, the metrics are counterproductive because they encourage a focus on test performance rather than on the knowledge that a proper curriculum aims to advance. Metrics of this sort privilege symbols over reality.

There are, likewise, many well-documented instances of needy patients refused treatment for the sake of statistics. (“V.A. doctors say rating push hurts patient care,” proclaimed The New York Times on the front page on 1 January 2018.) When doctors or hospitals are judged by the success of operations on patients with a particular diagnosis, they face a strong incentive to avoid treating the neediest patients, because these are ordinarily the least likely to recover.

Not infrequently, metrics produce almost the opposite results for which they were intended. Muller gives the example of hospital emergency rooms that improved their metric for timely admission of patients by allowing a line of ambulances to form outside.

Muller’s basic arguments, as he acknowledges, are not new. Thoroughgoing critiques of performance metrics began appearing in the 1970s in response to a wave of extravagant ambitions for numbers-based management. A particularly noteworthy target of such criticism was Robert McNamara, who became infamous for his fetishization of body counts during the Vietnam War.

There were at least two fundamental problems with this metric: first, that the numbers could be fudged—for example, by including noncombatants in the death counts—and second, that the number of deaths did not seem to contribute much to winning the war. We must add the moral point that this metric encouraged the slaughter of innocents.

In 1975, the American social psychologist Donald Campbell and the British economist C. A. E. Goodhart, articulated independently the principle that reliance on measurement to incentivize behaviors leads almost inevitably to a corruption of the measures. Muller explains the logic of this corruption and defends, in place of indiscriminate numbers, an ideal of professional knowledge and experience.

Measurement, he concludes, can contribute to better performance, but only if the measures are designed to function in alliance with professional values rather than as an alternative to them. Good metrics cannot be detached from customs and practices but must depend on a willingness to immerse oneself in the work of these institutions.

About the author

The reviewer is at the Department of History, University of California at Los Angeles, Los Angeles, CA 90095, USA.