With alarming regularity, many promising pilots in the health care improvement and implementation field have little overall impact when applied more broadly. For example, following early reports that care coordination programs benefit patients and reduce costs, a 2012 Office of the Inspector General (OIG) report found no net benefit, on average, across 34 care coordination and disease management programs on hospital admissions or regular Medicare spending. In 2014, Friedberg and colleaguesfound patient-centered medical homes (PCMH) in Pennsylvania had no impact on utilization or costs of care, and negligible improvement in quality, despite early reports promising decreased costs and improved quality of care.
Moreover, surgical checklists have been adopted globally after early findings of improvement in surgical outcomes. In particular, in Ontario, policies have called for universal adoption of checklists. However, recently Urbach and colleagues compared operative mortality and surgical complications at 101 acute care hospitals in Ontario, finding that a surgical checklist implementation was associated with no overall positive effects.
This phenomenon was described in the 1980s by American program evaluator Peter Rossi as the “Iron Law” of evaluation in studies of the impact of social programs, arguing that as a new model is implemented widely across a broad range of settings, the effect will tend toward zero. As a result, policymakers are unsure whether or not to encourage model expansion.
In this post we describe how new models can fall foul of Rossi’s Iron Law in several interdependent ways, and we recommend approaches to reduce the likelihood of this happening. Specifically, we argue that just because a pilot does not work everywhere does not mean it should be wholly abandoned. Instead, data should be reviewed at the individual site level. Evaluators and policymakers should understand where and why a model has worked and use that information to guide the