In the small and overheated world of impact evaluation we have a serious “baby and bathwater” problem. It’s not that we may be tossing out the baby with the bathwater; it’s that we risk throwing out the baby and drowning in the bathwater.

The baby? It’s the value of measuring in a valid and reliable way whether something that we intentionally do to change people’s lives—improve their health, advance their educational opportunities, increase their income—in fact does that thing. Across the ideological and methodological divides, I’m guessing there’s quite a bit of agreement about value in knowing whether X leads to Y—whether, for example, introducing single-sex classrooms leads to girls being more likely to complete primary school. We may resist reducing complex social and economic systems to linear causation, but “if / then” thinking is a core feature of most public policy design. Assessing whether those causal relationships bear out in the real world has the potential to make policy and program decisions better than they otherwise would be.

The bathwater? It’s the circular and often misinformed debates about methodological superiority. Polarizing positions have been taken between those who favor one way to measure the differences with and without a particular intervention—namely, comparing randomly selected “treatment” and “control” groups—and those who are deeply skeptical of applying the scientific method to context-specific, nuanced, and dynamic interactions between people and their environment. Observers of these debates and participants themselves must by now be quite weary of the conversation. We have been listening for a long while to characterization of the “randomistas” and arguments about how random assignment evaluations stack up in cost, difficulty, and rigor to many alternatives, from quasi-experimental modeling methods to before-after observational studies. Some of the arguments are ancient, while others are newer—or at least being joined with new passion. If you want a refresher on the state of play, you can find some useful resources in two posts on the Center for Global Development’s blog (including the comments) and this recent paper by Howard White.

In the persistent back-and-forth, in the taking of sides, I fear we are at risk of losing the focus on impact—which is, after all, the main value proposition of impact evaluation. The most important contribution impact evaluation can make is to challenge the practice of measuring only what we spend and what we do, and then confidently assuming good things will result in equal measure. If all impact evaluation does is direct our attention to real-world changes in place of self-promotional storytelling, it will have made a contribution. Undertaking an impact evaluation—regardless of methodology—makes us state, for the record, that we think a causal pathway exists between a particular X and Y. And impact evaluation makes us ask the toughest question: “Will our actions truly do more good than harm?” Far from being an expression of dogmatism, impact evaluations start by saying we aren’t so sure about the effects of our actions, we’re open to surprise and to learning. Whatever methods we may like or loathe, we have to protect that baby.