Last month, my colleague Dana Schmidt wrote a blog post about what the Hewlett Foundation and its grantees have learned about improving children’s early learning from the Quality Education in Developing Countries Initiative. Under this Initiative, our grantees implemented a variety of instructional models both within the school day and after school hours, with children enrolled and with those who were not. Many of these were evaluated using randomized control studies or other quasi-experimental evaluation designs to determine the impact of these interventions on children’s learning.

At the Hewlett Foundation, we encourage our grantees and also try ourselves to talk about and learn from failure, or at least from those things that don’t always go according to plan! As the (relatively) new kid on the block with the Global Development & Population team, I thought I would offer a few observations about what we learned about the evaluation process itself, recognizing that hindsight is 20-20 (or some approximation thereof). Most of this wisdom comes directly from conversations with our grantees and colleagues since I’ve joined the Foundation.

We underestimated the time that was needed for some of the instructional models to be more fully developed, and in hindsight, should have allowed our grantees more time to work the kinks out before carrying out some randomized control evaluations. In some cases, randomized control study designs were just not possible, and so we had to be flexible about evaluation methods. One of the smartest things we did was to encourage grantees and evaluators to work closely together—so evaluators could better understand the instructional model and the context, and grantees had input in framing the questions and helping determine how best to measure learning outcomes.

While the randomized control studies that we commissioned were able to measure the impacts of these instructional models on learning, they were not able to sufficiently unpack the effect or most essential elements of each instructional model. Practically speaking, this has meant that it has been difficult to tell which elements contribute most to learning improvements and thus are highest priority for scaling up, and which could be dropped or emphasized less, depending on resource availability.  Despite these challenges, the Meta-Analysis by Patrick McEwan of Wellesley College goes a long way towards unpacking what is known about improving learning based on a review of dozens of randomized control studies conducted over the past 20 years or so. We think it is essential reading. We will continue to work with our grantees to help them better unpack the elements of their instructional models wherever possible; we encourage others to take this essential step before initiating randomized control studies in the future.

We also underestimated the challenges associated with completing cost analyses of these instructional models. But where we did succeed in doing so, it has produced valuable information that our grantees and policy makers can use for identifying possible areas for streamlining or improving efficiencies that will enable scale-up (e.g. rethinking or restructuring teacher training, mentoring and support, and options for getting instructional and reading materials into the hands of teachers and children most affordably).

Finally, we learned that it is especially important to spend more time and energy figuring out possible delivery channels and constraints to scale-up from the outset. We assumed that we and our grantees would be able to build ownership and political will for scaling up based on evaluation results coupled with brokering new partnerships and financing relationships. We did not fully appreciate all of the systemic barriers to change in environments where incentives and accountability are not currently structured around learning outcomes. We did not sufficiently plan how to manage the discontinuity in political will when reform-minded leaders and other allies left office.

We also could have done a better job of structuring our support to some grantees to enable them to work with policy makers and other key stakeholders to answer vital questions about scaling up. Such as: when to scale and whether this meant deepening services and impact in existing locations, viral spread of innovations, or scaling programs vertically through government or other key providers to reach more children with a basic package of improvements? How to do this without losing the basic integrity of these instructional models? Who would be responsible for quality assurance? Who pays for what and how to achieve the necessary commitment and clarify the roles of various key actors?

So what does this mean for the remaining nine months or so of this time-bound initiative that we started several years ago? The Hewlett Foundation is not funding the full scale-up costs of these programs—we don’t see that as our role in getting sustainable solutions in place, and simply don’t have the resources to do so. Rather, we are supporting our grantees to consolidate and expand the most promising instructional models, and also to use the results of evaluations and cost analyses to their best advantage. We are providing more intensive technical support and capacity building to a few of our existing grantees to assist them in pursuing opportunities that have emerged for scaling up. We also hope to document and share some of our experiences through our participation and support of the Brookings Institution Center for Universal Education’s “Millions Learning” work. Finally, we will be intensifying our support for household-based assessments of student learning, like ASER in India, Uwezo in East Africa and Beekunko and Jangandoo in Mali and Senegal, in order to better capture data on all children’s learning and inspire both communities and policy-makers to take notice and take action to improve children’s learning.