Evidence for Teach for America

In this study by Mathematica.

TFA teachers were more effective than the teachers with whom they were compared. On average, students assigned to TFA teachers scored 0.07 standard deviations higher on end-of-year math assessments than students assigned to comparison teachers, a statistically significant difference. This impact is equivalent to an additional 2.6 months of school for the average student nationwide.

They captured data on TFA teachers and non-TFA teachers, and nothing in the data (e.g., their educational backgrounds) predicted this result. Even though this study was carefully conducted, my prediction is that it will not hold up over time. I put my faith in the null hypothesis in education.

3 thoughts on “Evidence for Teach for America

  1. Education Realist posted a good takedown on the takewaway on the results.

    Not only are we justified in being skeptical about replication, but I think the real moral of the story is ‘diminishing returns’. Consider:

    TFA teachers who take the Mathematics Content Knowledge Test outperformed comparison teachers by 22 points (or 0.93 standard deviations); those who took the Middle School Mathematics Test also outperformed comparison teachers by 22 points (or 1.19 standard deviations).

    Well, obviously, you don’t get a difference of one standard deviation or more in math scores through random selection of individuals, do you? And yet that kind of difference only yields a 0.06 standard deviation increase (at the 0.05 statistical significance level), from ‘teachers from traditional routes’.

    So, significantly smarter teachers, with substantially less ethnic diversity, and who would likely demand higher salaries for retention over a career in problem schools vs. their competitive employment opportunities, could only squeeze out a tiny amount of extra measured ‘effectiveness’.

    That seems like a very strong result for, not against, the null hypothesis to me.

    • I suppose it depends on exactly how one defines the null hypothesis here.

      I prefer to think of it in Dierdre McCloskey terms. It is an example of something that has statistical significance (it is >95% likely that the TFA group gets better results) but does not have policy significance (in this case, the difference is too small and comes at too high a cost).

Comments are closed.