Against the Null Hypothesis

Andrew Flowers reports on the Chetty, et al study showing differences in teacher value added,

Their numbers are being replicated in many different settings. Even in Rothstein’s paper critiquing their method, he replicated their results using data from North Carolina public schools. “I’m not aware of another area of social science where there has been so much replication, in such a short time, and they’ve all found the same result,” Kane said. On the consistency of replicability, Staiger said “it’s just astounding, actually.” Even Rothstein grants this: “Replication is an extremely important part of the research process … I think this is a great success, that these very complex analyses are producing similar results.”

We’ll see. My money is still on the null hypothesis.

31 thoughts on “Against the Null Hypothesis

  1. In any case, refusing to sever with poor teachers is an odd stance. It says that teaching doesn’t matter. But if it doesn’t matter than why so determined to retain the teachers?

    • “But if it doesn’t matter than why so determined to retain the teachers?”

      Well, teaching per se doesn’t much matter, at least not compared to the students’ underlying ability. However, hiring teachers is much more of a pain point than firing teachers, so a principal is unlikely to go through the work of firing teachers unless they are really, horribly awful.

      In other words, few teachers are really, horribly, awful.

      • I’m referring specifically to people who defend inertia and do-nothing as a policy strategy.

        • I know. That was an answer. People are determined to keep the strategy because job security is a perk teachers like, that they trade considerable money for. And so administrators and others realize that pushing to end the strategy won’t make teachers happy and *since there aren’t many really awful teachers* not worth pushing for.

          • I’m not asking you to care. As I’ve said below, I’m uninterested in whether or not evaluation reform comes to pass in my particular district. It’s not as if evaluation reform has done much good. Most states get the same results with test based evals as they do otherwise, and the principals will simply refuse to take next steps if they don’t agree with it.

            I pointed out the flaw in your reasoning–namely, you seem to think that evaluation reform will uncover a whole bunch of terrible teachers. But that appears not to be true. Meanwhile, teachers will need a lot more money to be compensated for the loss of job security. Assuming that ever happens, which is again unlikely. Even the states without tenure don’t have high teacher dismissal rates.
            Again, the point, which flies over your head: job protection is a perk. Getting rid of it costs money. Principal behavior demonstrates that they aren’t longing to rid themselves of high percentages of bad teachers.

  2. I wouldn’t think that would necessarily refute you null hypothesis. The benefits of the good teachers may not stick for more than 3 years and may not build up over more than 3 years.

  3. I’m confused by your rhetoric.

    In social science, all variables are correlated. The null hypothesis is always false.

    I think what you mean to say is that true causal effect sizes tend to be smaller than people think/hope. I agree, but that does not make the null hypothesis true.

    Why would you find it hard to believe that some teachers are somewhat better than others at raising test scores? And if you have a lot of data, why wouldn’t you be able to see that?

  4. It’s implausible that teachers make no difference and are substitutable for each other.

    Some teachers have classrooms where the students waste time, talk over the teacher, and even verbally assault each other. Other teachers of those same students sometimes manage to keep them in line and cover some material. If the difference between these teachers is null, then school is an incredible waste of time.

    I think you should be more precise about what your null hypothesis is. Do you mean only for public school interventions, or for grade-school education more broadly?

    • It isn’t just that teachers aren’t interchangeable or even that an intervention works and sticks. It is also that we could actually accomplish and scale the solution.

      My response to this for this scenario is that we already do a poor job of retaining teachers. Since we are already attriting teachers (and not entering a fart ton of people with education degrees), are we sure we are retaining the best teachers? No, of course we aren’t. We should try it.

      • But in this case we don’t have to know how to create effective teachers, only how to measure them. The intervention is getting rid of the least effective. That seems inherent scalable.

        • Seems like it. But you have to replace them with someone. My thought is you could move some students from the poorer performing teachers to the better performing teachers.

          • If you just replace them with a new hire selected in the usual way *who hasn’t already been fired elsewhere for being ineffective*, that should be all you need.

          • One point of the null hypothesis is that *nothing matters*. As Arnold has said, he does not literally believe this to be true, but it is hard to disprove.

            So the question here maybe is not “can this scale?”, but rather “does this really improve student outcomes, particularly over the long-term?”

            I don’t know enough about the research – have the authors and the re-producers proved beyond that this really does improve outcomes?

        • You would have to produce more candidates than are produced currently. I think it could be done, but what are talking about? Half again? Doubling them? That is a gargantuan task. And not to mention doubling or more the screening work.

    • “It’s implausible that teachers make no difference and are substitutable for each other.”

      In one sense, you’re correct. It’s implausible that a teacher who is skilled at teaching a subject to motivated kids with high IQs will have any skills whatsoever at engaging and teaching unengaged kids with mid-level IQs.

      In another sense, you’re wrong. In the sense that teachers are being compared, it’s quite possible that they are interchangeable. That is, in the scenario you paint, it’s entirely possible that the teacher with well-behaved kids will not have noticeably higher test scores than the teacher with out of control kids. At the high school level, the tests are so out of whack with ability levels that engaged kids aren’t necessarily learning enough to register on test scores.

      Of course, kids are better off in a classroom where they are engaged and learning something, even if the learning doesn’t register on tests. This is what we call “non-cognitive issues” and why teachers often say that school is more than just test scores.

  5. don’t forget that teachers unions are inherently against sorting more- from less-effective teachers. So even if you have an instrument to do that, you may not be able to use it at any kind of meaningful scale

  6. “don’t forget that teachers unions are inherently against sorting more- from less-effective teachers.”

    Don’t forget that teachers unions are teachers. That is, teachers are not in favor of sorting more from less effective teachers, because most of us tend to know that opinions about effectiveness vary widely.

      • Not sure what that has to do with anything. My only point was that many people say “teachers unions” because they think unions have some fell design contrary to teachers’ interests. As if teachers are longing to be evaluated but those mean rotten unions stop it.

          • That, too, wasn’t my point. Read much?

            I’m not an activist. It’s largely irrelevant to me whether you get your way or not. If you haven’t noticed, states that have implemented evaluation reform haven’t had the results you long for. And the results you long for are impossible to achieve in high school anyway, since it doesn’t have the “growth” measurability that VAM needs.

            So I’m not worried about teacher eval. Just pointing out errors. In this case, the error was simply that the speaker was trying to pretend that teachers unions aren’t teachers.

  7. I’m still waiting for a clear statement of this “null hypothesis”.

    I learned stuff in school, stuff I demonstrably would not know if I had not attended school, because I would never have looked it up.

    The capital of Honduras is Tegucigalpa. Null hypothesis disproven.

  8. The null hypothesis is that education interventions don’t work. Arnold adds caveats because researchers improperly cherry-pick their factors.

Comments are closed.