Null Hypothesis Deniers

The IGM forum of economists asks their participants whether they agree with:

Comparing their students’ average gains on standardized tests over the school year makes it easier to predict which teachers — all else equal — are more likely to improve their student’s long-term life outcomes.

Nobody disagrees. Although I would have disagreed out of spite, the wording of the question makes it impossible to disagree. The question stipulates “all else equal” and uses the cautious “are more likely to improve.” So if I thought that, other things equal, there is a .0001 chance of a good teacher causing an improvement of .0001 standard deviations in their student’s [sic] long-term life outcomes, I should answer “agree.”

A more interesting way to phrase the question might have been, “Using students’ average gains on standardized tests over the school year to determine teacher retention and firing decisions can reliably lead to measurably important gains in long-term life outcomes.” The economists probably still would have committed the Type I error, but at least this wording gives the null hypothesis a fighting chance.

6 thoughts on “Null Hypothesis Deniers

  1. In finance, we evaluate traders and funds not in terms of annual return, but in terms of ‘Alpha’ – that is – the difference in the rate of return they achieved vs what was expected, which can be estimated with some suitable market index of comparable riskiness.

    So, in education, we might try evaluation teachers with ‘Educational Alpha’, that is, not their average test-score yield, but the extra yield they get vs. what we would expect for the kids, given their statistical profile at the beginning of the year.

    That’s a much fairer way to evaluate performance. Of course, lots of people aren’t very interested in fair performance measures becoming public knowledge.

    For the estimation model, I’d guess the most predictive elements in any factor analysis would be IQ and past grades and test performance. That model would not be PC, which is too bad, because it would probably show that the constant disparagement thrown at ‘bad schools’ as well as the adulation of ‘good schools’ is mostly a bunch of undeserved blame and credit given to people who probably have similar levels of performance measured in this ‘value-added above expectations’ manner.

    If the model’s expected student test-score yields have low dispersion and average, say, “+10 points”, and a teacher shows a persistent ability to get a long-run trend average of +15 points, then he or she has ‘Educational Alpha’ of 50.

    The average Educational Alpha is defined as 0, and the null hypothesis says that most teachers score very close to zero, with a low standard deviation.

    Still, I’d expect some special teachers to consistently score in the small positive numbers, and so I might be tempted to answer the question ‘yes’ too, though with the above caveats.

    Still, how many super-teachers with an EA above, say, 10, are we talking about. Raymond Wolters wrote this about Teach For America:

    At Harvard the number of applicants increased from 100 in 2007 to 293 in 2010…. Even when choosing at Harvard and Yale, however, TFA selected only a minority of the candidates. This led one wry journalist to say that the solution to America’s educational problem was at hand: “Evidently, all we have to do is to fire all the schoolteachers and replace them with the best Harvard graduates—but not the run-of-the-mill Harvard grads. Just the best Harvard graduates.”

  2. I quibble with the wording in another way.

    I think school should focus less on “life outcomes” and more on “learning useful things”. As “useful things”, I will accept standardized tests. By this measure, I expect that teachers and schools both make a difference. That is, a teacher whose students have big score improvements, will also tend to be able to improve the scores for a new batch of students next year. I expect this to persist, to a lower degree, if you make that teacher work in a different school and thus with a different student body.

    Maybe I’m wrong, but that’s the null hypothesis I care about. Arnold, would you claim that if a teacher raises test scores for 10 years in a row, you don’t consider it likely they will do it for the next 10 years as well?

    • Assuming that, what intervention would you make to improve education on the whole based on those factors?

      • My proximate solution is more standardized testing, so that you can find out which teachers are actually teaching anything concrete.

        However, that’s merely a first step, because it seems quite hard to avoid public school employees from just gaming the system. So I would also make it an even bigger priority to allow parents a choice in what school their kids go to. As Arnold frequently writes: exit > voice.

        I don’t believe we would have good supermarkets if there was one supermarket in a given region and everyone had to go to it. Competition is important or there will never be real improvement.

        That’s my platform. Anyone want to vote for me? Ah well….

  3. Naturally a teacher who’s merely better than the other teachers in a local school will look amazing. It would be a hard choice for some folks (say they’re childless and thus living or at least working in a poor neighborhood doesn’t trouble them) – enjoy the comforts of more talented and better behaved children, parents and administrators, yet look mediocre by ‘value added’, or be a big fish in a stinky pond. Actually, if enough kids moved around from place to place you could probably loosely identify the good teachers at good schools who are nonetheless merely par for that environment (it seems cruel to me to punish them for that). The only reason I can think of that such an automated “performance added” policy won’t work *if enacted* (good luck!) is that it’s too hard to prevent teacher/admin cheating – and there’s plenty of that in the poorer schools; enough to worry.

Comments are closed.