Null Hypothesis Watch

From a report on a site called Straight Talk, on a study by Dale Farran and Mark Lipsey, who write

Our initial results supported the immediate effectiveness of pre-k; children in the program performed better at the end of pre-k than control children, most of whom had stayed home. The press, the public, and our colleagues relished these findings. But ours was a longitudinal study and the third grade results told a different story. Not only was there fade out, but the pre-k children scored below the controls on the state achievement tests. Moreover, they had more disciplinary offenses and none of the positive effects on retention and special education that were anticipated.

Those findings were not welcome. So much so that it has been difficult to get the results published. Our first attempt was reviewed by pre-k advocates who had disparaged our findings when they first came out in a working paper – we know that because their reviews repeated word-for-word criticisms made in their prior blogs and commentary. We are grateful for an open-minded editor who allowed our recent paper summarizing the results of this study to be published (after, we should note, a very thorough peer review and 17 single-spaced pages of responses to questions raised by reviewers).

Social desirability bias is a major factor in what gets published as research into poverty. That is why even when I see studies that seem to refute the null hypothesis, I am doubtful that they will replicate.

30 thoughts on “Null Hypothesis Watch

  1. Will those of us that like using statistics to understand the world even be able to do so in PC world? Or will good data be impossible to find.

    It doesn’t even have to be intentional. For instance between selective high IQ immigration from Africa and increasingly mixed race children identifying as black we will probably see the black white IQ gap close over time. Not because genes were ultimately disproven in favor of environment. But because changes in genetics will not be accurately measured for fear of un-PC results.

    How should someone view statistics that countermand their worldview in a world where statistics are often misrepresented or surpressed? Does this not mean that there is going to be one less reliable tool under which one would change their mind?

    • Data reliability is a huge issue. In general, getting accurate information depends on serious personal accountability for getting it wrong. But with many scientific or research papers, there are no audits, the full data set often isn’t available for independent scrutiny, and even when these sets are available, people tend to simply accept them on trust. Yes, occasionally instances of outright fraud are discovered, or retractions made, but those are pretty rare.

      What exactly is the penalty for not replicating? Most people are willing to give researchers the presumption of innocence and the benefit of the doubt. How can we distinguish a good faith disagreement or statistical flukes from intentional misrepresentations? But if no one is held accountable, nothing will change.

      Here’s my proposal: body and lab cameras.

      Body cameras are one of those new institutions I wrote about in a comment to “A question about monetary systems” that must have gotten stuck in some spam-filter or something. They are what you get when social trust breaks down, when we go from a high-trust to a low-trust equilibrium.

      In the case of police, we have a similar issue. In a conflict of testimony, without video, what the cop sends on the stand tends to be trusted, which creates a moral hazard and incentive for perjury and other bad behavior. With body cameras, we don’t have to rely on testimony, and with the prospect of accountability a new behavioral equilibrium arises. It’s not an unalloyed improvement – the public can be nudged to interpret video out of context or in a more negative light than warranted by the nature of the job, and of course cops tend to dislike being recorded on the job as much as anybody – but it’s still probably better overall.

      So, why not scientists and researchers too?

      After all, what does “replication crisis” really mean? It means out existing institutional arrangements were not adequate to produce reliable, trustworthy results, and that a cloud hangs over the existing body of knowledge. How should we respond to that? Personally, or structurally? “Trust us, we’ll do the same things, but better this time, we promise!” is the personal approach. That’s not very robust against the risk of it all happening again.

  2. My observation is these programs exist because they are a substitute for daycare. Evidently helping families afford daycare is communist or something so we pretend like these pre-k programs are “school.”

    Just provide subsidized daycare to working families. It will be cheaper.

  3. Family relationships dominate eventually, Your parents either have clues or they don’t.

  4. Agnostic about this. One study in one place adds to the body of data but on its own means nothing. Logic says pre-k should be provide long term enrichment than being plopped in front of a tv everyday. The details of the study will provide the useful information.

    • To the extent that pre-k is schoolish, it’s going to involve a fair amount of 4-year-olds being told to sit still and pay attention. That’s something a lot of them, especially males, don’t do well. I would not be surprised if this leads to a fair amount of “this is hard … this is stupid … I hate this” and a general negative attitude toward school. By third grade, maybe that overwhelms the “long term enrichment.”

      I would love to see a breakdown of the results by gender.

      • I suspect it’s more just a selection effect: the types of parents who put their kids in pre-K programs are more likely to have children with below average academic abilities and greater than average discipline problems. Might be a bit of both, though, really.

    • “Logic says pre-k should be provide long term enrichment than being plopped in front of a tv everyday.”

      You’d think. Then again I know a kid in a Nordic country who has taught himself the basics of English by watching cartoons on tv (so his English has an American accent and tends toward a lot of short, imperative exclamations – “let’s go, guys!”). Pretty sure it wouldn’t have happened in pre-k.

    • Logic says pre-k should be provide long term enrichment than being plopped in front of a tv everyday.

      Considering how young these children are, I do not see why that would true. What they are learning at that age, is language and about the physical world.

  5. The negative effects are very likely selection bias, because 1) the kids who didn’t get the treatment mostly stayed home, which meant they mostly had someone to care for them at home, which meant that the kids that didn’t get the treatment probably had higher socioeconomic status than the treatment group. Also 2) educational effects are almost always selection bias.

    The fade-out is hardly surprising. I don’t remember much of what I did in preschool; do you? Nearly all experiences fade into the past and become less important with time.

  6. Facts are political. Most underlying political differences are really disagreements about facts. And when your political views become set, it is very hard to accept facts that are not consistent with those political views. And when you control the facts, such as through “studies”, you control the politics.

  7. The study illustrates the limitations of macro level analysis. Aggregates simply do not provide useful, actionable information.

    For example, what are we to make of the wide internal variation in what individuals in the treatment and control groups experienced? The state-funded pre-school program is described as: “not atypical of state pre-k programs generally, operating with some mandated structure based on accepted standards, but neither tightly controlled nor shaped and guided by an overarching vision widely understood and embraced throughout the state.” So the intervention basically consisted of VPK funding rather than any particular educational model. Of kids in the control group: “63% received home-based care by a parent, relative, or other person; 13% attended Head Start or what parents described as a public pre-k program; 16% were in private center-based childcare; 5% had some combination of Head Start and private childcare; and childcare for 3% was not reported.” There are at least 6 widely recognized pre-school education program models, but we have no way to know which were being compared with which. But maybe it doesn’t matter at all what we are comparing, the report quotes a finding that “drivers of successful pre-school programs have yet to be identified.”

    One wishes the subgroup analysese had been more extensive. Home based care might be best for rural families with a stay-at-home parent. Pre-school centers in wealthy suburbs may outperform stay-at-home-parents. Who knows? Did black chilren in majority black centers outperform black children in majority white centers?

    The subgroup analysis provided is far more useful but, as the authors admit, at times ambiguous. 9% of the treatment group was identified as special needs and throughout the period covered treatment group individuals were more likely to have an IED. So is that good or bad? One reason I’d probably opt to keep my kid out of a state program.

    Better news for English-as-a-second language treatment group kids. Here there are positive results and perhaps future research will further explore what works best for the various subgroups within subgroups here.

    In short, parents will still have to make decisions about how to raise their children. Government funding might make it easier for some familes to choose a particular option. Parents will then have to evaluate whether or not to change options based upon their experience in a particular program. Should states fund pre-school programs? Leave it to the voters to decide. Experts have little to contribute.

    • Voters aren’t noticeably smarter than experts, and often much less so. If all parents were intelligently engaged with their kids’ education, we’d hardly need schools at all.

      Also, if people weren’t selfish, communism would work.

      Don’t get your hopes up, is all I’m saying.

  8. Shouldn’t the Null Hypothesis work in both directions? An educational initiative that shows positive results won’t replicate, scale, or have any effect after several years, and the same should happen with an educational initiative which shows negative results.

    If you are skeptical of all the studies showing positive results, it seems to me that you should be just as skeptical of a study which shows negative results.

    • It really depends on how close to “as good as they can reasonably get” American schools are. If you think they could get a lot better–the default of most politicians and commentators–then just about any change that sounds good should increase student achievement.

      If, on the other hand, you think most schools, no matter how much we dislike the results, are doing about as well as is possible, then you don’t expect much in the way of systematic improvement from changes. On the other hand, for the same reason that most mutations are harmful, it is not unreasonable to expect a change to make things worse.

      If you are bumping up against a ceiling, a change may leave you further from the ceiling.

  9. Check out the study authors’ Reference Note V (emphasis added):

    [v] Among children randomly assigned to the VPK group, 87 percent actually enrolled in VPK (13 percent did not). Among children randomly assigned to the control group, 34 percent ended up enrolling in VPK—primarily because they had been waitlisted and were admitted to the program when a space opened up. This rate of “no-shows” in the VPK group and “cross-overs” in the control group means that the program’s effect on those children who actually attended VPK is substantially larger (i.e., worse) than the effects that we describe in the main text for the full randomized sample—perhaps nearly twice as large.

    Perhaps our ancestors were entirely capable of choosing the optimal school-starting age after all.

    The real problem is that preschool cannot fix genetic low IQ, but just you wait: the fanatic gap-closers will react to Farran and Lipsey’s findings by proposing pre-pre-school. If institutionalizing children at age 3 doesn’t work, they’ll try age 2. Then age 1.

    I guarantee that the results will get worse as the interventions get more aggressive. The betting is that interventionists will ignore the trend and increase the treatment, no matter how counterproductive.

  10. Some of you missed it, so let me repeat the headline: in this large study, the families of all study participants signed up for the program— they were randomly assigned to test and control groups.

    The parents of both groups are from the same population. They had the same higher socioeconomic status, discipline problems, immigrant status etc.

    You “never” get large random control experiments in the social sciences? This study in not perfect but comes close.

    This is the kind of study that should make you update your priors.

    • Fair enough; I hadn’t clicked the link. A randomized, controlled trial is about as good as it gets.

      Further down, they suggested that the negative effects may be from kids being classified as having special needs in pre-K. A class for 3 year olds will have kids who, at a certain date, were barely three and kids who were almost four, and that is a large difference in age and maturity (comparatively). It may be that many of the youngest kids in each class were identified as special needs simply because they were less developed.

  11. As Left Center voter I have turned against universal pre-K education for this reason. I still tend to think there is a self-selection bias here as the optimal environment for young kids to be home with a parent during the day over daycare options. And I do think modest pre-K education is probably good (say 4 – 8 hours a week) but most pre-K programs are a full week. So I fall back on pre-K education has tremendous diminishing returns on these idea.

  12. I thought this was a good study. It’s just simply not credible that the results would be worse. Roughly the same, sure. I’d expect no real effect, or an initial one that fades out.

    But it isn’t credible that in low income families, that kids who go to state approved preschools would do *worse* than kids who stayed home.

    The authors also clearly thought it unlikely. I thought their speculation that perhaps preschool kids were shunted early into special ed, which slowed their education, was interesting. The reason, whatever it is, will be something like that.

  13. Ed (Realist), old friend, have you considered it may be a socialization problem? Maybe throwing the kiddies into the VPK tank instead of leaving them home with Mom and sibs and neighbors at such a tender age tends to make them more cynical or scared or whatever so they have trouble later.

    • It could be that coaching small children in a developmentally appropriate way to get better at (1) impulse control, (2) frustration tolerance, (3) playing well with others is hard enough in a family setting, done by someone with an irrational (“Love and probably Genetics too!”) interest in the child. It may be harder when done by relative strangers in a day-care like setting, no matter the zeal and training of the care-givers. Could be. Don’t know.

      Also, there are dynamic effects from having so many kids in an age cohort together in an institutional setting. To restate:

      “Take a number of children and put them into a group setting and you may get amplification of tendencies (at least among some children) that lead to trouble.” We can accept this as a plausible hypothesis. To get data on testing it we might need to do studies that are labor intensive, with “thin slicing” of how discipline is handled–by families and at home. Similarly to the guy who thin-sliced the way married couples fight. John and Julie Gottman’s research.

      Judith Rich Harris’s work on socialization and peer groups might be relevant, too.

      So many hypotheses so little time…and money…and meanwhile there are so many people committed to making universal pre-K a reality. Methinks it’s free (subject to eligibility requirements) and universally available (in theory) in the city of Rochester, NY.

      https://www.rcsdk12.org/prek3

      • It’s likely that some development goes better away from the family, in impersonal institutions where nobody thinks your special, or among distant relatives who don’t think you’re Little Lord Fauntelroy. But when does that begin? Beats me.

        After writing the above I started thinking of the custom common in parts of Africa of “fostering” in which kids are farmed out to distant relatives from an early age in part to toughen them up and put them to work. This is not seen as abuse but simply introduction to the hard knocks of life.

        It’s not clear to me what the logic is, except that it’s common and not necessarily abuse.

        https://www.sciencedirect.com/science/article/pii/S0304387808000060

        • Some societies have used similar methods to prevent war; the kids are hostages. The Tokugawa Shogunate used to keep its vassals’ wives and kids in Kyoto; the husbands were there part time. It’s a real disincentive to starting trouble.

    • Goofy? Maybe. Sorry if my earlier posts were goofy. Were they unclear?

      Here is one theoretically plausible mechanism. If bad habits get amplified in day care because there are more peers and the “bad kids” group together and egg each other on, that could explain things getting worse. Judith Rich Harris discusses this in her two books _The nurture myth_ and _no two alike_, though she’s talking about much older kids.

      It’s plausible that this dynamic might take root outside the home and outside the supervision of blood relatives, or outside the watch of kin and near-kin.

      Educational Realist, can you articulate a theoretically sound and empirically grounded basis for the argument that “Pre-K can’t make things worse?” If you believe that to be the case it’s one thing–can you explain why everyone else should believe it too?

      Based on experimental design concepts, the “treatment” (pre-K) can have an effect. Why assume that either the treatment is (1) better than the control or (2) has no effect, but that (3) negative effects are a priori impossible?

      Methinks in medicine some negative impacts get classified as “iatrogenic.”

    • My earlier comment suggested one possible way pre-k might have a negative effect down the road for some young people. When I taught high school, I was struck by 1) so many students just aren’t interested, and 2) for a small but noticeable minority, school is an enemy.

      I wonder how the readers of this blog would react if from age 5 – 17, they were required to attend sports camp 180 days each year.

      • Versus the alternative of staying home, and whether they like it or not, I still wouldn’t expect the fitness level of mandatory sport campers to drop for almost any of them.

        One could always imagine weird cases like that, sure, but in practice it would be negligible.

        • I agree. The average fitness level of today’s young people is so low that mandatory sports camp would undoubtedly improve it. What I was trying to do was to get readers here to feel just how unpleasant school is for some people. (I also suspect mandatory sports participation would create some people who hate sports, the way mandatory chapel can create anti-clericalists.)

          Pushing the hypothetical along, I’ll bet that many readers of this blog

          1) would hate it,
          2) would try to get through with as little effort as possible,
          3) would by doing that lower the standards of the camp. As long as failing more than a small percentage of campers is unacceptable and makes you a “bad counselor” who won’t get tenure, standards inevitably descend to where only a small percentage fail–or you pretend to have standards but don’t actually enforce them.

Comments are closed.