In 1994, Benjamin S. Bloom wrote,
Using the standard deviation (sigma) of the control (conventional) class, it was typically found that the average student under tutoring was about two standard deviations above the average of the control class (the average tutored student was above 98% of the students in the control class). The average student under mastery learning was about one standard deviation above the average of the control class (the average mastery learning student was above 84% of the students in the control class)
The reader who sent me the link to this paper asked whether it invalidates the Null Hypothesis. The researchers experiments rather than observational studies, so I will give them that. But
1. Usually, educational interventions have small effects. The effect of “mastery learning” of one standard deviation seems rather implausibly high, considering its definition.
Formative tests (the same tests used with the conventional group) are given for feedback followed by corrective procedures and parallel formative tests to determine the extent to which the students have mastered the subject matter.
That does not sound like something that would cause a one-standard deviation difference. My guess is that these findings would not replicate if they were undertaken by different researchers.
2. Even when interventions show large effects for a single subject in a single year, the effects tend to fade out. That is, if you examine the experimental group and the control group three years later, any difference has vanished. Even if these results replicate in the short term, they do not invalidate the Null Hypothesis if they suffer from fade-out.
3. We do not know how what would be necessary to enable tutoring to scale. Bloom seems to believe that tutoring works by adapting to the needs of the student. If so, then my guess is that the process of matching tutoring style to student characteristics would be quite a challenge.