Jose Luis Ricon, (Ricon 2019)


Wordcount: 13459 | Reading time: 72 min

I suppose I should have expected that with a title that includes “systematic review”. Nonetheless I’d have preferred a more explicit conclusion. The gist, as I see it based on reading (read: skimming) the piece:

  • The benefits of Direct Instruction are most relevant for low-socioeconomic-status learners and those educationally disadvantaged
  • Mastery learning’s benefits are likely derived from it’s implicit use of the Testing effect
  • Spaced repetition is indeed very effective
  • A key question of education is “What should we learn?” and not only “How can we learn more effectively/faster/better/etc?”. As an example: “What’s the effect size of removing history from the curriculum and leaving more time for the kids to play?” I wouldn’t want to remove history but that idea, “What are the opportunity costs of our curriculum”, are valuable.


Why would mastery learning work? Some proposed experiments

Suppose that something by the name “The Incredible Learning Method” or TILM has been shown to increase performance by d=0.5. One then wonders what TILM is. TILM consists on doing spaced repetition and doing pushups every day.

Should we then recommend that kids do pushups if they want to learn? Or should we find what, ultimately, causes TILM to work?

In this approach, we break down a method into its components and experiment with them separately, or together by parts, seeing what effects we find. Then, we could try to study how each of the subcomponents behaves in other samples, or try to find neuroscience-level explanations for them.

For spaced repetition, the answer here would be the underlying idea of there being a forgetting curve (a) which in turn might be possible to explain by some feature of how the brain works at a low level.

In fact, could it be that mastery learning is spaced repetition in disguise?

Or, it could also be that the thing doing the trick is just more time: If we assume that the more exposed one is to a concept more one learns it (with perhaps some plateau at the end), then the fact that a student keeps studying over and over the same material would increase performance. If so, mastery learning wouldn’t have an effect if one controls for time invested in instruction. As Slavin puts it:

In an extreme form, the central contentions of mastery learning theory are almost tautologically true. If we establish a reasonable set of learning objectives and demand that every student achieve them at a high level regardless of how long that takes, then it is virtually certain that all students will ultimately achieve that criterion.

Or, it could also be that it is just the testing that is doing the work. Recall, mastery learning is a study-test-feedback-corrective loop. Slavin found that the study-test-feedback loop works almost as well. But we also know that study-test also works! This is known as the testing effect (a).

Or, it could also be that mastering lesson N does have an impact on better absorbing the knowledge in lessons N+i. If so, a way to study this would be teaching a group with mastery learning and another with regular methods one lesson, then teaching both groups with regular methods lesson 2. If mastery learning works, then the group of students that originally was taught with mastery learning should get better results. This is, it seems to me, the only genuine “mastery” effect.


Bloom noted that mastery learning had an effect size of around 1 (one sigma); while tutoring leads to d=2. This is mostly an outlier case.

Nonetheless, Bloom was on to something: Tutoring and mastery learning do have a degree of experimental support, and fortunately it seems that carefully designed software systems can completely replace the instructional side of traditional teaching, achieving better results, on par with one to one tutoring. However, designing them is a hard endeavour, and there is a motivational component of teachers that may not be as easily replicable purely by software.

Overall, it’s good news that the effects are present for younger and older students, and across subjects, but the effect sizes of tutoring, mastery learning or DI are not as good as they would seem from Bloom’s paper. That said, it is true that tutoring does have large effect sizes, and that properly designed software does as well. The DARPA case study shows what is possible with software tutoring, in the case the effect sizes went even beyond Bloom’s paper.

[…] Mastery learning, it seems, works by overfitting to a test, and the chances that those skills do not generalise are nontrivial.

Coda: What should you learn

I have been discussing above learning methods. But what about learning content? What should one learn? Bryan Caplan, in his book The Case Against Education10 argues that skills are not very transferable: you get good at what you do, and you quickly forget everything else. This informed my priors when looking at the mastery learning literature; and so I was not that surprised to find the issue with the kinds of tests I outlined in my section on Slavin.

Caplan is right, and this opens an avenue to improve teaching: Suppose you focus on the very basics that will most likely be useful to the students in the future (reading, writing, mathematics). With the time freed up from not teaching other subjects, one can both help disadvantaged students get good at those core topics, and let non-disadvantaged students use the extra time to learn about whatever they want. I would even suggest let them go home, but it may be argued one of the implicit roles of school is that of a place to keep the kids while the parents are working, a kindergarten for older children, but if parents are okay letting their kids roam around, and cities are safe, this is not an issue. Definitely worth exploring.

Granting that one can use DI to hammer down very basic points, should one also hammer down History and English Literature? Just because you can make the learning of those happen doesn’t mean that you should: the effectiveness of an intervention also depends on the values being used to judge it. What’s the effect size of removing History from the curriculum and leaving more time for the kids to play?