Little Apps for teaching stats

Interactive apps are a powerful way to engage students with statistical concepts. Several textbooks have apps associated with them, for instance, the Agresti/Franklin/Klingenberg book, Statistics: The art and science of learning from data, the Tintle et al book, Introduction to Statistical Investigations, the Lock^5 book Statistics: Unlocking the power of data. Even though these apps are associated with textbooks, many are free to use without the book. What more could an instructor ask?

As part of the StatPREP project, we have also been developing apps. The StatPREP philosophy is to put data front-and-center in statistics. Consequently, our apps are based almost entirely in displays of data. And each app uses a standard graphical modality for displaying data. We do add annotations – in particular, confidence intervals – and many of the apps display resampling in the same space as the original data.

Sometimes textbook apps are used to demonstrate a particular statistical concept, with the app being abandonned when moving on to other concepts. In our view, this creates a fragmented experience for both students and instructors. Most of our apps are designed to be used for several lessons, first for generating awareness of data, then simple statistical summaries, and only then moving on to more formal statistics.

For example, consider the app listed below about the two-sample t test. Its designed to let you use it for a range of purposes, as you progress through your course. For instance, it might be used for five different lessons:

Lesson 1: Showing patterns in data. There are many interesting stories in the data used in the app. The instructor notes give some examples, but the best way to find them is just by browsing different variables. This lesson could introduce using groupwise means to summarise the data. It could be in the form of an in-class demonstration, and it could also be used for a take home exercise along these lines: “Find a pair of response and explanatory variables that interest you. Write a paragraph explaining, in everyday terms, what is the relationship between the variables (if any) and how that relationship shows up in the graphic.”
Lesson 2: Sampling variation. Once students are used to the idea of using groupwise means to summarize a pattern, it’s time to show them that there is sampling variation. Do this by displaying sample after sample. You can find variables where the pattern is stable across samples, and variables where the means jump around so much that there is no reliable detail in the pattern. You can also show how things change if the sample size is made larger.
Lesson 3: R-squared. The app provides statistical annotations that can be used to explain the (quite simple) idea behind R-squared.
Lesson 4: p-values. Show how p-values depend on the strength of the relationship between variables (as indicated by R-squared) and the sample size. You may learn a surprising lesson yourself: that p-values jump around a lot from sample to sample. This might make you more skeptical about using 0.05 as a threshold.
Lesson 5: Where the p-value comes from. Just as you can change variables and sample size in the graphics, you can do the same in showing the F statistic and see that it is just another way of summarizing R-squared.

You can try them out by following these links. (We recommend that you open the links one at a time in a new browser tab. There’s no point in opening them all at once.)

(Note for those using the original alpha release. We are gradually re-writing the apps to include better documentation and to display better on small screens such as mobile phones.)

Principles

The design of these apps follows principles consistent with the goals of StatPREP.

There’s always real data behind the apps.
The displays are always genuine modes of displaying data.
Most of the apps introduce the possibility of exploring the data and finding out something about the world.
We try hard to avoid purely theoretical constructions. For instance, we use simulation to generate samples and resamples.
We de-emphasize p-values in accordance with the recommendations of the American Statistical Association.
Following the GAISE report, we emphasize multi-variate thinking. There’s almost always a response and explanatory variable. (As we introduce new apps about modeling, there will be covariates as well.)
Statistics are always displayed in the context of the whole range of data. For instance, if there’s a mean and its confidence interval, that’s plotted on top of the case-by-case data so that students see clearly the variation in the data.
There’s usually a way to see how results can depend on sample size.
The apps are unadorned. An instructor who wants to embed an app in a lesson worksheet or an interactive tutorial can do so.