## Thursday, July 19, 2012

### Concrete Example of Confidence Intervals

Estimating the Mean Rank of a Deck of Cards by Sampling a Hand of 4.

In my statistics class, near the end of my unit on confidence intervals (C.I.'s), my brief lecture notes say this:

Concrete Example of C.I.'s
Consider: Deck of cards (ranks A=1, 2-10, J=11, Q=12, K=13).
Estimate mean of all card values; obtain 80% C.I. for μ & interpret.
Sample size n=4; stdev σ=3.74; population uniform.
Table: Group (A-E), sample, x', E, C.I. (later: contain μ Y/N)
Interpret: There is an 80% chance that the mean of all card values is in any given interval.
We expect roughly 0.8×5 = 4 intervals to contain μ; check & lessons.

Hopefully you can see what I'm talking about -- time-permitting (takes about 50 min if I'm not rushing), I find this to be excellent and highly memorable way to reinforce the conceptual lessons of confidence intervals one final time. Frequently I get some students literally gasping in surprise that that population mean, μ, of the deck of cards is actually contained in most of the C.I.'s (as incredible as it may sound -- given that's the whole point of the procedure).

Just for a bit more detail: I'm taking a standard deck of 52 cards and declaring that we'll estimate the average rank value by a random sample of just 4 cards drawn from the deck (as above, we declare A=1, J=11, Q=12, K=13 for this purpose). I split the class up into 5 groups, somewhat as though they're five separate research facilities, each of whom will get their own sample and be asked to perform the C.I. estimation procedure (quicker students may opt to calculate all 5 C.I.'s). I may also talk about the fact that I'm actually breaking the rules for the procedure -- the population is non-normal and the sample size is small; but the procedure is robust enough that with a uniform population it tends to work out anyway. Here's an example from when I did it this last week (each of the 5 rows is a separate group's sample of 4):

As you can see, I've opted to do this at the 80% confidence level, so that we can explicitly compare our results to the number of intervals which we would expect to contain μ in this case (that being revealed right at the end as a separate check). For my classes, this is one of the most insanely challenging parts of the class, so I am compelled to emphasize as often as I can -- understanding exactly what probability statements are telling you. (When I ask something like at the bottom of the picture above, the first responses are, inevitably, almost always "none of them" or "all of them". No joke!)

So as much as I'm always fighting time-management issues in my classes, this is an added demonstration that I've found to be invaluably useful at the end of this unit, basically the "crown jewel" and the hardest part of my community-college statistics class. Here are some of the many lessons that we can draw out from this one case study (esp. granted that we're doing it multiple times):
1. Most intervals contain μ, but not all.
2. The margin of error E is fixed by the chosen sampling process.
3. The population mean μ is a fixed number (but generally unknown).