2010-07-29

Wow, That's Freaking Cheap

Seen on the ride into school.

Similarly, don't forget about ye olde "Verizon Math Fail" recording from a couple years back. Glad that wasn't me.

2010-06-08

This Is The Dumbest Goddamn Thing You Can Say About Statistics

"A large population size must require a larger sample size."
This -- or any iteration thereof -- is the dumbest goddamn thing you can say about statistics. While it's a clear demonstration that someone's missed the whole point of inferential statistics, it's also one of the most common things you'll hear about them. (Often in the form of "That sample is only a small proportion of the population.") Here's some of the varieties of this statement that I've encountered over time:
How do they project statistics like that? I'm trying to imagine what kind of sample size you'd need to represent, well, everything in the universe. [In regard to matter/anti-matter ratio in the universe as researched at Fermilab; comment posted at Slashdot]
Adobe claims that its Flash platform reaches '99% of internet viewers,' but a closer look at those statistics suggests it's not exactly all-encompassing... the number of Flash users is based on a questionable internet survey of just 4,600 people — around 0.0005% of the suggested 956,000,000 total. [News summary at Slashdot]
That poll doesn't convince me of 4e's success or lack thereof. Also, there's only 904 total votes while ENWorld has over 74,000 members, so that's only a small fraction of forum members (addmittedly many of those 74,000 are probably inactive). [In regard to the popularity of the D&D game's 4th Edition; comment posted at ENWorld]
You get the idea. To save some writing time here, I'll use n to indicate the sample size and N to indicate the population size. For any statistical inference, if n=50 is an acceptable sample for N=1,000, then it's also acceptable for N=10,000, N=1 billion, or N=infinity. In particular, one thing that never really matters is the ratio of sample to population.

Brief illustration: Let's say that you're using a sample mean to estimate a population mean (much like in a scientific opinion survey, etc.). As long as you have a sample size of at least 30 or so, you automatically know what the shape of all possible sample mean results is: a normal curve, as per the (mathematically proven) Central Limit Theorem. And then you can use that curve (via some integral calculus, or a resulting table or spreadsheet formula) to calculate the probability that your observed sample mean is any given distance from the population mean. Does the size of the population have any bearing on this sampling distribution shape? No. Does the CLT make any reference to the size of the population? No, not whatsoever. You have a moderate-sized sample (30+), you know the shape of all possible sample means, you calculate your probability from that (or some equivalent process), done.

Exception: In calculating sampling distribution probabilities, you'll use something like the fact that its standard deviation is σ/√n. (Here the σ indicates the standard deviation of the whole population.) Now, if the population size happens to be exceptionally small (like, N≤20n), and you're sampling without replacement, then you can improve the estimate a bit by instead using the correction formula √((N-n)/(N-1)) * σ/√n. But why bother? (a) You're almost never in that situation, (b) it rarely makes that much difference, and (c) you're just making extra number-crunching work for yourself. So you're actually better off assuming that the population is really huge or even infinite (as is actually done), thereby saving yourself calculation effort by way of the simpler formula. For any N>20n, the difference is negligible anyway (which is to say: lim N→∞ √((N-n)/(N-1)) * σ/√n = σ/√n). Run some numerical examples (pick any σ you like) and you'll see how little difference it makes.

Even more absurd exception: One requirement that the Central Limit Theorem does have is that the population standard deviation must be nonzero, i.e., σ>0, which does rule out having a population size of just one. But, c'mon, if that were the case then what you're doing isn't really sampling or inferential statistics in the first place, now is it?

In summary: If anything, a larger population size makes the statistics easier, and the math is simplest when you assume an infinite population size in the first place. Other than that, population size has no bearing on the math behind your estimation or surveying procedure.

One final, really simple observation: If an opinion poll is performed at the standard 95% confidence level, then its margin of error can be basically calculated by: E = 1/√n. (Compare to the formula for standard deviation above; the σ disappears due to a particular very convenient substitution and cancellation.) Does the population size N appear anywhere in this formula? Nope -- it's fundamentally irrelevant to the process.

(I've written about this before, but I wanted a version that was a bit more -- ahem -- direct, for posterity's sake.)

2010-06-03

Stuff that Shouldn't Work

Making practice or test exercises is harder than you might first think before becoming a teacher. If you ever make a problem up on the fly while lecturing, it's highly probable that you'll create something with hideous fractions, irrational or imaginary numbers, extraneous solutions, etc., that you didn't want, which winds up sidetracking you from the point you were trying to make.

Another pitfall is creating problems that are singularities, i.e., the correct answer can be produced by some completely incorrect process, one that won't work for any other problem of the same nature. For remedial math students, this is almost a nightmare scenario, since their capacity to correctly generalize from the specific to the abstract is already shaky and confused as it is.

As just one example, here's one of my favorites from the algebra workbook we use at my school (custom edition produced by other teachers in same department):
If 1.05x = 22.05, then x = ?
Now, the correct process is to divide both sides by 1.05, and see that x = 22.05/1.05 = 21. But horribly, if a student mistakenly subtracts 1.05, then they also get the same answer! Say x = 22.05 - 1.05 = 21. Thus, this exercise allows a student to "submarine" a totally broken process (answers are multiple-choice in the book), giving them apparent confirmation that they're doing the right thing when they're absolutely not. (Note that this particular exercise was changed in a newer edition after I pointed it out.)

Enough prelude. The thing I'm trying to get around to is that last night I saw the "crown jewel" for this kind of problem, as part of a set of practice problems for the ACT Compass Test in Algebra. (You can actually see it here: "Sample Math Test Questions: Numerical Skills/Pre-Algebra and Algebra", Algebra item #14). Something like this:
For x ≠ 3, reduce (x2 - 9)/(x-3).
Now, the point of this exercise is to practice factoring (in this case, the top is a "difference of squares") and then cancel like factors on the top & bottom. Write: (x2 - 9)/(x-3) = (x+3)(x-3)/(x-3) = x+3.

But last night my students got all weirded out when I was writing that much (there's an additional wrinkle in the Compass problem, but it's not germane to my point) and said they got the right answer with a lot less work. They explained: "Divide x2 on top by x on bottom and get x. Divide -9 on top by -3 on bottom and get +3. There's the answer, x+3."

Now obviously this is a horribly mutilated process (and not uncommon!), thinking that you can divide individual terms in a rational expression piecemeal. (My best explanation, not that it gets fantastic traction, is always "Division distributes across addition, so if you divide by x, you have to divide every term by x." ) But the really crazy unique thing about this problem is that the broken process actually works for every possible problem of this format!

Consider all possible ways of constructing a "difference of squares" on top, and one of its canceling factors on the bottom. Case 1: Say you're reducing (a2-b2)/(a+b). Correct process: (a2-b2)/(a+b) = (a+b)(a-b)/(a+b) = a-b. Incorrect process: (a2-b2)/(a+b) = a2/a - b2/b = a-b (same answer). Case 2: Say you're reducing (a2-b2)/(a-b). Correct process: (a2-b2)/(a-b) = (a+b)(a-b)/(a-b) = a+b. Incorrect process: (a2-b2)/(a-b) = a2/a - b2/(-b) = a+b (also the same answer).

So not only does the "broken" process work for all permutations of this kind of problem, it even manages to get all the signs correct regardless of how those have been set up. Arrrghh!!!

(Silver lining: At least two of my students had the courage to tell me that's what they'd done, and I had the presence of mind to listen to it last night. I've used this practice test for about 5 years without anyone pointing out how they were doing it like that.)

2010-05-12

Number 91

What's wrong with the number 91?

Consider this: With just three exceptions, all of the composite numbers up to 100 have a factor of either 2, 3, or 5. So, for all of those numbers, you can pretty much immediately see that they're composite, via your simple, standard divisibility-identifying tricks.

The three exceptions, things that have a "7" built into them as their smallest factor, are: 49 (7*7), 77 (7*11), and 91 (7*13). Of course, 49 and 77 are pretty obviously composite (knowing one's squares and what divisibility-by-11 looks like).

So 91 is the only composite number up to 100 that I can't immediately identify. I tend to incorrectly conclude that it's prime if I'm just working at it mentally.

2010-04-29

Counterexample to the CLT!?

The introductory statistics text that I teach out of presents the Central Limit Theorem this way:
The Central Limit Theorem (CLT): For a relatively large sample size, the variable x' is approximately normally distributed, regardless of the distribution of the variable under consideration. The approximation becomes better with increasing sample size. [Weiss, "Introductory Statistics" 7E, p. 341]
In other words, any distribution turns into a normal curve when you're sampling (with large sample sizes). I also know off the top of my head that the formal CLT is talking about a distribution that's been standardized (converted z = (x-μ)/(σ/√(n))), and how its limit as a function is the standard-normal curve (centered at 0, standard deviation 1).

So one day I'm walking around sort of half-dozy and I'm thinking, "Wait a minute! What about a constant function? If you had a distribution that was fixed with one element (say, 100% chance that x=5), any conceivable sample mean would just be the constant x'=5, and there's no way a graph of that looks like a normal curve, right?"

Meditating...

Well, the thing that I didn't immediately have in my head, and is also entirely left out of the Weiss text -- There is one single fine-print requirement to the CLT, and it's that the standard deviation of your variable must be nonzero (and also non-infinite), i.e., 0<σ<∞. Which is sort of obvious from the fact that you need that to standardize with z = (x-μ)/(σ/√(n)), it being used to divide with in the formula and all. And of course a constant function has zero deviation, so it's indeed outside the scope of the theorem.

Guess I can't get too mad at the Weiss text for this... chances of it being useful for an introductory student to spend time parsing that is about nil. (Obviously, it hasn't come up in 5+ years of teaching the class.) Still, it might be nice to put it in a little footnote at the bottom of the page so I don't go daydreaming about possible counterexamples on my commute.

2010-04-27

Sets as Plastic Baggies

Is this the picture of an empty baggie?Many times I've seen the use of set-braces (roster notation) compared to an "envelope". Here's one example (speaking of empty sets and the null-set symbol):
To help clarify this concept, think of a set as an envelope. If the set is empty, then the envelope is empty. On the other hand, if the set is not empty -- that is, it contains at least one element -- then there are items in the envelope. One such item can be another envelope. Using this analogy, the symbol {Ø} specifies an empty envelope contained within another envelope. [Setek and Gallo, "Fundamentals of Mathematics" 10th Ed., p. 74]
Now, what I think is the jarring discordance in this analogy: You can immediately see the contents inside a { } symbol, but not so with envelopes (being opaque and all). That's probably why the whole metaphor always feels foul in my mouth, and might be part of the reason I get a poor reaction from students when I try to use it in class.

Let's try a better metaphor: A clear plastic baggie (with a zip, perhaps?). These you can, like the set { } symbol, instantly see into. If you put one plastic baggie inside another, then you can immediately see it sitting inside there... just like the frequently-misused {Ø} notation.

So let's use a metaphor that shares in the transparency and clarity of our precise math notation.

2010-04-08

... In New Jersey

So I got a few inquiries about the prior post on how a Psychology Today article could make me yell out loud. Yes, I was sitting in my apartment alone and hollered out, "Oh NOO!!!" at this part:
In an article published in 2005, Patricia Clark Kenschaft, a professor of mathematics at Montclair State University, described her experiences of going into elementary schools and talking with teachers about math. In one visit to a K-6 elementary school in New Jersey she discovered that not a single teacher, out of the fifty that she met with, knew how to find the area of a rectangle.[2] They taught multiplication, but none of them knew that multiplication is used to find the area of a rectangle. Their most common guess was that you add the length and the width to get the area. Their excuse for not knowing was that they did not need to teach about areas of rectangles; that came later in the curriculum. But the fact that they couldn't figure out that multiplication is used to find the area was evidence to Kenschaft that they didn't really know what multiplication is or what it is for. She also found that although the teachers knew and taught the algorithm for multiplying one two-digit number by another, none of them could explain why that algorithm works.
Holy god, that is insane. I have a hard time imagining anyone being unable to find the area of a rectangle, never mind school staff actually teaching arithmetic, to say nothing of going 0-for-50 in a survey on the subject. I mean... unbelievable! Maybe I should take a poll of students in one of my own classes. Is it possible that people had just forgotten what the word "area" meant?

Just to complete the progression in the article:
The school that Kenschaft visited happened to be in a very poor district, with mostly African American kids, so at first she figured that the worst teachers must have been assigned to that school, and she theorized that this was why African Americans do even more poorly than white Americans on math tests. But then she went into some schools in wealthy districts, with mostly white kids, and found that the mathematics knowledge of teachers there was equally pathetic. She concluded that nobody could be learning much math in school and, "It appears that the higher scores of the affluent districts are not due to superior teaching but to the supplementary informal ‘home schooling' of children."
On the larger thesis of the article, that current math instruction in K-6 is doing more damage than good, and could and has been dropped successfully at least once... you know what? I can potentially believe that. It's possible. If the quality of math instruction is truly that atrocious, I wouldn't want children subjected to it -- of course the only consistent result would be crippling lifelong math anxiety (per Dijkstra's, "as potential programmers they are mentally mutilated beyond hope of regeneration," and all that).

But far more important than my own best-guesses would be: This would take a lot more research before going forward with a plan of that nature. And no way in hell would we either get (1) the research approved, or (2) the program implemented here in the USA (what with the READING AND MATH UBER ALLES!! approach to education that comes down from the political wing these days).