Monday, July 8, 2013

Why Z-Scores Have Mean 0, Standard Deviation 1

This article is aimed at introductory statistics students.

Statistics, as I often say, is a "space age" branch of math --many of the key procedures like student's t-distribution weren't developed until the 20th century (and thus helped launch the revolution in science, technology, and medicine). While statistics are really critical to understanding modern society, it's somewhat unfortunate that they're built on a very high edifice of prior math work -- in the introductory stats class we're constantly "stealing" some ideas from calculus, trigonometry, measure theory, etc., without being totally explicit about it (the students having neither the time not background to understand them).

One of the first areas where this pops up in my classes is the notion of z-scores: taking a data set and standardizing by means of z = (x-μ)/σ. The whole point of this, of course, is to convert the data set to a new one with mean zero and standard deviation (stdev) one -- but again, unfortunately, the majority of my students have neither the knowledge of linear transformations nor algebraic proofs to see why this is the case. Our textbook has a numerical example, but in the interest of time, my students just wind up taking this on faith (bolstered, I hope, by a single graphical check-in).

Well, for the first time in almost a decade of teaching this class at my current college, I had a student come into my office this week and express discomfort with the fact that he didn't fully understand why that was the case, and if we'd really properly established that fact. Of course, I'd say this is the very best question that a student could ask at this juncture, and really gets at the heart of confirmation that should be central to any math class. (Interesting tidbit -- the student in question is a History major, not part of any STEM or medical/biology program required to take the class.)

So I hunted around online for a couple minutes for an explanation, but I couldn't find anything really pitched at the expected level of my students (requirements: a fully worked out numerical example, graphical illustration without having heard of shift/stretches before, algebraic proof without first knowing that summations distribute across terms, etc.) Instead, I took some time the next day and wrote up an article myself to send to the student, which you can see linked below. Hopefully this very careful and detailed treatment helps in some other cases when the question pops up again:


(Edited: Jan-9, 2015).

14 comments:

  1. Good explanation! Clear and succinct! :-)
    p.s. I love it when students like that take responsibility for *learning*, not just trying to get a good grade!

    ReplyDelete
    Replies
    1. Thanks! I was pretty happy, and the student really enthusiastically thanked me for it the next time I saw him. As my favorite math professor said to me years ago -- You have to commit to tap into the joy of the 1-3 students or so per class who are really engaging with you and the subject material.

      Delete
  2. Thank you so much. I am in the middle of revising my stats for a new uni course. I just couldn't fully understand why the SD was always 1 in standard normal distribution. I've searched and searched for an answer all day. Wish I had found your article earlier. I know understand it. Thank you.

    ReplyDelete
  3. I am so thankful for this explanation, it helped me a lot :) You seem to be a wonderful teacher!

    ReplyDelete
    Replies
    1. You made my day, thanks for the comment!

      Delete
  4. It's been quite long since I've tried to find the explanation for this query. Never did I find one before. Luckily I found your blog and it helped me a lot. Now I fully understand it. You're such an amazing teacher. I wish you were my stats teacher. I never understood 60% of my stats lessons when I was in college. Anyway, thanks a lot.

    ReplyDelete
    Replies
    1. Thanks so much for saying that, glad it helped. You made my day!

      Delete
  5. Superb!...I have also been long wondering to understand how mean can be zero and std deviation 1. Also that graphical illustration was very simple to understand the concept of shifting curve areas.

    ReplyDelete
  6. U r indeed a great teacher which is evident d way u explained d concept in very simple way. I m lucky to visit ur website n get ur explanation. Thanks so much & good luck! - Vijay Maurya (India)

    ReplyDelete
  7. Great work. Thank you for the simplest explanation.

    ReplyDelete
  8. Hi, could you reupload the PDF?
    Thank you!

    ReplyDelete
    Replies
    1. Pretty sure the link is working currently; I just checked it.

      Delete