2013-07-08

Why Z-Scores Have Mean 0, Standard Deviation 1

This article is aimed at introductory statistics students.

Statistics, as I often say, is a "space age" branch of math --many of the key procedures like student's t-distribution weren't developed until the 20th century (and thus helped launch the revolution in science, technology, and medicine). While statistics are really critical to understanding modern society, it's somewhat unfortunate that they're built on a very high edifice of prior math work -- in the introductory stats class we're constantly "stealing" some ideas from calculus, trigonometry, measure theory, etc., without being explicit about it (the students having neither the time nor background to understand them).

One of the first areas where this pops up in my classes is the notion of z-scores: taking a data set and standardizing by means of z = (x − μ)/σ. The whole point of this, of course, is to convert the data set to a new one with mean zero and standard deviation (stdev) one -- but again, unfortunately, the majority of our students have neither the knowledge of linear transformations nor algebraic proofs to see why this is the case. Our textbook has a numerical example, but in the interest of time, my students just wind up taking this on faith (bolstered, I hope, by a single graphical check-in).

Well, for the first time in almost a decade of teaching this class at my current college, I had a student come into my office this week and express discomfort with the fact that he didn't fully understand why that was the case, and if we'd really properly established that fact. Of course, I'd say this is the very best question that a student could ask at this juncture, and really gets at the heart of confirmation and proof that should be central to any math class. (Interesting tidbit -- the student in question is a History major, not part of any STEM or medical/biology program required to take the class.)

So I hunted around online for a couple minutes for an explanation, but I couldn't find anything really pitched at the expected level of my students (requirements: a fully worked out numerical example, graphical illustration without having heard of shift/stretches before, algebraic proof without first knowing that summations distribute across terms, etc.) Instead, I took some time the next day and wrote up an article myself to send to the student, which you can see linked below. Hopefully this careful and detailed treatment helps in some other cases when the question pops up again:


(Edited: Jan-9, 2015).

24 comments:

  1. Good explanation! Clear and succinct! :-)
    p.s. I love it when students like that take responsibility for *learning*, not just trying to get a good grade!

    ReplyDelete
    Replies
    1. Thanks! I was pretty happy, and the student really enthusiastically thanked me for it the next time I saw him. As my favorite math professor said to me years ago -- You have to commit to tap into the joy of the 1-3 students or so per class who are really engaging with you and the subject material.

      Delete
  2. Thank you so much. I am in the middle of revising my stats for a new uni course. I just couldn't fully understand why the SD was always 1 in standard normal distribution. I've searched and searched for an answer all day. Wish I had found your article earlier. I know understand it. Thank you.

    ReplyDelete
  3. I am so thankful for this explanation, it helped me a lot :) You seem to be a wonderful teacher!

    ReplyDelete
  4. It's been quite long since I've tried to find the explanation for this query. Never did I find one before. Luckily I found your blog and it helped me a lot. Now I fully understand it. You're such an amazing teacher. I wish you were my stats teacher. I never understood 60% of my stats lessons when I was in college. Anyway, thanks a lot.

    ReplyDelete
    Replies
    1. Thanks so much for saying that, glad it helped. You made my day!

      Delete
  5. Superb!...I have also been long wondering to understand how mean can be zero and std deviation 1. Also that graphical illustration was very simple to understand the concept of shifting curve areas.

    ReplyDelete
  6. U r indeed a great teacher which is evident d way u explained d concept in very simple way. I m lucky to visit ur website n get ur explanation. Thanks so much & good luck! - Vijay Maurya (India)

    ReplyDelete
  7. Great work. Thank you for the simplest explanation.

    ReplyDelete
  8. Hi, could you reupload the PDF?
    Thank you!

    ReplyDelete
    Replies
    1. Pretty sure the link is working currently; I just checked it.

      Delete
  9. Good article on z-transformation.

    ReplyDelete
  10. Thank you so much. This is my question that I can not find in my mother language.

    ReplyDelete
  11. Thank you so much for taking the time to do this and post it. I'm taking an intro stats class now and was going crazy trying to get the answer to this. So satisfying to finally have it click.

    ReplyDelete
  12. Thanks a lot!
    It is so well explained that all my doubts regarding 0 mean and 1 sd in standard normal distribution has been clean now.

    ReplyDelete
  13. Hi. Thank you so much for putting in the effort to explain why mean is 0 and sd is 1. I'm learning intro to stats so this is super helpful.

    ReplyDelete
  14. God bless you, holy moly did I have trouble understanding WHY it mean = 0 & sd = 1. My statistics textbook glided over that as a statement with no explaination & it was driving me absolutely nuts.

    ReplyDelete
  15. I am so thankful to you because of of your explanation of z score. That really cleared my confusion about it

    ReplyDelete
  16. I wz searching for the question, " why the standard deviation of Z is 1, nd luckily got your blog, very nicely explained, although explanation of geneic one for sd is not there but it has already been answered, thank you sir

    ReplyDelete
  17. This should be 1st answer in google to the question. Thank you sir.

    ReplyDelete