Algebra in NY Times

This weekend, the New York Times published an opinion piece by Andrew Hacker, emeritus professor of political science at Queens College in CUNY (sister college to where I teach). The piece is titled "Is Algebra Necessary?" and seems to make the argument that an algebra requirement should be waived for most students in American high schools and college.

Obviously, on a nearly daily basis, my job deals directly with the pain and death-march scenario of a majority of students coming into our community colleges and being unable to pass the equivalent of a 7th-grade algebra course. I do think that for all of us -- even myself -- proficiency at math is almost always the limiting factor in our careers. At the same time, it's almost uncanny how many old friends I have getting in contact with me to say that they've suddenly found a higher level of math really key to their professional advancement -- including psychologists, photographers, and even artists (mentioned in Hacker's article as a group that should clearly be freed from a math requirement).

Let me riff on that last point for a bit. My girlfriend is a fine artist, with an MFA in sculpting from a school here where we live in New York City. Earlier this year, she was the recipient of a month-long artist residency in Taiwan where she put together an outdoor installation in knitted recycled plastic as part of an exhibit on environmental themes. She has a fairly high proficiency at math (in fact, for about a month she was a math major in college before switching), and this gets used routinely in her career. She has to estimate volumes of complicated shapes she's planning to put together, so as to procure materials (plastic, wire, plaster, etc.) She has to do calculations with money so as to set budgets and write grant proposals. She has to estimate time for projects that might last many months. At some point she generated a calculation for people, time, and material to cover the Eiffel Tower in tiny crocheted plastic leaves (a long-term goal).

And as I know from several friends working as graphic artists (such as from my time in the video game industry) -- almost all of the work today is done on computers anyway, so they need proficiency with numbers, computer science, (x,y) coordinate systems, algorithms, etc. in order to interface with the most basic professional workflows nowadays. On the side, my girlfriend also runs a home business coding HTML and hosting websites for other artists, and last week was learning for the first time to hack in a UNIX command prompt in order to apply software patches. Such is the life of an artist in the 21st century.

This post is sort of a brain-dump in the first 5 minutes after reading Prof. Hacker's opinion piece. One thing I would like to ask is: Having criticized a bunch of other academic disciplines for, in his view, their overly-high math requirements (math departments themselves, medicine, history, etc.), what would his prescription be for his own discipline, political science? What would the bare-minimum level of math proficiency be there -- decimal arithmetic, college algebra, perhaps statistics? (He writes, "I say this as a writer and social scientist whose work relies heavily on the use of numbers.") Is it possible that the requirements set by a discipline may be there for reasons not immediately obvious to an outsider? (As one example: I wouldn't have known that statistical confidence intervals and P-value statements are a core piece of any medical literature until my own mother, a school nurse, asked for help in reading a required article for continuing education credit.)

At first blush, Prof. Hacker's criticisms don't sound entirely coherent, but I could be biased. Personally, I think I'm most worried about changing the essence of what counts as a college education in exchange for the possibly spurious idea that everyone in our society is required to have a B.A. (and yet that may be an unwinnable fight at this point). But one last thought; near the end of his article he writes,

I WANT to end on a positive note. Mathematics, both pure and applied, is integral to our civilization, whether the realm is aesthetic or electronic... Instead of investing so much of our academic energy in a subject that blocks further attainment for much of our population, I propose that we start thinking about alternatives. Thus mathematics teachers at every level could create exciting courses in what I call “citizen statistics.” This would not be a backdoor version of algebra, as in the Advanced Placement syllabus. Nor would it focus on equations used by scholars when they write for one another.

Instead, it would familiarize students with the kinds of numbers that describe and delineate our personal and public lives. It could, for example, teach students how the Consumer Price Index is computed, what is included and how each item in the index is weighted — and include discussion about which items should be included and what weights they should be given. 
 Okay -- just for argument's sake, let's say I go to the Wikipedia article on the Consumer Price Index and start reading up on that subject. What do I see there? Math, equations, written in the language of algebra:

So, what do those symbols mean? What is the proper order of applying those operations? (Noting that approximately half of my remedial algebra students end a semester unable to answer a final exam problem to apply the proper order-of-operations in several steps.) Any of these disciplines, and even a simple CPI calculation is a great example, presume that educated readers have the grammar of algebra (I think that's what it's most like, really) available to converse with. The only other option is to mount a crusade to expunge this writing from professional resources -- and in so doing, bring those disciplines to a grindingly slow level of inefficiency and lack of progress.


Concrete Example of Confidence Intervals

Estimating the Mean Rank of a Deck of Cards by Sampling a Hand of 4.

In my statistics class, near the end of my unit on confidence intervals (C.I.'s), my brief lecture notes say this:

Concrete Example of C.I.'s 
    Consider: Deck of cards (ranks A=1, 2-10, J=11, Q=12, K=13).
    Estimate mean of all card values; obtain 80% C.I. for μ & interpret.
        Sample size n=4; stdev σ=3.74; population uniform.
        Table: Group (A-E), sample, x', E, C.I. (later: contain μ Y/N)
    Interpret: There is an 80% chance that the mean of all card values is in any given interval.
        We expect roughly 0.8×5 = 4 intervals to contain μ; check & lessons.

Hopefully you can see what I'm talking about -- time-permitting (takes about 50 min if I'm not rushing), I find this to be excellent and highly memorable way to reinforce the conceptual lessons of confidence intervals one final time. Frequently I get some students literally gasping in surprise that that population mean, μ, of the deck of cards is actually contained in most of the C.I.'s (as incredible as it may sound -- given that's the whole point of the procedure).

Just for a bit more detail: I'm taking a standard deck of 52 cards and declaring that we'll estimate the average rank value by a random sample of just 4 cards drawn from the deck (as above, we declare A=1, J=11, Q=12, K=13 for this purpose). I split the class up into 5 groups, somewhat as though they're five separate research facilities, each of whom will get their own sample and be asked to perform the C.I. estimation procedure (quicker students may opt to calculate all 5 C.I.'s). I may also talk about the fact that I'm actually breaking the rules for the procedure -- the population is non-normal and the sample size is small; but the procedure is robust enough that with a uniform population it tends to work out anyway. Here's an example from when I did it this last week (each of the 5 rows is a separate group's sample of 4):

As you can see, I've opted to do this at the 80% confidence level, so that we can explicitly compare our results to the number of intervals which we would expect to contain μ in this case (that being revealed right at the end as a separate check). For my classes, this is one of the most insanely challenging parts of the class, so I am compelled to emphasize as often as I can -- understanding exactly what probability statements are telling you. (When I ask something like at the bottom of the picture above, the first responses are, inevitably, almost always "none of them" or "all of them". No joke!)

So as much as I'm always fighting time-management issues in my classes, this is an added demonstration that I've found to be invaluably useful at the end of this unit, basically the "crown jewel" and the hardest part of my community-college statistics class. Here are some of the many lessons that we can draw out from this one case study (esp. granted that we're doing it multiple times):
  1. Most intervals contain μ, but not all.
  2. The margin of error E is fixed by the chosen sampling process.
  3. The population mean μ is a fixed number (but generally unknown).
  4. The probability statement is about the C.I., not about μ.
  5. The procedure is robust even when breaking the assumptions somewhat.
This is probably my favorite, and most incredibly useful, extra demonstration that I do in any of my classes. Highly recommended if you have a chance to try it out yourself.


Power Rules

A Method of Reducing Algebraic Power Rules to Just Two Major Principles

In a basic algebra class, we teach the "rules of exponents" and the "rules of radicals". Altogether, this usually appears in the form of 14 or so separate symbolic rules (and maybe more if you're careful to point out common simplifying errors to avoid).

I've taught this in a rather dramatically different way for several years, in a manner which collapses all the different rules to just two major principles. This is based on some observations I made which seem pretty trivial in retrospect, but I find them to be useful -- not a panacea, but they get some traction from students, and better convey the deep global applicability of math (not a big list of disjointed rules to memorize).

At this point, I tend not to even see what I'm doing as something unusual, but I showed it to a friend of mine the other week who has a PhD in Molecular Biology, and she exclaimed, "I was never taught it that way!", and seemed quite delighted. (To my knowledge, really, no one's ever taught it this way, since it's a method I developed -- for what that's worth.) I've mentioned it in passing before but I figured I would highlight it clearly here today.

Order of Operations (OOP)

First, I start the course by teaching a proper order-of-operations, and tell students that it's the single most important thing in the class -- the "engine that drives everything we do" -- the only thing that is arbitrarily made up, and from which everything else flows. It's not PEMDAS. They must memorize the following chart:
  1. Parentheses
  2. Exponents & Radicals
  3. Multiplication & Division
  4. Addition & Subtraction
Of course, the details are important: Parentheses means "operations inside parentheses" (and includes other grouping symbols like braces, brackets, fraction bars, radical vinculums, and absolute values). In any phase of calculation, we work left-to-right across the expression (just like we read), calculating any of the given operations as we encounter them. Note that after parentheses, each operation comes paired with its inverse (tied in order of operations). As we do exercises, I'm careful to verbally model the mental process: "Do we have any parentheses? No, so we don't need a written line for that. Do we have any exponents or radicals? Yes, so we'll need a written line for that..." Etc.

Principle #1: Operations on same-base powers shift one place down in OOP.

I now call this the "Fundamental Rule of Exponents".
When performing any algebraic operation on powers (like x2), we have a simple mental shortcut, and that shortcut is found in the order of operations picture by shifting one place down. Some examples:

Ex. #1: Simplify (x3)2. Think: I have a power (x3) and I'm exponentiating it (raising it to the power 2). What is my shortcut? Find "exponent" in the order-of-operations and shift one place down and you see our shortcut: multiply the powers. So (x3)2 = x6.

Proof: (x3)2 = (x∙x∙x)(x∙x∙x) = x6 [finding total x's multiplied]

Ex. #2: Simplify x5/x3. Think: I have same-base powers (both base x) and they are being divided. What is my shortcut? Find "divide" in the order-of-operations and shift one place down: we will subtract our powers. So x5/x3 = x2.

Proof: x5/x3 = (x∙x∙x∙x∙x)/(x∙x∙x) = x2 [cancelling x's top & bottom]

Stuff like that. You can do more initial examples based on student inquiry or interest; of course, it also works for multiplying and radicalizing same-base powers (shortcuts to add and divide, respectively). Maybe weaker students wind up having to memorize all four implied relationships anyway, but I think that's okay. It gives everyone a framework for truly understanding the relationships between operations when they need it.
(And there's even something else that I often add later in the course: What happens if we add or subtract powers? Look below "add" and what you see is -- nothing. I'll literally write "No Operation" on a 5th line at this point, and even mention the analogous CPU machine language command. So it's consistent that you'll never be changing powers in an add or subtract operation; simply combine like terms and transcribe the powers.)

Principle #2: Operations distribute over any operation one line down in OOP.

I now call this the "General Distribution Rule".

So what I mean here is that (as I explain in the lecture) you've got parentheses, with one operation outside, and another operation inside. The observation is that we have a very nice, one-line shortcut to get rid of the parentheses by applying the outside operation to each piece inside -- so long as the inner operation is one of the items one line down in the order-of-operations.

Ex. #1: Simplify 7(x+5). Think: We have parentheses. The outside operation is multiplication. The inside operation is addition. Since the latter is one line down in OOP, we can distribute this: namely, "distribution of multiplication over addition". (Note that the "over" in the official name echoes and recalls the relationship in the OOP picture.) So 7(x+5) = 7x+35.

Check: Ask students for a specific value for x, substitute into both sides, and check to see if they are the same value. (Optional: Most students are already comfortable with this distribution, and don't need time spent on the check.)

Ex. #2: Simplify (a2b3)2. Think: Once more, we have parentheses. The outside operation is exponentiation. The inside operation is multiplication. Again, since the latter is one line further down in OOP, we can distribute this in a one-line shortcut -- "distribution of exponents over multiplication". So, recalling the first principle for applying exponents to powers: (a2b3)2 = a4b6.

Proof: (a2b3)2 = (a∙a∙b∙b∙b)(a∙a∙b∙b∙b) = a4b6 [total a's and b's multiplied]

Ex. #3: True or False: (x+4)2 = x2 + 16. Think: Look at the parentheses on the left. The outside operation is an exponent, while the inside operation is addition. This will not distribute in a one-line shortcut, since addition is two lines below exponents. Therefore the statement is False. (Note that this is one of the most common errors in basic algebra, and so it deserves special attention -- I'll write "exponents do not distribute over add/subtract" on the board.)

Check: Ask students for a specific value for x (not zero), substitute into both sides, and check to see if they are the same value.

Of course, this principle also works to recall any of: distributing multiplication over add/subtract, distributing division over add/subtract, distributing exponents over multiply/divide, and distributing radicals over multiple/divide. There's actually a total of a full dozen (12) relationships explained by this one single principle (including the fact that exponents/radicals do not distribute over add/subtract).


It pays off to do as many simplifying examples as time permits afterward, but I do think that getting these two major principles on the board as soon as possible "primes the pump" for everything that happens later. It provides an overarching organizational structure, and it also serves as a model for abstract, generalized principles being easier to remember and apply (making us more happy) than a large body of otherwise disjointed rules. Which from my perspective, may be the only thing that justifies everyone taking a basic algebra course in the first place.



On the Frequency and Probability of Amendments to the U.S. Constitution

Consider the following chart of amendments to the U.S. Constitution:

The number of times the U.S. Constitution has been amended has been very "bursty" -- there tend to be many amendments in eras of tremendous social upheaval, and then longer spans without any amendments at all. We're currently in one of the long spans basically devoid of amendments being passed. To be specific:
  1. Ten amendments were passed together in 1791 as the Bill of Rights soon after the country was founded, and two more in 1795 and 1804 -- but then no more for a period of 60 years.
  2. Three amendments were passed in the era of the Civil War (in 1865, 1868, 1870) -- but then no more for the next 40 years.
  3. Four amendments were passed near the end of the Progressive Era (1913, 1913, 1919, 1920).
  4. Two were passed in the depths of the Great Depression (both in 1933).
  5. Five were passed around the decade of the Sixties (1951, 1961, 1964, 1967, 1971).
  6. Only one was passed in the last 40 years; and it's a clear outlier, in that this 27th Amendment was first proposed in 1789 and was pending ratification for over 200 years (finally enacted in 1992).

As a possibly related matter, consider how the chance of achieving ratification changes as the country grows (i.e., adds more states). Note that after a two-thirds vote by Congress, amendments require ratification by three-quarters of the States. As a simple model, we'll use the binomial distribution and assume a fixed probability of being ratified by any given state:

Caveat: States don't have just one chance to ratify an amendment; each one can try year after year until it succeeds (so it becomes more like an "or" operator, increasing the chance over time; see the 27th Amendment above, for example).

But what we see here is that, generally speaking, more states means that a greater level of consensus is needed to pass amendments. If the chance per state is 75% (three-fourths), then the chance to ratify is basically fixed at 50%, since the expectation itself is for three-fourths of the states to approve (with some jumpiness in the percents due to granularity of rounding the three-fourths to an integer number of states). If the chance per state is lower (like 60%; a weak majority), then the chance to ratify crashes precipitously with more states; but if the chance per state is higher (like 80%; a stronger majority), then the chance to ratify increases to near-certainty.

Open Document spreadsheet with these calculations.


Prioritize Questions

On Cutting Content to Prioritize Student Questions

For the last few weeks in my summer courses, I've been frustrated that I'm constantly running over time by about 5 minutes (every night in a 3-hour session). Not great, particularly when the session ends with the culminating problem of the evening, and I really want students to have time to try one while I'm present to help them out.

A related issue is that I've been entertaining more questions at the start of class than maybe I'm accustomed to in the recent past (frequently the first 30 min out of the 3 hours). So between the two -- and for a long time I've thought that this is actually the fundamental task of a college instructor -- I sit down at the end of the night and try to CUT OUT something from my presentation for the next time. (At this point, over the years, it's been cut so much to the bare bones that it feels like the only thing left is amputating formerly key concepts or problems.)

But it has to be that way. Principle: Of everything I can do during class time, answering student questions is the most important and must be prioritized. If I'm fortunate enough to have a student who's done homework, and arrives interested and asking good questions (not all questions being good ones), then the best thing I can do is to highlight and answer those questions. That's the ultimate competitive advantage we have in teaching classes with a live, expert instructor present.


Backup Parachutes

I usually try to teach ways to double-check any calculation procedure (by means of initial estimation, making a graph/sketch, etc.). Frequently students resist this: they ask if they can skip the estimation step and just do the direct calculation alone. That's problematic, because it's the estimation which is perhaps the better test of actual comprehension of the overall concepts involved.

New analogy: The double-check is like a backup parachute. Most of the time the main parachute works, but no sane skydiver would take a leap out of a plane without the backup if they can avoid it. If they did, they'd be one tiny glitch away from total disaster.

Edit: I suppose an even better analogy would be the little "drogue parachute" that is deployed for slowdown & stability prior to the main parachute (granted that estimation should occur at the beginning and only partly suggest the full answer). But not everyone knows about that, so I went with "backup parachute" instead.


Teach Logic

Recently I've been coming to the opinion that we need to teach basic logic at a young age, as was done in classical education. Ultimately, it's the foundation for all of math and the scientific method. If the first time you study logic it is in college, then all of your education is really built on shifting sands.

A couple related thoughts: I'm coming to this largely because of how many of my students in any class (even sophomore statistics) get helplessly tangled up over something as simple as an if/then statement. Or a subset relationship (e.g., normal curves are bell-shaped, but bell-shaped is not the same thing as normal). Or an "and" statement (the z-interval procedure requires a simple random sample, known population standard deviation, and a normal sampling distribution of the mean... the last of which can be established by either a normal population or a large sample size).

I'm reminded of my first programming book in the 6th grade which introduced "and" and "or" operators and just said, "the meaning of these should be obvious", with an example of each. It may not be a priori obvious to everyone, but it really shouldn't take very long, and could pay off enormous benefits later.

Coincidentally, I just came across a delightful blog post by John Barnes on the same subject titled, "The Hobo Queen of the Sciences". Here are a few terrific highlight quotes:
And then I got Ms. Pounding Shouter... She thumped the podium, she pointed at people and accused them of not understanding her, she ordered them to believe what she told them to... "I was totally  logical. I pointed things out real loud and told people they were dumb if they didn't believe it, and I yelled so they'd get the point."
And also:
Last and far from least, in a related course  where I used to teach listening for logic as a way of improving listening comprehension and retention, one student asked me at the end of the class, "Why wasn't I taught this in fourth grade?"

Of course, to his credit John goes on to explain the vested interests that don't want fourth-graders -- or jury members -- knowing the basics of logic and reasoning.

Read it here.