## Thursday, October 4, 2012

### Bungled Election Probability

Here's a common malformed math problem that really irks me -- The idea that in a voting situation, the ratio of voters indicates the probability of one party winning. For example: a problem might say that out of a group of 50 people, 30 people favor candidate A, and 20 people candidate B, so candidate A has a 60% chance of winning an election. Obviously, this can only be the case if the election is decided by one single voter being chosen by random method, which is not remotely how any elections actually work.

I've seen this pop in one or more publisher-provided testbanks that I use. And yes, it currently also appears in the Udacity Statistics 101 course (Unit 24.4, et. al.; hopefully fixed soon?). For god's sake people, please don't do that.

1. For the record, what's the correct solution? If I were handed this problem, I would probably go back to election records and see how often an X% lead, Y days before the election, led to Z candidate winning. Then you could simply match up the cases, and there's probably a best-fit line in the historical data somewhere. Do you have a solution based on rule rather than historical data?

Also, Nate Silver (fivethirtyeight.blogs.nytimes.com) does this sort of thing a lot, and he runs a good blog about turning polls into election probabilities.

1. As you might guess, the order-zero response would be "Not enough information". The order-one response would be "Lots more likely than the plurality amount" (like, 60% favorability would be pretty much an ironclad election win). More sophisticated analyses are extremely complicated and subject to debate.

Yahoo has a Wharton PhD economist writing a regular column on the issue: http://news.yahoo.com/signal/

Or consider how networks call elections: http://today.ucla.edu/portal/ut/forum-discusses-how-networks-call-238556.aspx

2. Follow-up: According to that latter article, the threshold that the networks use on election night is around a 5-6 point advantage (like 47% to 53%). Above that, they call they state as having been won. Below that, they wait for more data.