The Sample Truth: How Online Opinion Polls are Different From Traditional Polls, and Sometimes, Better
bob@cringely.com
This column has an international readership, but I live in the United States, and from time to time, I've come across a story that has impact on only this one, rather large, country. That's the case this week when we consider the curious role of opinion polling in the current presidential contest between George W. Bush and Al Gore. I have been reading all these stories in newspapers and on the Internet about how one candidate or the other is ahead in the latest tracking poll, and just wondered what the heck that's all about. How is it, short of being outed as a pedophile or a tax cheat, that the support of a candidate can change so quickly week to week? Are we really so shallow in our political beliefs, or are the numbers simply not real?
Having talked this week with a few pollsters, I find that this flopping around of numbers is really a lot less strong than it appears, and stems from two different effects. First — and this is a legitimate effect with real meaning for the final election — is the role of undecided voters. Some people don't make up their minds until they are in the polling booth. And in the weeks before the election, these folks sway in the wind of public opinion. It is a lot easier to sway an undecided voter than a voter who has already decided, especially when the election is weeks away. And so they sway week to week, attracted or repulsed by whatever the week's news brings. These folks tend to enjoy their indecision. It certainly gives them something to talk about, but the real question is, "Are these people really undecided or just enjoying the chase?" If, rather than polling them, these people were shoved toward a ballot box on week two or week eight of a campaign, would their votes actually change? Nobody knows.
The other effect that keeps these poll numbers churning comes from the realm of what Benjamin Disraeli (who knew a thing or two about elections) called the three kinds of lies — "lies, damnable lies, and statistics." The pollsters are so good about mentioning in every case that their margin of error is plus or minus, say, four percentage points, that this admonishment goes unnoticed, just like the fine print at the bottom of car ads or the software license agreements that we click on without ever really knowing what they say. Plus or minus four percentage points is an eight point spread. So if the numbers change by five percentage points from one week to the next, does it really mean anything? Maybe it does, maybe it doesn't.
With all this in mind, I called my favorite high tech pollster, Ann Stephens, empress of PC Data. Along with its arch-rival Media Metrix, PC Data is in the heavyweight division of companies tracking Web surfing and I thought this might give them some insight into political behavior, too. More specifically, Ms. Stephens had claimed to me during a previous election that PC Data's poll numbers were remarkably accurate. What were they learning about the current fight between Bush and Gore?
This is Internet polling, of course. It is accomplished by either sending out a few thousand e-mails or by giving Web site visitors or ISP log-ins a chance to share their views. Online polls are cheaper and faster to do, but are they more or less accurate? That's what I wanted to know.
"Online polls are, well, different," said Stephens, "and that's because people behave differently online than they do on the phone or in front of a pollster who visits their home. I'm not saying they are less accurate, but you have to understand what you are working with. Part of it has to do with the kind of people you meet online. They tend to be a bit more conservative. There are more Rush Limbaugh than Ralph Nader fans online. And the number of people who will participate in an online poll is also smaller than for a comparable phone survey. Fewer take part. But those who do take part tend to be passionate about their beliefs, or at least they are more willing to express that passion."
So does that make online polls less accurate?
"Not if you take these factors into account in designing your poll and selecting your sample," said Stephens. "We are at the point where the Internet community is more than half of the voting population, and that's a pretty good sample. And Internet polling has some real advantages over telephone polling, which leaves out entire professions. Telephone polls, which are done mostly in the early evening, completely ignore people who are working at that hour, skewing the sample toward mainstream nine-to-fivers and away from people who work nights. Yet both groups vote."
What about the story I remembered in which PC Data polls were supposed to have been extraordinarily accurate?
"That was during the congressional election in 1998," Stephens recalled. "We were just getting our panel up and running then, and were trying to fine-tune our sample size. It made sense to run some political polls so we could compare them to the more traditional polls being announced every week. And of course, the election itself would be the final determinant of how good a sample we had. If the election mirrored our poll results, we would be right on."
"The problem was that our poll results didn't match the other polls at the time. This was 1998, Clinton had been impeached, and the mainstream polls were predicting a Republican surge in Congress with an expected gain of up to 20 seats (out of 435). But our polls showed no such landslide. They showed that a lot of people were angry — the passion that is probably masked in other polling techniques — but there was no Republican triumph in the works. Thinking that the other polls were right, this meant our sample was off somehow. But when the actual vote came, it turned out that we were right and the other polls were wrong. Every poll was off except ours. Our sample was good."
So what does that mean for the current election? Where does PC Data say the votes will fall? They don't say. The company is not doing any election polling this year and the reason why might surprise you.
"Yes, in a way it is a shame we don't do those polls anymore," said Stephens. "They are quick and cheap to do. We can poll a sample of 10,000 voters and have results in 18 hours. But online polling is not a good business — there is no money in it — and it actually puts us at risk. The companies doing online polls are losing their shirts. What we measure are online transactions, where there is a lot of money to be made. We want PC Data to be perceived as an online transaction tracker, not an online pollster. In the current market, a transaction company is worth three times as much as a pollster. We have to be very clear about this, even though the polls are interesting and fun to do."









