Bad Research; No Biscuit.

April 19, 2011 · 7:00 am

Fun With Crosstabs

A couple of thought’s on PPP’s bombshell last week about interracial marriage in Mississippi, which, if you missed it, said that 46% of “usual Republican Primary voters” there say interracial marriage should be against the law (full PDF of results here):

First, at the very outset, this: I like PPP quite a lot. They do solid work. They release more data on their polls than many of teir competitors. They are a Democratic pollster, but if you look at the data Nate Silver puts together for FiveThirtyEight, PPP’s results are both pretty accurate and pretty unbiased. Yes, one can argue that being partisan gives them an agenda, and while I’m sure they’re not sad to have uncovered this data point, at the end of the day, a pollster is only as good as his numbers, and PPP’s numbers are good. If anything, I’d say their partisan leaning gives them some “cover” when they want to ask something controversial, like they did here. Actual question wording: “Do you think interracial marriage should be legal or illegal?” I think a good number of pollsters would have difficulty asking that question; it seems, to me at least, to be too likely to cause offense. More on that in a minute, though.

Needless to say, the fact that 46% of their respondents said “illegal” made a lot of news last week. PPP says they asked the question of Democrats (and I assume independents) as well, and will be releasing that part of the data shortly, and I’m sure that will make some news, too. I’m not naive enough to believe that racism is dead, and it’s worth noting that while the US Supreme Court struck down state laws banning interracial marriages back in 1967, neighboring Alabama didn’t get around to removing the anti-miscegenation bit from its state constitution until the year 2000, but I still think these numbers are surprisingly high. These are findings that produce coverage because people talk about them; in this case, a lot of sad head-shaking mixed with gleeful potshots from one side, and a lot of unoriginal pollster attacks from the other, mostly of the usual tired “but it’s IVR, it could have been an 8-year-old taking the poll” sort. (Because it would be a better or easier-to-stomach finding if 46% of Mississippi 8-year-olds felt this way?)

Still, I think there are about three perfectly reasonable reasons to question these results, so let’s get to it:

First, there’s this: “legal” and “illegal” sound really similar on the phone, no matter how carefully the announcer (or the live interviewer) says the words. Everyone does it, because workaround wording is cumbersome and ends up being something like “should so-and-so be allowed by law? Or against the law,” which is problematic because people generally just don’t say “allowed by law,” so you end up causing some confusion there, as well. Instead, it might be better, somewhat illogically, to go with something even longer, like “should Mississippi allow interracial couples to get married? Or should interracial marriage be against the law?” It’s impossible to mis-hear that, except for the next thing:

There isn’t (or, there wasn’t until this poll was released) any real debate in this country right now about interracial marriage. There is, however, one about same-sex marriage. I think it’s very possible that some percentage of respondents simply misunderstood the question. This was raised, to an extent, in the comments on the original PPP post on this poll, and I think PPP’s response might have been a little too hasty, though I understand where they’re coming from. To be very, very clear: I do not think the respondents on this poll, or southerners, or Republicans, or southern Republicans, or any people in general, are stupid. I do not think people are hearing the word “interracial” and failing to understand what it means. I think they’re just not hearing the word at all. Look: people are not taking polls in ideal lab conditions; they’re taking them in real life, in rooms that contain television sets, and children, and the internet, and in some cases they’re taking the polls in their cars on their cell phones, despite the pollster’s best intentions to not accidentally break the law by calling any wireless numbers. No matter how clearly worded and how well-read the question is, the pollster is battling an infinite number of distractions for each respondent’s time, and can’t expect each respondent to be giving them his or her undivided attention. In 2011, in an election poll, when the 14th question you’re asked, coming immediately after several “how would you vote” questions, if you hear “Do you think interracial marriage should be legal or illegal,” I think you’re going to have one of three reactions, and only one of them is good.

You might hear the question correctly and answer it. Spoiler alert: that’s the good one.

You might hear the question incorrectly and answer it. Another spoiler: that’s bad.

You might get so offended that you hang up the phone. This is so bad that it’s the topic of the next paragraph.

If you ask a question that deeply offends some of your respondents, they’re going to hang up on you. That’s going to leave only the non-offended to answer that question. What does that do to your results? If you, out of the blue, put a question in the middle of your Mississippi Republican primary election poll, a question that carries the pretty clear subtext of “are you a racist,” I think there’s a reasonable chance that many non-racist respondents will decide they know where this is headed — to a bad place — and drop out of the poll. I actually think a similar thing may be driving some of the polls showing large numbers of Republicans in the “birther” camp — a lot of non-birthers are hearing that question as “Now I’d like to ask you something to try to make people who share some of your views on politics look really, really stupid. Where do you think Barack Obama was born?” Reasonable Republicans may be dropping out at this point, which not only is inflating the number of Republican birthers, but it’s also having the side effect of inflating the poll numbers for “candidates” like Donald Trump, because, in most cases, if you hang up during the course of a poll, none of your previous answers count. If Romney voters start dropping out of a poll near the tail end, their answers up at the top end of the poll are also going to go in the garbage, and that’s going to give a bump to the fringe respondents’ favorite candidates.

All this is testable, of course, with some a/b sample splitting. I think the hardest thing is making it absolutely clear we’re talking about interracial marriage (as opposed to same-sex marriage), which might require, ironically enough, asking about same-sex marriage first to make it more abundantly clear, in the subsequent interracial marriage questions, that we’re now talking about something else.

I look forward to seeing what data PPP shows for the rest of Mississippi respondents from this poll; offhand I’d expect it to be somewhat less of a large number, but still larger than many will be comfortable with — but I’d take it all with a grain of salt until someone is able to do some testing of my hypothesis here.

9 Comments

Filed under election polling, IVR, Public Opinion Polling

April 6, 2011 · 4:00 am

Yet Another Example of How Not To Use The Internet To Conduct Research

(edit, April 6, 2011: Over a year since I posted this, and I just took another Zogby poll (now an “Ibope Zogby poll,” by the way), and they’re still asking this question the same way. And I still, despite being pretty politically aware, knowing my congressman’s name, and having even written the guy and gotten a response on at least one occasion, have absolutely no idea what district number I live in. Everything below this parenthetical addition is old content, so if you have seen it before, sorry.)

This is from a couple of weeks ago, and I’m just now getting a chance to post it.

88% of Americans live in a state with fewer than 53 US congressional districts in it. Only California has that many; Texas comes in second with 32.

And yet, here’s how the good folks at Zogby Interactive ask what congressional district you live in:

That’s right. Zogby asks what state you live in, and then asks you, regardless of how many districts your state contains, which of 53 districts you live in. This is terrible for a lot of reasons, beginning with what should be obvious to everyone: it’s really lazy.

Looking at this from a practical political standpoint, though, it’s a mess. Folks just don’t think about their congressional district that way. Many (certainly not all) will know the name of their representative — or at least be able to pick the name from a short list of possibilities — but the odds of them knowing the actual district number aren’t great.

That being said: it can be problematic to ask people who their representative is if you’re then going to ask them if they’re going to vote for that person — it creates a priming effect and reminds (or informs, if the respondent is less focused on politics) of incumbency and makes it difficult to do a clean “would you vote for x or y” question. While I didn’t get that question as a follow-up, it’s possible some respondents did, though I somewhat doubt it this far out.

A much better way to ask this question is to ask for zip code, which will let you look up the right district in most cases; a simpler method (for the respondent), and one that might feel less personally intrusive, is to remember that this is the internet and present a state map, on which the respondent can zoom in and actually CLICK WHERE HE LIVES.

And, saying what should be obvious, but maybe isn’t: if you structure your research in such a way that only the very very very top-most super-engaged respondents are qualified to answer a follow-up, your results are only going to reflect that tiny slice of the population.

Pathetic, and sadly, about what one would expect.

For Better or For Worse, Twitter.

Honestly, I don’t know if it’s a good thing that the vast majority of my thoughts come in little snippets no longer than 140 characters at a time, but it seems to be the case.

I can’t promise every one (or even necessarily most) of my tweets will be about the mess that is market research these days, but for better or for worse, it’s where I mostly am. I’ll of course continue to post longer items here when the mood strikes me, but if you’re not following me there, you may want to.

Databases Are Your Friend!

I’ve ranted about this so many times that it’s a true pleasure to see it being done the right way by YouGov/Polling Point here:

(obviously, I blanked out the zip code.)

Compare and contrast with other folks’ ongoing aggravation of asking me to pick which country I live in (from a list of about a billion, though the US at least is at the top), then pick my state, and then entering my zip code. Harris does this pretty frequently, though I just saw an example the other day where they instead asked if I still lived in the United States, and then (without asking for state or zip) asked me to type the name of the city where I live, which I found somewhat unusual.

Anyway, nice to see this happening. We’re already on the panel, so you already know all this background info!

1 Comment

Filed under databases are your friend, Doing it right, Market Research, web research, YouGov

February 8, 2011 · 10:46 am

Taking a Hatchet to Your Matrix

Not a newsflash: I hate matrixes. That being said, I acknowledge they’re sometimes going to be necessary. If you’ve got to use one, though, I think it’s in everyone’s interest to keep each one as small as possible, and to use as few of them as possible.

There’s often a point in web surveys where the respondent is asked whether or not he has heard of a number of different items – brands of orange juice, for instance, to use my favorite example. That’s followed by another question asking which of the brands the respondent has personally tried.

Then come the matrixes, where respondents are asked to rate each of the brands that they’ve heard of – not just the subset they’ve personally tried – across a number of rating criteria, each one likely being its own matrix on its own page. This is the point where the respondent suddenly regrets being so honest about the brands he’s seen in the grocery store or advertised on TV, because he suddenly realizes he’s going to be spending the next fifteen minutes of his life clicking “don’t know” or “not applicable” on matrix after matrix inquiring about the best flavor, the least pulp, the nicest packaging, and so on. I get, very clearly, that as researchers, this isn’t entirely a waste of time – we can give our clients a report that shows the attitudes crosstabbed by both active users and those who are just aware of each brand. It has the added “bonus” of letting us inflate the number of respondents — you get to tell your client that you asked the evaluation questions of significantly more people than you would have if you’d only included those who use the brands in question. (This is the product research version of asking unlikely voters how they’ll be voting.) And, of course, it’s possible that some respondents will have differing levels of familiarity with the products they don’t themselves use, and may actually be able to provide useful feedback nevertheless. But, still:

I’m writing this, actually, as I take a break from a piece of research I’m in the middle of taking. I think I’m on about the sixth matrix page. I’ve got 8 columns going across – 7 point Likert plus a “not sure” – and 10 rows of brands going down, only 1 of which is asking me about something I truly have knowledge of – the other 9 are things I’ve heard of, but have no ability to evaluate. I don’t want to go into specifics, but let’s pretend it’s about travel, and that it first asked me which foreign cities I’d ever considered traveling to, and then asked which ones I’d actually visited — and now it’s asking me about every city I’d considered going to, to rate the quality of its museums, central train station, hotels, safety, and so on. There might be the occasional question I can answer based on something a friend told me or based on something I vaguely remember reading on Wikipedia or in a Rough Guide, but in general, I’m just not able to comment on the friendliness of the Dublin populace, you know?

Not only is this frustrating, but I’m also (and this wouldn’t apply to an ordinary respondent) acutely aware that my speeding through page after page, clicking “not sure” for 9 of the 10 choices and then assigning an answer choice to the one thing I’m familiar with is probably going to result in my responses being discarded anyway.

I have a sense, based on the level of detail each matrix has gone into, that I’m going to have another 4 or 5 of these waiting for me, and honestly, I’m hoping I time out while I write this; if I do, I’m done.

Is an aggravated respondent really in anyone’s best interest?

4 Comments

Filed under bad user experiences, data quality, Market Research, matrixes make me cry, web research

February 7, 2011 · 8:02 am

(Nit)Picking on Harris

So, I dipped by toe back in to web research this morning. Since I generally have decent experiences with Harris polls, I decided to give them a shot. As usual with them, it was a fairly painless experience; nothing really glaringly wrong or obnoxious — so let’s not dwell too much on either of these things.

First, from a pure design standpoint, I don’t understand the point of these massively over-wide columns. If you’re going to answer true for some and false for some, it’s really a lot of left-right mouse or trackpad motion — enough that it created a minor annoyance for me. In a 3-question true/false setup like this, it’s really not terrible — but in a longer series of questions, it might drive me to drink:

Wouldn’t this shopped version be easier to use?

Like I said, pretty minor. Which brings us to my second and final observation on this poll:

Who the heck are the numbers for?

So, yes: all pretty minor.

1 Comment

Filed under Harris, jargon, Market Research, silly nitpicking, web research

February 1, 2011 · 7:48 am

Feeling the Urge…

I’m feeling the urge to get back at this. Going to try to sit through some Toluna research today, see if it’s any better than the last time I looked, and then report back with some dire pronouncements about the future of market research.

Though I basically just said it in the comments on Ray’s excellent post.

Carry on; I’ll be along presently.

3 Comments

Filed under administrivia, general, Market Research

June 21, 2010 · 11:19 am

Obscure AND Potentially Personally Identifying? Let’s Ask It!

Sent in by a reader; click to embiggen:

Bad enough they’re asking for something few people would know offhand — and who wants to go fetch a piece of mail to get the answer — but I think there’s an equally bad issue here regarding respondent confidentiality, at least theoretically. A quick search of census data for some five-digit zip codes chosen at random from among those I’m familiar with around the country shows between about 8,500 and 16,000 occupied households in each. (I wouldn’t call that an average, as it’s practically anecdotal, but it’ll do for now, since I can’t find exactly what I’m looking for.) A zip+4, though, is designed to be reflective of a much, much smaller geography. According to the US Postal Service:

The 4-digit add-on number identifies a geographic segment within the 5-digit delivery area, such as a city block, office building, individual high-volume receiver of mail, or any other unit that would aid efficient mail sorting and delivery

How small are those “geographic segments?” You can use this USPS lookup tool to get a sense of it. I live on a suburban street; my house is on a corner. My immediate neighbor around the corner has a different zip+4; the people across the street have a different zip+4; the house immediately behind me has a different zip +4. The house next door to me, though, and the two houses that follow it going down to the end of the block — those all have the same zip+4 data. Apparently, my personal zip+4 will narrow you down to one of four homes.

Now, presumably, you gave your full mailing address when you signed up for this panel, so it’s not as if the research company) doesn’t already know exactly who you are and where you live — and it’s not as if telephone research doesn’t contain your even more personally identifiable phone number right there in the data — but still, this makes me uncomfortable. Rather than using back-end databases to append that information in post-production (which, for the millionth time, would be the ideal way to deal with this situation), we’re instead outright asking for something that both makes your data pretty easy to tie back to you and which you don’t know in the first place. (I actually thought I knew mine, and I don’t, though I was fairly close.)

All in all, this strikes me as a really bad question. What do you think?

3 Comments

Filed under bad user experiences, data quality, databases are your friend, ethics, Market Research, redundant questions, web research

May 11, 2010 · 11:23 am

LA Times: What the What?

So for a couple of weeks now, I’ve been getting emails from the Los Angeles Times about how my email newsletter subscriptions are about to end. I’ve been ignoring them, because I don’t think I actually get any emails from the Los Angeles Times. I suppose I must have registered with a real email address on their site to read a story once, years ago, before BugMeNot and their Firefox extension made such things unnecessary. In any case, I don’t care, fine, whatever, stop sending me those newsletters you’re not actually sending me, I’ll find a way to survive, despite the longing I shall forever feel in my heart.

Just now, though, I got this brilliant piece of email from them:

“Why have we stopped sending you emails?” WHAT DO YOU THINK THIS THING IS? IT’S AN EMAIL! THAT YOU’RE SENDING ME! ABOUT HOW YOU’VE STOPPED SENDING ME EMAILS WHICH IN ACTUALITY YOU NEVER WERE SENDING ME IN THE FIRST PLACE!”

It boggles the mind.