Vue October 2011 Special Feature

Page 1

Like a Survey Annie Pettit shows how social media research is not to be feared, but rather, has so much in common with survey research practices, that you already know how to make the leap to using social media for research purposes.

Annie Pettit

A strange, frightening beast prowls loudly about. You can’t quite see it, you’re not sure if you want to touch it, but it’s intriguing. Something about it seems oddly familiar but you can’t quite place your finger on it. The beast, or social media research as we’ve now learned to call it, shouldn’t be scary and strange. In fact, it’s just like a survey, the surveys you already know and love, and here is why. Data Collection

The first step of data collection in the survey world is finding people from whom to gather opinions. Survey panels are among the most popular resources, with every panel working on its own secret sauce to attract a wide range of panelists from a wide range of sources. They may use various techniques such as Internet advertising, banners, website partners, and other methodologies to ensure they have accessed as many types of people as possible. This 12

vue October 2011

ensures that their panels have as little bias or skew as possible. In the social media research space, data collection is viewed in a very similar way. Attempts are made to find people’s opinions (e.g., status updates, tweets, messages, comments, replies) from as wide a range of sources as possible. While most clients are interested in opinions collected from Twitter and Facebook, the Internet space of social media opinions is much larger. Care is taken to seek out opinions from blogs (e.g., Blogger, Wordpress), video sites (e.g., YouTube, MetaCafe), news sites (e.g., CNN, Fox News), and many other places across the Internet that permit people to share their opinions online. By accessing as many different types of websites as possible, data collectors can ensure there is as little bias or skew as possible.


S P EC IAL F E AT U R E

Data Quality

In the survey and focus group world, we’ve spent decades identifying and fighting myriad data quality issues. We’ve identified the issue of straightlining, and designed checks to identify people who straightline in inappropriate places. We’ve identified the issue of people rushing through surveys, and designed data quality checks to weed out speeders. We’ve created red herring tests, age and gender matching tests, verbatim tests, and incidence rate checks. We’ve almost gotten to the point where we’ve created checks to check the checks. Once again, social media data is like a survey, in that data quality issues abound. The issues may be slightly different, but researchers are continuously working to identify them and create precise quality checks to deal with them. For instance, certain categories of data, such as financial and pharmaceutical data, are known to be heavy with spam. Anyone who writes or reads an unmoderated blog or forum knows all about social media comments like, “You have most wonderful blog about Adidas shoes buy Viagra.” These comments fill up the blog page in hopes that readers will come to a realization they do want to buy Viagra. Of course, comments like these, though they mention the desired brand name, aren’t actually about the

groups of people. Either way, the sampling process allows us to ensure that the people participating in our research are relevant for our work and have the knowledge and opinions required. Identifying people who are relevant for a research objective can be a difficult process, but in the social media space, we have data on our side. The mere process of mentioning a brand name in social media means that it is somehow relevant to the person. They like it, they hate it, they buy it, they avoid it, they’ve heard of it, they’ve been told to check it out, they’ve heard about it on TV, they’re wondering what it is. There is a reason that the brand name was mentioned, and that makes their opinion important to understand. Social media data benefits from the unique perspective, because instead of survey researchers trying to identify the target audience, the social media audience identifies itself to the data collector. Weighting

As hard as researchers try, it’s almost always impossible to collect a set of data that perfectly matches a sampling matrix. Weighting allows us to ensure that, even when opinions are not collected in proportions that reflect our desired population, the end results will. For example, if an

New methodologies can seem overwhelming at first. But, once they are framed around familiar words and methodologies, what was once new and scary is familiar and easy to understand. brand, and social media systems strive to identify and remove them. There are also data quality concerns around astroturfing, a problem whereby people are paid to deliberately make positive comments about a brand on as many websites as possible. The hope is that people reading those websites will think that a large group of happy consumers love the brand. Fortunately, these types of comments can also be caught via carefully tuned automated systems. For sure, more data quality issues will arise but, like surveys, we’ll continue to hunt and eradicate them. Sampling

Survey sampling is an extremely important process that develops out of the research objective. Whether we use a survey panel or in-house client lists, researchers have a specific sampling goal to achieve. Perhaps we need to speak with a demographically diverse sample of people, or perhaps we need to speak with a sample of pre-qualified, targeted

attempt to collect a census representative sample of research participants resulted in a sample that was 75% female and 25% male, our weighting processes allow us to interpret the data as if it was 50% female and 50% male. Like surveys, social media research depends on weighting processes, though the variables used are different. Instead of weighting opinions based on the demographic profile of the research participant, opinions are weighted based on the source of the opinion. Weighting becomes extremely important when you consider that different websites attract very different users and produce very different amounts of data. For instance, Twitter attracts early adopters, encourages flaring tempers, and produces a lot of data. On the other hand, Blogger attracts people who think out their ideas carefully in well-thought-out blog postings and produce less data. Thus, even when Twitter data accounts for 50% of a data set, researchers can consider weighting those responses down to account for the fact that only about 13% of

vue October 2011

13


S P EC IAL F E AT U R E

Internet users actually use Twitter. Failing to consider whether weighting is an appropriate tool for a specific job may render results that are not generalizeable to the intended Internet population. Scaling

Thinking about words as numbers may feel like a foreign concept but it is actually an everyday process for every market researcher. When a survey is set up, we make many decisions about how we want people to answer the questions. We decide whether to use three point, five point, seven point, or ten point scales. We decide what labels to put on those scales, whether they are likelihood to recommend, likelihood to purchase, satisfaction, or agreement. Once we have made our decisions, we then let research participants select numbers on our scale that best reflect their interpretation of the scale. For instance, a research participant may feel that he or she is “Very Likely” to “Intend to purchase Brand A.” In the social media data space, the same theories apply. The researcher selects the most appropriate scale whether it is three points (positive, neutral, negative), five points (strongly positive, somewhat positive, neutral, somewhat negative, strongly negative), or something else. Because sentiment can be scored on a continuous scale, researchers can decide exactly which scale is most appropriate for their work. The major difference, however, is that as opposed to the data contributors deciding which number best suits their opinion, the researcher decides which number best suits the opinion. For example, the researcher may decide that “totally gonna buy Brand A” best suits a coding of “Strongly positive.” On the other hand, the researcher may decide that “don’t buy that crap” falls into “Somewhat negative.” The addition of a swear word may render “don’t buy that f****** crap” a “Strongly negative.” Like a survey, scaling is inherent in the social media research process. Variables

We hope that every survey is designed with specific research objectives in mind. Whether those objectives result in a usage and attitude study, a segmentation study, or a tracker, the objectives require specific questions to be asked. To meet these needs, researchers design questions about purchasing behaviours, recommendation behaviours, trial, and usage behaviours. We frame those such that they will work well with the scales that we have designed. Thus, a typical purchase question may be written as, “How likely are you to purchase Brand A?” Variables are also an important component of social media data. Once again, however, the variables are determined by the data contributors who choose what

14

vue October 2011

topics are important enough for them to talk about. Once this data is collected, researchers can create variables to match the topics of conversation. For example, verbatims that use words or phrases like “gonna buy,” “want to purchase,” and “gonna lay down some cash” would be classified into the purchasing variable. Verbatims that use phrases like “you gotta buy that” or “I’d totally recommend that” would be classified into the recommendation variable. And, when combined with sentiment information, social media variables plus sentiment scores are a direct match to questions on a survey. Ethical Standards

The ethics of using social media data for research purposes is still under massive debate but there are a number of issues to consider as you develop your own opinions. As with survey research, the overriding principle is respect for the research participant. We do our best in survey research to avoid abusing our respondents, though we sometimes fail when we request their participation in 60-minute surveys with massive grids. And, we avoid interacting with, interfering with, or talking to people when we conduct observational research in shopping malls and stores. Similarly, with social media data, we must ensure that we always treat data contributors with respect. We must remember that they have not volunteered to give us data, they have not given us permission to directly quote their comments, and they have not invited us into their Twitter stream. We are outside observers and must act respectfully while we do our work. Conclusions

New methodologies can seem overwhelming at first. But, once they are framed around familiar words and methodologies, what was once new and scary is familiar and easy to understand. Data collection, data quality, sampling, weighting, scaling, and variables are processes we already know and understand, and they apply perfectly to social media data as well. Like a survey, social media research is just a good ol’ friend.

Annie Pettit, PhD, is VP Research Standards at Research Now and Chief Research Officer of Conversition Strategies, a social media research company. She is an online market researcher who specializes in social media market research, survey research, and data quality. Annie is a member of the CASRO, MRA, and ESOMAR social media research committees. She tweets at @LoveStats and maintains the Conversition and the LoveStats marketing research blogs where she occasionally showcases her attempts at being a better baker and gardener.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.