Introduction to Research Methods in Political Science: The POWERMUTT* Project (for use with SPSS Version 21) *Politically-Oriented Web-Enhanced Research Methods for Undergraduates — Topics & Tools Resources for introductory research methods courses in political science and related disciplines TABLE OF CONTENTS

V. VARIETIES OF DATA

Introduction

This topic addresses some of the different types of data analyzed by political scientists.  Research in which experiments are conducted is addressed first, not only because it is important in its own right, but because it is in key ways the “gold standard,” while other methods are often attempts to adapt the underlying logic of experimental design to conditions that do not lend themselves to experimentation.  Survey research is addressed next, since it is so widely used and since it illustrates very well the opportunities and problems that result when research is taken out of the lab and into the real world.  Finally, to illustrate the wide range of approaches to data collection in the study of politics, brief descriptions are provided of three other methods: focus groups, content analysis, and participant observation.

Experimental Design

In a classic experimental design, the objects to be studied are randomly assigned to one of two or more groups.  (The concept of randomness will be described more fully later in this topic).  The value of what we have called the independent variable (sometimes referred to in other disciplines by other terms such as the “stimulus” or “experimental” variable) is then manipulated in order to measure its impact on the dependent variable (also known by terms such as “response” or “criterion” variable).  For example, a medical researcher investigating a possible treatment for cancer may randomly assign mice with cancer to two different groups, one of which (the experimental group) will receive the treatment while the other (the control group) will not.  The researcher will then measure, for example, differences in survival rates between the two groups.

In a pure experimental design[1] one would have:

• Randomly assigned subjects (R)
• Pre- observations (O1) of the subject group
• An intervention by the researcher (eXperiment)
• Post observations of the subject group (O2)
• A control group’s observations (O3 , O4)

The classical empirical research design is often summarized as:

R  O1    O2
R  O3          O4

While it is relatively uncommon in social science research, this design does express an ideal world for the researcher.  The researcher can examine the state or condition of the subjects prior to an intervention (O1) and compare it with the state or condition afterwards (O2).  The assumption is that the changes between O1   and  O2  are due to “X”.  However, there are often competing explanations for the differences between O1, and  O2  so the researcher  uses R  O3 , O4 as a mechanism to control for all those explanations the researcher cannot foresee.  This latter group is called the control group.

If we wanted to prove that a research methods course improved a student’s ability to read and understand the literature in political science, then we could randomly assign college students to research classes and other students to non-research courses. Prior to the beginning of classes all students would be assessed as to their reading and understanding of the literature. The randomly assigned students in the research course would then be given a full term of instruction.  At the end of the term all students would again be evaluated as to their reading and comprehension of the literature.  This would be a classical research design and —

• Ideally O1 would be roughly equal to O3
• Ideally O2  would be higher than O1
• Ideally O4 would be higher than O3 because some improvement in reading the literature should be developed in all courses.
• Ideally O2 would be higher than O4

The design, along with the statistical analysis of the assessments, enables us to see if the research class (X) caused a change.  Cause is established by three factors. Did the researcher’s “X” precede the outcome (O2)?  Is the X statistically associated with the changes in O2?  Has the researcher established there are no competing explanations because the random changes are accounted for by O3 and O4?  So while this would be the ideal situation we understand why this type of research design is not more widely used because:

• Students sign up for classes and are not randomly assigned.  Therefore “R” is not feasible.
• Testing everyone prior to the academic term would be costly. The O1 and O3 options — while not impossible — are unlikely.
• At best we might find a comparison group at the end of students who would be willing to participate — but their investment might be considerably less.

Consequently, while social science researchers know what an ideal research design is, it is often the case that the context is not conducive to the ideal plan.  Therefore, in social science we talk about variables influencing other variables.  The independent variable “research class” influences the assessment scores of student’s comprehension but it may not be the cause because we have not met the scientific standard for cause.

Experimental design does play an important role in political research, both in the laboratory and in field studies.  An example of a subject well suited to laboratory research is the study of political campaign ads.  Stevens, for example, conducted a laboratory experiment in which undergraduates enrolled in introductory political science classes were shown negative (attack) ads.[2]  He found that these ads increased the information levels of sophisticated subjects (those scoring high on an index of political interest and knowledge), but produced little or no gain among those less sophisticated.  (He found similar results from non-experimental analysis using data from the American National Election Study.)  An example of field research is the study by Gerber and Green of the impact of different methods of encouraging voter turnout.[3]  The researchers randomly divided registered voters into different groups, one (the control group) that received no stimulus, the others (the experimental groups) that were contacted by mail, by phone, or in person with “get out the vote” messages.  Gerber and Green found that face-to-face contact had the greatest impact, while telephone calls had essentially none.

Other examples of field experiments involve question wording in public opinion surveys.  Sometimes researchers will randomly split a sample into two or more groups, each of which will be asked a question using slightly different words.  Most surveys overestimate voting participation, since many respondents are reluctant to admit not having voted.  In the 2004 American National Election Study, half of all respondents were simply asked whether they had voted in the November election. The other half of the sample was asked to choose from among four responses: 1) "I did not vote (in the election this November)," 2) "I thought about voting this time, but didn't," 3) "I usually vote, but didn't this time," and 4) "I am sure I voted."  Compare the "valid percents" in the two tables below.[4]  Note: actual turnout was 55 percent of the voting age population.[5]

Survey Research and Sampling

Survey research (or “public opinion polling”) is another “scientific” approach to the study of human behavior.  However, it also serves to demonstrate ways in which social research is “as much an art as a science.”  Good survey research goes to great lengths to employ rigorous sampling methods.  Equal care is devoted to careful design and pre-testing of questionnaires, training of interviewers, and processing and analyzing data.  On the other hand, survey research involves a great deal of uncertainty, and requires making a number of judgment calls about which reasonable people will disagree.

Sampling is very important in social science research.  It is often impossible to study all of the cases we might wish to, and so we instead take a sample from a larger population (also called a universe).  Survey research is the most common, though not the only, social science application of sampling techniques.

Ideally, a sample should be random.  A random sample is one such that each item in the population from which the sample is drawn has an equal probability of being included in the sample.[6]  The reason why this is important is that a random sample provides an unbiased estimate of the characteristics of the population, so that respondents will constitute a representative sample of the population.  Put another way, if 60 percent of a random sample of voters favor candidate X then, although it would be impractical to interview all voters, our best guess is that 60 percent of all voters also favor candidate X.

The reliability of this best guess will increase with the size of the sample.  If we have only interviewed a sample of 10 voters our results will be a lot less reliable than if we have interviewed a thousand.  Ninety-five times out of a hundred, a random sample of 1,000 will be accurate to within about 3 percentage points.  Put a bit more formally, such a sample has a margin of error, or confidence interval of approximately plus or minus (±) 3 percent at a 95 percent confidence level.  If a random sample of 1,000 voters shows that 60 percent favor candidate X, there is a 95 percent chance that the real figure in the population is someplace in the range of about 57 to 63 percent.

Creative Research Systems has provided an elegant on-line sample size calculator (http://www.surveysystem.com/sscalc.htm).  In the first dialog box, select the 95% confidence level, enter 3 for the confidence interval and 20000 for the population, and then click on “Calculate.”  What sample size do you need? Without changing anything else, add a zero to the population size, changing it 200000.  How much does the needed sample size increase?   Add three more zeroes, making the population size 200000000 (two hundred million).  By now you should have reached the somewhat counterintuitive conclusion that, beyond a certain point, the size of the population makes little difference.  If you were sampling people in a small town, you would need almost as large a sample as you would if your population consisted of the entire country.

In the second dialog box, select the 95% confidence level, enter 1000 as the sample size and 20000000 as the population.  (Leave “Percentage” at 50.  This refers to the estimated percent of the population having the characteristic you are sampling, and 50 is the most conservative option.)  Click on “Calculate.”  What is the confidence interval?  Without changing anything else, double the sample size to 2000 and again click on “Calculate.”   Notice that the confidence interval is not reduced dramatically.  That is why surveys usually don’t exceed more than about 1,500 respondents.  The number of interviews is a dominant factor in driving the costs of a survey, and beyond a certain point increasing this number is not cost effective, since costs will increase almost proportionately, but the margin of error will be reduced only a little.

Often it is not practical to carry out a pure random sample.  One common shortcut is the area cluster sample.  In this approach, a number of Primary Sampling Units (PSUs) are selected at random within a larger geographic area.  For example, a study of the United States might begin by choosing a subset of congressional districts.  Within each PSU, smaller areas may be selected in several stages down to the individual household.  Within each household, an individual respondent is then chosen.  Ideally, each stage of the process is carried out at random.  Even when this is done, the resulting sampling error will tend to be a little higher than in a pure random sample,[7] but the cost savings may make the trade-off well worthwhile.

Somewhat similar to a cluster sample is a stratified sample.  Suppose, for example, that you were studying opinion among students at a university, and wanted to be sure that the numbers of lower division, upper division, and graduate students were representative of the student body as a whole.  You might proceed by dividing the student body into three strata representing these categories, and then select students at random from each of the strata in proportion to their share of the student population.  Sometimes, a research design will call for deliberate oversampling of some small groups so that there are sufficient cases to permit reliable analysis.  (If the university consists mostly of undergraduates, you might need to oversample grad students in order to have enough of them on which to base comparisons with undergrads.)  However, any time analysis is carried out that combines the different strata, cases must we weighted in order to correct for this oversampling.     (In our example, failure to weight cases would result in overestimating the graduate student component of student body opinion.)

The essential difference between area cluster sampling and stratified sampling is their purpose.  An area cluster sample is appropriate when it would be impractical to conduct a random sample over the entire population being studied.  A stratified sample is appropriate when it is important to ensure inclusion in the sample of sufficient numbers of respondents within subcategories of the population.  An important point to stress is that sampling is conducted at random within each cluster or strata.

Many so-called public opinion polls fail to employ random sampling methods.  As a citizen, as well as a student of political science, it is important that you be able to recognize such polls, and their severe limitations.  You may have been stopped at a shopping mall by someone with a clipboard, and asked some questions about your shopping habits.  Maybe while watching a college football game on TV you have been asked to call a toll free (or toll) number to vote for your choice for the Heisman Award or, while reading a story online, clicked on a question about policy in the Middle East.  Perhaps you have clipped, filled out, and mailed in a questionnaire in a magazine.

None of these surveys employ anything like random sampling.  Most rely primarily on self-selection, and those who opt to call, click, or mail in a survey may well differ systematically in their views from those who do not.  Even if those questioning customers at the mall are careful to include representative numbers of men and women, older and younger shoppers, people of different races, etc., this approach, called a quota sample, should not be confused with a stratified random sample, since there is no guarantee of representativeness within the various groups questioned.  Those visiting the mall may or may have different views from demographically similar people who, for example, shop on line or at neighborhood markets.

Statistical methods used to make inferences about populations based on samples cannot legitimately be applied to non-probability based samples.  Such samples should be avoided if possible.  When you see surveys based on non-probability based samples reported in the media, you may find them interesting or entertaining, but should take their findings with a grain of salt.

Even in the best designed surveys, strict random sampling is a goal that can almost never be fully achieved under real world conditions, resulting in non-random (or “systematic”) error.  Let us assume that a survey is being conducted by phone.  Not everyone has a phone.  Not all who do are home when called.  A version of the questionnaire may not be available in a language spoken by the person to be interviewed.  People may refuse to participate, especially if they have had their patience tried by telemarketers.  The resulting sample of people who are willing and able to participate may differ in systematic ways from other potential respondents.

Apart from non-randomness of samples, there are other sources of systematic error in surveys.  Slight differences in question wording may produce large differences in how questions are answered.  The order in which questions are asked may influence responses.  Respondents may lie.

Journalists who use polls to measure the “horse race” aspect of a political campaign face additional problems.  One is trying to guess which respondents will actually turn out to vote.  Pollsters have devised various methods for isolating the responses of “likely voters,” but these are basically educated guesses.  Exit polls, in which voters are questioned as they leave the voting area, avoid this problem, but the widespread use of absentee voting in many states creates new problems.  These issues are less of a problem for academic survey research, where the focus is usually on analyzing existing patterns rather than on predicting future events. Some such surveys are even conducted after an election is over.  The American National Election Study, for example, includes both pre and post election interviews.  Post election surveys are not without their own pitfalls, however.  Respondents will sometimes have a tendency to report voting for the winner, even when they did not.

While surveys can be conducted by mail, these usually yield very low and often unrepresentative response rates, and so the preferred survey methods are face-to-face or via telephone.  The General Social Survey still employs face-to-face interviews wherever feasible.[8]  The American National Election Study split the 2000 sample between face-to-face and telephone interviews, went to an all-telephone survey in 2002, and returned to face-to-face interviews in 2004.[9]

In general, however, telephone surveys have increasingly become the method of choice.  The biggest advantage is cost.  The per interview cost of telephone interviews is simply far less than what is required when interviewers are sent door- to-door (thus spending more time getting to interview sites and incurring travel expenses).  There are other factors favoring the use of the telephone.  Interviewers can be more easily trained and more closely supervised.  Problems that arise can be dealt with on the spot.  Computer Assisted Telephone Interviewing (CATI) technology can be employed for such things as random-digit dialing, call-backs, and data entry.

A disadvantage of telephone surveys compared to door-to door, face-to face interviews is their relatively low response rates.  When the American National Election Study split its sample between face to face and telephone interviews for its 2000 pre-election survey, it obtained a response rate of 64.8 percent for the former, compared to 57.2 percent for the latter.   An analysis of a number of telephone and face-to-face surveys showed that face-to-face surveys were generally more representative of the demographic characteristics of the general population.[10]  Note that most telephone surveys produce response rates far lower than that obtained by the ANES, and "telephone survey response rates are on a precipitous decline."[11] Another problem that has surfaced in recent years is that many (especially younger) voters rely entirely on cell phones rather than on land lines, and calling the former is more costly since federal law requires that they be called manually.

These difficulties have led to at least two alternative approaches, the "robo" poll and the online poll, both often referred to as "interactive" surveys.[12]  A robo poll, known more formally as an Interactive Voter Response (IVR) poll, is a telephone survey in which the questions are asked by a computer using the voice of a professional actor. Though these polls have very poor response rates (because it's much easier to hang up on a computer than on a fellow human being), the per-interview costs are much lower.  The hope is that larger sample sizes can compensate for samples that are less random.

Another approach is the online poll, in which the "interaction" is conducted over the Internet.  Like robo polls, online polls are also less expensive than traditional telephone surveys, and so larger samples are feasible.  Because they require respondents to "opt in," however, the results are not really random samples.

When samples, however obtained, differ from known characteristics of the population (for example, by comparing the sample to recent census figures), samples can be weighted  to compensate for under or over representation of certain groups.  There is still no way of knowing, however, whether respondents and non-respondents within these groups differ in their political attitudes and behavior.

Focus Groups

Focus groups and surveys have some similarities, and are often used by the same researchers.  They are, however, quite different and it is important not to confuse them.

Specific differences between a survey and a focus group include the following:

• A survey will typically include about 1,000-1,500 respondents.  A focus group will typically include 8-10 participants.
• Respondents to a survey are, insofar as possible, selected at random so as to constitute a representative sample of the population from which they are drawn.  Members of a focus group are chosen because the members have something relevant in common.  For example, they may be first time voters.  To obtain a range of perspectives, a study may include a number of different focus groups, each with different common characteristics.  Even taken together, however, participants in focus groups cannot be considered a representative sample of the population.
• A survey will consist of a carefully structured questionnaire in order to obtain comparable responses from all people surveyed.  A focus group will be much freer flowing in order to probe more deeply into people's opinions and feelings.
• A survey is most useful for testing hypotheses.  A focus group is most useful for generating hypotheses.
• Information gathered from surveys lends itself to quantitative analysis.  Information gathered from focus groups lends itself to qualitative analysis.

Surveys and focus groups are complementary rather than competing methods and are often used in combination.  For example, before a survey questionnaire is put together, researchers may conduct a series of focus groups. In these groups, an attempt will be made to get a better sense of what issues are of greatest concern to participants, how they react to certain words or concepts, or how they perceive different candidates and parties.  Researchers may then be better able to know which questions to include in a survey, and how best to word them.  Avoid the temptation to treat focus groups as a less expensive alternative to surveys.

Content Analysis

The study of texts or documents of one sort or another has long been important in political research.  Content analysis attempts to make such study more rigorous.  An example is the Penn State Event Data Project.  Researchers there have developed software to read, code, and analyze extensive electronic document collections (e.g., the Reuters wire service reports) in order to study patterns of interaction between nations.  Their hope is that understanding these interactions better might help avoid international conflict.

Participant Observation

This approach was originally developed by anthropologists such as Margaret Mead.  These scholars sought to understand cultures (in Mead’s case, those of tribal societies in the South Pacific) by immersing themselves in those cultures as fully as possible over extended periods of time.  An example of participant observation in recent political science is the work of Richard Fenno, who studied members of congress by following them around on their visits to their states and districts, “soaking and poking” (in Fenno’s words), and trying to blend in with the members’ environments.  Such research is, by its nature, largely qualitative, and better suited to generating than to testing hypotheses.

Key Concepts

Exercises

1.   Assume that you wish to survey students at your college or university regarding their opinions on various issues, their political party loyalties, and their voting intentions in the next election.  Design an appropriate questionnaire, decide how many students you will need for your sample, and spell out how the sampling will be done.  (If you are actually planning to carry out such a survey, be aware that your institution has, or should have, rigorous legal and ethical standards for conducting research involving human subjects.  Before carrying out your research, yoyur research design will need to be approved by your campus's Institutional Review Board (IRB). Allow plenty of time to find out what the standards are, and be sure to incorporate them into your design.)

The following two exercises are intended to provide extremely simple examples of content analysis using only commonly available software.

The next two exercises require that your library subscribes to LexisNexis Academic. In LexisNexis, choose "Power Search," "Terms and Connectors," and "All News (English)."  Then try the following:

2. Former California Governor Arnold Schwarzenegger has often been called a RINO (Republican in Name Only) because of his liberal stands on some issues at some times. On other issues or at other times, however, he has been conservative.  To see how his image may have changed over time, type "Schwarzenegger W/5 liberal" into the search box, and specify that you wish to search between January 1 and December 31 of 2002.  ("Schwarzenegger W/5 liberal" means that you are looking for articles in which "liberal" and "Schwarzenegger" appear within five words of each other.)   How many hits did you get?  Repeat, but substitute "conservative" for "liberal."  Repeat for both "liberal" and "conservative," but for each year since 2002.  Do you see any patterns?

3. Sometimes the media use different terms to describe the same or similar concepts, and sometimes terms go in or out of fashion for one reason or another. For each of the past 10 years, compare the relative frequencies of the following:

• "global warming" vs. "climate change";
• "gun control" vs. "gun rights";
• "gay marriage" vs. "same sex marriage."

4. This exercise requires a word processor or other software capable of searching text files. Go to http://www.csupomona.edu/~jlkorey/205/ and download and open the files named "Biden.doc," "McCain.doc," "Obams.doc," and "Palin.doc." These are the acceptance speeches delivered by the presidential and vice presidential candidates of the two major parties in 2008. What words or terms do you think were especially closely linked to each of the candidates? Search each of the four documents to see how often this word or term was used by each of the speakers.

For Further Study

Experiments:

Colorado State University , “Overview: Experimental and Quasi-Experimental Research,”http://writing.colostate.edu/guides/research/experiment/.

Survey Research:

Nelson, Elizabeth N., and Edward E. Nelson, “Survey Research Design and Quantitative Methods of Analysis for Cross-Sectional Data,”California Opinions on Women's Issues -- 1985-1995. http://www.ssric.org/trd/modules/cowi/chapter3

Palmquist, Ruth A., Survey Methods. http://www.gslis.utexas.edu/~palmquis/courses/survey.html.

Research Randomizer. http://randomizer.org.

The Roper Center, Polling 101: Fundamentals of Polling. http://www.ropercenter.uconn.edu/education/polling_fundamentals.html.

The Roper Center, Polling 201: Analyzing Polls. http://www.ropercenter.uconn.edu/education/analyzing_polls.html.

There are several good sources that "aggregate" polling data available online, including:

Focus Groups:

Center for Institutional Research, Evaluation, and Planning, University of Texas at El Paso, "The Who, What, When, Where, Why, and How (many) of Focus Groups." http://irp.utep.edu/Default.aspx?tabid=57975.

Luntz, Frank I., “Focus Groups in American Politics,” PollingReport.com. http://www.pollingreport.com/focus.htm.

Content Analysis

Colorado State University , "Overview: Content Analysis,"  http://writing.colostate.edu/guides/research/content/.

[1]  This paragraph, and the five that follow it, were written by Sandra Emerson, and are used (in slightly modified form) with permission.

[2] Daniel Stevens, "Separate and Unequal Effects: Information, Political Sophistication and Negative Advertising in American Elections, Political Research Quarterly 58 (September 2005: 413-425.

[3]  Alan S. Gerber and Donald P. Green, “The Effects of Personal Canvassing, Telephone Calls, and Direct Mail on Voter Turnout: A Field Experiment,” American Political Science Review 94 (Sept. 2000): 653-664.

[4]  In these tables, a weight variable was used.

[5]  The American Presidency Project, "Voter Turnout in Presidential Elections, 1829-2008," http://www.presidency.ucsb.edu/data/turnout.php. Accessed Gebruary 20, 2013.

[6] Strictly speaking, a distinction needs to be made between a “simple random sample,” in which 1) each item and 2) each combination of items in the population have an equal probability of being included in the sample, and a “systematic random sample,” which meets only the first of these conditions.  An example of a systematic random sample would be one chosen by selecting every 100th name from a phone directory.  In this case, two persons who are listed adjacent to one another in the directory would never end up in the same sample.  For most purposes, however, this distinction is of no great practical importance.

[7] Herbert M. Blalock, Jr., Social Statistics revised 2nd ed. NY: McGraw Hill, 1979: 568-569.

[8] General Social Survey, "FAQs: 6. How is the GSS administered?", http://publicdata.norc.org:41000/gssbeta/faqs.html#6. Accessed February 20, 2013.

[9] Nancy Burns,, Donald R. Kinder, Steven J. Rosenstone, Virginia Sapiro, and the National Election Studies. National Election Studies, 2000: Pre-/Post-Election Study [codebook].   Ann Arbor , MI : University of Michigan , Center for Political Studies, 2001.

[10] Charles H. Ellis and Jon A. Krosnick, “Comparing Telephone and Face-to-Face Surveys in Terms of Sample Representativeness: A Meta-Analysis of Demographic Characteristics,”  National Election Studies: Technical Report #59. http://www.electionstudies.org/resources/papers/documents/nes010871.pdf.  April 1999.  Accessed February 20, 2013.

[11] Marketing Charts, "Telephone Survey Response Rates Dropping; Accuracy Remains High" http://www.marketingcharts.com/wp/direct/telephone-survey-response-rates-dropping-accuracy-remains-high-22107/

[12] Among organizations conducting robo polls are Rassmussen Reports and SurveyUSA.  Online polls include those by Harris Interactive and CONNECTA.   For discussions of these types of polls, see http://www.pollster.com/blogs/ivr_polls/ and http://www.pollster.com/blogs/internet_polls/.

Last updated April 19, 2013 .