Data Collection Mode Effect on Abortion Questions : A Comparison of Face-To-Face and Web Surveys

Public opinion on abortion has been changing overtime (Shaw, 2003). The attitudes toward abortion are very complex around the world and in the U.S., it is largely polarized. The debate over pro-choice versus pro-life has always been a central topic for discussion (Medoff, 2013) and the public opinion has becoming even more divided on this. For example, Shaw (2003) showed that the in 1992 35% of the public were pro-life and 59% were pro-choice. In 2003, the difference became much smaller (45% pro-life versus 48% pro-choice).The acceptance of abortion varied depending on a several factors, such as pre-adulthood factors (Pacheco & Kreitzer, 2016), demographic (Woodhams, Hill, Fabiyi, & Gilliam, 2016), media consumption (Altshuler, Gerns Storey, & Prager, 2015), and occupation (Begun, Kattari, McKay, Winter, & O’Neill, 2017; Sjöström, Essén, Sydén, Gemzell-Danielsson, & Klingberg-Allvin, 2014).Along with the change of public opinion on abortion, in the U.S. the estimated abortion rate per 1000 women aged 15 to 44 years also declined from 19.4 to 14.6 between 2008 and 2014, a 25% change (Jones & Jerman, 2017). However, the rate of decline varied depending on the demographics of women, such as age, race and ethnicity, and income.


Introduction
Public opinion on abortion has been changing overtime (Shaw, 2003).The attitudes toward abortion are very complex around the world and in the U.S., it is largely polarized.The debate over pro-choice versus pro-life has always been a central topic for discussion (Medoff, 2013) and the public opinion has becoming even more divided on this.For example, Shaw (2003) showed that the in 1992 35% of the public were pro-life and 59% were pro-choice.In 2003, the difference became much smaller (45% pro-life versus 48% pro-choice).The acceptance of abortion varied depending on a several factors, such as pre-adulthood factors (Pacheco & Kreitzer, 2016), demographic (Woodhams, Hill, Fabiyi, & Gilliam, 2016), media consumption (Altshuler, Gerns Storey, & Prager, 2015), and occupation (Begun, Kattari, McKay, Winter, & O'Neill, 2017;Sjöström, Essén, Sydén, Gemzell-Danielsson, & Klingberg-Allvin, 2014).Along with the change of public opinion on abortion, in the U.S. the estimated abortion rate per 1000 women aged 15 to 44 years also declined from 19.4 to 14.6 between 2008 and 2014, a 25% change (Jones & Jerman, 2017).However, the rate of decline varied depending on the demographics of women, such as age, race and ethnicity, and income.
It is important to note that both public opinion research toward abortion and population estimates of the abortion are oftenbased on survey data.This type of studies often carries a profound policy making impact when policies are informed by findings of public opinion survey research.Given that, research on the measurement of the relevant questions will benefit the understanding of public opinion as it relates to abortion.More specifically, a solid understanding of mode effect on abortion attitudes can have implication on how researchers, practitioners and policy makers measure the public opinion of this important topic.To date, survey researchers have spent limited efforts in examining the survey methodology aspect of the abortion-related attitudinal question.An early study on measuring abortion in surveys of U.S. women analyzed three major surveys -National Survey of Family Growth, National Surveys of Young Women and National Longitudinal Surveys of Work Experience of Youthand found that the self-reported abortion was highly deficient, especially among nonwhite women (Jones & Forrest, 1992).In a survey conducted in Mexico, Lara, Strickler, Olavarrieta, & Ellertson (2004) tested four data collection modes, namely face-to-face, audio computerassisted self-interview, paper-based self-administered questionnaire and a random-response technique.They found that the random-response technique yielded the highest selfreport abortion attempts.A meta-analysis also showed that the random-response technique increased the self-report to sensitive question, including abortion, compared to other data collection mode (Lensvelt-Mulders, Hox, Van der Heijden, & Maas, 2005).A recent study by Singer and Couper (2014) compared using "baby" versus "fetus" when asking about attitudes toward abortion and found that there was no significant difference on abortion preferences but the preference for a prenatal testing for genetic defects differed by the question wording (Singer & Couper, 2014).The choice of data collection mode is a crucial factor when it comes to measuring attitudes and opinions toward abortion.The literature shows that survey responses vary, depending on the survey mode, especially for sensitive questions like the ones examined in this study (Kreuter, Presser, & Tourangeau, 2008).
There are some literatures on the mode effect between face-to-face and Web surveys, and some consistent findings can be drawn from previous studies.First, the response rates tend to be higher for face-to-face than Web surveys.This is possibly due to different levels of interviewer contacts between these two modes (Christensen, Ekholm, Glümer, & Juel, 2014;Heerwegh & Loosveldt, 2008;Manfreda et al., 2008;Revilla & Saris, 2012).Interviewers in the face-to-face surveys recruit respondents, introduce the survey, address their concerns, and persuade them to participate, while in Web surveys, respondents are usually recruited by email or mail, and they initiate the survey themselves.Very limited direct inter-personal interaction with respondents from the survey organization exists in a Web survey.Second, face-to-face surveys suffer from a higher level of social desirability bias than Web surveys, especially when asking sensitive questions (Heerwegh, 2009).Social desirability bias refers to the phenomenon of over reporting socially desirable attitudes and behaviors while under reporting socially undesirable ones (Callegaro, 2008).Social desirability bias is most prevalent when respondents answer questions asking for sensitive information or questions with a potentially socially desirable response (Christensen et al., 2014;Duffy, Smith, Terhanian, & Bremer, 2005).Face-to-face respondents are more likely to provide a socially desirable answer to present themselves in a favorable image or to avoid tensions and negative judgments from the interviewer or when other people are present during the survey.The relatively higher level of anonymity and confidentially in a self-administered Web survey can increase the disclosure of undesirable responses.Third, the data quality for these two modes is mixed.While Web surveys tend to show a higher level of item nonresponse in general and more non-differentiation for rating scales, face-to-face surveys are susceptible to more extreme response bias (Beukenhorst et al., 2014;Goldenbeld & de Craen, 2013;Heerwegh, 2009;Heerwegh & Loosveldt, 2008).Given the mixed findings, more comparative research is pressing to examine the mode effects on measurement and data quality between face-to-face and Web surveys.This current study intends to expand the existing literature on the measurement of abortion questions by examining the data collection mode effect on attitudes toward abortion.In particular, this study examines the responses to eight attitudinal questions on abortion from two nationally representative surveys, one through face-to-face and one through Web.More specifically, this study compares the substantive responses and item nonresponse rates of abortion questions between these two modes of data collection.The eight questions asked respondent's opinion about abortion in eight scenarios, including nonfatal health risk of pregnant women, fatal health risk of pregnant women, incest, rape, birth defect of fetus, financial hardship, child will not be the sex woman wants it to be, and woman's choice (see Appendix for exact wordings).Why should we expect that a mode effect exists in abortion questions?On the one hand, attitudinal questions on abortion are typically seen as sensitive and the differential levels of social desirability bias between face-to-face and Web surveys are likely to result in different patterns of response.Specifically, there are two possibilities.First, respondents hide their real attitudes and provide more acceptable albeit untruthful responses (Christensen et al., 2014;Duffy et al., 2005;Liu & Wang, 2015).In this case, we should expect to observe a different response pattern to the abortion questions.Second, respondents can also withhold their opinion completely by not offering a substantive response (Christensen et al., 2014).This will result in item nonresponse, including "don't know" and "refusal" responses.The interviewer involvement in the face-to-face survey is likely to increase the social desirability bias, and hence face-to-face respondents are more likely to provide socially acceptable responses or not provide answers at all.On the other hand, the higher motivation resulting from the interviewer involvement in the face-to-face interviews is likely to result in more thoughtful and conscientious responses compared to Web surveys (Beukenhorst et al., 2014;Goldenbeld & de Craen, 2013;Heerwegh, 2009;Heerwegh & Loosveldt, 2008).Therefore, face-to-face respondents will provide answers that are less ambiguous answers (i.e., the middle option) and less non-substantive answers (i.e., item nonresponse).This study will also examine whether the survey mode effect differs by gender.Research has shown gender differences on attitudes toward abortion (Finlay, 1981;Lohan, Cruise, O'Halloran, Alderdice, & Hyde, 2011;Schwandt et al., 2013).Women hold more liberal attitudes and a higher approval of women's autonomy in abortion decisions than men (Patel & Johns, 2009).It is possible that when a topic is more related to the respondents, as abortion is to women, respondents are likely to possess a well-formed attitude and be less susceptible to the impact of the survey mode.By contrast, men's attitudes toward abortion may not be as solid and hence they are more likely to edit their responses based on a sense of privacy, anonymity and confidentially of the survey mode.

Study population and data collection
This study examines the data collection mode effect using the 2012 American National Election Studies (ANES).The 2012 ANES is a national representative survey examining the general population's electoral participation, voting behavior, and public opinion.The target population is U.S. citizens aged 18 or older as of the 2012 Election Day.The 2012 ANES includes two waves of data collection, namely a pre-election study and a post-election study, and the same respondents were interviewed twice.The field period for the pre-election study was between September and November 2012, and the post-election study was between November 2012 and January 2013.The 2012 ANES innovatively conducted two parallel surveys, one through a face-to-face mode and one through the Web, using two independent national samples and one identical questionnaire.
The Web survey was conducted using GfK Knowledge Panel, which is a nationally representative online panel.The panelists were recruited through address-based sampling and random-digit dialing.All household members were enumerated at the recruiting stage and demographic information was collected before any survey.Respondents for the 2012 ANES were selected from this probability-based GfK Knowledge Panel.The face-to-face survey used an address-based, stratified, multi-stage cluster sample.The first stage of sampling consisted of stratifying the 48 contiguous states and the District of Columbia into nine regions corresponding to Census Divisions, which constitute this study's strata.Within each of the nine regions, census tracts were then randomly selected proportionally to the region's proportion of the U.S. adult population.In the second stage, residential addresses within each tract were randomly selected.In the third stage, one eligible person per household was randomly selected.The sample included a main sample and two over samples for African Americans and Hispanic Americans, respectively.Within each household, random selection was performed, and one person was selected for the survey.Since these two probability samples both target the same U.S. general population, they should have comparable coverage .

Measures
As mentioned, eight abortion questions were asked about a third into the post-election survey (see Appendix A).The order of the first seven questions was randomized, and the last question on women's choice was always asked last.Three response options, namely favor, oppose, or neither favor nor oppose, were provided.Face-to-face respondents can answer "don't know" or refuse to answer any question and it is coded as item nonresponse.Web respondents can skip a question, which is coded as item nonresponse.In the analysis, I calculated the percentage of item nonresponse for each of the eight questions and compared them across the two modes.

Analytical approach
The analyses contain two parts.First, the percentages of item nonresponse are compared 3 of 10 GENDER AND WOMEN'S STUDIES Liu, M. Gender and Women's Studies.2018, 1(1):2.
between face-to-face and Web surveys using a chi-square test.Second, the response distribution of each question is compared between the two modes using a chi-square test.Both analyses are performed for the whole samples, and for males and females separately.Considering the quasi-experimental nature of the survey data, I analyzed the data through propensity score weighting technique.Specifically, I first conducted a propensity model to predict the participation in face-to-face versus Web survey with variables that were potentially correlated with the response propensity for both data collection modes.The variables used in the propensity model included respondent's gender, age, marital status, education level, employment status, belonging to social class, race and ethnicity, number of children, home internet access, household income, home ownership, and years lived in the current address.Taking the predicted probability of the participating to face-to-face versus Web survey, I created a propensity weight for each respondent.Last, I calculated the weighted distribution for both the substance responses and item nonresponse, and performed the weighted statistical test accordingly.All the results showed in this paper were adjusted with the propensity weight.All analyses were conducted in R.

Demographic distributions
The unweighted demographic distributions differ significantly between face-to-face and Web (Table 1).Once weights are applied, the demographic distributions become not significant between modes.For both surveys, the weighted analyses show over 70% of the respondents are non-Hispanic white; over 53% of the respondents are married; and half of the respondents' household income falls less than $50,000.

Item nonresponse
As a first step to assess data quality for these two modes, I calculated the percentage of item nonresponse for each of the eight questions and compared them across the two modes.Across both mode and all questions, the item nonresponse rate is between 8.2% and 10.6%.For six out of the eight questions, Web survey has slightly more missing data than face-to-face survey although none of the difference is statistically significant.Similarly, when comparing the item nonresponse rate between the two data collection modes for male and female separately, there is no statistically significant difference for any of the questions.This suggests a lack of mode effect on item nonresponse for abortion related questions.Respondents are neither more or less likely to provide a non-substantive response, including "don't know" and "refuse," to these three questions in either modes.

Attitudes toward abortion
I next present the results of the mode effect on substantive responses for the eight questions in Table 3.For each abortion question, I compared the percentages of favor, oppose, and neither favor nor oppose under each mode.The distribution comparisons of the substantive responses between the two modes reveal a fairly consistent pattern.In particular, the chisquare tests for these eight questions across the two modes indicate that significant mode effects exist on all questions at p<.0001 level.
For the "oppose" response option, seven out of eight questions indicate a higher percentage for this negative category under face-to-face interviews than Web interviews, whereas the only remaining question (Rape) show no difference between face-to-face and web surveys.A close look at the distribution comparison shows that the oppositions are high and quite disparate for child gender wrong (face-to-face 86% oppose vs. Web 74% oppose), financial hardship (face-to-face 61% oppose vs. Web 50% oppose) and nonfatal health risk (face-toface 40% oppose vs. Web 26% oppose) scenarios.The percentages of opposing are middling for both woman's choice (face-to-face 42% oppose vs. Web 37% oppose), incest (face-toface 32% oppose vs. Web 23% oppose) and birth defect (face-to-face 29% oppose vs. Web 22% oppose) scenarios, although the differences between face-to-face and Web surveys are smaller compared to the first three questions.Fatal health risk and rape scenarios both receive low opposition and the differences between the two survey modes are relatively small and negligible.
For the "neither favor nor oppose" response, which is also regarded as the neutral option, all the questions show a higher percentage for Web than face-to-face surveys.The questions about birth defect (Web 29% vs. face-to-face 15%), incest (Web 28% vs. face-to-face 14%), nonfatal health risk (Web 28% vs. face-to-face 17%) and financial hardship (Web 26% vs. face-to-face 12%) scenarios received substantially more "neither favor nor oppose" answers from Web respondents than from face-to-face respondents.The higher percentages of the two end points of the rating scales in the face-to-face survey indicate that face-to-face respondents tend to provide more divided attitudes toward abortion than Web respondents.Web respondents, in contrast, provided answers that are more neutral or ambiguous, suggesting that Web surveys tend to elicit less clear-cut opinions toward abortion than faceto-face surveys do.
I also conducted the analyses separately by the respondent's gender (Tables 2 and 3).For the item nonresponse analysis, similar to the whole sample analysis, the differences between face-to-face and Web survey for male and for female respondents are small and not statistically significant.For substantive responses, the patterns of response differences between face-to-face and Web for both genders are similar to the combined whole sample.All differences are statistically significant.Both male and female face-to-face respondents are more in favor of abortion for six out of eight scenarios (nonfatal health risk and child gender wrong scenarios show little to no difference between modes) while more Web respondents are selecting the middle option.The patterns of the level of favoring/opposing across all the items for both genders are also similar to the combined whole samples.

Discussion
This study set out to examine the abortion attitude difference between face-to-face and Web surveys through two independent national samples.The questions under study are sensitive which can lead to social desirability bias.the respondent feels that his or her real underlying attitude is not in accordance with the social norm, one of the choices is to withhold his or her opinion, which results in item nonresponse.
Considering the relatively higher level of anonymity and confidentially in a self-administered Web survey, I expected lower item nonresponse in Web survey than face-to-face survey.In other words, I expected the Web survey respondents to be more forthcoming when responding to sensitive questions.However, the results showed that the item nonresponse rates between the two modes were similar and not statistically significant.This suggests that respondents in face-to-face survey mode were no more likely to withhold their opinions in face of the abortion-related questions than Web survey respondents.
The responses to eight abortion questions in this survey show significant mode effects.Web respondents are more likely to choose the neither favor nor oppose option than face-to-face respondents for all of the questions.On the other hand, face-to-face respondents are more likely to choose favor or oppose than Web respondents.This is in line with the previous study, which reports that face-to-face respondents select more extreme answers from ordinal rating scales than Web respondents (Goldenbeld & de Craen, 2013).These findings suggest that mode effects exist in terms of people's responses toward abortion-related attitudinal questions, and the estimates on abortion attitudes drawn from face-to-face interviews and Web interviews are not entirely comparable.One possible explanation for the mode effect is the respondent's motivation.The presence of an interviewer in a face-to-face survey is likely to enhance the motivation of the respondents, and consequently, respondents are likely to take the survey more seriously and hence give more informative answers (such as whether they favor or oppose abortion) in comparison with the Web respondents (Christensen et al., 2014;Heerwegh, 2009).The middle options are more difficult to interpret in comparison to the other two options that clearly show where the respondents stand on the topic.More Web respondents endorse the middle options, possibly due to the lack of motivation in the selfadministered interview.Social desirability bias may also contribute to the response difference.However, both directions of the responses could be seen as socially desirable, depending on what the respondents think that the interviewer or society in general perceive as the norm.Another possibility is the narrow nature of the response options, which may have forced respondents who feel mildly in favor of or oppose the statement to choose the middle option, since otherwise they will risk seeming hard-line.The specific question topic may also contribute to the response differences between face-to-face and Web respondents.For the scenarios where abortion is more acceptable, such as fatal health risk, the oppose rate is low overall and no difference exists between the surveys.For the less socially acceptable scenarios, such as incorrect fetus gender, most respondents disapprove of abortion and relatively large differences exist between surveys.One may also argue that Web panel respondents have more survey experience than cross-sectional face-to-face respondents and that may contribute to the difference.However, the literature shows that the survey experience has little impact on survey responses (Toepoel, Das, & Soest, 2008).
Although the results show significant differences for many items, the absolute differences are not large for many of them.In addition, when ranking the scenarios from the most favoring to the least favoring, the patterns for both face-to-face and Web surveys are almost identical.In many situations, such as policy making, the general degree of favoring/opposing, rather than the exact number, is of the most interest.In that case, there is very little mode difference in the impression gained regarding public attitudes to the comparative acceptability for the various reasons for abortion.
Future work is definitely needed to further examine the mode effect on abortion questions, using other data sources.When asking abortion-related attitudinal questions, I encourage future researchers to explore other modes.Since both interviewer motivation and privacy are the potential factors contributing to the measurement bias, one should consider a combination of modes that can maximize the effectiveness of both.For example, computer-assisted selfinterviews could be a worth while research effort.Similarly, a leave-behind self-administered questionnaire for sensitive questions after a face-to-face survey is also a potential approach.
One major limitation of the 2012 ANES is the quasi-experimental nature of the survey and low response rates for both modes of data collection, and for Web surveys in particular.The low response rate is potentially correlated with higher nonresponse bias.Selection bias is another possible source of error for the observed mode difference, given the response rate difference between the two surveys (Vannieuwenhuyze & Revilla, 2013).The data at hand do GENDER AND WOMEN'S STUDIES Liu, M. Gender and Women's Studies.2018, 1(1):2.not allow us to tease apart the mode difference from the selection bias.Therefore, the real mode difference may be smaller or even nonexistent.Future surveys should consider a strict randomized experiment for studying the face-to-face vs. Web differences, and use techniques to improve the response rate and make them comparable between the modes under comparison.Another limitation is the limited scale type analyzed in the study.It is entirely possible that mode effect interacts with the rating scale characteristics and variations in scales is necessary to make conclusions that are more general about mode effect on abortion questions.As previous research shows, the way people respond to answer scales can differ by the data collection mode (Liu, Conrad, & Lee, 2016;Weijters, Schillewaert, & Geuens, 2008).Therefore, future research should a variety of scales on the same topic between modes to see if the results reported here still hold.Last, survey is just one of the methodologies for collecting public opinion on abortion.In fact, there is a long running debate between quantitative and qualitative research methods on issues like the one studied here (for examples, see Jayaratne & Stewart, 1991;Lawson, 1995;Westmarland, 2001).Whether different survey modes will draw similar or different conclusion from qualitative studies is unknown but worth exploring.
Regardless of the limitations, this is the first study that reports the potential abortion attitude differences between national face-to-face and Web surveys.There are slightly more face-toface respondents who did not provide an answer to the questions than Web respondents.Also, larger responses differences exist between modes observed from scenarios that are less socially acceptable while smaller differences exist for scenarios that are more acceptable.The relative degrees of favoring/opposing for various scenarios are quite similar between the two survey modes.For one thing, the differences between these two modes urge that caution be taken when directly comparing results collected through these two different modes.For another, the mode differences do not impose a serious threat on measuring the general population's opinions toward abortion.

APPENDIX
Question wordings used in the analysis.Do you favor, oppose, or neither favor nor oppose abortion being legal if staying pregnant would hurt the woman's health but is very unlikely to cause her to die.Do you favor, oppose, or neither favor nor oppose abortion being legal if staying pregnant could cause the woman to die.Do you favor, oppose, or neither favor nor oppose abortion being legal if the pregnancy was caused by the woman having sex with a blood relative.Do you favor, oppose, or neither favor nor oppose abortion being legal if the pregnancy was caused by the woman being raped.Do you favor, oppose, or neither favor nor oppose abortion being legal if the fetus will be born with a serious birth defect.Do you favor, oppose, or neither favor nor oppose abortion being legal if having the child would be extremely difficult for the woman financially.Do you favor, oppose, or neither favor nor oppose abortion being legal if the child will not be the sex the woman wants it to be.Do you favor, oppose, or neither favor nor oppose abortion being legal if the woman chooses to have one?

Table 2 .
Item nonresponse to abortion questions by mode of data collection, 2012 American National Election Studies (weighted results)

Table 3 .
Attitudes toward abortion by mode of data collection, 2012 American National Election Studies (weighted results) When facing abortion-related attitudinal questions, if Liu, M. Gender and Women's Studies.2018, 1(1):2.