@ Journal of the American Statistical Association June 1972, Volume 67, Number 3 3 8 Applications Section
Effects of Question Length on Reporting Behavior in the Survey Interview A N D R ~LAURENT*
The effects of questions which differ in form and length on completeness and accuracy of information reported in household interviews were explored in o series of four field studies. Question length and structure were varied systematically while keeping constant the demand choracferistics. Questions differed also in the recoll time offered and the amount of redundoncy contained. The onolyses were based on tope-recorded interviews containing questions on health events and behaviors. The longer questions elicited more information thon short ones. Indications are thot the doto were olso more accurate. Sever01 hypotheses are offered to explain these effects.
1. INTRODUCTION
Among the standard instructions for constructing survey interviews is the admonition that questions should be brief and to the point. Although there is general agreement among survey technicians that short questions are desirable, to our knowledge no systematic studies have been conducted to substantiate this principle. I n fact some empirical studies have shown a balance or matching in the level of verbal output between the interviewer and respondent and suggest that the principle of brevity may be an over-simplification. The purpose in this account is to investigate the effects of question length on response duration and on the quantity and accuracy of information reported in the survey interview. Our investigation \rill focus primarily on the impact of question length on the report of factual information (acute illnesses, chronic conditions and various health-related behaviors) rather than opinions or attitudes. I n a study by Cannell, Fowler, and Marquis [ I ] drtailed data mere recorded by an observer of a large variety of behavior displayed during the survey interview by the interviewer and the respondent. The behavioral data were then related to the number of events reported by the respondent. This analysis showed a clear, positive association between the total behavioral activity of the respondent and the number of items reported. Furthermore, a high correlation was found between the behavior activity * Andre Laurent is assistant professor, Department of Organizational Behavior, INSE.%D (European Institute of Business Administration), Fontainebleau, France. .%t the time of the study reported here, the author was Study Director, Survey Research Center, Institute for Social Research, University of LIichigan. T h e main study from which these data came was done under contract with the Xational Center for IIealth Statistics, U.S. Public Health Service, Ph-48-68-209. Further analysis of these data was supported through Public Health Service Grant HS00252 from the National Center for ITealth Research and Development, Charles F. Cannell, Principal Investigator. .it the time of this study the research team was composed of Cannell, Laurent and Llarquis.
level of the interviewer and that of the respondent. Thus, respondents behaved more (and presumably reported more) when interviewer behavior levels were high, and they showed less behavior (and reported fewer items) when interviewer behavior levels were low. I t was not possible to say who set the behavior levels, but it did seem that the amount of verbal behavior of one participant was a direct function of the behavior level of the other person. These findings led to speculation that the interviewer and respondent each sought for cues from the other about the degree of effort to put into their respective roles. If this cue search process does in fact account for the modeling of the respondent behavior, then inferences can be drawn about question length. I,ogically, short questions should elicit short answers and longer questions should yield longer responses. This inference is supported by a series of studies on interview speech behavior conducted by 3Iatarazzo and his colleagues [4]. Briefly stated, these studies (of employment interviews) found that an increase in interviewer speech duration resulted in a significant increase in respondent speech duration. For instance, in a 45minute interview divided into three 15-minute periods where the interviewers' utterances averaged 5.0, 15.2, and 5.5 seconds, the respondents' utterances averaged 30.9, 64.5, and 31.9 seconds, respectively. I n other experiments of this series, the researchers varied the schedule of the interview sequence of utterances, both in range and direction. They also controlled for number and type of questions, for topics discussed, and for interviewer differences. I t mas found consistently that whenever the intervie~rerw as doing more talking, so was the respondent.' Ways of inducing greater respondent verbalization are of particular interest since this might help to improve reporting accuracy in survey interviews. If longer questions do elicit longer responses, the probability that these longer responses mill contain more information appears as an interesting hypothesis. 1 Replications of this finding have been obtained under other interviewing situations such as the astronaut-ground communicator conversations (51, t h e Kennedy news conferences 171, and the experiment by Heller and his colleagues on interviewer style [6].
Effects of Question Length on Reporting Behavior
2. POTENTIAL EFFECTS O F QUESTION LENGTH O N REPORTING BEHAVIOR
First, we need to ascertain if one does in fact get a matching of speech duration in survey interviews. If so, is the increase in respondent speech duration a direct result of increases in interviewer speech duration or the result of greater information demands made upon the respondent by more elaborate questions? If A is asked the question, "Tell me a little bit about your job," and B is asked the long question, "Tell me everything you may think of about your job; I am interested in as many details as you can provide to describe it," one expects B to talk longer than A simply because of differences in the demand characteristics of the question. To find out whether changes in respondent speech duration are a function of changes in interviewer speech duration and independent of changes in the information demands of the question, one needs to modify the length of the question while holding constant the amount of information it asks for. Such a design was used throughout the series of experiments described here. A second objective is to find out whether increases in question length-without changes in the information demanded-have any effect on the answer content, independent of variations in respondent speech duration. A mere increase in question length might result in an increase in amount of reported information, with or without a correspondent increase in respondent speech duration. This hypothesis is based upon the cue-search model of the interview just described, in which it is assumed that the respondent looks a t the interviewer as a source of cues. A longer question might provide the respondent with cognitive and motivational cues conveying the idea that a fuller report is desired. The final reporting performance might change, although this change might not necessarily be reflected in the physical length of the answer. Third, it is important to ascertain what effect question length has upon the validity of the reported information. Validity might be improved by the use of longer questions, since the interviewer cues might transmit a request for completeness and accuracy of report. Finally, since a question can be lengthened in many ways, it is worth exploring how reporting behavior may be differently affected by a t Ieast a few of the ways through which questions can be lengthened. 3. THE EXPERIMENTS
A series of four field interviewing experiments were designed and conducted in sequence, each experiment suggesting the next design and the new hypotheses to be tested. All questionnaires investigated health behavior. The objective of the first experiment was to generate a t least preliminary answers to the following questions: what happens to the respondent when the survey interviewer starts asking length questions? Does he tend to match the interviewer's volubility by giving length responses? How much information does he report? The second experiment focused on validity of report: in the case of
lengthy questions how accurate is the reported information? Experiment 3 was an attempt to identify the operative factors, if any, of a longer question by manipulating question design or structure along with question length. Finally a fourth experiment was aimed a t stabilizing, replicating and refining findings from the preceding studies. Experiment 1 : Effects of Question Length on Answer Duration and Reporting Frequency
A pilot field experiment was designed to test the effects of interviewer speech duration upon respondent speech duration and reporting frequency. I n order to increase control over the content and non-content aspects of interviewer verbal behavior, variation in speech duration was created by the use of questionnaires with short and long questions. Furthermore, the lengthening of questions was done so as to keep the information demand constant in short and long questions. This was to avoid obtaining longer answers to longer questions simply because these questions explicitly asked for more information. An interview containing 28 questions was created which asked for a report of various health events (acute illnesses, injuries or accidents, chronic conditions, etc.) and healthrelated activities (medicines taken, doctor visits, etc.) that occurred during various periods of time (last two weeks, last four weeks, last six months, etc.). Various types of open and closed questions were used. Then, each of these questions was written in a long form according to the following procedures. A long question was composed of three sentences: 1. An introductory statement describing partially the topic of the question, using the same terms as used in the short question, but with a different grammatical structure; 2. An intermediary statement conveying some more information already contained in the short question, but not yet presented in the introductory statement, and usually introduced by a clich6; or a filler, introducing some extraneous information of obvious and inconsequential nature about the survey, unlikely to affect the meaning of the question; 3. The question itself in its short form.
The following example illustrates the question-writing procedure : QI7-Short F o r m : Have you ever had a n y trouble hearing?
(7 words)
Ql7-Long F o r m : Trouble hearing is the last item of this
list. We are looking for some information about it. Have you
ever had any trouble hearing? (24 words)
Thus, length was added to questions by introducing redundancy, clichds, and extraneous information. These were assumed not to aIter the objective or meaning of the question. The short form of the question was always just the last sentence of the long question (as shown in Q17). On the average, short questions contained 14 words, and long questions 38 words. This contrast in question length was assumed to be large enough to ensure a substantial variation in interviewer speech duration, in spite of expected individual differences in speed of reading.
Journal of the American Statistical Association, June 1972
Three questionnaires (X, Y, and Z) were designed. The 28 questions appeared in the same order in all three questionnaires. Questionnaire Z (Control) consisted of shortform questions only. Questionnaires X and Y consisted of blocks of long and short-form questions, alternated so that each block of questions asked in the long form on Questionnaire X was asked in the short form on Questionnaire Y and vice versa. Two female interviewers mere employed in the study and were instructed to read the questions exactly as worded, not to engage in any unnecessary speech, and not to probe. Interviews mere taken using a random sample of dmelling units in two moderate-income census tracts of a medium size midwestern city. Questionnaires X, Y and Z mere randomly assigned to the sample addresses and to the two interviewers. Eligible respondents were white, married females. A total of 27 interviews were taken (nine of each form) and tape-recorded with cassette-type machines. Two dependent variables, assumed to be affected by the length of the question asked, were measured: 1. The duration of respondent answers to each question:
this measure was defined as the number of seconds from the end of the question to the end of the response, minus any irrelevant interruption, or additional interviewer interventi~n.~ 2 . T he percentage of questions which elicited one or more items of the requested health information.
Looking a t the average time the respondent took to answer a question as a function of the question length, two findings emerged. First it was clear that when interviews with both short and long questions ( X and Y) were used, the longer questions did not elicit answers any longer than short questions did. Answers to long and short questions averaged 5.6 and 5.7 seconds respectively. Second, interviews with short questions only ( 2 ) did not elicit substantially shorter answers than did long questions in interviews X and Y. The average answer length in interview Z was 5.3 seconds against 5.6 for the same questions aslied in a long form in interviews X and Y. The difference is not statistically significant3 and may be considered inconsequential. These results are far from the approximately 100 percent increase in answer duration repeatedly obtained in other studies where comparable lengthening of interviewer speech duration had been used. I n this limited exploration, the matching effect between interviewer and respondent speech duration did not appear. I t was further verified that the matching effect was absent in both open and closed questions. Furthermore, when interviews contained an equal mix of short and long questions ( X and Y), the length of these questions did not affect the frequency of health information report. About 38 percent of the short-form questions elicited health information against 40 percent of the
equivalent questions written in a long form. The difference is inconsequential. However, when the entire interview was composed of short form questions (Z), only 29 percent of the questions elicited health report. This proportion differs significantly4 from those obtained with either long or short questions in the two other interviews ( X and Y). These results mean that lengthening of half of the questions in interviews X and Y led to a significant increase in the frequency of response to the health information items regardless of whether these health questions were lengthened themselves. Thus the effect of longer questions in eliciting more report of relevant information carries over to short questions when both are used in the same interview. The effect mas present in answers to both closed and open questions. I n summary, the results of this pilot experiment suggest the following conclusion: when information demand is held constant, longer questions do not produce noticeably longer responses, but do obtain a greater number of relevant items of information. Under these circumstances, increases in interviewer speech duration affect the content of respondent speech without affecting its duration. Experiment 2: Effects of Question Length on Validity of Report
While the findings in Experiment 1 imply that increases in question length result in greater reporting frequency, no data are available on the validity of the extra information obtained by the use of long questions. It may be t h a t the increased reports are largely false positives or overreports of health conditions. However, we have no particular reason to predict that overreporting mould increase as a function of question length. On the contrary, we can hypothesize (drawing again from the cue-search model of the interview) that respondents may interpret longer questions as calling for more completeness and accuracy of report. This would decrease underreporting, and possibly overreporting, thus improving the overall validity of the data. Experiment 2 mas designed to ascertain if longer questions affect the validity of reporting of health informasample of respondents was drawn from a populat i ~ nA. ~ tion of patients who had visited a physician in a pre-paid clinic during a six-month period prior to the survey. For each visit, the physician was asked to fill out a checklist form of 13 chronic conditions, indicating whether the patient had or did not have each listed condition, or whether no sufficient information was available. Information about the patient was obtained by the physician from the patient's record and from his own knowledge of the patient's health. A weighted sample of patients was used in which 88 percent had a t least one chronic condition and 12 percent had none of the listed conditions. Respondents were female, white, 18 to 60 years old, and living in the greater Detroit Rletropolitan area. 4
Answer duration was timed from the tapes by a single coder.
3 p >.OR (based on one-tailed t assuming simple random sample).
2
p 5.05 (one-tailed, based on Z).
For a full report of this experiment, which also included an investigntion of the effects of verbal reinforcement and reinterview, see Cannell, et al. [2]. 6
Effects of Question Length on Reporting Behavior
A questionnaire was prepared using standard short- I . AGREEMENT BETWEEN PHYSICIAN AND RESPONDENT O N
CHRONIC CONDITIONS BY AGREEMENT RATES AND
form questions which asked about various health conditions and behaviors. I n the middle of the questionnaire, LENGTH OF QUESTIONSa (Experiment 2)
checklist-type questions were introduced which asked Questionnaire Percentage about the presence of the 13 chronic conditions listed on procedure increase Type of agreement r a t e due t o Difference the physician summary form. A long-question version of Short Long question question question length the questionnaire was prepared by adding extra words, A P r o b a b i l i t y f o r c h r o n i c extraneous phrases, etc, to the short questions. A + B c o n d i t i o n s checked a s p r e s e n t by t h e p h y s i c i a n However, whereas in the pilot experiment the total t o be r e p o r t e d a s p r e s e n t by the respondent .537 .622 ,085~ +17 questionnaire length was rather short and interviews lasted only 10 to 15 minutes, in this study the question- A P r o b a b i l i t y f o r chronic ns reported a s naires were much longer and interviews lasted 30 to 45 A + C pc roensdeint it oby the respondent t o have been checked a s minutes, more like most survey research interviews. p r e s e n t by t h e p h y s i c i a n .477 ,516 .039 4% Ten female interviewers were employed in the study. Question-length interview treatments (long and short) A O v e r a l l p r o b a b i l i t y of between p h y s i c i a n were assigned a t random within geographic clusters of A f B X agreement and respondent f o r c h r o n i c c o n d i t i o n s mentioned a s respondents; each interviewer administered both types p r e s e n t by e i t h e r one of them .392 .05kc +16 .338 of interviews. 106 persons were interviewed with the short procedure and 96 with the long one. Number of persons interviewed were 106 in short-question procedure and 96 in Since errors could exist in the physician's report as well long-question procedure. p 5.05, one-tailed; based on Z. as in the respondent's report, a high degree of agreement p 5.10, one-tailed; based on 2. between the two sources was not expected. However, it was assumed that higher validity of the respondent report would be reflected by a greater agreement between the long-question interviews than it is with short-question two sources. interviews. The increase in overall probability of agreeProbability of agreement between physician arid re- ment obtained with long-question interviews amounts to spondent on the presence or absence of the listed 13 16 percent. chronic conditions was computed on the basis of match Comparable agreement rates were computed on the aband mismatch in "yes" and "no" provided by the two in- sence of chronic condition^.^ While these were not noticeformation sources. Excluding cases where the physician or ably enhanced by the use of long questions, a t least quesrespondent declared he did not have enough information tion length did not produce any detrimental effect on this to answer "yes" or "no" on the presence of a condition, type of agreement. the following four possibilities of match or mismatch Since reinterviewing was another variable to be investiexisted for each chronic condition: gated in the larger project, 50 percent of the original respondents were reinterviewed two weeks later. ResponPhysician dents were reinterviewed in the same format (long or short Respondent YES NO question) as they were the first time. The reinterview data confirmed the findings from the original interviews. All YES A C rates of agreement between physician and respondent reports were higher in the long-question group than in the short-question group, and the improvement reached a 5 percent level of significance on several rates. While Experiment 1 shows that more information was The following data were tabulated for both interview forms: elicited by longer questions, Experiment 2 demonstrates that longer questions also elicit information of higher 1. The probability t h a t chronic conditions checked as present by the physician were reported as present by the validity. respondent A / A +B. 2. The probability t h a t chronic conditions reported as present by the respondent were checked as present by the physician A / A + C . 3. Finally, the overall probability of agreement between physician and respondent for chronic conditions which have been mentioned as present by either one of them
A/A+B+C.
Table 1 shows that whether one starts from Row 1 (physician data), from Row 2 (respondent data), or from Row 3 (both), the probability of agreement on the existence of a chronic condition is consistently higher with
Experiment 3: Effects of Various Question Lengths and Structures on Reporting Frequency
The next study was an attempt to identify the source of the effects of question length and to replicate some of the earlier findings. Three alternate hypotheses were investigated : fl
Agreement rates on absence of chronic conditions were the following:
Journal of the American Statistical Association, June 1972 1. Question length per se improves reporting through cognitive or motivational cue-giving. 2. The influential variable is not length per se but the additional time it provides for search activity, thus improving recall. 3. Redundancy in question wording is the determining factor acting either by improving clarity, providing increased exposure time to the stimulus or bringing a repeated-trials effect.
I n the previous experiments, all three dimensionsquestion length, recall time and redundancy-had been varied simultaneously so that their specific effect could not be isolated. The design of Experiment 3 was aimed a t testing the effects of variations in total question length and recall time, while partially controlling for redundancy. For this purpose, experimental questions were designed using introductory and redundant statements in various arrangements. I n any given question, each of these statements was kept roughly equal in length. Q =Questzon in its short standard form. F = F i l l e r statement introducing extraneous information of inconsequential character and unrelated to the specific question demand. q = I n t r o d u c t o r y statenlent describing the topic of the question in a manner sufficient to stimulate the starting of relev a n t recall activity.
Four experimental questionnaires were constructed, each using one of the following question structures as illustrated here: Questionnaire A. This served as a control treatment, question length and recall time are low, and no redundancy is introduced. Q : "What are the things you do to protect your health?
Questionnaire B. The question length is doubled, but the recall time is unchanged and low since the filler statement (F) allows no start on specific recall activity. F + Q : " I n addition we are gathering a few data on the coming subject. Il'hat are the things you do to protect your health?"
Questionnaire C. Here the recall time is increased, since the introductory clause (q) provides a stimulus for relevant recall activity by stating the essence of the question objective; on the other hand, the question length is kept equal to the B type. q + Q : "Our next question asks those things people do to protect their health. IVhat are the things you do to protect your health?"
Questio~rnaireI). This increases further both the recall time and the question length by irltroducirlg a filler (F) between the introductory st:ztement ( g ) and the question (Q). q + E + Q : "Olir next question asks those things people do to protect their health. This is an additional subject we are gathering a few data on. What are the things you do to protect your health?
Redundancy occurs whenever a question uses both
statements q and Q, so that this variable was controlled within groups of treatments A and B (where there was no redundancy) and C and D, where redundancy was introduced. The design would have been more powerful with the use of a fifth treatment (F+q+Q), which was contemplated but not used in this study. Instead, a Question?zaire E was designed in which blocks of q+Q and Q question types were alternated in order to explore further the length carry-over effect found in Experiment 1. To study the carry-over effect five identical short questions were also included near the end of each questionnaire. The questionnaires contained 26 questions which concerned health topics similar to those in Experiment 1. The questionnaires included a balance of open and closed, forced choice questions. A standard probe ("anything else?") was used after responses to the open questions in all treatments. Four female interviewers were employed in the study. The study was designed to ensure that each interviewer took an equal number of interviews and an equal number of each form. Respondents were female, white, 18 to 65 years old, and were randomly assigned to the experimental treatments. Dwelling units were randomly selected among 10 average-income blocks of the greater Detroit Jletropolitan area based on 1960 Census data. A total of 60 interviews (12 in each experimental group) were taken and tape-recorded. The dependent variable was the number of discrete items of information reported per questionnaire relevant to each question's objectives. A special coding system was developed and specific rules spelled out to insure the reliability of this content measure. Coding was done directly from the tapes. Check coding of 25 percent of the sample showed a percentage agreement of over 90 percent. The mean number of items reported per persoil by questiorlnaire procedure is s h o ~ ~ ill - n Table 2 . Also indicated are the average number of TT-ordsper q ~ e s t i o nthe ,~ type of question structure and the number of statements per questiorl for each treatment. The results did not provide any great insight into the effects of the experimental variables. They suggest that a moderate increase in question length (tn-o-statement structure), regardless of the lengthening strategy used, does not have any subst:zntial effect upon reporting frequency. ilThile increase in the number of reported items does occur in experimental treatments B, C and El the observed differences with Treatment A are not sigrlificarlt. Only Treatment D resulted in a sig11ific:znt increase in the number of items reported. Responderlts reported on the average 23 percent more items in I) than in A . Treatment D(q+F+Q) used significantly longer cluestiorls tli>lrl arly of the other treatments, since etlc1-1 cluestion i11 D contained three statements, while B and C had two statements each; and d only one. Since various combinations of two-statement questions have been tried in 7 The average number of words per question is computed on the basis of all 26 questions; that is, it includes the five questions kept sllort and identical in each treatment, which tends t o sliglltly louer the actual contrast in length between procedures.
Effects of Question Length on Reporting Behavior 2. MEAN NUMBER OF ITEMS REPORTED PER PERSON BY QUESTIONNAIRE PROCEDURE (Experiment 3) Number of items reported Mean number o f i t e m s reported
Questionnaire procedure A
D
34.21
Mean d i f f e r e n c e from A Mean p e r c e n t i n c r e a s e over A
B
E
42.00
C
36.58
37.04
35.96
7 .7ga
2.37
2.83
1.75
23
7
8
5
(Average number o f words p e r q u e s t i o n ) (Type o f q u e s t i o n s t r u c t u r e ) (Number o f s t a t e m e n t s per question) (Number o f p e r s o n s )
a
Q (1) (12)
qFQ
(3)
(11)
Q (1 & 2) (12)
FQ
sQ
(2) (12)
(2) (@)
2~>a*, p 5 . 0 5 (based on one-tailed t assuming simple random sample).
B, C and El i t is hypothesized that the three-statement strategy used in Questionnaire D represents a critical lengthening of the questions necessary to obtain the predicted effect. A similar lengthening ratio had been used in Experiment 1 where similar results had been obtained. If the differences between the slight increases in reporting obtained in treatments B, C and E can be interpreted in spite of the small sample size, it seems that neither redundancy nor recall time as such are the operating variables. Treatment C(q+Q), which incorporated both of these dimensions, obtained the smallest increase in report. The present data suggest that the cuing properties of a significantly longer question represent the most promising explanation. We foresee that an experimental design which mill vary the order of statements q, F, and Q (while keeping constant the three-statement length) mill be able to better identify the structural variables, if any, that bring about improved reporting.
We also verified that a significant increase in the amount of reported information had been obtained by both open and closed long questions; the effect was stronger for closed (34 percent increase) than for open questions (IS percent increase). Within these categories of open and closed questions, data were also analyzed by subcategories of question objectives. This showed that the extent of the question length effect was dependent upon the nature of the information being asked. Further work in this direction could lead to principles for a selective lengthening of questions, based upon their type and objective. The carryover of the question-length effect (which mas predicted to occur in the block of five questions kept short in Treatment D) mas verified and found to be significant (p 5 .O5) for those questions which mere open (21 percent increase). The effect was smaller (9 percent increase) and not significant for closed questions. RlIore work is needed on the carry-over effect since it probably relates to other variables, such as location of the block of short questions within the questionnaire, size and frequency of blocks, etc. Data were also analyzed as a function of whether information mas reported before or after the standard probes were used in the open questions. The effect of question length was found to be independent of the effect of probing. The same 18 percent reporting increase due to question length was found when treating the data before or after probing. The reporting increase due to probing was the same (19 percent) for short and long questions. The question-length effect on reporting had as much influence in these data as classical probing had. 3. MEAN NUMBER OF ITEMS REPORTED PER PERSON BY INTERVIEWING WAVE AND BY QUESTIONNAIRE PROCEDURE (Experiment 4 ) Questionnaire procedure
A
Interviewing wave
Experiment 4: Replication of the Effects of Question Length on Reporting Frequency and Answer Duration
This final experiment (an extension of Experiment 3) mas designed to ascertain further the effects obtained in Treatment D. For this purpose, 12 more interviews were taken in both Group A (short question) and Group D (long question) by two of the former interviewers. All field interviewing rules mere the same except that more training was given on question objectives (two questions had raised some interpretation problems on the part of the interviewers in the first wave). This training might account for a slight overall increase in number of items reported under both treatments A and D in the second interviewing wave, as shown in Table 3. The major findings are brieflydiscussed here. First we verified that the question-length effect of increasing the number of items reported was present through the whole questionnaire and quite evident a t each question level from the first to the last question. However, the effect looked slightly stronger in the first half of the questionnaire than in the second half.
( s h o r t question) Mean no. N O . of of items persons reported
D ( l o n g question) Mesn no. N O . of o f items persons reported
Dtffaranca between means
pa
Percentage increase
Wave 1
12
34.21
11
42.00
7.79
.05
23
Wave 2
11
36.56
12
44.00
7.44
.05
20
Both waves
23
35.33
23
43.04
7.71
.O1
22
p based on one-tailed t assuming simple random sample. NOTE: T h e results from t h e second interviewing wave clearly replicated t h e first-wave results (Table 3). Since this study was a n extension of Experiment 3 and methods were kept t h e same, it was possible t o combine t h e data from both studies for comparable groups. This provides better stability in t h e data. a
Finally, to replicate the findings on answer duration obtained in Experiment 1, all taped interviews from Treatment A and D mere coded for speech and silence durations with electronic timers. Efforts were made to obtain as accurate measures as possible; check-timing confirmed high reliability. A coder first recorded the total amount of time the interviewer spoke in the interview (from Q l to Q26). The same measure was then taken for the respondent. Irrelevant interrupttons such as telephone calls mere excluded and a total interview time (from Q1 to Q26) was recorded. Subtracting all speech durations
Journal of the American Statistical Association, June 1972
from the total interview time gave an approximate measure of silence time for each interview. All measures were initially taken in tenths of seconds. Table 4 presents the mean duration in seconds per interview of interviemer speech, respondent speech, silences, and total interview. 4 . M E A N DURATION (in Seconds) O F INTERVIEWER A N D
RESPONDENT SPEECH A N D SllENCES PER INTERVIEW
BY QUESTIONNAIRE PROCEDURE (Experiment 4 )
Questionnaire procedure Speech duration
(short t u e s t i o n )
(long q:estion)
Difference (D A)
-
pa
Percent increase +l38
Interviewer speech
117
279
162
.002
Reapondent speech
213
231
18
NS
+
8
Silences
175
169
-6
NS
-
3
Total interview time (Total number of
vorda i n questions)
504
(347)
(Question structure)
Q
qw
(Number of peraons)
(20)
(22)
p based on one-tailed t assuming simple random sample. NOTE: I t is interesting to notice that the percent increase in interviewer speech duration created experimentally by the lengthening of the questions in Treatment D is exactly t h e same (138%) as the percent increase in actual number of words contained in the D questions.
I t is interesting to notice that the percent increase in interviewer speech duration created experimentally by the lengthening of the questions in Treatment D is exactly the same (138 percent) as the percent increase in actual number of words contained in the D questions. As the table shows, this increase in interviemer speech duration did not produce the matching increase of the respondent speech duration found in other research. Responses to long questions mere only 8 percent longer in time than responses to short questions. No correlation was found between interviewer and respondent speech duration^.^ Respondents on the average only spoke 18 seconds more per interview, and this difference is not statistically significant. Table 4 also shows that silence duration (time between end of question and beginning of response and time between end of response and beginning of next question) was also nearly identical in the two treatments, and thus mas apparently unaffected by the question length. The figures presented in Table 4 enable us to compute that 93 percent of the total increase in D interview time is accounted for by the experimental lengthening of the interviemer speech duration. Even though respondents reported significantly more information in answer to longer questions, they did not consume significantly more time to do it. The effectiveness of their verbal output was apparently enhanced in response to longer questions. These results replicate on a more controlled basis the findings of Experiment 1. 6
Correlation figures were .20 in Treatment A and .08 in Treatment B.
4. DISCUSSION
On the basis of the findings generated by the four experiments presented here, three major propositions can be made. When the length of questions is a t least doubled by introducing the subject of inquiry and adding redundant remarks prior to the actual question: 1. No appreciable increase is obtained in response duration, 2. Yet the response contains more information, and 3. The reported information is more valid.
The first proposition does not conform to findings of other studies on the speech-length matching effect in the interview. A major departure of the present research was to control the demand characteristics of the questions so that the long form of a question mould not give or ask any more relevant information than would the short form. I t seems likely that this factor is responsible for the observed discrepancy between our findings and those of Rlatarazzo and others. The absence of the anticipated length-matching supports our interpretation of the 14atarazzo findings: variations in respondent speech duration may be the result of uncontrolled changes in information demanded by questions of various lengths and be independent of variations in interviewer speech duration. The present study suggests that longer questions do not elicit longer responses when their information demand is controlled. One could argue that ;\'Iatarazzols findings only apply to conversational-type interviews dealing with discussion items as opposed to more structured interviews dealing with factual items reporting. However, Jaffe and Feldstein, using sophisticated methods for the recording a n analysis of conversational rhythms, have demonstrated recently that ('there is no correlation in unstructured dialogue between the vocalization lengths of the two speakers" [3]. The notion that longer questions might elicit more information and a more accurate report disagrees with common assumptions and current survey methodology. However, the evidence in this paper implies that lengthy and redundant questions-as used in this research-provide larger quantity and higher accuracy of report, even though the time duration of the responses is not appreciably longer. Moreover the effect of increasing reporting frequency carries over to short questions included in the same interview. The mechanisms by which lengthy questions bring about better validity of report are not clear yet. The available data do not seem to indicate any major or obvious influence played by either increased time for recall activity or the redundancy aspect of the question, although another experimental design (suggested previously) would be necessary to discard more firmly these possibilities. The present findings show that a substantial lengthening of the question is necessary to obtain a significant improvement in reporting. The most tenable hypothesis for these findings appears to be that long questions act as cues or teaching devices conveying in an effective and integrated way the message that the respondent reporting job
Effects of Question Length on Reporting Behavior
is important, deserves time, and consideration and requires serious efforts and thoroughness. Beyond their impact a t a cognitive level, these cues become motivators for higher reporting performance. I n addition to teaching and motivating for higher task achievement through modeling with the interviewer's behavior, a longer question may inform the respondent that the interviewer is not in a hurry, thus releasing perception of time constraints which might be detrimental to adequate search activity. Apparently longer questions do not produce more elaboration on one particular recall item but retrieval of more items. Finally, the responding behavior may also gain in effectiveness because some of the initial ruminating-type activity has already taken place during the time the question was being asked. The respondent is given more time for rehearsal activity and response polishing. A longer question may therefore provide cognitive and motivational cues to more adequate performance and, in the meantime, prepare the respondent to engage in a more efficient verbal behavior. More work is certainly needed to better identify the variables that operate when question structure and length are varied and to explore further the effects of various lengthening strategies. One could also investigate further the carryover effect of long questions: what proportion or schedule of longer questions is necessary in an interview to create and maintain the effects obtained in the experiments? Comparisons could be made between the effectiveness of long questions and that of a large number of short questions aiming a t the same information for an equal overall interviewing time. If responding to cues is thought of as a form of acquiescence, then variations in question length might also have more or less effect depending on the respondent's perception of the status difference between himself and the interviewer. Would various categories of respondents-housewives, businessmen, physicians . . . etc.-react similarly to long questions? How would the effects vary with the type of information being asked? The present research has been concerned with the impact of question length upon the report of factual-type data. One may wonder what the impact would be on
attitudinal data. T o the extent that attitudes are based on facts and events, one can assume that the above findings have also implications for interviews investigating attitudes and opinions. Possibly longer questions would invite respondents to retrieve and use more facts as a basis for stating their opinions and feelings. Or, on the other hand, longer questions might give respondents more time to think up defenses, to organize their responses around social desirability or to cover up controversial feelings. This study brings more evidence of the fact that questions serve much broader purposes in the interview than merely communicating an objective. They may provide cues for adequate role behavior, establish expectations for respondent performance and stimulate higher activity level. This complexity is not well understood, often overlooked and definitely calls for more investigation of the role of questions."
[Received April 1971. Revised December 1971.] REFERENCES [ I ] Cannell, Charles F., Fowler, Floyd J. and Marquis, K e n t H., "The Influence of Interviewer and Respondent Psychological and Behavioral Variables on the Reporting in Household Interview," Vital and Health Statistics, Public Health Service, Series 2, No. 26 (March 1968). [2] --, Marquis, Kent H. and Laurent, AndrB, "An Experimental Study of the Effects of Reinforcement, Question Length, and Reinterviews on Reporting Selected Chronic Conditions in Household Interviews," Survey Research Center, The University of Michigan, Ann Arbor, Michigan, 1969. 131 Jaffee, J, and Feldstein, S., Rhythms of Dialogue, New York and London: Academic Press, 1970, 107. [4] Matarazzo, J.D., Wiens, A.N. and Saslow, G., "Studies of Interview Speech Behavior," in L. Krasner and L.P. U11mann, eds., Research i n Behavior Modijication, New York: Holt, Rinehart and Winston, 1965, 179-210. 151 ----, Wiens, A.N., Saslow, G., Dunham, R . H . and Voas, R.B., "Speech Durations of Astronaut and Ground Communicator," Science, 143 (1964), 148-50. [6] Heller, K., David, J . D . and Myers, R.A., "The Effects of Interviewer Style in a Standardized Interview," Journal of Consulting Psychology, 30, 6 (1966), 501-08. [7] Ray, M.L. and Webb, E.J., '(Speech Duration Effects in the Kennedy News Conferences,'' Science, 153 (1966), 899-901. 9 Findings obtained since this article was completed suggest t h a t t h e "long question effect" is a complex one. For example, there is evidence that higher educated groups show greater effects than lower. Further research is under way t o study the effect more completely.