Volume 66 / Number 1 / 2019
Volume 66 / Number 1 / 2019
Experimental Psychology
Experimental Psychology
Editors-in-Chief Andreas Eder Christian Frings Editors Tom Beckers Arndt Bröder Gesine Dreisbach Manuel Perea Jörg Rieskamp James Schmidt Alexander Schütz Samuel Shaki Sarah Teige-Mocigemba Matthias Wieser
Current research on implicit theories Contents and topics include
• Students’ Growth Mindsets, Goals, and Academic Outcomes in Mathematics • Primary School Students’ Implicit Theories and Their Reading Motivation: The Role of Parents’ and Teachers’ Effort Feedback
• Implicit Theories of Ability and Self-Efficacy: Testing Alternative Social Cognitive Models to Science Motivation • A Fixed Mindset Leads to Negative Affect: The Relations Between Implicit Theories of Intelligence and Subjective Well-Being
• Examining the Effects of Task Instructions to Induce Implicit Theories of Intelligence on a Rational Thinking Task: A Cross-Cultural Study
• Implicit Theories About Willpower in Resisting Temptations and Emotion Control
Marko Lüftenegger / Jason A. Chen (Editors)
Implicit Theories
The Role and Impact of Malleable Mindsets Zeitschrift für Psychologie, Vol. 225/2 2017, iv + 70 pp., large format US $49.00 / € 34.95 ISBN 978-0-88937-499-7 Individuals develop fundamental assumptions about human attributes to explain and understand their world. These implicit theories incorporate beliefs about the fixedness or malleability of personal attributes such as intelligence, willpower, and personality, and organize the way people ascribe meaning to events. Although implicit beliefs can be held stably, empirical findings show that they can also be primed and changed in brief laboratory experiments or in longer-term interventions. This offers manifold opportunities for shaping individuals’ implicit theories in different contexts. For research, this means that implicit theories have a high interdisciplinary appeal.
www.hogrefe.com
This volume brings together current research on implicit theories from different psychological subdisciplines that investigates implicit theories in individuals from Asia (Philippines), Australia, Europe (Germany, Norway, Switzerland), and the United States in different domains (education, health, willpower) using cross-sectional, longitudinal, and experimental designs. The papers provide current research on implicit theories and their effects on attitudes, thoughts or behavior, and report on cross-cultural effects of interventions designed to influence implicit theories.
Experimental Psychology
Volume 66/Number 1/2019
Editors
A. Eder, Würzburg, Germany
C. Frings, Trier, Germany
Associate Editors
T. Beckers, Leuven, Belgium A. Bröder, Mannheim, Germany G. Dreisbach, Regensburg, Germany M. Perea, Valencia, Spain J. Rieskamp, Switzerland
J. Schmidt, Dijon, France A. Schütz, Marburg, Germany S. Shaki, Samaria, Israel S. Teige-Mocigemba, Marburg, Germany M. Wieser, Rotterdam, The Netherlands
Editorial Board
U. J. Bayen, Düsseldorf, Germany H. Blank, Portsmouth, UK J. De Houwer, Ghent, Belgium R. Dell’Acqua, Padova, Italy G. O. Einstein, Greenville, SC, USA E. Erdfelder, Mannheim, Germany M. Goldsmith, Haifa, Israel D. Hermans, Leuven, Belgium R. Hertwig, Berlin, Germany J. L. Hicks, Baton Rouge, LA, USA P. Juslin, Uppsala, Sweden Y. Kareev, Jerusalem, Israel D. Kerzel, Geneva, Switzerland A. Kiesel, Freiburg, Germany K. C. Klauer, Freiburg, Germany R. Kliegl, Potsdam, Germany I. Koch, Aachen, Germany J. I. Krueger, Providence, RI, USA S. Lindsay, Victoria, BC, Canada
E. Loftus, Irvine, CA, USA T. Meiser, Mannheim, Germany K. Mitchell, West Chester, PA, USA N. W. Mulligan, Chapel Hill, NC, USA B. Newell, Sydney, Australia K. Oberauer, Zürich, Switzerland F. Parmentier, Palma, Spain M. Regenwetter, Champaign, IL, USA R. Reisenzein, Greifswald, Germany J. N. Rouder, Irvine, CA, USA D. Shanks, London, UK M. Steffens, Landau, Germany S. Tremblay, Quebec, Canada C. Unkelbach, Köln, Germany M. Waldmann, Göttingen, Germany E. Walther, Trier, Germany P. A. White, Cardiff, UK D. Zakay, Tel Aviv, Israel
Publisher
Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail publishing@hogrefe.com, Web http://www.hogrefe.com
Production
Regina Pinks-Freybott, Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail production@hogrefe.com
Subscriptions
Hogrefe Publishing, Herbert-Quandt-Str. 4, D-37081 Göttingen, Germany, Tel. +49 551 99950-956, Fax +49 551 99950-998
Advertising/Inserts
Melanie Beck, Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail marketing@hogrefe.com
ISSN
ISSN-L 1618-3169, ISSN-Print 1618-3169, ISSN-Online 2190-5142
Copyright Information
Ó 2019 Hogrefe Publishing. The journal as well as the individual contributions to it are protected under international copyright law. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, digital, mechanical, photocopying, microfilming or otherwise, without prior written permission from the publisher. All rights, including translation rights, are reserved.
Publication
Published in six issues per annual volume. Experimental Psychology is the continuation of Zeitschrift für Experimentelle Psychologie (ISSN 0949-3964), the last annual volume of which (Volume 48) was published in 2001.
Subscription Prices
Calendar year subscriptions only. Rates for 2019: Institutions – from US $326.00/1249.00 (print only; pricing for online access can be found in the journals catalog at hgf.io/journals2019); Individuals – US $259.00/1185.00 (print & online). Postage and handling – US $24.00/118.00. Single copies – US $85.00/166.50 + postage & handling.
Payment
Payment may be made by check, international money order, or credit card to Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, or, for customers in North America, to Hogrefe Publishing, Inc., Journals Department, 7 Bulfinch Place, 2nd floor, Boston, MA 02114, USA.
Electronic Full Text
The full text of Experimental Psychology is available online at http://econtent.hogrefe.com/loi/zea
Abstracting/Indexing Services
Experimental Psychology is abstracted/indexed in Current Contents/Social & Behavioral Sciences (CC/S&BS), Social Science Citation Index (SSCI), Medline, PsyJOURNALS, PsycINFO, PSYNDEX, ERIH, Scopus, and EMCare. 2017 Impact Factor 1.206, Journal Citation Reports (Clarivate Analytics, 2018)
Experimental Psychology (2019), 66(1)
Ó 2019 Hogrefe Publishing
Contents Research Articles
Short Research Articles
Ó 2019 Hogrefe Publishing
The Role of Talker Familiarity in Auditory Distraction Brittan A. Barker and Emily M. Elliott
1
And Remember the Truth That Once Was Spoken: Knowledge of Having Disclosed Private Information to a Stranger Is Retrieved Automatically Franziska Schreckenbach, Klaus Rothermund, and Nicolas Koranyi
12
The Effect of Outcome Probability on Generalization in Predictive Learning Hadar Ram, Dieter Struyf, Bram Vervliet, Gal Menahem, and Nira Liberman
23
Effect of Foreign Accent on Immediate Serial Recall Kit Ying Chan, Ming Ming Chiu, Brady A. Dailey, and Daroon M. Jalil
40
Task Switching Hurts Memory Encoding Michèle C. Muhmenthaler and Beat Meier
58
On the Lack of Real Consequences in Consumer Choice Research: And Its Consequences Sina A. Klein and Benjamin E. Hilbig
68
The Effect of a Verbal Concurrent Task on Visual Precision in Working Memory Ed D. J. Berry, Richard J. Allen, Amanda H. Waterman, and Robert H. Logie
77
Affective Influence on Context-Specific Proportion Congruent (CSPC) Effect: Neutral or Affective Facial Expressions as Context Stimuli Jinhui Zhang , Andrea Kiesel, and David Dignath
86
Experimental Psychology (2019), 66(1)
Research Article
The Role of Talker Familiarity in Auditory Distraction Brittan A. Barker1
and Emily M. Elliott2
1
Department of Communicative Disorders and Deaf Education, Utah State University, Logan, UT, USA
2
Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
Abstract: The current research employed a classic irrelevant sound effect paradigm and investigated the talker-specific content of the irrelevant speech. Specifically, we aimed to determine if the participants’ familiarity with the irrelevant speech’s talker affected the magnitude of the irrelevant sound effect. Experiment 1 was an exploration of talker familiarity established in a natural listening environment (i.e., a university classroom) in which we manipulated the participants’ relationships with the talker. In Experiment 2, we manipulated the participants’ familiarity with the talker via 4 days of controlled exposure to the target talker’s audio recordings. For both Experiments 1 and 2, a robust effect of irrelevant speech was found; however, regardless of the talker manipulation, talker familiarity did not influence the size of the effect. We interpreted the results within the processing view of the auditory distraction effect and highlighted the notion that talker familiarity may be more vulnerable than once thought. Keywords: irrelevant sound effect, irrelevant speech attention, working memory, talker familiarity
Views of Auditory Distraction
It is well established that background noise negatively affects peoples’ capacity to immediately recall visually presented information (Colle & Welsh, 1976), and this auditory distraction phenomenon is often referred to as the irrelevant sound effect (ISE). The ISE is one example of an auditory distraction paradigm that provides researchers with an ecologically valid construct to explore theoretical aspects of cognition intertwined with short-term memory and selective attention, while simultaneously working toward an understanding of its underlying neurological mechanisms (for a review, see Hughes & Jones, 2001). As a result, researchers explore the ISE from a number of perspectives via various manipulations, one of which is to investigate the acoustic and semantic content of the irrelevant speech in an attempt to determine which of its properties affect the magnitude of the ISE (e.g., Röer, Körner, Buchner, & Bell, 2017; Tremblay & Jones, 1999). Despite the fact that the research on this particular auditory distraction paradigm extends over 40 years, researchers do not agree on the precise mechanism(s) of the disruption to serial recall performance. The current research drew upon findings from the field of hearing science to understand the role of the talker-specific content of the irrelevant speech, thus utilizing an interdisciplinary approach to gain a clearer understanding of the cause of this distraction. Ó 2019 Hogrefe Publishing
While it is well accepted that the degree of acoustic variability of the irrelevant sounds is critical for the disruptive effect on serial recall to occur (i.e., the changing-state effect; Ellermeier & Zimmer, 2014; Jones, Madden, & Miles, 1992), the role of attentional factors has been called into question. For example, an individual’s working memory capacity does not have a relationship with the size of their ISE (see Beaman, 2004; Elliott & Briganti, 2012). This lack of a relationship between the size of the ISE and working memory capacity is in contrast to research employing an auditory deviant paradigm. The serial recall version of the deviant paradigm is similar in design to the ISE; however, research participants are presented with irrelevant sounds that contain an unexpected and deviant item (e.g., a male voice changing to a female voice). The magnitude of the deviant effect in the auditory deviant paradigm was shown to correlate with working memory capacity (e.g., Sörqvist & Rönnberg, 2014; although see Körner, Röer, Buchner, & Bell, 2017 for an opposing view). Hughes (2014) built upon the divergence of these two signature findings, the changing-state effect (Jones et al., 1992) and the auditory deviant effect (Hughes, Vachon, & Jones, 2007), to support the duplex model of auditory distraction. Within this model, it is hypothesized that the degree of disruption in the changing-state effect is caused by the overlap in processing demands resulting from the obligatory order cues in to-be-ignored auditory stimuli, conflicting with the order cues of the to-be-remembered Experimental Psychology (2019), 66(1), 1–11 https://doi.org/10.1027/1618-3169/a000425
2
stimuli, especially when serial order recall is required (i.e., interference by process). The changing acoustic features of the to-be-ignored stimuli are referred to as a form of precategorical auditory distraction (Marsh et al., 2018). In contrast, postcategorical auditory distraction results from semantic or phonological features of the to-be-ignored stimuli that may result in attention capture (e.g., Röer et al., 2017). Prior researchers have also examined the role of attentional factors in auditory distraction effects and proposed a functional view of auditory distraction (e.g., Allport, 1993; Hughes & Jones, 2003; Neumann, 1996). Within this view, it is argued that people must maintain a balance of attentional selectivity. That is, they must be able to focus on processing the experimental task in the forefront (e.g., visual digit-span task), while also allowing for some monitoring or awareness of the irrelevant auditory channel’s content. This internal monitoring promotes human survival. The openness to incoming auditory information allows an individual to perceive and process something critical, like the sound of an emergency vehicle’s siren or a building’s fire alarm. Supporting this functional view, Röer et al. (2017) conducted experiments using an ISE serial recall paradigm with taboo words presented in the irrelevant channel. When a variety of taboo words were presented, as opposed to the repetition of the same taboo word or neutral words, the participants’ serial recall was significantly more disrupted by the changing taboo words than changing neutral words. The finding – changing taboo words yield the greatest disruption – is a critical piece of evidence that people process the semantic content of the irrelevant sounds. Because the participants in their study were instructed to ignore the auditory information, without a functional system of selectivity, the effect of the changing taboo words should have been no different than the changing neutral words. Röer et al. (2017; Experiment 2) further suggested that the disruptive effect of taboo words was not related to individual levels of working memory capacity. The authors’ overall interpretation of the findings reflected upon the flexibility of the cognitive system, as opposed to viewing the disruption from the changing taboo words as a failure of the system. While researchers have shown a number of sounds cause disruption to focal task performance, such as tones or alarms, the authors discussed the potential importance of speech as a distracting stimulus. As stated by Röer et al., “it seems plausible that speech sounds (and sounds that resemble speech in their physical properties) play a special role” (p. 748). Examining the components of the speech sounds in the irrelevant channel may provide new insights into the processing of irrelevant information. This was the case in the present study, when we manipulated the research participants’ familiarity with the speaker of the irrelevant speech. Experimental Psychology (2019), 66(1), 1–11
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
The Role of Talker Familiarity Generally speaking, speech is made up of two components: linguistic information and indexical or talker-specific information (Abercrombie, 1967; Pisoni, 1997). The linguistic information refers to the segmental speech components such as phonemes and morphemes, while the talker-specific information includes components reflected in the speech signal that are particular to an individual talker’s voice (e.g., accent, gender, age) as opposed to any linguistic content. Historically, many studies in the field of speech perception focused on the perception, encoding, processing, and retrieval of the linguistic information (e.g., Kuhl, 1993; Liberman, 1957; Nearey, 1990), while the talker-specific information in the speech signal was often unaccounted for or dismissed from the problem space. Over the past decades, however, researchers showed that the talker-specific information is neurologically coupled with its linguistic information (Chandrasekaran, Chan, & Wong, 2011; Naoi et al., 2012) and often affects the perception and processing of linguistic information across a variety of listeners and a number of laboratory tasks. In particular, when a listener is familiar with a talker’s voice, the listener’s ability to perceive and process the talker’s speech is often better than if they are listening to a stranger (e.g., Nygaard & Pisoni, 1998; Souza, Gehani, Wright, & McCloy, 2013). Furthermore, the talker familiarity effect seems strongest when listeners are explicitly reminded of their relationship with the talker before the experimental task begins (Newman & Evers, 2007). Interestingly, despite past work examining talker variability in the to-be-ignored auditory stimuli (Hughes, Marsh, & Jones, 2009, 2011), no prior studies on the ISE have attempted to manipulate the relationship of the research participant’s familiarity with the talker.
The Current Study The work of Röer and colleagues suggested that certain factors increase the allocation of attentional resources toward the irrelevant channel (taboo words, Röer et al., 2017; one’s own name, Röer, Bell, & Buchner, 2013), whereas an interference-by-process view instead places emphasis on the processes needed to complete the task at hand (Hughes, 2014) and the precategorical features of the irrelevant speech (Marsh et al., 2018). Based upon the findings of Röer and colleagues, in addition to the known effects of talker familiarity on speech perception (Nygaard & Pisoni, 1998; Souza et al., 2013), we predicted that the greater the degree of familiarity of the participant with the talker, the greater the resulting size of the disruption to serial recall performance. This talker familiarity effect would be further increased when the participant was made explicitly aware Ó 2019 Hogrefe Publishing
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
of their relationship with the talker, relative to being uninformed about the talker’s identity, or having no familiarity at all. Alternatively, the interference-by-process view would only predict an effect of talker familiarity if the task required a specific interaction with the characteristics of the talker’s voice, which is not necessary to complete a serial recall task. This prediction indicates that in a typical ISE task, the familiarity of the participant with the talker’s voice producing the irrelevant speech would be processed as a postcategorical feature.
Experiment 1 In Experiment 1, we explored the role of talker familiarity established in a natural listening environment – a college classroom (see also Newman & Evers, 2007). We did so within the context of a classic ISE task, and we recruited students from introductory psychology classes to serve as participants. One-third of the participants were a part of the control group and were unfamiliar with the irrelevant talker’s speech; the remaining participants were familiar with the irrelevant talker’s speech because it was recorded by their introductory psychology class instructor. Of these remaining participants, half of them were informed that they were familiar with the talker and half of them were not informed that they were familiar with the talker.
Method Participants Two hundred seven undergraduate students from a large, state university enrolled in introductory psychology classes participated in the experiment. We recruited participants through the university’s Department of Psychology research participant pool and compensated them with either course credit or extra credit in psychology classes. We quasi-randomly assigned participants to 1 of 3 groups based on their familiarity with the background talker. The final sample (N = 179) included n = 96 for the unfamiliar talker condition, n = 40 for the informed “familiar” talker condition, and n = 43 for the uninformed “familiar” talker condition. Participant inclusion criteria were as follows: normal or corrected-to-normal vision; hearing within-typicallimits (5 excluded); and native English speaker (7 excluded). Of the total number of original participants, 19 were removed from data analyses based on answers to the postexperiment questionnaire (see Results section for details). Experimental Design We utilized a 2 3 mixed design with auditory condition (silent, speech) as a within-subjects factor and talker familiarity (unfamiliar, uninformed, informed) as a between-subjects factor. Ó 2019 Hogrefe Publishing
3
Materials We used E-Prime software 2.0 (2012) to conduct the experiment on Dell desktop computers equipped with 1700 monitors and Sony circumaural headphones. The experimental setup was located in a sound-treated room. The E-Prime code and stimuli used for Experiment 1 can be secured by researchers for replication purposes by e-mailing the corresponding author. Visual Stimuli Seven target digits (1–7) served as the visual stimuli. Digits were printed in black typeface and were comfortably viewed by the participants. All digits were centered on the monitor and displayed on a white background. Auditory Stimuli Talker details. A female instructor for an “Introduction to Psychology” course at the university recorded the distractor sentences. We expected participants in the familiar condition to be familiar with this voice because the talker was their introductory psychology instructor. The participants were invited to assist with the study approximately midway through the semester, to ensure that they had ample time to become familiar with the instructor’s voice. The participants in the unfamiliar condition were enrolled in introductory psychology course sections other than the target talker’s. Distractor sentences. Sentences from the Revised List of Phonetically Balanced Sentences: Harvard Sentences (IEEE, 1969) served as the distractor sentences for all participants. The sentences were combined and edited into sound files approximately 6 s in length using the Adobe Audition 2.0 (2005) sound editing software on a Dell Optiplex 740 computer with a Delta 101LT PCI sound card. The recordings were edited and equated across root-mean-square (RMS). These edited sentences served as the distractor sentences for all participants. Procedure Each participant was tested individually at a personal computer using a typical ISE paradigm. A researcher obtained informed consent prior to the experiment. The participants in the unfamiliar and uninformed groups were given no information about the irrelevant speech’s talker. However, for participants in the informed group, prior to the beginning of the experiment, “The voice you will hear is your PSYC 2000 professor, [INSTRUCTOR’S NAME], so you should be familiar with this voice.” appeared on the computer monitor. The experiment began with 3 practice trials, followed by the 40 test trials presented randomly. The 7 target digits were presented visually on the computer monitor at a rate of 750 ms each, while the distractor sentences were simultaneously presented at a comfortable listening level Experimental Psychology (2019), 66(1), 1–11
4
via headphones worn by the participant. The total duration of stimuli presentation was approximately 6,000 ms. Of the 40 experimental trials, 20 were silent trials and 20 were auditory distractor trials; the different trial types were intermixed across the 40 total trials. Each participant was instructed to remember the digits that they saw on the screen and to ignore the sentences they heard over the headphones. For each trial, after the digits were presented on the monitor, the participant was instructed to recall the digits in the order in which they were presented and use the number keypad to record their responses. After the completion of all trials, the participant answered follow-up, multiple-choice questions via the computer about the distractor talker. Follow-up questions can be found in Appendix A. Finally, the experimenter debriefed the participant and thanked them for their participation.
Results and Discussion Assessing Talker Familiarity For our first step in data analysis, we examined the participants’ responses to the follow-up questionnaire. It was critical those participants in the informed condition, for example, could correctly recognize the talker as familiar and provide the correct name of their instructor (recall, they were told her name at the beginning of the experiment). From the original sample, 4 individuals in the informed group responded, “I don’t know” to the question asking about the familiarity of the to-be-ignored voice. We removed those individuals from further data analyses since the voice was not “explicitly familiar” to these participants. Interestingly, in the unfamiliar condition, in which participants were recruited from a section of “Introduction to Psychology” that was not taught by the to-be-ignored talker, 15 participants reported that they were familiar with the voice. Given that these individuals may have tried to attach a label to the voice – because it is highly unlikely that they were actually familiar with the correct identity of the talker – we excluded these participants from further data analyses as well. Recall, the uninformed group was given no instructions regarding the identity of the speaker, and it was possible that they may have recognized her voice correctly, as she was their instructor. Assessing the Effect of Talker Familiarity on the Irrelevant Speech Effect With the final sample established (N = 179), we conducted a 2 3 mixed-model analysis of variance (ANOVA) with the within-participants factor of auditory condition (silent, speech) and the between-participants factor of talker familiarity (unfamiliar, uninformed, informed). A main effect of auditory condition was found, F(1, 176) = 428.19, MSE = .007, p < .01, η2p = .71, with higher serial recall performance Experimental Psychology (2019), 66(1), 1–11
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
Figure 1. Mean proportion correct serial recall performance of the talker familiarity groups, by auditory condition. Error bars represent standard error of the mean. White bars represent trials in silence and gray bars represent trials presented in the presence of irrelevant speech.
in the silent condition (M = 0.79) than in the presence of speech (M = 0.59). However, there was no main effect of talker familiarity (p = .63) and no interaction of talker familiarity with auditory condition (p = .41; see Figure 1). We then collapsed the two familiarity conditions into one group (N = 83) and created a difference score to assess the size of the disruption created by the irrelevant speech. The familiar group (Mdiff = 0.20) did not demonstrate a significantly larger difference score than the unfamiliar group (Mdiff = 0.19) [t(177) = .73, p = .46]. It is possible that the talker – despite being the participants’ professor and accurately identified by name via a post-experiment questionnaire – was not actually familiar to either the uninformed or informed “familiar” groups of listeners. An alternative possibility is that the demands of the task, ignoring the auditory information and performing serial recall on visually presented digits, did not involve any overlap in processing between the irrelevant speech (which was possibly spoken by a familiar talker) and the required response. Thus, in an attempt to differentiate between these two theoretical explanations of the results of Experiment 1, we conducted Experiment 2 and directly manipulated the familiarity of the talker’s voice.
Experiment 2 Given the lack of a significant talker familiarity effect and the little experimental control available regarding listener familiarity in Experiment 1, we conducted Experiment 2 to address these concerns. We familiarized listeners with a specific talker’s voice via four days of training that included Ó 2019 Hogrefe Publishing
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
30 min/day of narrative exposure, followed by an interactive voice detection task. We hypothesized that this voice training would provide controlled exposure to and interaction with the target talker’s voice and subsequently affect the magnitude of the ISE, thus yielding a greater possibility of talker familiarity effects that were not found in Experiment 1. By directly manipulating the participants’ familiarity with the talker’s voice, the likelihood of familiarity leading to an attention capture response (i.e., the strength of the postcategorical feature to disrupt serial recall performance) is increased. In contrast, if the precategorical features remain the dominant cause of disruption, then the familiarity of the talker’s voice will not influence the size of the ISE.
5
task’s distractor sentences. We recorded the stories in the same manner as the distractor sentences and used them as stimuli for the talker exposure task. Stories included such pieces as “Eleonora” by Edgar Allen Poe (1841) and “Experience” by Tessa Hadley (2013). A researcher uploaded the stories’ audio recordings to Dell Optiplex 740 computer equipped with a Delta 1010LT PCI sound card. They then edited and equated the recordings across total RMS using Adobe Audition 2.0 (2005) sound editing software. The stories were combined to yield four different, 30-minute blocks of story audio. Each block of story audio included a total of three or four stories and was used as stimuli for the 4-day-long talker exposure task.
Method Participants Sixty-three undergraduate students from a large, state university were recruited through the university’s Department of Psychology research participant pool to participate in this experiment. We quasi-randomly assigned participants to either undergo or forgo talker familiarity training. The final sample included n = 29 talker training condition and n = 31 for the control condition (N = 60; 53 females, Mage = 23 years). Participant inclusion criteria were as follows: normal or corrected-to-normal vision; self-reported hearing withintypical-limits; and a native English speaker. We compensated participants in the control group with either course credit or extra credit; we compensated participants undergoing talker training with $25. Experimental Design We employed a 2 2 2 mixed design with auditory condition (silent, speech) and time of testing (pre, post) as within-subjects factors; talker training (training, control) was the between-subjects factor. Materials Irrelevant Sound Task We used the same materials and stimuli from Experiment 1 for Experiment 2’s irrelevant sound task. Talker Training We used E-Prime software 2.0 (2012) to conduct the talker training portion of the experiment on Dell desktop computers equipped with 1700 monitors and Sony circumaural headphones. The setup was located in a sound-treated room. The E-Prime code and stimuli used for talker training can be secured by researchers for replication purposes by e-mailing the corresponding author. Talker Exposure Task’s Auditory Stimuli A total of 14 published short stories were recorded by the same, female talker who recorded the irrelevant sound Ó 2019 Hogrefe Publishing
Voice Detection Task The same female talker who recorded the aforementioned distractor sentences and stories also recorded 10 additional Harvard sentences. An additional 12 native English-speaking females recorded a total of 60 Harvard sentences. A researcher uploaded these audio recordings to the same Dell computer. They again edited and equated across total RMS using the Adobe Audition 2.0 (2005) sound editing software for these stimuli. The sentences were utilized to yield four different sets of 15 voice detection trials. Each sentence set was used as stimuli for the voice detection task which served as the final component of the 4-day-long training protocol. Procedure Experiment 2’s procedure consisted of three components: baseline, pretraining ISE task (pretest); talker training; and post-training ISE task (posttest). The talker training component of this experiment had two phases: (1) talker exposure and (2) voice detection task; talker training occurred four times over a series of 4 consecutive days. The pretest and posttest components had one phase each (see Figure 2 for a schema). We administered all components of the experiment in a sound-treated room at a personal computer equipped with headphones; we trained and tested all participants individually. All of the speech stimuli were presented at the listener’s maximum comfort level. All components of the experiment were administered during 4 consecutive days. We counterbalanced talker-, digit-, and sentencepresentation across all irrelevant sound tasks and participants. Irrelevant Sound Task We used ISE task procedures identical to those of Experiment 1. A researcher obtained informed consent prior to the experiment. The participants in the control condition Experimental Psychology (2019), 66(1), 1–11
6
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
Figure 2. Schema of three-component procedure for Experiment 2.
completed an ISE task on Day 1 and Day 4 of the 4 consecutive days (see Figure 2); we gave them no information about the irrelevant speech’s talker. The participants in the training condition also completed an ISE task on Day 1 and Day 4 of the 4 consecutive days (see Figure 2); we familiarized them with the irrelevant speech’s talker over the 4 consecutive days of training. Talker Training Participants in the talker training condition underwent talker training on Days 1–4. On Day 1, training occurred after the ISE task; on Day 4, training occurred before the ISE task (see Figure 2). Training began with talker exposure, followed by a voice detection task. Talker Exposure Talker exposure began with the following message written across the computer’s monitor: “Sally will read you 3 stories. Your job is to relax and listen.” The short story’s title and its author then appeared printed across the screen, and the story’s audio began. The participant listened to a total of 30 min of stories – read by the Experimental Psychology (2019), 66(1), 1–11
talker who produced the irrelevant speech in the ISE tasks – via the computer’s headphones. The stories were not repeated across the 4 days of training. The voice detection task began immediately after the participant completed their 30 min of talker exposure. Voice Detection Task The voice detection task began with the following instructions written across the computer’s monitor: “For this next task, you will hear 3 spoken sentences. Sally may or may not speak one of the sentences. Your job is to indicate whether or not you heard Sally’s voice . . . make your judgement as quickly and accurately as possible.” Then, the voice detection task began. If the participant heard Sally (i.e., the talker who read the stories during talker exposure) speak one of the three sentences, they pressed the letter “Y” on the computer’s keyboard. If the participant did not detect Sally’s voice, they pressed the letter “N” on the computer’s keyboard. The voice detection task consisted of a total of 15 trials; the trials were never duplicated across the 4 days of training. Ó 2019 Hogrefe Publishing
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
Figure 3. Mean proportion of correct serial recall for both groups of listeners (talker training, control) across time of testing, with standard error of the mean represented by the error bars. White bars represent trials in silence and gray bars represent trials presented in the presence of irrelevant speech. Dotted bars on the left represent listeners, who underwent talker familiarity training.
Results and Discussion Assessing the Effect of Talker Familiarity on the Irrelevant Speech Effect We scored performance on the serial recall task using a proportion correct method, in which each item correctly recalled was counted independently of whether the entire list was correct or not. Strict serial recall scoring was used, such that the item had to be in the correct order to be counted as accurate. Descriptive statistics are illustrated in Figure 3. We utilized a 2 2 2 mixed ANOVA with the between-subjects factor of participant group resulting in a nonsignificant comparison, F(1, 58) = 0.161, p = .69, and both of the within-subjects factors resulting in significant main effects: auditory condition, F(1, 58) = 181.16, MSE = .01, p < .001, η2p = .76, and time of testing, F(1, 58) = 33.76, MSE = .009, p < .001, η2p = .37. These main effects can be described as an overall detriment to recall in the presence of speech (M = 0.69), as compared to silence (M = 0.86), and as an improvement from pretest performance (M = 0.75) to posttest performance (M = 0.80). In other words, in concert with the data from Experiment 1, these data demonstrated robust irrelevant sound and practice effects. However, despite the experimental manipulation of talker familiarity via talker training in Experiment 2, there was no effect of talker familiarity. Talker Training: Voice Detection Data Overall, voice detection performance was high (M = 0.88, SD = 0.08). The high level of accuracy suggests that participants encoded talker-specific, acoustic characteristics to Ó 2019 Hogrefe Publishing
7
memory and later used said information to determine whether or not the target talker’s voice was present among the three audio recordings presented at test. We also examined the four, different story sets to test for story-specific effects on performance. Across the 4 days of talker training sessions, three participants each missed one session. We therefore excluded these participants from the repeatedmeasures ANOVA exploring the effect of story during talker exposure (N = 26). The use of the four different story sets across the 4 days of training did not result in any significant differences in voice detection accuracy, F(3, 75) = 1.55, MSE = .01, p = .21, η2p = .06 (see Appendix B for descriptive statistics for the voice detection task associated with each story set). Recall that we conducted Experiment 2 because we questioned whether or not the irrelevant talker (i.e., the listeners’ professor) was truly “familiar” to the listeners in Experiment 1. Without said familiarity, the precategorical features of the auditory distractors would continue to drive the participants’ performance without any influence from the postcategorical features. In Experiment 2, we included a 4-day-long talker training component (i.e., talker familiarization) prior to the ISE task. Training included 30 min of exposure to the ISE talker’s voice via story narratives, followed by a voice detection task including the target voice. Listening to the stories likely involved passive encoding of the talker-specific characteristics of the voice, while the voice detection task involved a more active, deeper level of processing on the listeners’ part (Schneider & Shiffrin, 1977). Subsequently, we predicted that the irrelevant talker’s voice would be familiar to the trained participants and affect the magnitude of their ISE. This was not the case. The size of the ISE was statistically indistinguishable in pretesting and in posttesting, before and after voice detection training was completed. Familiarity with the irrelevant talker did not modulate auditory distraction in Experiment 2’s ISE task, suggesting that the precategorical features remained the dominant contributor to the distraction effects we observed.
General Discussion Across two experiments, participants’ serial recall was significantly hindered in the presence of irrelevant background speech, compared to serial recall in quiet (ISE; Colle & Welsh, 1976; Elliott & Briganti, 2012). Furthermore, these data showed that a person’s familiarity with the talker did not influence the size of the ISE, whether the participants were informed of the talker’s identity (Experiment 1) or received 4 days of talker training that required them to attend to and interact with talker-specific Experimental Psychology (2019), 66(1), 1–11
8
characteristics of the voice (Experiment 2). These findings are in contrast to our hypotheses and previous speech perception research comparing listeners’ perception of words spoken by familiar and unfamiliar voices – familiarity with the talker’s voice typically yields a perceptual advantage (e.g., Barker & Newman, 2004; Souza et al., 2013; Yonan & Sommers, 2000). However, it is important to note, the majority of these aforementioned studies exploring talker familiarity utilized talkers that the listeners had long-standing, personal relationships with (e.g., a parent or spouse). It is probable that it is the degree of familiarity one has with a talker that enhances their ability to perceive speech. For example, Newman and Evers (2007) conducted a study employing a perceptual shadowing task in which participants shadowed the speech of one talker while ignoring the speech of a second talker. Similar to our current study, Newman and Evers’ familiar talker was the participants’ introductory psychology instructor. In Experiment 1, the listeners were familiar with the voice of talker to be shadowed and the results showed a talker familiarity effect (i.e., familiarity with the target talker significantly improved the listeners’ shadowing accuracy). In Experiment 2, the listeners were familiar with the to-be-ignored talker’s voice and familiarity did not affect the listeners’ shadowing accuracy – just as the listeners’ familiarity with the to-be-ignored talker did not affect the magnitude of the ISE in the present study. The processing requirements of ignoring a voice while shadowing one of two speech streams are arguably similar to being asked to ignore a voice while remembering visually presented digits, and task similarities may explain the lack of a talker familiarity effect. The emphasis on the overlap of the processing demands of the task and the to-be-ignored stimuli within an ISE paradigm – and a speech shadowing task – fits with the duplex-model account of auditory distraction, and specifically, the interference-by-process view of the changing-state effect (Hughes, 2014). The current findings suggest that in these two experiments, the precategorical features of auditory distraction were the dominant cause of the distraction observed. In this precategorical/postcategorical description of the duplex mechanism account (Marsh et al., 2018), the lack of an effect of talker familiarity is not necessarily problematic for the “functional” view of auditory distraction of Röer et al. (2017). The current stimuli were not emotionally arousing as a result of their semantic context or the talker-specific information. It is possible that if we increased the degree of familiarity our participants had with the to-beignored voice even further (e.g., if the talker was the participants’ close friend or family member), the talker-specific information may have affected the participants’ serial recall. Further manipulating the concept of familiarity,
Experimental Psychology (2019), 66(1), 1–11
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
and thus strengthening the potency of the postcategorical features in future research, may ultimately give us both new insights into the “functional” view of auditory distraction and how listeners utilize talker familiarity cues during everyday communication in noisy listening environments. These data also remind us that talker familiarity is a more vulnerable construct than once thought (Drouin, Monto, & Theodore, 2017). Some researchers argue that it is the role of attention that seems to modulate said talker familiarity effects across a variety of speech perception paradigms. In other words, “when processing time during retrieval is decoupled from encoding factors, it fails to predict the emergence of talker-specificity effects. Rather attention during encoding appears to be the putative variable” (Theodore, Blumstein, & Luthra, 2015, p. 1674). This is known as the attention-driven specificity hypothesis (Maibauer, Markis, Newell, & McLennan, 2014; Theodore et al., 2015). Our data from Experiment 1 support this perspective. It seems probable that during their introductory psychology class, students were not attending to the instructor’s identity and voice characteristics (e.g., gender, shimmer, quality, accent), rather their attentional resources were allocated to learning and understanding class material. Because the students were not explicitly attending to (and encoding) the talker characteristics of their instructor’s voice during class, when they participated in Experiment 1 – even those participants in the informed familiarity condition – they had few detailed, talker-specific exemplars available for recall and subsequent interference (Church & Schacter, 1994). Therefore, familiarity with the irrelevant speech’s talker had no significant effect on the magnitude of the ISE. Our data from Experiment 2 counter this argument. In the future, if we employed a similar design and ISE methodology, but trained the student participants to document (i.e., actively attend to) specific vocal characteristics of their instructor during each class and prior to the experiment, we may gain further insight into whether or not attention allocation is driving talker-specific effects. In summary, our study replicated the robust data already supporting the ISE but failed to show a talker-specific effect of familiarity in the context of ISE. These data suggest that talker familiarity is a more fragile construct than the speech perception literature originally suggested (Drouin et al., 2017), while also supporting the interference-by-process view of the changing-state effect (Hughes, 2014). Further research is needed to help resolve when talker-specific information is perceived and utilized by the listener. This knowledge will be key to confirming the theoretical explanations of the irrelevant speech effect and working toward a biologically plausible theory of speech perception that includes both linguistic and indexical information.
Ó 2019 Hogrefe Publishing
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at https://doi.org/10.1027/ 1618-3169/a000425 ESM 1. Data (.xls) Raw data of Experiment 1. ESM 2. Data (.xls) Raw data of Experiment 2. ESM 3. Data (.xlsx) Raw data of the talker training.
References Abercrombie, D. (1967). Elements of general phonetics. Chicago, IL: Aldine. Adobe Systems. (2005). Adobe Audition CC (Version 2) [Computer software]. San Jose, CA: Adobe Systems. Allport, A. (1993). Attention and control: Have we been asking the wrong questions? A critical review of twenty-five years. In E. Meyer & S. Kornblurn (Eds.), Attention and performance XVI: Synergies in experimental psychology, artificial intelligence, and cognitive neuroscience (pp. 182–218). Cambridge, MA: MIT Press. Barker, B. A., & Newman, R. S. (2004). Listen to your mother! The role of talker familiarity in infant streaming. Cognition, 94, B45– B53. https://doi.org/10.1016/j.cognition.2004.06.001 Beaman, C. P. (2004). The irrelevant sound phenomenon revisited: What role for working memory capacity? Journal of Experimental Psychology: Learning, Memory, and Cognition, 5, 1106–1118. https://doi.org/10.1037/0278-7393.30.5.1106 Chandrasekaran, B., Chan, A. H. D., & Wong, P. C. M. (2011). Neural processing of what and who information in speech. Journal of Cognitive Neuroscience, 23, 2690–2700. https://doi. org/10.1162/jocn.2011.21631 Church, B. A., & Schacter, D. L. (1994). Perceptual specificity of auditory priming: Implicit memory for voice intonation and fundamental frequency. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 521–533. Colle, H. A., & Welsh, A. (1976). Acoustic masking in primary memory. Journal of Verbal Learning and Verbal Behavior, 15, 17–31. https://doi.org/10.1016/S0022-5371(76)90003-7 Drouin, J. R., Monto, N. R., & Theodore, R. M. (2017). Talkerspecificity effects in spoken language processing: Now you see them, now you don’t. In A. Lahiri & S. Kotzer (Eds.), The speech processing lexicon: Neurocognitive and behavioural approaches (pp. 107–128). Berlin, Germany/Boston, MA: Walter de Gruyter GmbH. Ellermeier, W., & Zimmer, K. (2014). The psychoacoustics of the irrelevant sound effect. Acoustical Science and Technology, 35, 10–16. https://doi.org/10.1250/ast.35.10 Elliott, E. M., & Briganti, A. M. (2012). Investigating the role of attentional resources in the irrelevant speech effect. Acta Psychologica, 140, 64–74. https://doi.org/10.1016/j.actpsy. 2012.02.009 Hadley, T. (2013). Experience. In The New Yorker. New York, NY: Condé Nast. Hughes, R. W. (2014). Auditory distraction: A duplex-mechanism account. PsyCh Journal, 3, 30–41. https://doi.org/10.1002/pchj.44 Hughes, R. W., & Jones, D. M. (2001). The intrusiveness of sound: Laboratory findings and their implications for noise abatement. Noise & Health, 13, 55–74.
Ó 2019 Hogrefe Publishing
9
Hughes, R. W., & Jones, D. M. (2003). Indispensable benefits and unavoidable costs of unattended sound for cognitive functioning. Noise & Health, 6, 63–76. Hughes, R. W., Marsh, J. E., & Jones, D. M. (2009). Perceptualgestural (mis)mapping in serial short-term memory: The impact of talker variability. Journal of Experimental Psychology: Learning, Memory, & Cognition, 35, 1411–1425. https://doi.org/ 10.1037/a0017008 Hughes, R. W., Marsh, J. E., & Jones, D. M. (2011). Role of serial order in the impact of talker variability on short-term memory: Testing a perceptual organization-based account. Memory & Cognition, 39, 1435–1447. https://doi.org/10.3758/s13421011-0116-x Hughes, R. W., Vachon, F., & Jones, D. M. (2007). Disruption of short-term memory by changing and deviant sounds: Support for a duplex-mechanism account of auditory distraction. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 1050–1061. https://doi.org/10.1037/02787393.33.6.1050 IEEE. (1969). IEEE recommended practice for speech quality measurements. IEEE Transactions on Audio and Electroacoustics, 17, 225–246. https://doi.org/10.1109/TAU.1969.1162058 Jones, D. M., Madden, C., & Miles, C. (1992). Privileged access by irrelevant speech to short-term memory: The role of changing state. The Quarterly Journal of Experimental Psychology. Section A, 44, 645–669. https://doi.org/10.1080/ 14640749208401304 Körner, U., Röer, J. P., Buchner, A., & Bell, R. (2017). Working memory capacity is equally unrelated to auditory distraction by changingstate and deviant sounds. Journal of Memory and Language, 96, 122–137. https://doi.org/10.1016/j.jml.2017.05.005 Kuhl, P. K. (1993). Early linguistic experience and phonetic perception: Implications for theories of developmental speech perception. Journal of Phonetics, 21, 125–139. Liberman, A. M. (1957). Some results of research on speech perception. Journal of the Acoustical Society of America, 29, 117–123. https://doi.org/10.1121/1.1908635 Maibauer, A. M., Markis, T. A., Newell, J., & McLennan, C. T. (2014). Famous talker effects in spoken word recognition. Attention, Perception, & Psychophysics, 76, 11–18. https://doi. org/10.3758/s13414-013-0600-4 Marsh, J. E., Yang, J., Qualter, P., Richardson, C., Perham, N., Vachon, F., & Hughes, R. W. (2018). Postcategorical auditory distraction in short-term memory: Insights from increased task load and task type. Journal of Experimental Psychology: Learning, Memory, & Cognition, 44, 882–897. https://doi.org/ 10.1037/xlm0000492 Naoi, N., Minagawa-Kawai, Y., Kobayashi, A., Takeuchi, K., Nakamura, K., Yamamoto, J.-I., & Shozo, K. (2012). Cerebral responses to infant-directed speech and the effect of talker familiarity. NeuroImage, 59, 1735–1744. https://doi.org/ 10.1016/j.neuroimage.2011.07.093 Nearey, T. M. (1990). The segment as a unit of speech perception. Journal of Phonetics, 18, 347–373. Newman, R. S., & Evers, S. (2007). The effect of talker familiarity on stream segregation. Journal of Phonetics, 35, 85–103. https://doi.org/10.1016/j.wocn.2005.10.004 Neumann, O. (1996). Theories of attention. In O. Neumann & W. Prinz (Eds.), Handbook of perception and action Vol. 3: Attention (pp. 389–446). San Diego, CA: Academic Press. Nygaard, L. C., & Pisoni, D. B. (1998). Talker-specific learning in speech perception. Perception & Psychophysics, 60, 355–3769. Pisoni, D. B. (1997). Some thoughts on “normalization” in speech perception. In K. Johnson & J. W. Mullennix (Eds.), Talker variability in speech processing (pp. 9–32). San Diego, CA: Academic Press.
Experimental Psychology (2019), 66(1), 1–11
10
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
Poe, E. A. (1841). Eleonora: A fable. The gift: A Christmas and New Year’s present for 1842 (pp. 17–322). Philadelphia, PA: Carey and Hart. Psychology Software Tools, Inc. [E-Prime 2.0]. (2012). Retrieved from http://www.pstnet.com Röer, J. P., Bell, R., & Buchner, A. (2013). Self-relevance increases the irrelevant sound effect: Attentional disruption by one’s own name. Journal of Cognitive Psychology, 25, 925–931. https:// doi.org/10.1080/20445911.2013.828063 Röer, J. P., Körner, U., Buchner, A., & Bell, R. (2017). Attentional capture by taboo words: A functional view of auditory distraction. Emotion, 17, 740–750. https://doi.org/10.1037/ emo0000274 Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information processing: I. Detection, search, and attention. Psychological Review, 84, 1–66. https://doi.org/ 10.1037/0033-295X.84.1.1 Souza, P., Gehani, N., Wright, R., & McCloy, D. (2013). The advantage of knowing the talker. Journal of the American Academy of Audiology, 24, 689–700. https://doi.org/10.3766/jaaa.24.8.6 Sörqvist, P., & Rönnberg, J. (2014). Individual differences in distractibility: An update and a model. PsyCh Journal, 3, 42–57. https://doi.org/10.1002/pchj.47 Theodore, R. M., Blumstein, S. E., & Luthra, S. (2015). Attention modulates specificity effects in spoken word recognition: Challenges to the time-course hypothesis. Attention, Perception, & Psychophysics, 77, 1674–1684. https://doi.org/10.3758/ s13414-015-0854-0 Tremblay, S., & Jones, D. M. (1999). Change of intensity fails to produce an irrelevant sound effect: Implications for the representation of unattended sound. Journal of Experimental Psychology: Human Perception and Performance, 25, 1005– 1015. https://doi.org/10.1037/0096-1523.25.4.1005 Yonan, C. A., & Sommers, M. S. (2000). The effects of talker familiarity on spoken word identification in younger and older listeners. Psychology & Aging, 15, 88–99. https://doi.org/ 10.1037/0882-7974.15.1.88
History Received September 28, 2017 Revision received May 30, 2018 Accepted July 16, 2018 Published online February 19, 2019 Acknowledgments The authors would like to thank Annie Schubert for her help with sound editing and E-Prime programming and Noelle Brown for recording the irrelevant speech stimuli for the study. Alicia Briganti assisted with data collection. Open Data Raw data are available in the Electronic Supplementary Materials, ESM 1–3. Authorship All authors made substantial contributions to conception and design; Jort de Vreeze designed the study, collected the data, and wrote the manuscript; Christina Matschke designed the study and revised the manuscript; both authors provided final approval of the version to be published. ORCID Brittan Barker https://orcid.org/0000-0001-9327-7057
Brittan Barker Department of Communicative Disorders and Deaf Education Utah State University 2620 Old Main Hill Logan, UT 84342-2620 USA brittan.barker@usu.edu
Appendix A Follow-Up Questions Used to Access Participants’ Familiarity With The Spoken Voice 1. Was the voice speaking the sentences to you familiar?
(A) Yes (B) No 2. If the voice was familiar, attempt to identify the voice by selecting the letter next to the person’s last name.
(A) Brown (B) Knapp (C) Munson (D) Toussaint (E) Vigna (F) The voice is not familiar to me 3. If you are taking PSYC 2000, how often a week do you attend class? (A) Once a week (B) Two times a week (C) Three times a week Experimental Psychology (2019), 66(1), 1–11
Ó 2019 Hogrefe Publishing
B. A. Barker & E. M. Elliott, Irrelevant Speech From a Familiar Talker
11
Appendix B Table B1. Descriptive statistics for Experiment 2’s talker training voice detection data from each story session Training Session
n
M
SD
Min
Max
A
26
0.85
0.17
0.27
1.00
B
27
0.90
0.09
0.67
1.00
C
28
0.90
0.12
0.47
1.00
D
28
0.87
0.15
0.47
1.00
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 1–11
Research Article
And Remember the Truth That Once Was Spoken Knowledge of Having Disclosed Private Information to a Stranger Is Retrieved Automatically Franziska Schreckenbach , Klaus Rothermund, and Nicolas Koranyi Department of General Psychology II, Institute of Psychology, Friedrich Schiller University Jena, Germany
Abstract: Whenever individuals reveal personally relevant information to a stranger, they have to remember their self-disclosure for future interactions. Relying on instance-based theories of automaticity, we hypothesized that knowledge about having revealed private information to someone unfamiliar is retrieved automatically whenever this person is encountered again. In two studies, participants were orally interviewed by two different persons and instructed to be honest to one of them and to lie to the other. This instruction was either related to the identity of the interviewers (Experiment 1) or their gender (Experiment 2). Afterward, the target words honest and dishonest had to be identified in a categorization task in which pictures of the interviewers and of unknown persons served as task-irrelevant prime stimuli. In line with the hypothesis, results revealed congruence effects, indicating faster identification of the target word honest following the picture of a person whom one had told the truth. Keywords: self-disclosure, lying, automatic processes, instance-based learning
Typically, individuals entrust intimate details about the incidents in their lives, their thoughts and feelings only to very few people in their social environment (DePaulo & Kashy, 1998; Dindia, Fitzpatrick, & Kenny, 1997). These confidants are usually close friends or family members, like our partner, our siblings, or our parents. These social bonds are based on feelings of intimacy and a long history of mutual trust and reciprocity (Neyer, Wrzus, Wagner, & Lang, 2011). However, on some occasions, individuals may feel the urge to talk to someone about a personal issue, for instance, because none of their intimates is available or because they want to talk about a secret they want to hide away from these very persons. If such a situation arises, people sometimes confide in a total stranger, which is known as the “stranger on a train” phenomenon (Rubin, 1975). For instance, this might happen when meeting an interesting person at a party and immediately starting an intense discussion about a personally relevant topic, or when simply sitting alone with another person in a train coach while having the need to talk about a major problem. Experimental Psychology (2019), 66(1), 12–22 https://doi.org/10.1027/1618-3169/a000427
Whenever individuals have spoken openly and frankly to an unfamiliar person, they are well advised to remember that they had been honest to better regulate future interactions with this person, who has become an insider. For instance, the insider might use the private information to one’s disadvantage, and it is therefore important to have an eye on him or her. Alternatively, the insider might also be more open or more loyal to us, due to the fact that we have been as open before. Finally, when meeting the person again and acting according to the default mode (e.g., withholding personally relevant information), the person might be irritated and might even identify the incorrectness of one’s statements due to the incongruence with the original conversation. In the present research, we want to test the idea that remembering one’s self-disclosure toward someone unfamiliar is achieved by an automatic mechanism that comes into effect as soon as one has confided in an unknown person. Specifically, we test the hypothesis that a previously unknown conversation partner and the fact that one has been honest are bound together, forming an episodic unit that is stored in memory. When the person is encountered again, the information about having been honest is automatically retrieved from the stored episode. The broader theoretical framework behind this idea is provided by instance-based or episodic theories of memory Ó 2019 Hogrefe Publishing
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
storage and retrieval (Logan, 1988; see also Denkinger & Koutstaal, 2009; Hommel, 1998, 2004; Rothermund, Wentura, & De Houwer, 2005). Within these theories, it is proposed that when an action is performed, information about this action is stored together with situational cues as an episodic unit (alternatively called an event file, Hommel, 1998, 2004, instance, Logan, 1988, or stimulus-response episode, see Rothermund et al., 2005). These episodic units become automatically reactivated when encountering a similar situation again and therefore enable people to react fast and consistently across different situations. Furthermore, recent findings suggest that not only the motoric response toward a specific stimulus itself is encoded as part of an episode, but that relevant meta-knowledge regarding the task context (Horner & Henson, 2009, 2011; Waszak, Hommel, & Allport, 2003, 2005), the semantic meaning (Giesen & Rothermund, 2016; Moutsopoulou, Yang, Desantis, & Waszak, 2015), and the veracity of an action (Koranyi, Schreckenbach, & Rothermund, 2015) is likewise encoded in episodic format and later becomes retrieved when the eliciting situation is encountered again. A second theoretical idea that led to our research hypothesis relates to the finding that lying is a widespread and sometimes even adaptive phenomenon, with more than 90% of all people using deception at least once in a week and with lies possibly helping to avoid conflicts and to maintain positive relationships (e.g., DePaulo, Kashy, Kirkendol, Wyer, & Epstein, 1996; Ennis, Vrij, & Chance, 2008; Metts, 1989; Serota, Levine, & Boster, 2010). There is also evidence that lies are told more frequently to strangers than to friends (DePaulo & Kashy, 1998). Furthermore, the widespread assumption that lies are cognitively more demanding than telling the truth has been recently challenged by results from various studies (e.g., Barnes, Schaubroeck, Huth, & Ghumman, 2011; McCornack, Morrison, Palik, Wisner, & Zhu, 2014; Yam, Chen, & Reynold, 2014). These lines of research provide some evidence for our assumption that telling an unknown person something truthful is not necessarily the default option in every conversation and may even be experienced as being more demanding, especially when personal information is concerned. Given these findings and the fact that the likelihood of encoding distinct or unusual events in memory is higher compared to common events (Cohen & Carr, 1975; see also Brandt, Gardiner, & Macrae, 2006), we assume that knowledge about having disclosed some personally relevant truth to a stranger should become stored in memory because of its distinctiveness and is automatically activated on a later occasion. We test this idea in two experiments with the same core paradigm, consisting of two parts: First, participants either had to tell the truth or had to lie to different unfamiliar persons in oral interviews. Afterward, facial photographs of their interviewers served as task-irrelevant prime stimuli Ó 2019 Hogrefe Publishing
13
during a classification task in which the target words dishonest and honest had to be identified. Based on the assumption that a person becomes associated with the truth status of the responses that were given to this individual, we predicted a congruence effect, that is, faster identification of the target words following a matching picture prime. Specifically, we predicted faster identification of the word honest compared to the word dishonest after the presentation of a picture of a person to whom one had told the truth before, and faster identification of the word dishonest compared to the word honest after the presentation of a picture of a person one has lied to before. In this study, we report all measures, manipulations, and exclusions.
Experiment 1 Experiment 1 was a direct test of our hypothesis that knowledge about having told the truth to someone unfamiliar before can be activated automatically in a subsequent similar situation. We also examined whether this retrieval process is specific for truthful situations or deceptive ones by inserting not only pictures of interviewers that participants had to lie to or tell the truth before, but also pictures of unknown persons.
Method Participants and Design The sample size was determined relying on a previous experiment by Koranyi et al. (2015), who found an effect size of ηp2 = .16 in a similar paradigm. Accordingly, a power analysis was conducted based on this value, α = .05 and a power of 1 β = .8, leading to a sample size of 45. To account for possible exclusions, we recruited fifty students from a German University. The data of two subjects were excluded because of extremely slow reactions, with their mean response time lying more than three interquartile ranges above the third quartile of the mean reaction times of the sample. The final sample then consisted of 48 subjects (33 female) with an average age of M = 22.0 years (SD = 3.16). They participated in exchange for partial course credit and a chocolate bar. Recruiting was finished after the required number had been reached. We used a 3 2 factorial design with the within-subjects factors prime picture (facial photograph of interviewer one has told the truth vs. lied to vs. not encountered before) and target word (honest vs. dishonest). Procedure and Materials Upon arrival, participants were seated in front of a computer and received instructions on the screen. As picture Experimental Psychology (2019), 66(1), 12–22
14
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
primes in the priming task, we used facial photographs of the four interrogators on which they wore the same clothes, hairstyle, and accessories as in the actual interviews. All pictures had a size of 400 600 px, with each person’s eyes placed at the same position in the upper half of the picture (approx. 200 px below the top). Interviews First, participants were informed that they would be interviewed twice by different interrogators and that their task was to answer all questions in one interview truthfully, but to lie to all questions in the other one. Additionally, participants were prompted not to reveal their dishonesty to the respective interviewer and to act so as to convince both interviewers of the truthfulness of their statements. In order to achieve this goal, participants were instructed to always wait for some seconds before providing each answer. Afterward, a picture of the first interviewer appeared on the left side of the screen, while on the right side the interviewer’s name, the topic that participants would be asked about, and the instruction whether to tell the truth or to lie was presented (e.g., “This is Sascha. Sascha is going to ask you some questions on the subject of your family. Please always tell the truth.”). Complementary information was given about the second interviewer (e.g., “This is Sarah. Sarah is going to ask you some questions on the subject of your studies at the university. Please always respond with a lie.”). Participants were then guided to separate rooms where the interviews took place. Every participant was interrogated by one of two possible couples of interviewers, with pictures of the other couple serving as a neutral baseline in the subsequent priming task. Each couple consisted of a man and a woman, the assignment of truthfulness and lying to interviewer and topic was counterbalanced across participants, as was the order of interviews. Four questions were related to the topic family, the other ones referred to the topic university (see Electronic Supplementary Material, ESM 1 for a complete list of all questions). Each question was created to touch a personally important issue but at the same time should not be too intimate in order to prevent participants from answering untruthfully even if they were instructed to tell the truth. Priming Task After the interviews, participants performed a priming task on the computer to assess whether the specific persons from the interview automatically trigger retrieval of knowledge about having previously told the truth or a lie, respectively. Each trial had the same temporal structure (see Figure 1): A fixation cross (500 ms) was followed by a prime picture, showing one of the four interviewers. After 1
300 ms, the target word appeared upon the prime picture, approximately at the position where the eyes in the pictures were located (200 px below the top). Participants were asked to decide as fast as possible whether a presented target was the word ehrlich (English: honest) or gelogen (English: dishonest, having lied) by pressing the “J” key for honest and the “F” key for dishonest. To ensure that target words were encoded and identified not only on a perceptual but also on a semantic level, they were degraded by inserting alphanumeric characters between the letters (e.g., x$eh&$rxli#c%h instead of ehrlich). The locations of these additional characters within the target words were determined randomly for each trial, ensuring a large degree of variability between the target stimuli. Both stimuli (i.e., the facial photograph of the interviewer and the target word) remained on the screen until the participant responded by pressing either the J or F key on the computer keyboard. The next trial was initiated after an inter-trialinterval of 750 ms. The priming task comprised 192 experimental trials with order of trials randomized individually. All four prime pictures were presented 48 times each, half of the times preceding dishonest as target word and half of the times honest as target word. To ensure that the honest and dishonest keys maintained their semantic meaning across the experiment, 60 additional filler trials were randomly intermixed into the experimental trials that required a genuine true/false decision (see Eder & Rothermund, 2008; Wiswede, Koranyi, Müller, Langner, & Rothermund, 2013). In the filler trials, a true (50%; e.g., “Saturn is a planet”) or false assertion (50%; e.g., “Einstein was a musician”) was presented word by word in the center of the screen instead of a prime picture. The assertion was followed by the question “honest or dishonest?” that was presented as a response cue instead of a target. Participants had to evaluate the truth of the previously presented sentence by pressing the same keys that were also used for the honest/dishonest classification of the experimental trials. The priming task had a total duration of approximately 20 min.
Results To combine effects of both speed and accuracy in a single dependent variable, we computed inverse efficiency scores (IES; see Table 1) by dividing RTs by the proportion of correct responses (i.e., 1 error rate) separately for all combinations of the factorial design.1 Small values on the IES indicate fast responses and/or low error rates, whereas large values indicate slow responding and/or high error frequencies. Calculating IESs is an established method for
The complete raw data as well as an analysis script for R can be found in the ESMs 2–5 attached to this paper.
Experimental Psychology (2019), 66(1), 12–22
Ó 2019 Hogrefe Publishing
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
15
Figure 1. Trial structure of Experiment 1.
Table 1. Mean response time (RT), accuracy (%), and inverse efficiency scores (IES: RT divided by proportion of correct responses) as a function of prime picture (interviewer to whom one had told the truth, a lie, or nothing) and target word (honest vs. dishonest) Prime picture condition Truth
Lie
Baseline
Target word DV
Honest (SD)
Dishonest (SD)
Honest (SD)
Dishonest (SD)
Honest (SD)
Dishonest (SD)
RT
507 (65)
536 (80)
518 (67)
519 (69)
518 (64)
525 (73)
Accuracy (%)
95.8 (4.8)
94.3 (6.5)
94.4 (6.1)
95.6 (5.5)
95.4 (4.1)
95.9 (3.4)
IES
529 (65)
572 (103)
550 (71)
546 (83)
543 (62)
548 (75)
merging the effect variance of response times (RT) and error rates into a single index that also prevents biased results due to differences in speed/accuracy tradeoffs between conditions (Bruyer & Brysbaert, 2011; Koranyi et al., 2015; Townsend & Ashby, 1983). RT in the experimental task that were more than three interquartile ranges above the third quartile of an individual’s RT distribution were categorized as far-out values (Tukey, 1977) and therefore discarded (1.4% of all RTs). All RTs below the threshold of 250 ms were discarded as well (0.1%). The IESs were submitted to a 3 (Prime Picture: interviewer who was told the truth vs. a lie vs. neutral [not Ó 2019 Hogrefe Publishing
encountered]) 2 (Target Word: honest vs. dishonest) analysis of variance (ANOVA) with repeated measures on both factors. We also specified two a priori contrasts for the factor prime picture to test whether retrieval of information occurred for both truthful and untruthful statements or mainly for one of them. The first contrast was specified to compare truth primes (i.e., pictures of the interviewer who was told the truth) with neutral primes (i.e., pictures of the unknown interviewers), whereas the second contrast compared lie primes with neutral primes. Results revealed a main effect of target word, F(1, 47) = 4.64, p = .037, ηp2 = .09, indicating that on average, Experimental Psychology (2019), 66(1), 12–22
16
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
participants classified the target honest (M = 541) faster than the target dishonest (M = 555), whereas no main effect for prime picture (F < 1) emerged. Additionally, the predicted interaction of both factors was found, F(2, 46) = 5.83, p = .006, ηp2 = .20. Further inspection of planned contrasts revealed that the interaction effect was due to the interaction of target word and the truth versus neutral contrast, F(1, 47) = 8.31, p = .006, ηp2 = .15. In line with the assumption of congruency, IESs were smaller for honest (M = 529) versus dishonest (M = 572) targets in the truth condition, F(1, 47) = 12.14, p = .001, ηp2 = .21, whereas no such difference was found for neutral primes (M = 543 and M = 548, F < 1). No effect was found for the interaction between target word and the lie versus neutral contrast (F < 1).2 We also scrutinized reaction times and accuracy separately. Results revealed a significant main effect of target word, F(1, 47) = 11.79, p = .001, ηp2 = .20, for the response time data, as well as a significant interaction between prime picture and target word, F(2, 46) = 5.20, p = .009, ηp2 = .18. For the accuracy data, no significant effect emerged (all F < 2.5), but the descriptive pattern of results stayed the same, with error rates being smaller in the case of congruent primes and targets compared to incongruent primes and targets.
Discussion The aim of the present experiment was to test the hypothesis that knowledge about having been honest and open to a stranger is activated automatically when encountering this person again on a later occasion. In line with this hypothesis, we found the predicted facilitation effect for the word honest when the picture of an interviewer to whom one had revealed personal information was subsequently presented in a word identification task. Apparently, knowledge about having told the truth to a specific person is retrieved automatically when encountering this person again. There are some limitations of Experiment 1 that we want to consider here further. First, the assignment of response keys to honest/dishonest responses was not counterbalanced across participants, with “honest” always lying on the right side and “dishonest” on the left side of the response keyboard. This might have created an asymmetry in responding that may have made it more difficult to find an effect in the dishonest condition. Although we think 2
that such an influence on the interaction we found is unlikely, a replication with counterbalanced key assignments would be desirable to provide more certainty regarding this point. A second shortcoming of Experiment 1 is that we cannot rule out the possibility that the congruence effect was driven by a carry-over effect from the instructions. Participants might have remembered the explicit instruction to lie to a specific person during the interview also when seeing the corresponding picture in the priming procedure, which could explain the findings of Experiment 1. We think that this is an unlikely explanation because, if responses were indeed influenced by the interview instructions, one would expect prime-target congruence effects for both conditions, that is, not only for pictures of interviewers to whom one had told the truth but also for those to whom one had lied, which was not the case. Furthermore, we tested the effect of instruction-based carry-over effects for instructed lies and truths in a previous study with a highly similar paradigm and found no evidence for mere instruction-based effects in the absence of actual episodes of lying or truth telling (Koranyi et al., 2015). Still, as the presented data of the first experiment provide no safe evidence against this alternative explanation, we conducted a second experiment that allowed us to test whether the congruence effect was mainly driven by an explicitly instructed association or by episodic encoding and retrieval of the behavior that was actually shown during the interview situation.
Experiment 2 Experiment 2 used the same interview and priming task described in Experiment 1. In order to test carry-over effects from the instructions for the interviews to the priming task, participants now received generic instructions regarding the gender of interviewers to whom they had to lie or tell the truth, respectively, instead of receiving instructions regarding the identity of these interviewers (e.g., “respond with a lie to questions posed by women, respond truthfully to questions posed by men”). In the following priming task, we again presented pictures of the interviewers as well as pictures of unknown persons. If the congruence effect in Experiment 1 was due to instructions, the same response pattern should be found for both pictures of actual interviewers and of unknown persons of the same sex. That is, if participants were instructed to tell the truth
Although we counterbalanced the assignment of truth status to interviewer, it is possible that the congruency effect differs for different combinations of participant gender and interviewer gender. To examine this assumption, we ran additional analyses, including the factors participant sex (male vs. female) and sex of the interviewer who had to be told the truth (male vs. female). An interaction of both factors could be interpreted in terms of an effect of gender match/mismatch on the automatic retrieval effect of honest responses. However, these analyses yielded no significant results besides the interaction of prime and target (all other Fs < 2).
Experimental Psychology (2019), 66(1), 12–22
Ó 2019 Hogrefe Publishing
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
to women but to lie to men, every time a picture of a woman is presented, classification of the target “honest” should be faster compared to the target “dishonest,” no matter if this person was the former interviewer or not. In contrast, if automatic binding and retrieval reflects automatic encoding and retrieval of actual episodes of lying and truth telling, then the congruence effect should emerge only after the presentation of pictures of the actual interviewers, but not after the presentation of a person who had not been encountered as an interviewer during the interview.
Method Participants and Design The sample size was determined relying on the results of Experiment 1. Specifically, we analyzed the interaction between prime picture (honest vs. dishonest) and target word (honest vs. dishonest) and found an effect size of ηp2 = .19. Accordingly, a power analysis was conducted based on this value, α = .05 and a power of 1 β = .95, to give the proposed alternative explanation in terms of instruction-based carry-over effects a fair chance to emerge in the non-encountered condition, but also to be sure that there is no such influence in case of nonsignificant priming effects in this condition. This analysis resulted in a sample size of n = 59. To account for possible exclusions, we recruited 68 students from a German University. For their participation, they received 2.50 € and a chocolate bar. Recruiting was finished after the required number had been reached. The data of 8 subjects could not be meaningfully analyzed because of error rates higher than 25% (average 4.9% errors, without excluded participants). Two subjects were excluded because of extremely slow reactions, with their mean response time lying more than three interquartile ranges above the third quartile of the mean reaction times of the sample. Additionally, three participants had to be excluded because they did not remember the instructions to whom to lie correctly. Therefore, the final sample consisted of 55 participants (37 female) with an average age of M = 23.4 years (SD = 3.38). We used a 2 2 2 factorial design with the withinsubjects factors prime picture truth status (facial photograph of interviewer gender who had to be told the truth vs. had to be lied to), prime picture encounter status (picture of an interviewer one has encountered vs. not encountered during the interview), and target word (honest vs. dishonest). 3
17
Procedure and Materials All procedures and measures were the same as in Experiment 1, except for the changes reported here. Participants again took part in an oral interview and subsequently performed the priming task. After reading the general instructions about truth telling and lying in the upcoming two interviews, they received the explicit instruction that they should lie to the man and tell the truth to the woman (or vice versa, counterbalanced across participants). The remainder of the interview part was the same as in Experiment 1. Participants were interrogated by one of two possible couples of interviewers (counterbalanced across participants), with pictures of the other couple being used for the not-encountered interviewers condition in the subsequent priming task. The priming task on the computer was the same as in Experiment 1 with some minor changes: Response keys were D and L and the assignment of target words (honest vs. dishonest) to response keys was counterbalanced across participants. Thirty filler trials that required a genuine true/false decision (Eder & Rothermund, 2008; Wiswede et al., 2013) were intermixed into the experimental trials. In Experiment 1, we were not able to control whether participants followed our instructions, which is why we added some final questions after they had finished the priming procedure. These questions were asked in order to check whether they remembered their instructions to whom to tell the truth and whether they had known one of their interviewers in advance (see ESM 1 for these questions too). The whole experiment had a duration of about 25 min.
Results We excluded outlier values (2.2%) from the response latency data according to the same criteria as in Experiment 1, and calculated IES for all combinations of the factorial design. The IESs were submitted to a 2 (Prime Picture Truth Status: interviewer gender who had to be told the truth vs. had to be lied to) 2 (Prime Picture Encounter Status: interviewer encountered vs. not encountered during the interview) 2 (Target Word: honest vs. dishonest) ANOVA with repeated measures on all factors. All cell means are depicted in Table 2 (also for RTs and error rates). Results revealed a main effect of Target Word, F(1, 54) = 4.79, p = .033, ηp2 = .08, indicating that on average, participants classified the target honest (M = 626) faster than the target dishonest (M = 638). We also found a significant three-way interaction, F(1, 54) = 2.99, p = .045 (one-tailed),3 ηp2 = .05. To unpack this interaction, we ran
Methodologically, the F-test for the interaction is equivalent to a t-test that tests the difference between congruency effects for pictures of persons who were encountered and not encountered during the interview against zero. Thus, given our specific predictions, a one-tailed test is recommended in order to increase the power of the test (Maxwell & Delaney, 1990, p. 144).
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 12–22
18
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
Table 2. Mean response time (RT), accuracy (%), and inverse efficiency scores (IES: RT divided by proportion of correct responses) as a function of prime picture (interviewer to whom one had to tell the truth or a lie), target word (honest vs. dishonest), and picture familiarity (interviewer whom one had seen during the interview vs. not seen during the interview): Experiment 2 Prime picture condition Truth
Lie Target word
DV RT Accuracy (%) IES
Familiarity
Honest (SD)
Dishonest (SD)
Honest (SD)
Dishonest (SD)
Seen during the Interview
592 (97)
610 (100)
601 (102)
607 (115)
New
592 (95)
607 (112)
593 (89)
607 (113)
Seen during the interview
95.9 (6.0)
94.9 (6.8)
94.1 (8.0)
95.5 (6.4)
New
94.6 (5.5)
95.4 (5.5)
95.8 (5.9)
95.9 (5.8)
Seen during the interview
618 (96)
646 (120)
641 (107)
637 (116)
New
627 (100)
637 (118)
621 (100)
635 (124)
Note. Standard deviations appear in parentheses.
separate 2 2 ANOVAs with the factors prime picture truth status and target word for pictures of the previously encountered and not-encountered interviewers. For known interviewers, a marginally significant main effect of target word emerged, F(1, 54) = 3.31, p = .074, ηp2 = .06, as well as the predicted interaction of Prime Picture Target Word, F(1, 54) = 2.92, p = .047 (one-tailed), ηp2 = .05. In line with the findings from Experiment 1, for honest interviewers, IES were faster for “honest” (M = 618) than for “dishonest” (M = 646) targets, F(1, 54) = 5.73, p = .02, ηp2 = .10, whereas for interviewers that had to be lied to, no corresponding effect emerged (M = 641 and M = 637, respectively; F < 1). In contrast, for unknown interviewers, no Prime Picture Target Word interaction was observed (F < 1), indicating that pictures of people who were not encountered as interviewers during the interview did not have an influence on performance during the classification task.4 We again scrutinized reaction times and accuracy separately. For the response time data, only a significant main effect of target word was found, F(1, 54) = 9.70, p = .003, ηp2 = .15, but the descriptive pattern of results stayed the same, with reaction times being faster in the case of congruent primes and targets compared to incongruent primes and targets in case of interviewers who were encountered during the interview, while no systematic effect can be seen for pictures of interviewers who did not take part in the interview. For the accuracy data, results revealed a significant interaction between Prime Picture Truth Status Prime Picture Encountered Status, F(1, 54) = 4.11, p = .048, ηp2 = .07, which was further qualified by a significant three-way interaction, F(1, 54) = 3.12, p < .042 (one-tailed), 4
ηp2 = .06. Congruency effects were obtained for encountered interviewers but not for not-encountered interviewers.
Discussion The results of Experiment 2 replicate the findings from Experiment 1 by showing that knowledge about having told the truth to an unknown person is retrieved automatically. Furthermore, they support the claim that congruence effects reflect automatic encoding and retrieval of metaknowledge about the actual interview situation rather than carry-over effects of instructions: Because priming effects only emerged for pictures of actual interviewers but not for pictures of not-encountered interviewers of the same sex, results cannot be explained in terms of instructions, which were framed in generic terms referring to interviewer gender in the second experiment. However, it should be noted that the effect sizes in Experiment 2 were somewhat weaker than in Experiment 1.
General Discussion In two experiments, we found evidence for the assumption that the knowledge about having been honest to a stranger is retrieved automatically. When primed with pictures of a person to whom one had to tell the truth before, response times were faster and more accurate for “honest” versus “dishonest” targets, indicating that reencountering this
Furthermore, we again looked for effects of gender assignment by adding the factors participant sex (male vs. female) and truth sex (male vs. female). Results revealed a marginally significant interaction of sex and prime, F(1, 51) = 3.45, p = .069, ηp2 = .06, indicating that male participants responded faster after honest primes while female participants responded faster after dishonest primes, as well as a significant interaction of truth sex and prime, F(1, 51) = 13.98, p < .001, ηp2 = .22, which was further qualified by a marginally significant interaction of truth sex, prime and familiarity, F(1, 51) = 3.28, p = .076, ηp2 = .06, which cannot be interpreted in meaningful ways. No further connections with gender could be observed in the present data.
Experimental Psychology (2019), 66(1), 12–22
Ó 2019 Hogrefe Publishing
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
person automatically activated knowledge about having been honest before. No prime-target congruence effect was found for pictures of persons who had not been part of the interview, even if participants should have told them the truth given the instructions (e.g., presentation of a male person as a prime when the instruction was to tell the truth to a male person). These findings suggest that the information about having disclosed intimate or personal information to a person is a diagnostic piece of information that is bound to this person and stored in memory, ready to be automatically activated when this person is encountered again on a later occasion. Such a mechanism of marking previously unfamiliar persons in whom one has confided, and automatically retrieving this information when this person is encountered again helps to behave adequately and consistently in future interactions with this person.
Asymmetry of Associating Strangers With Knowledge About Having Told the Truth or a Lie Importantly, we did not find evidence of a similar marking of strangers to whose questions one had responded untruthfully, that is, not revealing anything intimate or personal. Storing and retrieving information about persons to whom one has either told a personal truth or has avoided this by having told a lie thus is apparently asymmetric. To explain this asymmetry, we assume that only information that deviates from the default becomes salient enough to warrant episodic storage and retrieval. In accordance with the idea that other people who are unfamiliar are typically not entrusted with personal information, evading the truth, not being honest, or telling a lie in response to personal questions thus seems to be the standard or “unmarked” (Hamilton & Deese, 1971) form of interaction, which is not salient enough to warrant to be bound and stored with the unfamiliar person into episodic memory. Not having revealed anything personal to this person – even having told an outright lie – may thus not leave any long-lasting traces with regard to this person. On the contrary, having an intimate and honest conversation with a previously unfamiliar stranger stands out against this background of superficial and non-committal interactions and thus represents a “marked” form of discourse. In these distinct situations, the person becomes bound to the knowledge of having been honest, and this episodic association is automatically retrieved on later encounters with this person. It is noteworthy that this asymmetry of storing knowledge about having told either the truth or a lie to a specific person contrasts with what we found with regard to episodic storage and retrieval of truth- and falsehood-related information regarding specific questions to which one has Ó 2019 Hogrefe Publishing
19
responded truthfully or with a lie (Koranyi et al., 2015). In this previous study, we found that specific questions were associated with the knowledge that one lied to this question but were not bound with the information that one answered truthfully to a question. When explaining this apparent discrepancy in the pattern of results, a first important point is to emphasize that the findings of the two studies are not incompatible. Instead, the two studies differed with regard to what they tested: Whereas the current study investigated whether persons became associated with knowledge regarding having told the truth or a falsity, the previous study by Koranyi et al. (2015) investigated whether questions acquired such associations. It is perfectly possible that in both studies, both types of associations emerged and were simultaneously stored in memory – with different asymmetries. We cannot test this possibility because only one type of associations was assessed by presenting either pictures of persons (current study) or questions (previous study) as primes in the test task. Had we presented also questions as primes in the current study, we might also have found priming effects for questions to which one had lied (facilitating detection of the target word dishonest) but no effects for questions to which one had responded truthfully, as in the previous study by Koranyi et al. (2015), despite the fact that we found an opposite asymmetry with regard to the episodic storage of person-related information. Thus, a first possibility is that for persons and questions, different types of episodes become encoded into memory: Previously unfamiliar persons can become associated with information of having told the truth but not with having told a lie, whereas questions can only become associated with the knowledge that one has lied to this question but not with the information that one has answered truthfully. It is also possible that the interview contexts of the two studies differed, leading to different asymmetries in storing and retrieval of truth-related information. The current study focused on persons as the decisive element that determined whether participants had to respond honestly or dishonestly, regardless of the questions that were posed. In this context, it would be natural to focus one’s attention onto one of the two interviewers, and apparently, people remembered the interviewer to whom they had revealed personal information more, linking him or her with this information, compared to the other interviewer to whom they had not told the truth. The previous study, on the contrary, specified the content of the question as the decisive element that determined the type of the response – we did not even vary the interviewer in the previous study but used only one interviewer who posed different questions, some of which had to be answered truthfully whereas other had to be answered with a lie. Focusing on questions might have activated a different salience asymmetry, with truthful answers Experimental Psychology (2019), 66(1), 12–22
20
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
being the natural, default response to a question, whereas lies represented the salient alternative to this default. According to this account, the context determines whether persons or questions are attended and will be stored in memory; this focus of attention in turn determines whether honest answers or lies are the salient mode of responding that deviates from the default and will be stored together with the person or question in episodic memory. Furthermore, even more complex hypotheses are possible as well: For instance, one can easily conceive of an interview in which the type of response is determined by the combination of person and question (e.g., always tell the truth for questions regarding Topic A if they are posed by Person X and for questions regarding Topic B if they are posed by Person Y, but always tell a lie for questions regarding Topic A if they are posed by Person Y and for questions regarding Topic B if they are posed by Person X). In such a situation, combinations of persons and questions are the decisive factor that determines responding, and it could be that in such a situation, these combinations are stored in memory together with their respective response behaviors (cf. related research on the activation of stereotypes by combinations of categories and contexts, Casper, Wentura, & Rothermund, 2010, 2011; Müller & Rothermund, 2012; Wigboldus, Dijksterhuis, & van Knippenberg, 2003).
Specificity of the Binding Effect for Lies and Truths Given the explanations for the discrepancy between our effects and those from Koranyi et al. (2015), the question may arise whether this kind of binding is specific to the truth status of an interaction. Actually, we believe that it is also possible to store other aspects of a certain situation (e.g., information about the mood in which one was acting toward the other person), as long as this information can help individuals to create future interactions in a more efficient and successful way. Still, this is a question that can only answer based on additional data.
Dependency of the Effect on the Content of the Questions One difficulty that we are often dealing with is the creation of suitable questions for the interview situations that participants have to go through. As stated above, these questions are chosen in order to create a balance between asking people about relevant topics but at the same time trying to be not too intimate, to make sure that participants do not lie when they are supposed to tell the truth. Unfortunately, we do not yet have any systematic approach on this, which is why it remains unclear which influence the selection that Experimental Psychology (2019), 66(1), 12–22
we made has on the current results. To us, it seems plausible that talking to someone about an unimportant subject might not leave any traces whereas sharing private information with a former stranger does. Definitely, additional studies are required to address this topic more comprehensively.
Interpersonal Closeness Another important question stemming from the current research regards the familiarity or interpersonal closeness of the person to whom one has responded either truthfully or dishonestly. Our study exclusively focused on strangers with whom one had had no personal relation before the experiment. In this situation, telling the truth was shown to be a non-default response that was stored and later retrieved from memory. It could very well be, however, that the opposite asymmetry would emerge for interactions with familiar persons. We would expect that honesty and intimacy are the default for interactions with close others (e.g., spouse/romantic partner, family members, good friends), so that answering truthfully to them would not constitute a remarkable event that would have to be stored in memory. Instead, lying to them would be a highly atypical and salient event that would probably leave strong person-related traces in episodic memory. However, in such a situation, it may not be the person alone that becomes associated with a “dishonesty tag,” since this would collide with the mental representation of this person as an important and close relation, implying that the person would still be associated with honesty. Perhaps, this dilemma could be solved by coding the combination of this person with the specific question to which one had lied as an exception in memory that would be reactivated only in situations that contain both elements (i.e., person and question). Furthermore, it would be interesting to investigate the consequences of having told the truth to a previously unfamiliar person on the relationship status that results with regard to this person. Previous research has accumulated some evidence indicating that disclosing personal information toward a stranger generates closeness (e.g., Aron, Melinat, Aron, Vallone, & Bator, 1997; Sprecher, Treger, Wondra, Hilaire, & Wallpe, 2013). Following these lines, one can assume that whenever a person discloses some intimate information to a stranger, a feeling of closeness might emerge toward this person, with implications for the evaluation of this person. Having told a lie to a stranger, however, would probably not produce any closeness and might even lead to negative evaluations. Interpersonal trust and openness might therefore be part of a relationship development process, in which the truth that one has spoken sets the stage for further intimate and truthful future interactions. Establishing a mental link between the Ó 2019 Hogrefe Publishing
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
previously unfamiliar person and the knowledge that one has been honest to him or her might provide an important first step in the development of an intimate and trusting relationship.
Conclusion This study demonstrated that having revealed personal information to a former stranger establishes an association between this person and the knowledge that one has been honest. Reencountering this person again leads to an automatic retrieval of this “honesty tag” that is important in regulating future interactions with this person and that might constitute an important first step in the development of an intimate relation. Future research is needed to more closely investigate the mediating processes underlying this effect and of the conditions that might moderate the specific pattern of episodic storage and retrieval of associating persons with truth and falsity-related information.
Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at https://doi.org/10.1027/ 1618-3169/a000427 ESM 1. Text (.docx) Supplementary questions. ESM 2. Data (.txt) Raw data of Experiment 1. ESM 3. Data (.txt) Raw data of Experiment 2. ESM 4. Data (.txt) Analysis of Experiment 1. ESM 5. Data (.txt) Analysis of Experiment 2.
References Aron, A., Melinat, E., Aron, E. N., Vallone, R. D., & Bator, R. J. (1997). The experimental generation of interpersonal closeness: A procedure and some preliminary findings. Personality and Social Psychology Bulletin, 23, 363–377. https://doi.org/ 10.1177/0146167297234003 Barnes, C., Schaubroeck, J., Huth, M., & Ghumman, S. (2011). Lack of sleep and unethical conduct. Organizational Behavior and Human Decision Processes, 115, 169–180. https://doi.org/ 10.1016/j.obhdp.2011.01.009 Brandt, K. R., Gardiner, J. M., & Macrae, C. N. (2006). The distinctiveness effect in forenames: The role of subjective experiences and recognition memory. British Journal of Psychology, 97, 269–280. https://doi.org/10.1348/000712605X73685
Ó 2019 Hogrefe Publishing
21
Bruyer, R., & Brysbaert, M. (2011). Combining speed and accuracy in cognitive psychology: Is the inverse efficiency score (IES) a better dependent variable than the mean reaction time (RT) and the percentage of errors (PE)? Psychologica Belgica, 51, 5–13. https://doi.org/10.5334/pb-51-1-5 Casper, C., Rothermund, K., & Wentura, D. (2010). Automatic stereotype activation is context dependent. Social Psychology, 41, 131–136. https://doi.org/10.1027/1864-9335/a000019 Casper, C., Rothermund, K., & Wentura, D. (2011). The activation of specific facets of age stereotypes depends on individuating information. Social Cognition, 29, 393–414. https://doi.org/ 10.1521/soco.2011.29.4.393 Cohen, M. E., & Carr, W. J. (1975). Facial recognition and the von Restorff effect. Bulletin of the Psychonomic Society, 6, 383– 384. https://doi.org/10.3758/BF03333209 Denkinger, B., & Koutstaal, W. (2009). Perceive-decide-act, perceive-decide-act: How abstract is repetition-related decision learning? Journal of Experimental Psychology: Learning, Memory and Cognition, 35, 742–756. https://doi.org/10.1037/ a0015263 DePaulo, B. M., & Kashy, D. A. (1998). Everyday lies in close and casual relationships. Journal of Personality and Social Psychology, 74, 63–79. https://doi.org/10.1037/0022-3514.74.1.63 DePaulo, B. M., Kashy, D. A., Kirkendol, S. E., Wyer, M. M., & Epstein, J. A. (1996). Lying in everyday life. Journal of Personality and Social Psychology, 70, 979–995. https://doi.org/ 10.1037/0022-3514.70.5.979 Dindia, K., Fitzpatrick, M. A., & Kenny, D. A. (1997). Self-disclosure in spouse and stranger Interaction: A social relations analysis. Human Communication Research, 23, 388–412. https://doi.org/ 10.1111/j.1468-2958.1997.tb00402.x Eder, A. B., & Rothermund, K. (2008). When do motor behaviors (mis)match affective stimuli? An evaluative coding view of approach and avoidance reactions. Journal of Experimental Psychology: General, 137, 262–281. https://doi.org/10.1037/ 0096-3445.137.2.262 Ennis, E., Vrij, A., & Chance, C. (2008). Individual differences and lying in everyday life. Journal of Social and Personal Relationships, 25, 105–118. https://doi.org/10.1177/ 0265407507086808 Giesen, C., & Rothermund, K. (2016). Multi-level response coding in stimulus-response bindings: Irrelevant distractors retrieve both semantic and motor response codes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42, 1643– 1656. https://doi.org/10.1037/xlm0000264 Hamilton, H. W., & Deese, J. (1971). Does linguistic marking have a psychological correlate? Journal of Verbal Learning and Verbal Behavior, 10, 707–714. https://doi.org/10.1016/S0022-5371 (71)80079-8 Hommel, B. (1998). Event files: Evidence for automatic integration of stimulus-response episodes. Visual Cognition, 5, 183–216. https://doi.org/10.1080/713756773 Hommel, B. (2004). Event files: Feature binding in and across perception and action. Trends in Cognitive Sciences, 8, 494– 500. https://doi.org/10.1016/j.tics.2004.08.007 Horner, A. J., & Henson, R. N. (2009). Bindings between stimuli and multiple response codes dominate long-lag repetition priming in speeded classification tasks. Journal of Experimental Psychology: Learning, Memory and Cognition, 35, 757–779. https://doi.org/10.1037/a0015262 Horner, A. J., & Henson, R. N. (2011). Stimulus-response bindings code both abstract and specific representations of stimuli: Evidence from a classification priming design that reverses multiple levels of response representation. Memory & Cognition, 39, 1457–1471. https://doi.org/10.3758/s13421-011-0118-8
Experimental Psychology (2019), 66(1), 12–22
22
F. Schreckenbach et al., And Remember the Truth That Once Was Spoken
Koranyi, N., Schreckenbach, F., & Rothermund, K. (2015). The implicit cognition of lying: Knowledge about having lied to a question is retrieved automatically. Social Cognition, 33, 67–84. https://doi.org/10.1521/soco.2015.33.1.67 Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527. https://doi.org/10.1037/ 0033-295X.95.4.492 Maxwell, S. E., & Delaney, H. D. (1990). Designing experiments and analyzing data: A model comparison perspective. Pacific Grove, CA: Wadsworth. McCornack, S., Morrison, K., Paik, J. E., Wisner, A. M., & Zhu, X. (2014). Information manipulation theory 2: A propositional theory of deceptive discourse production. Journal of Language and Social Psychology, 33, 348–377. https://doi.org/10.1177/ 0261927X14534656 Metts, S. (1989). An exploratory investigation of deception in close relationships. Journal of Social and Personal Relationships, 6, 159–179. https://doi.org/10.1177/026540758900600202 Moutsopoulou, K., Yang, Q., Desantis, A., & Waszak, F. (2015). Stimulus–classification and stimulus–action associations: Effects of repetition learning and durability. The Quarterly Journal of Experimental Psychology, 68, 1744–1757. https:// doi.org/10.1080/17470218.2014.984232 Müller, F., & Rothermund, K. (2012). Talking loudly but lazing at work – behavioral effects of stereotypes are context dependent. European Journal of Social Psychology, 42, 557–563. https://doi.org/10.1002/ejsp.1869 Neyer, F. J., Wrzus, C., Wagner, J., & Lang, F. R. (2011). Principles of relationship differentiation. European Psychologist, 16, 267– 277. https://doi.org/10.1027/1016-9040/a000055 Rothermund, K., Wentura, D., & De Houwer, J. (2005). Retrieval of incidental stimulus-response associations as a source of negative priming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 482–495. https://doi.org/10.1037/ 0278-7393.31.3.482 Rubin, Z. (1975). Disclosing oneself to a stranger: Reciprocity and its limits. Journal of Experimental Social Psychology, 11, 233– 260. https://doi.org/10.1016/S0022-1031(75)80025-4 Serota, K. B., Levine, T. R., & Boster, F. J. (2010). The prevalence of lying in America: Three studies of self-reported lies. Human Communication Research, 36, 2–25. https://doi.org/10.1111/ j.1468-2958.2009.01366.x Sprecher, S., Treger, S., Wondra, J. D., Hilaire, N., & Wallpe, K. (2013). Taking turns: Reciprocal self-disclosure promotes liking in initial interactions. Journal of Experimental Social Psychology, 49, 860–866. https://doi.org/10.1016/j.jesp.2013.03.017 Townsend, J. T., & Ashby, F. G. (1983). Stochastic modeling of elementary psychological processes. Cambridge, UK: Cambridge University Press.
Experimental Psychology (2019), 66(1), 12–22
Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley. Waszak, F., Hommel, B., & Allport, A. (2003). Task-switching and long-term priming: Role of episodic stimulus-task bindings in task-shift costs. Cognitive Psychology, 46, 361–413. https://doi. org/10.1016/S0010-0285(02)00520-0 Waszak, F., Hommel, B., & Allport, A. (2005). Interaction of task readiness and automatic retrieval in task switching: Negative priming and competitor priming. Memory & Cognition, 33, 595– 610. https://doi.org/10.3758/BF03195327 Wigboldus, D. H. J., Dijksterhuis, A., & van Knippenberg, A. (2003). When stereotypes get in the way: Stereotypes obstruct stereotype-inconsistent trait inferences. Journal of Personality and Social Psychology, 84, 470–484. Wiswede, D., Koranyi, N., Müller, F., Langner, O., & Rothermund, K. (2013). Validating the truth of propositions: Behavioral and ERP indicators of truth evaluation processes. Social Cognitive and Affective Neuroscience, 8, 647–653. https://doi.org/10.1093/ scan/nss042 Yam, K. C., Chen, X., & Reynolds, S. J. (2014). Ego depletion and its paradoxical effects on ethical decision making. Organizational Behavior and Human Decision Processes, 124, 204–214. https://doi.org/10.1016/j.obhdp.2014.03.008
History Received September 24, 2017 Revision received July 6, 2018 Accepted July 23, 2018 Published online February 19, 2019 Open Data Raw data and analysis of the Experiments are available in the Electronic Supplementary Materials, ESM 1–5. ORCID Franziska Annett Schreckenbach https://ordic.org/0000-0001-7867-533X
Franziska Schreckenbach Department of General Psychology II Institute of Psychology Friedrich Schiller University Jena Am Steiger 3 07743 Jena Germany franziska.schreckenbach@uni-jena.de
Ó 2019 Hogrefe Publishing
Research Article
The Effect of Outcome Probability on Generalization in Predictive Learning Hadar Ram1 , Dieter Struyf2, Bram Vervliet2,3, Gal Menahem1, and Nira Liberman1 1
School of Psychological Sciences, Tel Aviv University, Israel
2
Centre for the Psychology of Learning and Experimental Psychopathology, Leuven University, Belgium
3
Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
Abstract: People apply what they learn from experience not only to the experienced stimuli, but also to novel stimuli. But what determines how widely people generalize what they have learned? Using a predictive learning paradigm, we examined the hypothesis that a low (vs. high) probability of an outcome following a predicting stimulus would widen generalization. In three experiments, participants learned which stimulus predicted an outcome (S+) and which stimulus did not (S ) and then indicated how much they expected the outcome after each of eight novel stimuli ranging in perceptual similarity to S+ and S . The stimuli were rings of different sizes and the outcome was a picture of a lightning bolt. As hypothesized, a lower probability of the outcome widened generalization. That is, novel stimuli that were similar to S+ (but not to S ) produced expectations for the outcome that were as high as those associated with S+. Keywords: generalization, predictive learning, partial reinforcement, learning from experience
The purpose of learning from experience is to enable prediction (De Houwer & Beckers, 2002b; Hohwy, 2013; Liberman, Trope, & Rim, 2011; Suddendorf & Corballis, 2007). When people repeatedly experience event A (e.g., gray clouds/a member of a specific social group) that is followed by outcome B (e.g., rain/the group member offers help), they learn to predict outcome B from event A (e.g., expect rain when seeing gray clouds/expect help from another member of the same group). They might learn that another event, C (e.g., white clouds/a member of a different social group), does not predict the same outcome. Prediction is useful because it enables preparing for the future (e.g., take an umbrella/choose whom to ask for help). Importantly, experience has to be generalized in order to apply to a new situation. That is, outcome predictions should be made for events that are similar but not identical to the original event A. An interesting question is that of generalization breadth, namely what determines the range of stimuli (e.g., the range of grayness/the range of group members) for which outcome B would be predicted.
Ó 2019 Hogrefe Publishing
The factors that affect generalization breadth have been studied mainly within the psychology of learning, but are clearly related also to central topics in social psychology, such as attitudes and stereotypes. For example, when we hear a great talk at a conference, we may not only form a positive attitude toward the speaker, but also generalize this attitude toward the speaker’s laboratory members, his/her discipline or even his/her national group, thereby changing, strengthening, or helping to create social stereotypes. In the present article, we examine one factor that may affect generalization breadth, namely the probability that an outcome appears following a predicting stimulus. Specifically, we examine the hypothesis that low (vs. high) outcome probability (i.e., reinforcement) after a cue (S+) widens generalization. In the learning literature, this factor has been termed reinforcement rate. In what follows, we first present classic and contemporary models of learning which addressed the question of generalization as a function of reinforcement rate while also reviewing central relevant findings. We then turn to discuss why effects of reinforcement rate/outcome probability on generalization are important and elaborate why we believe that they are of interest for social psychology. Thereafter, we describe the predictive learning paradigm we use in our experiments and state our hypotheses in the more concrete terms of that paradigm.
Experimental Psychology (2019), 66(1), 23–39 https://doi.org/10.1027/1618-3169/a000429
24
Outcome Probability and Generalization in Classic and Contemporary Learning Models In instrumental conditioning, when an organism is rewarded (reinforced) for a particular response in the presence of a stimulus, it is likely to exhibit the same response when encountering the same stimulus again. Continuous reinforcement occurs when reinforcement is delivered after every response of the organism to that stimulus, whereas partial reinforcement occurs when reinforcement is delivered only after some responses (Jenkins & Stanley, 1950). Both classic and more recent theories of learning predict that low probability of reinforcement (i.e., outcome probability) would give rise to wider generalization. Notable among these theories are Bayesian models of generalization (Gershman, Blei, & Niv, 2010; Gershman & Niv, 2012; Shepard, 1987; Soto, Gershman & Niv, 2014; Soto, Quintana, Pérez-Acosta, Ponce, & Vogel, 2015; Tenenbaum & Griffiths, 2001). Shepard (1987), for example, conceptualized generalization as a Bayesian inference problem: The learner experiences stimulus X (e.g., a cloud of specific grayness) with a particular consequence (e.g., rain) and assumes that stimulus X belongs to a “consequential region,” a region of stimuli that produce the same consequence. The learner’s task is to infer the probability that a novel stimulus Y (i.e., a darker cloud) belongs to the same consequential region. Generalization to Y represents the estimated probability that Y belongs to the same consequential region, given that X belongs to it. Tenenbaum and Griffiths (2001) extended Shepard’s analysis to generalization from multiple examples. They claimed that as the number of learned examples within the same consequential region increases, the learner will tend to infer a narrower consequential region and thus would exhibit narrower generalization. For example, a learner who experiences on three occasions that the clouds of grayness level 6 are followed by rain (relative to a learner who experiences it only once) will generalize less to clouds of grayness level 5. This is because learners assume that experiences are sampled randomly from the consequential region, and three similar sampled observations are indicative of a narrower region than only one observation. Thus, according to Bayesian models of generalization, a lower number of reinforcements would give rise to a wider consequential region and as a result produce wider generalization. That is, the generalization gradient to novel stimuli will be wider under partial reinforcement than under continuous reinforcement. That low probability of reinforcement would give rise to wider generalization is also consistent with certain associative learning models. For example, elemental theories of conditioning, such as stimulus sampling theory (Atkinson Experimental Psychology (2019), 66(1), 23–39
H. Ram et al., The Effect of Outcome Probability on Generalization
& Estes, 1963; Estes, 1950, 1959), conceptualize stimuli as set of elements. The core idea is that in each trial, only a subset of the elements of each stimulus is sampled (i.e., processed). By consequence, the same stimulus is perceived slightly differently in each trial, with a different subset of elements being processed. In reinforced trials, only the sampled elements become associated with the outcome. Generalization occurs when a new stimulus shares those elements that have been associated with the outcome (McLaren & Mackintosh, 2000; Welham & Wills, 2011). Reinforcement rate determines how the associative strength is distributed among the elements of the predictor stimulus and hence can determine the extent of generalization to a new stimulus. Under full reinforcement, the sampling of elements is determined by their relative salience, such that the more salient elements of a predicting stimulus gain more associative strength with the outcome and each other (a process McLaren & Mackintosh, 2000 termed “unitization”: A relatively narrow set of elements gains relatively strong associative strength with the outcome). Because any novel stimulus will have a lower chance to specifically share that narrow set of elements, the generalization gradient will be narrow as well. Under partial reinforcement, on the other hand, the inherent unpredictability curbs the associative strength of the narrow set and leaves room for the other elements to gain some associative strength in occasionally reinforced trials. As a result, a larger set of elements acquires (some) associative strength with the outcome, thereby increasing the chances that a novel stimulus will share those elements. The result is a lower rate of response to generalization stimuli but wider generalization gradient. Some studies supported the hypothesis that partial (vs. continuous) reinforcement would widen generalization in human learning. These studies, however, mainly addressed fear conditioning. For example, in some early studies (Humphreys, 1939; Wickens, Schroder, & Snide, 1954), a specific tone was paired with an electric shock and thus was the conditioned stimulus. Participants’ galvanic skin response to the sound of the tone was measured. In the generalization test phase, novel tones that differed in pitch in steps of just noticeable differences from the conditioned stimulus were presented. More generalization was found in the partial reinforcement group than in the continuous reinforcement group.
Why the Effect of Outcome Probability on Generalization Is Important We think that the study of how outcome probability affects generalization is of great importance and relevance to how people understand their physical and social world and how Ó 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
they behave in it. In our example, when rain always follows the appearance of clouds of grayness level 6 (high probability), then rain would be associated with this specific level of grayness. However, when rain only sometimes follows the appearance of clouds of grayness level 6 (low probability), then rain would probably not become associated with the specific level of grayness, but rather with a wider range of levels of gray. The learner in this example would assume a wider range of grayness values when making predictions and deciding on actions. To use an example from the social world, when a member of a social group consistently offers help (high probability), observers would probably infer that he/she is kindhearted. However, if a member of a social group only sometimes offers help (low probability), observers would be more likely to form a milder positive evaluation that would apply to the entire social group. Additionally, probability is viewed within Construal Level Theory (CLT; Liberman & Trope, 2008, 2014; Trope & Liberman, 2010) as a dimension of psychological distance, along with temporal, spatial, and social distances (Todorov, Goren, & Trope, 2007; Wakslak & Trope, 2009; Wakslak, Trope, Liberman, & Alony, 2006). If found, the hypothesized effect of outcome probability on generalization would open the possibility that temporal, spatial, and social distances would have similar effects. We will return to this point in the general discussion, when we also discuss how the present results may be viewed with the CLT framework. As mentioned, while there are findings that lend initial support to our hypothesis that lower probability of an outcome after a learned stimulus will widen generalization, these findings are mainly related to the domain of human fear conditioning. We thought that because generalization is a basic process of attitude formation, it is important to examine this hypothesis also with more neutral, not fearrelated stimuli. To that end, we used a modified version of the predictive learning paradigm (Struyf, Iberico, & Vervliet, 2014), which we now turn to describe.
The Predictive Learning Paradigm: The Present Studies In the predictive learning paradigm (Struyf et al., 2014, Experiment 1), the participantsâ&#x20AC;&#x2122; goal is to learn which stimuli predict the appearance of an outcome. The stimuli are rings of different sizes, and the outcome is a picture of a
1
25
lightning bolt. A medium ring (S+) is followed by the outcome, and a large ring (S ) is never followed by the outcome. In each trial, one ring appears on the computer screen and participants indicate their prediction regarding the appearance of the lightning bolt on an 11-point rating scale, ranging from 0 (= certainly no lightning), via 5 (= uncertain), to 10 (= certainly lightning). Afterward, the ring and the scale disappear and the lightning bolt is either presented or not presented. Thus, participants learn from experience that S+ is followed by the outcome whereas S is not followed by the outcome. The paradigm includes two phases, the acquisition phase and the generalization phase. The acquisition phase presents S+ and S equally often: S+ is paired with the outcome whereas S is never paired with the outcome. At the generalization phase, eight novel rings varying in size are presented. The S+, being the middle-sized ring, is placed in the middle of the generalization dimension, whereas S , being the largest ring, is placed at the right edge. Half of the generalization rings are larger than S+ (i.e., between S+ and S ). Responses to these rings are affected by both excitatory generalization from S+ (which would call for predicting the outcome) and inhibitory generalization from S (which would call for predicting no outcome). The other half of the generalization rings are smaller than S+ (i.e., on the side of S+ that is opposite to S ). This allows us to test for generalization from S+ that is less influenced by generalization from S (McLaren & Mackintosh, 2002; Pearce, 1987; Spence, 1937).1 Our hypothesis concerned these latter generalization rings. We introduced into this paradigm a manipulation of outcome probability by varying the probability of the outcome following S+ during the acquisition phase. In the high-probability condition, this probability was 83% whereas, in the low-probability condition, it was 42%. We operationalized generalization as a tendency to predict the outcome after a generalization ring smaller than the original learned ring (S+), that is, generalization rings that are on the side of S+ that is opposite to S . As mentioned, these generalization rings reflect generalization mostly from S+ and less from S . We hypothesized that low (vs. high) probability of the outcome would cause higher predictions for rings smaller than S+ (i.e., on the side of S+ that is opposite to S ). Paradigms of discrimination leaning typically confound outcome probability given S+ with contingency between stimulus and outcome, such that low outcome probability given S+ coincides with lower contingency between
Learning situations that include both S+ and S (often referred to as discrimination learning) tend to give rise to a peak shift effect, whereby response is most frequent not to the learned stimulus itself, but rather to a generalization stimulus next to S+, on the side opposite to S . Discussing this effect is beyond the scope of the current article (for reviews, see Honig & Urcuioli, 1981; Purtle, 1973, Spence, 1937; Struyf et al., 2014).
Ă&#x201C; 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 23â&#x20AC;&#x201C;39
26
stimulus (S+ vs. S ) and outcome (outcome vs. no outcome). Because contingency might affect learning (Allan, 1980; De Houwer & Beckers, 2002a; Shanks, 1995), we need to examine whether higher generalization might, in fact, be the result of low contingency rather than low conditional probability (of the outcome given S+). We would like to discuss this in light of the results and thus defer our answer to that question until the general discussion.
Experiment 1 Experiment 1 was designed to test how generalization is affected by outcome probability. This was achieved by using a modified version of the predictive learning paradigm described previously. In the high-probability condition, S+ was presented 12 times during the acquisition phase. Ten out of its presentations were followed by the outcome (83%). In the low-probability condition, S+ was also presented 12 times during the acquisition phase, but only five out of its presentations were followed by the outcome (42%). At the generalization phase, which was similar for both conditions, the S+, S , and eight generalization rings appeared multiple times. The S+ was reinforced in 50% of its presentation. This feature was part of the original procedure and was intended to counter extinction during generalization. We followed the suggestion of Vervliet, Iberico, Vervoort, and Baeyens (2011) and analyzed the first block of the generalization phase, which shows effects of learning that are relatively clean of extinction (but suffer from a low number of trials), and only then moved to examine the entire generalization phase.2 We hypothesized that in the first generalization block, participants would predict the outcome following novel rings that are on the side of S+ opposite to S , in the low-probability condition more than in the high-probability condition.
Method Participants Seventy undergraduates from Tel Aviv University participated in the experiment in return for payment. The sample size was based on previous studies with the same paradigm (Struyf et al., 2014). Participants were randomly assigned to experimental conditions. To test whether each participant learned to differentiate between S+ and S , we computed the difference between the last two S+ and the last two 2 3
H. Ram et al., The Effect of Outcome Probability on Generalization
S trials in the acquisition phase. A lower score indicated less differentiation between S+ and S . Eleven participants were excluded because their difference score was below 1 (i.e., they failed to learn the difference between S+ and S by the end of the acquisition phase). The final sample consisted of 59 participants (Mage = 23.61, 47 women) Nhigh-probability = 30, Nlow-probability = 29. Stimuli The experimental stimuli were 10 rings varying in size. The diameter of the smallest ring (R1) was 2.00 cm, and each successive ring’s diameter increased by approximately 15% (R2: 2.30 cm, R3: 2.60 cm, R4: 2.90 cm, R5: 3.20 cm, R6; 3.50 cm, R7: 3.80 cm, R8: 4.10 cm, R9: 4.40 cm, R10: 4.70 cm). The intermediate-size ring (R5) served as S+, and the largest ring (R10) served as S .3 All the other rings served as generalization stimuli. The outcome was a drawing of a white lightning bolt on a black background. All stimuli were presented on a computer screen that was placed in front of the participants. Procedure First, participants signed a consent form. Then, the experiment began with an instruction screen, in which participants were informed that a number of figures would appear on the screen and that some of these figures would be followed by a lightning bolt. They were told that their goal was to learn which figure would be followed by the lightning bolt. The experimental procedure included two phases. The first phase was acquisition, in which S+ and S were each presented 12 times. The number of times that S+ was followed by the outcome changed according to the conditions: S+ was followed by the outcome in 10 of its presentations in the high-probability condition (83%), but only in five of its presentations in the low-probability condition (42%). None of the presentations of S were followed by the outcome. The generalization phase was identical in the two conditions: It consisted of six identical blocks. In each block, there were two presentations of S+ (one of which was followed by the outcome), two presentations of S , and one presentation of each of the eight generalization rings. The generalization rings and S were never followed by the outcome. Each trial started with a computer screen that said: “The next trial starts now.” Then, one stimulus was presented for 500 ms, and a rating scale appeared at the bottom of the screen, ranging from 0 (= certainly no lightning), via 5 (= uncertain), to 10 (= certainly lightning). Participants indicated their prediction on the scale by clicking on it with the
We will return and elaborate on the issue of extinction in the Discussion section. We did not counterbalance the S to be either the smallest or the largest ring, to prevent participants form inferring a rule such as “the larger the ring the higher the probability of the outcome appearance.” Therefore, S was always the largest ring (Ghirlanda & Enquist, 2003; Struyf et al., 2014).
Experimental Psychology (2019), 66(1), 23–39
Ó 2019 Hogrefe Publishing
Predictions
H. Ram et al., The Effect of Outcome Probability on Generalization
High-probability S+
High-probability S−
Low-probability S+
Low-probability S−
Figure 1. Mean predictions (on a 0– 10 scale) during the acquisition phase by outcome probability, trial number, and stimulus type. Error bars depict standard errors.
10 9 8 7 6 5 4 3 2 1 0 1
2
3
4
5
6 7 Trials
8
9
10
computer mouse. Afterward, the stimulus and the scale disappeared and the lightning bolt was either presented for 1,500 ms or was not presented at all. The inter-trial interval was always 3 s. Upon completing the task, participants responded to the following questions, which served as control variables: interest (“How interesting was the task for you?”), enjoyment (“How much did you enjoy the task?”), difficulty (“How difficult did you find the task?”), motivation (“How motivated did you feel to perform the task well?”), importance (“How important was it for you to perform the task well?”), and perceived competence (“How well do you feel that you did on the task?”) on scales that ranged from 1 (= not at all) to 7 (= very much). General mood was also assessed (“Generally, how do you feel right now?” 1 = very bad, 7 = very good), followed by eight specific emotions (“How sad/loose/tense/relaxed/nervous/happy/ joyful/depressed do you feel right now?” 1 = not at all, 7 = very much). Finally, participants answered a demographic questionnaire.
Results The raw data (of all three experiments) including analysis script are provided in the Electronic Supplementary Materials, ESM 1–4. Acquisition We analyzed the outcome predictions in a 12 2 2 mixed-design analysis of variance (ANOVA), with trial (12) and stimulus type (S vs. S+) as within-subject factors and outcome probability (high-probability vs. low-probability) as
4
27
11
12
a between-subject factor. There was a main effect of trial, F(11, 627) = 2.77, p = .002, η2p = .05, and a main effect of stimulus type, F(1, 57) = 385.80, p < .001, η2p = .87, which showed that participants predicted the outcome more after S+ than after S . That is, participants learned to differentiate between S+ and S . An interaction between stimulus type and trial, F(11, 627) = 54.79, p < .001, η2p = .49, showed that the difference between the predictions for S+ and S developed over trials (Figure 1). A main effect of outcome probability, F(1, 57) = 14.06, p < .001, η2p = .20, was qualified by an interaction with stimulus type, F(1, 57) = 25.29, p < .001, η2p = .31, which indicated that participants’ predictions for S+ reflected the actual probability of its appearance, which was higher in the high-probability condition than in the low-probability condition.4 The interaction between trial and outcome probability was not significant, F(11, 627) = 0.87, p = .57. The three-way interaction between stimulus type, trial, and outcome probability, F(11, 627) = 5.10, p < .001, η2p = .08, indicated that the difference in the outcome predictions between the two conditions did not exist initially, but rather emerged over trials. Generalization Generalization is defined as giving similar conditioned response to a novel stimulus as to the learned stimulus (Shepard, 1958; for a review see Ghirlanda & Enquist, 2003). In other words, we were interested in examining the response to the novel stimuli relative to the response to the original, learned stimulus. We thus computed the difference between the predictions given to each stimulus presented during the generalization phase to the predictions given to S+ at the last two trials in the acquisition phase (for a similar conceptualization of generalization, see
This result is consistent with earlier findings on partial versus full reinforcement in which the final level of response is higher under continuous reinforcement than under a partial reinforcement schedule (for review, see Jenkins & Stanley, 1950).
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 23–39
Predictions minus last two S+ trials
28
H. Ram et al., The Effect of Outcome Probability on Generalization
Low-probability
1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 R1
R2
R3
R4
S+ Stimuli
R6
R7
R8
Blough, 1975; Fazio, Shook, & Eiser, 2004; Hovland, 1937). A zero difference score indicated that the prediction for a novel stimulus was similar to S+. A lower difference score indicated lower predictions of the outcome and thus lower generalization. We used the same dependent variable in this and all subsequent studies in this paper. We analyzed generalization scores in a 10 2 mixeddesign ANOVA with stimuli (10: R1, R2, R3, R4, S+, R6, R7, R8, R9, S ) as a within-subject factor and outcome probability (high-probability vs. low-probability) as a between-subject factor. As explained above, we first analyzed the first block of generalization and then moved to examine the entire generalization phase (see Vervliet et al., 2011 for a similar procedure). First Generalization Block A main effect of stimuli, F(9, 513) = 41.5, p < .001, η2p = .42, indicated that the outcome was predicted after some stimuli more than after other stimuli. A main effect of outcome probability, F(1, 57) = 10.07, p = .002, η2p = .15, indicated that generalization scores in the low-probability condition were higher (M = 2.99, SD = 1.98) than in the highprobability condition (M = 4.49, SD = 1.64) (see Figure 2). There was no interaction between stimuli and outcome probability, F(9, 513) = 1.69, p = .09. To examine our hypothesis, we analyzed the generalization scores only for the generalization stimuli located on the side of S+ opposite to S (i.e., rings smaller than S+). We conducted a 4 2 mixed-design ANOVA with similarity to S+ (four levels, from most similar to S+ to least similar to S+), as within-subject factor and outcome probability (high-probability vs. low-probability) as a between-subject factor. The hypothesized effect of outcome probability, F(1, 57) = 6.64, p = .013, η2p = .10, indicated that generalization scores in the low-probability condition (M = 1.39, SD = 2.93) were higher than in the high-probability condition (M = 3.17, SD = 2.35). A main effect of similarity to Experimental Psychology (2019), 66(1), 23–39
Figure 2. Mean generalization scores for all stimuli during the first generalization block by outcome probability. Error bars depict standard errors.
High-probability
R9
S-
S+, F(3, 171) = 5.56, p = .001, η2p = .09, showed that generalization scores varied across the different generalization stimuli. The interaction between similarity and outcome probability was not significant F(3, 171) = 1.77, p = .150. All Generalization Blocks We repeated the analysis for all six generalization blocks. The 10 stimuli (R1, R2, R3, R4, S+, R6, R7, R8, R9, S ) 2 outcome probability (high-probability vs. low-probability) ANOVA revealed a significant main effect of outcome probability, F(1, 57) = 15.56, p < .001, η2p = .21, demonstrating that generalization scores in the low-probability condition were higher (M = 4.12, SD = 2.20) than in the highprobability condition (M = 6.10, SD = 1.62). A significant main effect of stimuli, F(9, 513) = 70.55, p < .001, η2p = .55, indicated that the outcome was predicted after some stimuli more than after other stimuli. There was no interaction between stimuli and outcome probability, F(9, 513) = 0.70, p = .700 (Figure 3). To examine our hypothesis, we analyzed the generalization scores only for the generalization stimuli located on the side of S+ opposite to S . We conducted a 4 2 mixed-design ANOVA with similarity to S+ (four levels, from most similar to S+ to least similar to S+), as withinsubject factor and outcome probability (high-probability vs. low-probability) as a between-subject factor. In line with our hypothesis, a significant effect of outcome probability, F(1, 57) = 12.41, p = .001, η2p = .18, indicated that generalization scores in the low-probability condition were higher (M = 3.49, SD = 2.71) than in the high-probability condition (M = 5.61, SD = 1.84). A main effect of similarity, F(3, 171) = 45.56, p < .001, η2p = .44, showed that generalization scores for these generalization stimuli decreased as the similarity to S+ decreased. The interaction between similarity and outcome probability was not significant F(3, 171) = 1.73, p = .160. Ó 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
Predictions minus last two S+ trials
Low-probability
29
Figure 3. Mean generalization scores for all stimuli during all generalization blocks by outcome probability. Error bars depict standard errors.
High-probability
1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean R1 R2 R3 R4 S+ R6 R7 R8 R9 SStimuli
There were no significant differences between the two outcome probability conditions in any of the mood measures or other measures.5
Discussion Experiment 1 examined the effect of outcome probability on generalization by manipulating the probability of the outcome after a predicting stimulus. Participants’ learning accurately reflected the experimental conditions, such that in the high-probability condition, participants predicted the outcome after S+ with more certainty than participants in the low-probability condition. Importantly, the generalization results were consistent with our hypothesis. Participants in the low-probability condition showed wider generalization than participants in the high-probability condition, by making predictions more similar to S+ after seeing a novel predictor that was similar to S+ (but not to S ). These results were apparent not only in the first block, but also when we analyzed all of the generalization blocks. Note that in this Experiment, the high-probability and low-probability conditions did not only differ in the probability of the outcome appearance after S+, but also differed in the number of pairings between S+ and the outcome. In Experiment 2, we controlled for this possible confound by including two low-probability conditions: one in which, similar to Experiment 1, the overall number of acquisition trials in the low-probability condition was similar to the
5
high-probability condition (but naturally, the number of pairings between S+ and the outcome was lower), and one in which the number of pairings between S+ and the outcome was equated between the low-probability and the high-probability conditions (as a consequence, there were more acquisition trials in this low-probability condition than in the high-probability condition).
Experiment 2 We replicated Experiment 1 but included an additional lowprobability condition that equated the number of pairings of S+ and the outcome between the low-probability and the high-probability conditions. This new condition, which we term the “low-probability-long” condition, had twice the number of acquisition trials as the high-probability and the original low-probability conditions. In this “low-probability-long” condition, the overall number of pairings of S+ and the outcome matched the high-probability condition. However, the probability of the outcome appearance following S+ was still low (42%) and thus matched the original low-probability condition in Experiment 1. We term the original low-probability condition that is similar to that of Experiment 1 the “low-probability-short” condition. As in Experiment 1, we hypothesized that low outcome probability will widen generalization.
Although some directional differences emerged, they did not survive a Bonferroni correction for multiple comparisons. Moreover, such differences did not emerge in Experiments 2 and 3 and will not be discussed any further. ESM 5 presents a full report of the descriptive and inferential statistics of these measures. Importantly, when these measures were entered as covariates, they did not reduce the effect of outcome probability, nor did they interact with outcome probability, neither in the analysis of the first generalization block nor in the analysis of all blocks.
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 23–39
30
H. Ram et al., The Effect of Outcome Probability on Generalization
Low-probability-long S+ Low-probability-short S–
Low-probability-long S− High-probability S+
Low-probability-short S+ High-probability S−
10 9
Figure 4. Mean predictions (on a 0– 10 scale) during the acquisition phase by outcome probability, trial number, and stimulus type. Error bars depict standard errors.
8 Predictions
7 6 5 4 3 2 1 0 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Trials
Method
Results
Participants Ninety-one undergraduates from Tel Aviv University participated in the experiment voluntarily. The sample size was chosen according to previous studies with the same paradigm (Struyf et al., 2014) as well as according to Experiment 1. Participants were randomly assigned to one of the three conditions. One participant was excluded because he did not finish the task, and another participant was excluded because she wrote down the stimuli and their outcomes for herself during the task. As in Experiment 1, to test whether each participant learned to differentiate between S+ and S , we computed the difference between the last two S+ trials and the last two S trials in the acquisition phase. Twenty participants were excluded because their difference score was below 1. The final sample consisted of 69 participants (Mage = 25.41, 40 women) Nhigh-probability = 25, Nlow-probability-short = 23, Nlow-probabilitylong = 21:
Acquisition We analyzed outcome predictions in a 12 2 3 mixeddesign ANOVA with trial (12) and stimulus type (S vs. S+) as within-subject factors and outcome probability (high-probability vs. low-probability-short vs. low-probability-long) as a between-subject factor. Since the low-probability-long condition had 24 trials, only the first 12 trials of this condition were analyzed. A main effect of stimulus type, F(1, 66) = 352.95, p < .001, η2p = .84, showed that participants learned to differentiate between S+ and S , namely they predicted the outcome more after S+ than after S . An interaction between stimulus type and trial, F(11, 726) = 32.01, p < .001, η2p = .33, showed that the difference between the predictions to S+ and S developed over trials. A main effect of outcome probability, F(2, 66) = 8.64, p < .001, η2p = .21, was qualified by an interaction with stimulus type, F(2, 66) = 8.13, p = .001, η2p = .20, which indicated that participants’ predictions accurately reflected the experimental condition. Specifically, predictions were higher in the high-probability condition than in the two low-probability conditions. Indeed, a simple effect analysis revealed that for S+ predictions were higher in the high-probability condition than in the lowprobability-short condition, p < .001, and the low-probability-long condition, p = .001, which did not differ from each other, p = .190 (Figure 4). An interaction between outcome probability and trials, F(22, 726) = 2.37, p < .001, η2p = .67, indicated that the differences in predictions between the three outcome probability conditions emerged over trials. The three-way interaction between stimulus type, trial, and outcome probability was not significant, F(22, 726) = 1.30, p = .160. We were also interested in comparing performance at the end of the acquisition phase between the three outcome
Procedure As in Experiment 1, participants first signed a consent form and read the task instructions on the computer screen. During acquisition, S+ and S were each presented 12 times in the high-probability and low-probability-short conditions and 24 times in the low-probability-long condition. S+ was followed by the outcome 10 times in the high-probability condition (83%) and in the low-probability-long condition (42%) but only five times in the low-probability-short condition (42%). S was never followed by the outcome. The generalization phase was identical for all three conditions and was similar to Experiment 1. After completing the task, participants responded to the same control and demographic questions as in Experiment 1. Experimental Psychology (2019), 66(1), 23–39
Ó 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
Predictions minus last two S+ trials
Low-probability-short High-probability
31
Figure 5. Mean generalization scores for all stimuli during the first generalization block by outcome probability. Error bars depict standard errors.
Low-probability-long
2 1 0 -1 -2 -3 -4 -5 -6 -7 R1
R2
R3
R4
S+ Stimuli
R6
R7
probability conditions. Therefore, we analyzed the mean outcome predictions in the last two S+ trials and the last two S trials in a 2 3 mixed-design ANOVA with stimulus type (S vs. S+) as a within-subject factor and outcome probability (high-probability vs. low-probability-short vs. low-probability-long) as a between-subject factor. A main effect of stimulus type, F(1, 66) = 575.00, p < .001, η2p = .90, indicated that participants learned to differentiate between S+ and S . A significant effect of outcome probability, F(2, 66) = 11.49, p < .001, η2p = .26, was qualified by an interaction between stimulus type and outcome probability, F(2, 66) = 4.77, p = .012, η2p = .13, which suggested that at the end of the acquisition phase, the difference between conditions in actual probability of outcomes was reflected in participants’ predictions. Indeed, a simple effect analysis revealed that for S+ predictions were higher in the high-probability condition than in the low-probability-short condition, p = .007, and the lowprobability-long condition, p < .001, which did not differ from each other, p = .160. Generalization First Generalization Block As in Experiment 1, generalization scores were computed by subtracting participants’ predictions for the last two S+ trials in the acquisition phase from the predictions in the generalization phase. First, we analyzed these generalization scores in a 10 3 mixed-design ANOVA with stimuli (10: R1, R2, R3, R4, S+, R6, R7, R8, R9, S ) as a withinsubject factor and outcome probability (high-probability vs. low-probability-short vs. low-probability-long) as a between-subject factor. There was no effect of outcome probability, F(2, 66) = 0.49, p = .620. A significant effect of stimuli, F(9, 594) = 30.4, p < .001, η2p = .31, was qualified Ó 2019 Hogrefe Publishing
R8
R9
S-
by an interaction between stimuli and outcome probability, F(18, 594) = 1.7, p = .035, η2p = .05. This interaction indicated that generalization scores in the two low-probability conditions were higher than in the high-probability condition only for stimuli smaller than S+ (on the side of S+ opposite to S ) but not for stimuli larger than S (located between S+ and S ) (Figure 5). To examine our hypothesis, we analyzed the generalization scores only for the generalization stimuli located on the side of S+ that is opposite to S . We conducted 4 3 mixed-design ANOVA with similarity to S+ (four levels, from most similar to S+ to least similar to S+), as a withinsubject factor and outcome probability (high-probability vs. low-probability-short vs. low-probability-long) as a between-subject factor. There was a main effect of similarity to S+, F(3, 198) = 4.51, p = .004, η2p = .06, which indicated that generalization scores varied across the different generalization stimuli. There was a significant effect of outcome probability, F(2, 66) = 5.19, p = .008, η2p = .14. Consistent with our hypothesis, a planned contrast analysis confirmed that generalization was higher in the two lowprobability conditions (M = 0.85, SD = 2.92) than in the high-probability condition (M = 2.86, SD = 2.38) F(1, 66) = 9.38, p = .003. Also as predicted, there was no significant difference between the two low-probability conditions F(1, 66) = 1.18, p = .280. The interaction between similarity and outcome probability was not significant, F(6, 198) = 0.33, p = .920. All Generalization Blocks As in Experiment 1, we repeated the generalization analyses for all six generalization blocks. Specifically, we analyzed the generalization scores in a 10 3 mixeddesign ANOVA with stimuli (10: R1, R2, R3, R4, S+, R6, Experimental Psychology (2019), 66(1), 23–39
32
H. Ram et al., The Effect of Outcome Probability on Generalization
Predictions minus last two S+ trials
Low-probability-short High-probability
Low-probability-long
1 0
Figure 6. Mean generalization scores for all stimuli during all generalization blocks by outcome probability. Error bars depict standard errors.
-1 -2 -3 -4 -5 -6 -7 -8 Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean R1 R2 R3 R4 S+ R6 R7 R8 R9 SStimuli
R7, R8, R9, S ) as a within-subject factor and outcome probability (high-probability vs. low-probability-short vs. low-probability-long) as a between-subject factor. A significant main effect of stimuli, F(9, 594) = 60.60, p < .001, η2p = .48, indicated that the outcome was predicted after some stimuli more than after others. A main effect of outcome probability, F(2, 66) = 3.44, p = .038, η2p = .09, indicated that generalization scores in the two low-probability conditions were higher than in the high-probability condition for all stimuli. There was no interaction between stimuli and outcome probability, F(18, 594) = 1.29, p = .190 (Figure 6). To examine our hypothesis, we analyzed the generalization scores only for the generalization stimuli which were located on the side of S+ opposite to S . We conducted a 4 3 mixed-design ANOVA with similarity to S+ (four levels, from most similar to S+ to least similar to S+), as within-subject factor and outcome probability (highprobability vs. low-probability-short vs. low-probabilitylong) as a between-subject factor. As before, there was a main effect of similarity, F(3, 198) = 28.78, p < .001, η2p = .30, which indicated that generalization scores for these generalization stimuli varied. The effect of outcome probability was significant, F(2, 66) = 4.68, p = .013, η2p = .12. A planned contrast analysis between the two low-probability conditions and the high-probability condition indicated that, as hypothesized, the generalization scores in the two low-probability conditions were higher (M = 2.86, SD = 2.34) than the generalization scores in the high-probability condition (M = 4.68, SD = 2.49), F(1, 66) = 8.2, p = .006, with no difference between the two low-probability condition, F(1, 66) = 1.33, p = .250. The interaction between similarity and outcome probability was not significant F(6, 198) = 0.71, p = .640. Experimental Psychology (2019), 66(1), 23–39
There were no significant differences between the three conditions in any of the mood measures or any of the control measures. Table 2 in ESM 5 presents the complete descriptive and inferential statistics for these measures.
Discussion Experiment 2 replicated Experiment 1 with another lowprobability condition. In both low-probability conditions, the probability of the outcome after S+ was lower (42%) compared to the high-probability condition (83%). However, in the low-probability-short condition, the number of pairings between S+ and the outcome was half that of the high-probability condition (but the overall number of trials was similar), whereas, in the low-probability-long condition, it was similar to the high-probability condition (but the overall number of trials was doubled). Importantly, these two low-probability conditions yielded very similar results, at both acquisition and generalization. At acquisition, participants gave higher predictions for S+ in the highprobability condition than in the two low-probability conditions, accurately reflecting the higher probability of the outcome appearance in the high-probability condition. At generalization, participants in both low-probability conditions, compared to those in the high-probability condition, generalized more broadly, by making predictions that were more similar to what they predicted for S+ for novel rings that were located next to S+ on the side opposite to S . There was no difference between the two low-probability conditions. These results suggest that reduced probability of the outcome appearance, rather than reduced number of reinforcements, is responsible for broadening generalization. These results are consistent with our hypothesis and fully replicate Experiment 1. Ó 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
Given the current design of Experiments 1 and 2, our hypothesis concerned only part of the generalization stimuli, namely only the novel stimuli that are not located between S+ and S . To investigate the effect of outcome probability on the entire generalization gradient, we conducted Experiment 3, in which S differed from S+ by shape rather than by size.
33
Stimuli and Procedure The experimental procedure and stimuli were similar to Experiment 1, except that S was a square similar in size to S+ (a 3.20 cm rib compared to a circle with a 3.20 cm diameter in Experiment 1).
Results
Experiment 3 Experiment 3 replicated Experiment 1, but we replaced S with a square the same size as S+. Thus, the relevant dimension to learning was the shape of the figure whereas the relevant dimension to generalization was the size of the ring. As a result, the inhibitory generalization from S to the novel rings no longer existed. Generalization for novel rings should be made only from the S+ (i.e., excitatory generalization), regardless of side, that is, regardless of whether the new ring was larger or smaller than S+ (Spence, 1937; McLaren & Mackintosh, 2002). As in Experiments 1 and 2, we hypothesized that outcome probability (low probability) would widen generalization. Unlike Experiments 1 and 2, the setup of Experiment 3 allowed us to examine the entire generalization gradient. We thus hypothesized that in the low-probability condition more than in the highprobability condition, participants would give higher predictions for the outcome following novel generalization rings, whether they were smaller or larger than S+.
Method Participants One hundred undergraduates from Tel Aviv University participated in the experiment in return for payment (N = 86) or credit points (N = 12). Because we introduced an important change to the paradigm, we increased the sample size according to the recommendation of Ledgerwood (2015) for experiments with unknown effects. We planned to recruit 50 participants per experimental condition. Participants were randomly assigned to one of the two conditions. One participant was excluded because he did not finish the task, and another participant was excluded because he answered a phone call during the task. As in the previous experiments, to test whether each participant learned to differentiate between S+ and S , we computed the difference between the last two S+ trials and the last two S trials in the acquisition phase. Twelve participants were excluded because their difference score was below 1. The final sample consisted of 86 participants (Mage = 24.19, 58 women) Nhigh-probability = 43, Nlow-probability = 43.
Ó 2019 Hogrefe Publishing
Acquisition We analyzed the outcome predictions in a 12 2 2 mixeddesign ANOVA with trial (12) and stimulus type (S vs. S+) as within-subject factors and outcome probability (highprobability vs. low-probability) as a between-subject factor. There was a main effect of trial, F(11, 924) = 4.63, p < .001, η2p = .05. A main effect of stimulus type, F(1, 84) = 652.81, p < .001, η2p = .89, showed that participants predicted the outcome more after S+ than after S ; that is, participants learned to differentiate between S+ and S . An interaction between stimulus type and trial, F(11, 924) = 55.26, p < .001, η2p = .40, showed that the difference between the predictions to S+ and S developed over trials. A main effect of outcome probability, F(1, 84) = 14.57, p < .001, η2p = .15, was qualified by an interaction between outcome probability and stimulus type, F(1, 84) = 27.69, p < .001, η2p = .25, which indicated that as could be expected, participants in the high-probability condition, predicted the outcome after S+ with more certainty than participants in the low-probability condition. A significant interaction between trial and outcome probability, F(11, 924) = 1.96, p = .030, η2p = .02, indicated that the difference in the predictions between the two outcome probability conditions increased over trials. The three-way interaction between stimulus type, trial, and outcome probability was not significant, F(11, 924) = 1.64, p = .082 (Figure 7). Generalization First Generalization Block As in the previous experiments, generalization scores were computed by subtracting participants’ predictions for the last two S+ trials in the acquisition phase from the predictions in the generalization phase. We first analyzed these generalization scores in a 10 2 mixed-design ANOVA with stimuli (10: R1, R2, R3, R4, S+, R6, R7, R8, R9, S ) as a within-subject factor and outcome probability (highprobability vs. low-probability) as a between-subject factor. There was an effect of outcome probability, F(1, 84) = 12.17, p = .001, η2p = .13, demonstrating that generalization scores in the low-probability condition were higher (M = 2.12, SD = 3.45) than in the high-probability condition (M = 4.02, SD = 3.71). A significant effect of stimuli, F(9, 756) = 22.02, p < .001, η2p = .21, indicated that the outcome was predicted more after some stimuli than others
Experimental Psychology (2019), 66(1), 23–39
34
H. Ram et al., The Effect of Outcome Probability on Generalization
Predictions
Low-probability S+ High-probability S+
Figure 7. Mean predictions (on a 0– 10 scale) during the acquisition phase by outcome probability, trial number, and stimulus type. Error bars depict standard errors.
Low-probability S− High-probability S−
10 9 8 7 6 5 4 3 2 1 0 1
2
3
4
5
6
7
8
9
10
11
12
Trials
Predictions minus last two S+ trials
Low-probability
Figure 8. Mean generalization scores for all stimuli during the first generalization block by outcome probability. Error bars depict standard errors.
High-probability
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 R1
R2
R3
R4
S+ R6 Stimuli
R7
(Figure 8). The interaction between stimuli and outcome probability was not significant, F(9, 594) = 1.7, p = .080. To examine our hypothesis, we analyzed the generalization scores in a 2 4 2 mixed-design ANOVA with both side of the gradient (bigger than S+ vs. smaller than S+) and similarity to S+ (four levels, from most similar to S+ to least similar to S+), as within-subject factors and outcome probability (high-probability vs. low-probability) as a betweensubject factor. The results revealed a main effect of outcome probability, F(1, 84) = 11.84, p < .001, η2p = .12, and no effect of side of the gradient, F(1, 84) = 0.03, p = .860, indicating that as hypothesized, generalization scores in the Experimental Psychology (2019), 66(1), 23–39
R8
R9
S-
low-probability condition (M = 1.55, SD = 3.23) were higher than in the high-probability condition (M = 3.34, SD = 3.50) along the entire generalization gradient, that is, for stimuli both smaller and larger than S+. A main effect of similarity to S+, F(3, 252) = 7.57, p < .001, η2p = .08, indicated that generalization scores varied across the different generalization stimuli. All other interactions were not significant; side of the gradient outcome probability, F(1, 84) = 1.20, p = .280, outcome probability similarity to S+, F(3, 252) = 1.11, p = .350, side of the gradient similarity to S+, F(3, 252) = 1.39, p = .250, side of the gradient outcome probability similarity to S+, F(3, 252) = 2.06, p = .110. Ó 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
Predictions minus last two S+ trials
Low-probability
35
High-Probability
0 -1 -2 -3 -4 -5 -6 -7 -8 -9
Figure 9. Mean generalization scores for all stimuli during all generalization blocks by outcome probability. Error bars depict standard errors.
Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean R1 R2 R3 R4 S+ R6 R7 R8 R9 SStimuli
All Generalization Blocks We repeated the analyses for all six generalization blocks. A 10 (stimuli: R1, R2, R3, R4, S+, R6, R7, R8, R9, S ) 2 outcome probability (high-probability vs. low-probability) ANOVA revealed a significant main effect of outcome probability, F(1, 84) = 22.90, p < .001, η2p = .21, demonstrating more generalization in the low-probability condition (M = 3.42, SD = 2.64) than in the high-probability condition (M = 5.67, SD = 2.63). An effect of stimuli, F(9, 756) = 45.37, p < .001, η2p = .35, was qualified by a marginally significant interaction between stimuli and outcome probability, F(9, 756) = 1.85, p = .056, η2p = .02, which indicated that the effect of outcome probability was stronger as similarity to S+ decreased (Figure 9). To examine our hypothesis, we analyzed the generalization scores in a 2 4 2 mixed-design ANOVA with both side of the gradient (opposite to S vs. between S+ and S ) and similarity to S+ (four levels, from most similar to S+ to least similar to S+) as within-subject factors and outcome probability (high-probability vs. low-probability) as a between-subject factor. The results revealed a main effect of outcome probability, F(1, 84) = 22.66, p < .001, η2p = .21, and no effect of side of the gradient F(1, 84) = 2.04, p = .16, indicating that as hypothesized, generalization scores in the low-probability condition (M = 2.91, SD = 2.35) were higher than in the high-probability condition (M = 4.95, SD = 2.39) along the entire generalization gradient. There was a main effect of similarity to S+, F(3, 252) = 44.18, p < .001, η2p = .34, which was qualified by an interaction between similarity to S+ and side of the gradient, F(3, 252) = 5.98, p = .001, η2p = .07, indicating lower generalization scores for stimuli less similar to S+, especially for stimuli that were larger than S+. All other interactions were not significant; Side of the gradient outcome probability, F(1, 84) = 2.04, p = .070, Outcome probability similarity to S+, Ó 2019 Hogrefe Publishing
F(3, 252) = 2.00, p = .110, Side of the gradient outcome probability similarity to S+, F(3, 252) = .69, p = .560. There were no significant differences between the two conditions in any of the mood measures or any of the control measures. Table 3 in ESM 5 presents the complete descriptive and inferential statistics for these measures.
Discussion Experiment 3 replicated Experiment 1 with S as a stimulus from a different category. During training, the relevant dimension was the shape (a ring as S+ vs. a square as S ) whereas, during generalization, the relevant dimension was the size of the ring, as all the generalization stimuli were rings of varying size. Because S did not belong to the generalization dimension, there was no inhibitory generalization from S to novel stimuli, and generalization to novel stimuli, both smaller and larger than S+, was only made from S+ (McLaren & Mackintosh, 2002; Spence, 1937). As in Experiments 1 and 2, during acquisition, participants learned to predict the outcome appearance according to the experimental conditions. In the high-probability condition, participants predicted the outcome after S+ with more certainty than in the low-probability condition. Most importantly, during generalization, generalization was higher along the entire generalization gradient for participants in the low-probability condition, compared to those in the high-probability condition. These results are consistent with our hypothesis.
General Discussion Three experiments examined the hypothesis that low probability of an outcome following a cue would widen generalization. This hypothesis is consistent with associative and Experimental Psychology (2019), 66(1), 23–39
36
Bayesian theories of learning and generalization (Blough, 1975; Gershman & Niv, 2012; Shepard, 1987; Soto et al., 2014, 2015; Tenenbaum & Griffiths, 2001). To test our hypothesis, we used a predictive learning paradigm. Experiment 1 introduced two conditions: the highprobability condition and the low-probability condition, which matched the high-probability condition in the number of acquisition trials. We found wider generalization in the low-probability condition than in the high-probability condition. Experiment 2 extended Experiment 1 by adding a second low-probability condition, in which the number of times the outcome followed the predicting stimulus S+ was similar to the high-probability condition, but the outcome probability was still as low as in the original low-probability condition. Experiment 2 replicated Experiment 1 and further supported our hypothesis that low outcome probability increases generalization. Furthermore, there were no differences between the two low-probability conditions in the generalization phase, suggesting that the probability of the outcome after the cue rather than the number of their pairings widens generalization. In Experiment 1 and 2, the relevant dimension to both learning and generalization was the same, namely the size of the ring. Specifically, S+ which predicted the appearance of the outcome was a medium ring, whereas S which predicted the absence of the outcome was a large ring. The generalization stimuli were rings of different sizes, such that half of them were between S+ and S and thus could be affected by both generalization from S+ and generalization from S . The other half of the generalization stimuli were placed on the side of S+ which was opposite to S . Because these latter generalization stimuli were less influenced by generalization from S , our hypothesis concerned primarily these stimuli. Experiment 3 allowed us to examine the effect of outcome probability on generalization for all of the generalization stimuli by using S from a different category (i.e., a square) that was not supposed to affect generalization (McLaren & Mackintosh, 2002; Spence, 1937; Vervliet et al., 2011). In line with our hypothesis, generalization was wider in the low-probability condition across the entire generalization gradient.
Generalization, Outcome Probability, and Contingency As mentioned in the introduction, our studies confound low outcome probability given S+ with low contingency. Because low contingency might result in more similar predictions for S+ and S , it could have also led participants to make more similar predictions for all stimuli, including not 6
H. Ram et al., The Effect of Outcome Probability on Generalization
only S+ and S , but also the generalization stimuli. Was it the case, then, that higher generalization in the low-probability conditions was actually caused by low contingency? We believe that several aspects of our results make this interpretation unlikely. First, more similar predictions for S+ and S (less discrimination) should have been manifested not only in lower (more regressive) predictions for S+, but also in higher (more regressive) predictions for S . In other words, in the high-probability/contingency conditions there is stronger anti-correlation between S and the outcome, compared to the low-probability/contingency conditions. As a result, outcome predictions following S should have been more similar to S+ in the low-probability/contingency conditions (i.e., weaker anti-correlation) than in the high-probability/contingency conditions (i.e., stronger anticorrelation). In all three experiments, however, predictions for S are similar between the high and the low-probability/contingency conditions, both at the end of learning and in the first generalization block.6 Second, if responses to generalization stimuli were a result of the relatively low contingency, then they should have seemed more similar not only to S+ but also to S in the low-probability conditions (compared to the highprobability conditions). This, however, was not the case for the novel stimuli which were the focus of our predictions (stimuli left to S+ in Experiments 1 and 2, and all novel stimuli in Experiment 3). These stimuli actually yielded responses less similar to S in the low-probability conditions than in the high-probability conditions. Future studies should examine if behavior would differ depending on whether people are asked to predict the outcome (as in our experiments) as opposed to indicate whether the stimulus caused the outcome. Potentially, the latter question would be more sensitive to contingency than the former, which we anticipate to be more reflective of conditional probability. For example, consider an experimental design in which S+ is always followed by the outcome, but the outcome also occurs in between S+ presentations, thus weakening the contingency between S + and the outcome. Potentially, this would affect causal judgments (participants will be less convinced that S+ is the cause of the outcome) more than predictive judgment (participants will not be less convinced that the outcome would follow S+).
Generalization and Extinction The Partial Reinforcements Extinction Effect (PREE, Atkinson, Atkinson, Smith, Bem, & Nolen-Hoeksema, 1995;
This pattern is easier to detect when looking at the raw data presented in ESM 5 (Figures 10, 12, 14) rather than Figures 2, 4, and 6.
Experimental Psychology (2019), 66(1), 23â&#x20AC;&#x201C;39
Ă&#x201C; 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
Baron & Kalsher, 2000; Hartman & Grant, 1960; Grant & Schipper, 1952) refers to the finding that a partial reinforcement schedule produces more resistance to extinction than continuous reinforcement. It is likely that PREE occurs because the difference in reinforcement rates between acquisition and extinction is less abrupt in partial reinforcement than in continuous reinforcement (Capaldi, 1966). Importantly, resistance to extinction can be viewed as an instance of wider generalization across contexts: The conditioned response or the conditioned stimulus in the context of acquisition is perceived as similar to a conditioned response or stimulus in the context of extinction. In this view, the PREE could be seen as reflecting broader generalization in partial compared to continuous reinforcement. This is, of course, consistent with our hypothesis in the present paper. Can the reverse hold? Namely can PREE explain our results? We think that this is not the case. First, we primarily relied on analyzing the first generalization block, in which extinction effects have not yet emerged (note that in both conditions, there were non-reinforced trials during acquisition). Second, we examined generalization to novel stimuli with systematic variation in similarity to the conditioned stimuli, which is different to generalization to the same conditioned stimulus in a different context, as is the case in PREE.
Implications and Future Directions Construal Level Theory (CLT; Liberman & Trope, 2008, 2014; Trope & Liberman, 2010) addresses the question of how human beings mentally travel along four dimensions of psychological distance: plan for the future and remember the past (i.e., temporal distance), think about spatially remote places (i.e., spatial distance), consider other peoples’ points of view (i.e., social distance), and think about both likely and less likely, improbable situations (i.e., hypothetically). Because outcome probability corresponds to distance7 between the predictor and the outcome on the dimension of hypotheticality, hypothesis regarding its effect on generalization can be derived from CLT. Indeed, psychological distance is related to generalization in a fundamental way, as any act of prediction involves a tradeoff between accuracy and applicability (Liberman et al., 2011). For example, if I experienced rain with clouds of grayness level 6, then the prediction of rain only with that specific level of grayness is likely to be accurate, but is unlikely to apply to many situations. In contrast, predict-
7
37
ing rain for any level of grayness would apply to many situations, but is less likely to be accurate. Because psychological distance increases uncertainty, the learned stimulus needs to be categorized widely to be applicable. “Clouds of grayness level 6” might be too specific category to apply across different times, places, perspectives, and less likely situations. In other words, the variability inherent in distancing calls for using broader generalization to maintain applicability. We see merit in the fact that learning theories and a social-psychological theory converge on a similar prediction. The experiments we presented here fit both frameworks, but future research could move to manipulations of distance that would be within the realm of CLT but outside the realm of traditional learning experiments. For example, we could introduce temporal distance by telling participants that the test phase will follow the training phase immediately versus much later. To the best of our knowledge, such a manipulation has never been used in learning experiments (perhaps because it would be difficult to implement with animals) and thus moves us more into the socialcognitive domain and to the important question of how communicated top-down information interacts with experience-based learning. Finding similar results of enhanced generalization with more distance with such paradigms would speak to the robustness of the effect of psychological distance on generalization.
Conclusions Three experiments demonstrated that low probability of an outcome after a cue widens generalization. Understanding what affects generalization breadth is important in most basic sub-fields of psychology – learning, cognition, social psychology, and decision making. Needless to say, it is also important in many applied fields, such as education, work, organizational behavior, and public policy. In some real-life situations, such as in stereotyping and prejudice, policy makers would be mostly interested in narrowing generalization. In other situations, however, such as school learning and personnel development, authorities might primarily seek to enhance generalization. In both cases, better understanding the factors that affect generalization is crucial. We hope that the present paper made a modest step in that direction.
Notably, in Construal Level Theory (Liberman & Trope, 2008), psychological distance encompasses social, spatial, and temporal distances as well as hypotheticality and refers to the extent of divergence from the direct experience of me here and now. However, in classic learning theories (see Shepard, 1958b), “distance” refers to perceptual similarity, namely distance between stimuli that are represented as points in a continuous metric of a psychological space.
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 23–39
38
Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at https://doi.org/10.1027/ 1618-3169/a000429 ESM 1. Data (.sav) Raw data of Experiment 1. ESM 2. Data (.sav) Conditions 3–2 Experiment ESM 3. Data (.sav) Experiment 3 probability square. ESM 4. Data (.sps) Analysis scripts. ESM 5. Tables and Figures (.docx) Tables of descriptive and inferential statistics for the control variables in Experiments 1–3. Figures of participants’ predictions for all stimuli in Experiments 1–3.
References Allan, L. G. (1980). A note on measurement of contingency between two binary variables in judgment tasks. Bulletin of the Psychonomic Society, 15, 147–149. https://doi.org/ 10.3758/BF03334492 Atkinson, R. C., & Estes, W. K. (1963). Stimulus sampling theory. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (Vol. 2, pp. 121–268). New York, NY: Wiley. Atkinson, R. L., Atkinson, R. C., Smith, E. E., Bem, D. J., & NolenHoeksema, S. (1995). Introduction to psychology (10th ed.). Orlando, FL: Harcourt College. Baron, R. A., & Kalsher, M. J. (2000). Psychology (5th ed.). Boston, MA: Allyn & Bacon. Blough, D. S. (1975). Steady state data and a quantitative model of operant generalization and discrimination. Journal of Experimental Psychology: Animal Behavior Processes, 1, 3. https:// doi.org/10.1037/0097-7403.1.1.3 Capaldi, E. J. (1966). Partial reinforcement: A hypothesis of sequential effects. Psychological Review, 73, 459. https://doi. org/10.1037/h0023684 De Houwer, J., & Beckers, T. (2002a). A review of recent developments in research and theories on human contingency learning. The Quarterly Journal of Experimental Psychology: Section B, 55, 289–310. https://doi.org/10.1080/02724990244000034 De Houwer, J., & Beckers, T. (2002b). Higher-order retrospective revaluation in human causal learning. The Quarterly Journal of Experimental Psychology: Section B, 55, 137–151. https://doi. org/10.1080/02724990143000216 Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, 57, 94. https://doi.org/10.1037/h0058559 Estes, W. K. (1959). The statistical approach to learning theory. In S. Koch (Ed.), Psychology: A study of a science (Vol. 2, pp. 380–491). New York, NY: McGraw-Hill. Fazio, R. H., Eiser, J. R., & Shook, N. J. (2004). Attitude formation through exploration: Valence asymmetries. Journal of Personality and Social Psychology, 87, 293. https://doi.org/10.1037/ 0022-3514.87.3.293 Gershman, S. J., Blei, D. M., & Niv, Y. (2010). Context, learning, and extinction. Psychological Review, 117, 197. https://doi.org/ 10.1037/a0017808
Experimental Psychology (2019), 66(1), 23–39
H. Ram et al., The Effect of Outcome Probability on Generalization
Gershman, S. J., & Niv, Y. (2012). Exploring a latent cause theory of classical conditioning. Learning & Behavior, 40, 255–268. https://doi.org.10.3758/s13420-012-0080-8 Ghirlanda, S., & Enquist, M. (2003). A century of generalization. Animal Behaviour, 66, 15–36. https://doi.org/10.1006/anbe. 2003.2174 Grant, D. A., & Schipper, L. M. (1952). The acquisition and extinction of conditioned eyelid responses as a function of the percentage of fixed-ratio random reinforcement. Journal of Experimental Psychology, 43, 313. https://doi.org/10.1037/ h0057186 Hartman, T. F., & Grant, D. A. (1960). Effect of intermittent reinforcement on acquisition, extinction, and spontaneous recovery of the conditioned eyelid response. Journal of Experimental Psychology, 60, 89. https://doi.org/10.1037/h0039832 Hohwy, J. (2013). The predictive mind. Oxford, UK: Oxford University Press. Honig, W. K., & Urcuioli, P. J. (1981). The legacy of Guttman and Kalish (1956): 25 years of research on stimulus generalization. Journal of the Experimental Analysis of Behavior, 36, 405–445. https://doi.org/10.1901/jeab.1981.36-405 Hovland, C. I. (1937). The generalization of conditioned responses. IV. The effects of varying amounts of reinforcement upon the degree of generalization of conditioned responses. Journal of Experimental Psychology, 21, 261. https://doi.org/10.1037/ h0061938 Humphreys, L. G. (1939). Generalization as a function of method of reinforcement. Journal of Experimental Psychology, 25, 361. https://doi.org/10.1037/h0057941 Jenkins, W. O., & Stanley, J. C. Jr (1950). Partial reinforcement: A review and critique. Psychological Bulletin, 47, 193. https://doi. org/10.1037/h0060772 Ledgerwood, A. (2015). Practical and painless: Five easy strategies to transition your lab. Talk presented in a symposium on best practices at the annual conference of the Society for Personality and Social Psychology, Long Beach, CA. Liberman, N., & Trope, Y. (2008). The psychology of transcending the here and now. Science, 1201, 322–1205. https://doi.org/ 10.1126/science.1161958 Liberman, N., & Trope, Y. (2014). Traversing psychological distance. Trends in Cognitive Sciences, 18, 364–369. https://doi. org/10.1016/j.tics.2014.03.001 Liberman, N., Trope, Y., & Rim, S. (2011). Prediction: A construal level perspective. In M. Bar (Ed.), Prediction in the brain: Using the past to generate the future (pp. 144–158). New York, NY: Oxford University Press. McLaren, I. P. L., & Mackintosh, N. J. (2000). An elemental model of associative learning: I. Latent inhibition and perceptual learning. Animal Learning & Behavior, 28, 211–246. https://doi. org/10.3758/BF03200258 McLaren, I. P. L., & Mackintosh, N. J. (2002). Associative learning and elemental representation: II. Generalization and discrimination. Animal Learning & Behavior, 30, 177–200. https://doi. org/10.3758/BF03192828 Purtle, R. B. (1973). Peak shift: A review. Psychological Bulletin, 80, 408. https://doi.org/10.1037/h0035233 Pearce, J. M. (1987). A model for stimulus generalization in Pavlovian conditioning. Psychological Review, 94, 61. https:// doi.org/10.1037/0033-295X.94.1.61 Shanks, D. R. (1995). The psychology of associative learning. Cambridge, UK: Cambridge University Press. Shepard, R. N. (1958). Stimulus and response generalization: Tests of a model relating generalization to distance in psychological space. Journal of Experimental Psychology, 55, 509. https://doi.org/10.1037/h0042354
Ó 2019 Hogrefe Publishing
H. Ram et al., The Effect of Outcome Probability on Generalization
Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237, 1317–1323. https://doi. org/10.1126/science.3629243 Soto, F. A., Gershman, S. J., & Niv, Y. (2014). Explaining compound generalization in associative and causal learning through rational principles of dimensional generalization. Psychological Review, 121, 526. https://doi.org/10.1037/a0037018 Soto, F. A., Quintana, G. R., Pérez-Acosta, A. M., Ponce, F. P., & Vogel, E. H. (2015). Why are some dimensions integral? Testing two hypotheses through causal learning experiments. Cognition, 143, 163–177. https://doi.org/10.1016/j.cognition.2015.07.001 Spence, K. W. (1937). The differential response in animals to stimuli varying within a single dimension. Psychological Review, 44, 430. https://doi.org/10.1037/h0062885 Struyf, D., Iberico, C., & Vervliet, B. (2014). Increasing predictive estimations without further learning: the peak-shift effect. Experimental Psychology, 61, 134–141. https://doi.org/ 10.1027/1618-3169/a000233 Suddendorf, T., & Corballis, M. C. (2007). The evolution of foresight: What is mental time travel, and is it unique to humans? Behavioral and Brain Sciences, 30, 299–313. https:// doi.org/10.1017/S0140525X07001975 Tenenbaum, J. B., & Griffiths, T. L. (2001). Generalization, similarity, and Bayesian inference. Behavioral and Brain Sciences, 24, 629–640. https://doi.org/10.1017/S0140525X01000061 Todorov, A., Goren, A., & Trope, Y. (2007). Probability as a psychological distance: Construal and preferences. Journal of Experimental Social Psychology, 43, 473–482. https://doi.org/ 10.1016/j.jesp.2006.04.002 Trope, Y., & Liberman, N. (2010). Construal-level theory of psychological distance. Psychological Review, 117, 440. https://doi.org/10.1037/a0018963 Vervliet, B., Iberico, C., Vervoort, E., & Baeyens, F. (2011). Generalization gradients in human predictive learning: Effects of discrimination training and within-subjects testing. Learning and Motivation, 42, 210–220. https://doi.org/10.1016/j.lmot. 2011.03.004 Wakslak, C., & Trope, Y. (2009). The effect of construal level on subjective probability estimates. Psychological Science, 20, 52– 58. https://doi.org/10.1111/j.1467-9280.2008.02250.x Wakslak, C. J., Trope, Y., Liberman, N., & Alony, R. (2006). Seeing the forest when entry is unlikely: Probability and the mental representation of events. Journal of Experimental Psychology: General, 135, 641. https://doi.org/10.1037/0096-3445.135.4.641
Ó 2019 Hogrefe Publishing
39
Welham, A. K., & Wills, A. J. (2011). Unitization, similarity, and overt attention in categorization and exposure. Memory & Cognition, 39, 1518. https://doi.org/10.3758/s13421-011-0124-x Wickens, D. D., Schroder, H. M., & Snide, J. D. (1954). Primary stimulus generalization of the GSR under two conditions. Journal of Experimental Psychology, 47, 52. https://doi.org/ 10.1037/h0053617
History Received September 17, 2018 Revision received August 7, 2018 Accepted August 16, 2018 Published online February 19, 2019 Acknowledgment The data reported in this manuscript were presented at a conference (European Social Cognition Network Transfer of Knowledge Conference, Lisbon, Portugal during July 2016). Open Data Raw data, conditions, analysis scripts, and additional materials are available in the Electronic Supplementary Materials, ESM 1–5. Funding This work was supported by the I-CORE Program of the Planning and Budgeting Committee and the Israel Science Foundation (Grant 51/11) and by a Center for Excellence grant from the University of Leuven – KU Leuven (PF/10/005). ORCID Hadar Ram https://orcid.org/0000-0003-0079-9425 Hadar Ram School of Psychological Sciences Tel Aviv University PO Box 39040 Tel Aviv 69978 Israel ramhadar5@gmail.com
Experimental Psychology (2019), 66(1), 23–39
Research Article
Effect of Foreign Accent on Immediate Serial Recall Kit Ying Chan1 , Ming Ming Chiu2, Brady A. Dailey3, and Daroon M. Jalil4 1
Department of Social and Behavioural Sciences, City University of Hong Kong, Hong Kong
2
Department of Special Education and Counselling, The Education University of Hong Kong, Hong Kong
3
Department of Linguistics, Boston University, Boston, MA, USA
4
Department of Psychology, Old Dominion University, Norfolk, VA, USA Abstract: This study disentangled factors contributing to impaired memory for foreign-accented words – misperception and disruption of encoding. When native English and Cantonese-accented words were presented auditorily for serial recall (Experiment 1), intrusion errors for accented words were higher across all serial positions (SPs). Participants made more intrusion errors during auditory presentation than visual and auditory presentation, and more errors for accented words than native words. Lengthening the interstimulus intervals in Experiment 2 reduced intrusion, repetition, order, and omission errors in the middle and late SPs during accented word recall, suggesting that extra time is required for identification and encoding of accented words into memory. Analyses of the intrusions showed that a majority of them were misperceptions and sounded similar to the stimulus words. These findings suggest that effortful perceptual processing of accented speech can induce perceptual difficulty and interfere with downstream memory processes by exhausting the shared pool of working memory. Keywords: serial recall, foreign accent, speech perception, short-term memory, listening effort
Foreign accent refers to the extent to which the pronunciation of second language (L2) learners deviates from native speaker norms (Munro & Derwing, 1995a). The acousticphonological deviations include different subsegmental (Caramazza, Yeni-Komshian, Zurif, & Carbone, 1973), segmental (Flege & Hillenbrand, 1984; Munro & Derwing, 1995a), suprasegmental (Reed, 2000; Riazantseva, 2001), and temporal characteristics (Munro & Derwing, 1998; Temple, 2000). These deviations induce a mismatch between the speech inputs and the native listener’s representations, resulting in increased misperceptions, processing time, and vulnerability to noise compared to native speech (Van Wijngaarden, 2001). Most research has focused on the accent-induced perception costs and the perceptual learning of accented speech (Clarke & Garrett, 2004; Reinisch & Holt, 2014; Witteman, Weber, & McQueen, 2013). Few studies have examined the influence of foreign accent on memory. Gill (1994) studied how regional and foreign accents affected comprehension, and subsequent recall. Lectures by native North American English speakers were rated as most comprehensible, followed by those of British English speakers and then by Experimental Psychology (2019), 66(1), 40–57 https://doi.org/10.1027/1618-3169/a000430
those of Malaysian English speakers. Native listeners recalled significantly more information from native North American English speakers than from British or Malaysian English speakers. Gill (1994) suggested that comprehending messages from unfamiliar regional or foreign-accented speakers requires more cognitive resources, resulting in fewer available resources to encode information for subsequent recall. Furthermore, Pickel and Staller (2012) showed that a perpetrator’s accent influenced witnesses’ memories of the perpetrator’s message and physical appearance. Witnesses listening to a message spoken by a native-speaking perpetrator rather than by a foreign-accented perpetrator performed significantly better on a secondary visual search task, suggesting that processing foreign-accented speech is more effortful. Witnesses subsequently recalled more correct details and fewer incorrect details from messages by native-speaking perpetrators. Pickel and Staller (2012) also proposed that processing accented speech demands more cognitive resources, leaving fewer cognitive resources for remembering the speech. However, these studies did not examine the intelligibility of the foreign-accented stimuli. For less intelligible foreign-accented words, initial lexical access failure rather than subsequent memory processes might account for the poor recall. A study by Cho and Feldman (2013) accounted for word intelligibility, and their participants were given 2 s Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
to process indexical information of the native and accented stimuli. Then participants either repeated the word or did nothing, followed by a visual presentation of the stimulus for 1.5 s. The free recall of foreign-accented words was superior to that of native words, and memory for less intelligible words was better. Cho and Feldman (2013) suggested that the visual feedback after the initial auditory presentation might make the less intelligible words more salient and memorable. More elaborated traces were formed for accented words due to the more variable indexical information. However, this study could not disentangle the impact of accent-induced recognition difficulty and indexical processing on memory. Past studies showed that speech recognition interfaces with memory functions (Mattys, Davis, Bradlow, & Scott, 2012), especially when the speech is sufficiently degraded that raises working memory (WM) demands (Francis & Nusbaum, 2009; Surprenant, 1999, 2007). Rabbitt (1968) auditorily presented listeners with two lists of four digits either with or without noise at a masking level that permitted correct identification. Regardless of the initial presentation condition, recall of the first list was poorer when the second list was presented with noise than without it. Rabbitt (1968) proposed that even though recognition of the noisedegraded second list was successful, it required effort and depleted processing resources, which would otherwise be available for effective encoding and rehearsal of the first list. Increased listening effort in adverse conditions can interfere with downstream higher-level cognitive processing (effortfulness hypothesis). The perceptual processing of accented words, whether correct lexical access is achieved or not, demands increased listening effort (Pickel & Staller, 2012; Van Engen & Peelle, 2014) and can interfere with downstream memory encoding. As a continuous accented speech stream unfolds, recognition and encoding of the prior words might not be completed before presentation of the next word, which might cause loss of maintained items in short-term memory (STM). Hence, both accented word recognition failure and impaired memory encoding might contribute to poor accented word recall. The current study aimed to disentangle these two potential explanations by contrasting the recall of foreignaccented words in serial recall (SR) tasks with an auditory and a bimodal presentation mode, respectively (Experiment 1). Recall performance at the initial, middle, and final SPs reflects different memory processing. For example, recall performance at initial SPs reflects transfer of information to long-term memory (LTM), and recall performance at final SPs reflects retrieval from STM (Glanzer & Cunitz, 1966; Murdock, 1962). Foreign-accented words with varying intelligibility were used as stimuli, and their intelligibility was measured with a perceptual identification (PI) task. During auditory presentation, low intelligibility, accented Ă&#x201C; 2019 Hogrefe Publishing
41
words might induce recognition difficulty that contributes to poor serial recall. During bimodal presentation, the target word was simultaneously presented visually and auditorily to increase correct accented word recognition, so recall performance likely reflects memory processing alone. Contrasting the two conditions can help determine the contribution of accent-induced perceptual difficulty to poor accented word recall. To further test whether accents impair memory encoding, a slower presentation rate was adopted in the auditory SR task in Experiment 2 to provide extra time for the perceptual processing and encoding to be fully completed before the arrival of the next word. The extra time also allows indexical processing of the stimuli. Contrasting Experiments 1 and 2 helped identify the contribution of impaired memory processing to the poor recall of accented words, as well as the influence of indexical processing on accented word recall. Recall errors were categorized as omissions, intrusions, orders, or repetitions (Hurlstone, Hitch, & Baddeley, 2014). Accent and task manipulations might differentially impact different error types, which reflect different cognitive processes. For example, recall of items absent from the original list (i.e., intrusion) could be due to successful recall of misperceptions. Recall in an incorrect SP (order) reflects lower efficiency in encoding relational information for items. Erroneous repetition of an item during recall (repetition) reflects lower efficiency in suppressing representation of an item after recall (Henson, 1998). No response during recall (omission) reflects forgetting or retrieval failure. Error analyses provide further insights into whether the accented words tend to be misperceived, and how accent impacts different perceptual and memory processes. Chan and Vitevitch (2015) showed that a particular word (rather than many different similar sounding words) accounted for a majority of the misperceptions of accented words. Results from Cho & Feldmanâ&#x20AC;&#x2122;s (2013) memory recognition task also showed that both native and accented stimuli activated phonologically similar words, but accented words did not broadly activate more phonological neighbors. These previous findings suggested that misperceptions of accented words are likely to be similar sounding. To test how likely intrusions of accented words were actually misperceptions, intrusions from the SR tasks and misperceptions from the PI task were subjected to phonological analyses and their phonological similarity with the stimuli was examined.
Experiment 1 Experiment 1 examined the impact of accents on the recognition and immediate recall of spoken words using a PI task Experimental Psychology (2019), 66(1), 40â&#x20AC;&#x201C;57
42
and SR tasks with two presentation modes, auditory and bimodal, adopted from Frankish (2008) study. In the SR task, the auditory group only heard the stimulus words, whereas the bimodal group simultaneously heard and saw the stimulus words. The auditory group was expected to have difficulty identifying the accented stimuli. The bimodal group could derive the identity of the stimuli from the synchronized visual input and was expected to approach perfect identification. The intelligibility of the accented items was expected to vary depending on their lexical characteristics (Chan & Vitevitch, 2015; Imai, Walley, & Flege, 2005). This experiment took advantage of this variability to examine how intelligibility of the accented stimuli might be a significant predictor for intrusions and omissions, which are related to accent-induced misperception or recognition failure, in the auditory group. Without identification difficulty, the bimodal group is expected to commit fewer intrusions than the auditory group. Comparing the performances of the two groups can help determine the extent to which accentinduced identification difficulties contribute to poor accented word recall. By comparing the native and accented word recall in the initial and final SPs, we examined the impact of accents on the transfer of an item into LTM and its retrieval from STM (Glanzer & Cunitz, 1966; Murdock, 1962). Based on previous findings about effortful processing for foreign-accented speech (Gill, 1994; Pickel & Staller, 2012), we predicted that fewer resources would remain for encoding the temporal relation of the items, transferring items into LTM, and maintaining items in STM. Hence, more omission or order errors were predicted for accented words.
Method Participants Fifty-three native speakers of American English were recruited from the Introductory Psychology participant pool at the James Madison University in Virginia. All participants were right-handed and reported no history of speech or hearing disorders. They were randomly assigned to the auditory group (27) or the bimodal group (26). Materials The nine English stimulus words, bud, cot, dog, fin, gas, job, lice, pool, and soak, were used to construct 36 stimulus lists for each accent condition. Eight 9 9 Latin squares were used to make the stimulus lists, so that each stimulus word appeared twice at each SP. Presentation of the stimulus lists was blocked by accent. To prevent participants from knowing the identity of the stimulus word before hearing the accented stimuli, the accented block was always presented before the native block. Potential practice effects Experimental Psychology (2019), 66(1), 40–57
K. Y. Chan et al., Foreign Accent and Serial Recall
are modeled as a control variable in the analysis. The sequence of lists within each block was randomized under the constraint that no stimulus word appeared at the same SP on two successive trials. For the PI task, the same accented stimulus words were used. Survey The survey asked about participants’ first language, their fluency in any other language(s), any history of hearing and speech disorder, history of studying Cantonese, any family members or close friends with a Cantonese accent, and regular contact with non-native speakers of English. Speakers Two female students were recruited from James Madison University to record the stimuli. The native speaker of American English was from the South Atlantic region of the United States. The non-native speaker was from the southern part of China with Cantonese as her native language and English as a L2 with a strong accent. The stimuli were recorded digitally at a 44.1 kHz sampling rate using Adobe Audition CS5 and the PreSonus AudioBox Studio Set (PreSonus Audio Electronics, Inc.) connected to a Dell PC (Dell Inc.). The amplitude of the individual sound files was increased to their maximum without distortion using Praat (Boersma & Weenink, 2009). The duration of the native (M = 562 ms, SD = 55.6) and accented stimuli (M = 499 ms, SD = 112) did not differ significantly, F(1, 8) = 2.31, p > .05. Procedure Each participant sat in front of a Dell PC with a set of Sennheiser HD 25-SP II headphones (Sennheiser Electronic GmbH & Co. KG). The presentation of stimuli and collection of responses were controlled by Paradigm 2.0 (Perception Research Systems, 2007). The whole experiment lasted about 60 min. Participants first completed a language background questionnaire followed by the SR task and the PI task. For the SR task, each participant received 36 accented trials followed by 36 native trials separated by a short break. Each trial started with a 500 ms warning tone with a “+” sign appearing simultaneously on the computer screen, followed by a blank screen of 1 s. Participants in the auditory group heard the word list over the headphones. For the bimodal group, a word was presented auditorily and a synchronized visual display of that word appeared on screen for 500 ms. The bimodal group were instructed to attend to both the visual and auditory channels. The stimulus words were separated by a 150-ms interstimulus interval (ISI). List presentation was followed by an 18-s interval for written recall of the words in the order of their presentation. Participants were instructed to fill in the nine spaces on the response sheet from left to right, and to guess when Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
43
Table 1. Analytic difficulties about the outcomes and explanatory variables in the current data and the statistics strategy adopted Outcome variables Analytic difficulty
Statistics strategy
Nested data (Trials within people)
Multilevel analysis (aka hierarchical linear modeling, Goldstein, 2011)
Differences across serial positions
Multilevel cross-classification (Goldstein, 2011)
Discrete variable (yes/no)
Logit/Probit
Multiple dependent variables (Y1, Y2, . . .)
Multivariate outcome models (Goldstein, 2011) Explanatory variables
Analytic difficulty
Statistics strategy
Cross-level moderation (e.g., Trial Serial position)
Random effects model (Goldstein, 2011)
False positives (Type I errors)
Two-stage linear step-up procedure (Benjamini et al., 2006)
Robustness of results
Separate multilevel, single outcome models; Analyses of subsets of the data
uncertain. Recall was monitored by the experimenter to ensure participants’ compliance with these instructions. A practice trial was presented before the experiment and was excluded from data analyses. The PI task was the same for both groups. It included a randomized presentation of 90 trials consisting of 10 repetitions of the 9 accented words. Each trial started with the word “READY” appearing on the screen for 500 ms followed by the stimulus word presented over the headphones. Participants were given as much time as they needed to type on the keyboard, the word that they heard. They could see and correct their responses on the screen before hitting the ENTER key to initiate the next trial. Participants had five practice trials that were excluded from data analyses.
(see the Equation 1 in Appendix A) and Wald tests (see Table 1; Goldstein, 2011). Furthermore, we dealt with analytic difficulties in the explanatory variables (cross-level moderation, false positives, robustness) with a random effects model (Goldstein, 2011), the two-stage linear step-up procedure (Benjamini, Krieger, & Yekutieli, 2006), separate multilevel, single outcome models, and analyses of subsets of the data (see Table 1). Variables were centered, and we used MLwiN 3.00 software (Charlton, Rasbash, Browne, Healy, & Cameron, 2017). The explanatory variables included: word identification error rate, accented speech, bimodal presentation, time of trial, serial position, regular contact with a non-native speaker, and interactions among variables at the same trial level: accented Speech Time of Trial. Practice effects are controlled for with the explanatory variable time of trial.
Data Analysis For the PI task, a response was scored as correct if its phonological transcription matched the stimulus. For the SR tasks, a response was scored as correct only if the target word or a phonetically equivalent spelling was recalled in the correct SP. Recall errors were further classified with respect to output SP as omission, intrusion, order, or repetition errors following the same scoring criteria from McCormack, Brown, Vousden, and Henson (2000). An omission error was recorded for any non-response. An intrusion error was recorded for recall of any word outside the study list. An order error was recorded when a word was recalled in an incorrect SP. A repetition error was recorded for any erroneously repeated recall of any word beyond its number of occurrence. Data for the perceptual identification tasks and the serial recall tasks for each condition can be found in the Electronic Supplementary Material, ESM 1. We addressed analytic difficulties in the outcomes (nested data, differences across serial positions, discrete variables, and multiple dependent variables) with multivariate outcome, multilevel, cross-classification Logit/Probit analyses Ó 2019 Hogrefe Publishing
Results and Discussion Table 2 shows the mean identification error rates and standard deviations per stimulus word in the PI task. The mean identification error rates showed a large variability in the auditory group, ranging from 0% to 41.1%. Compared to native words, an accuracy rate of about 60% seems low, but it is near the typical range for accented speech – 52.1% for isolated words in Chan and Vitevitch (2015), 52.8% for isolated words embedded in noise in Imai et al. (2005), and 51–54% for sentences in Bent and Bradlow (2003). The bimodal group almost achieved perfect identification with minimal variability. Statistical power differs for each level. For an effect size of 0.1, statistical power exceeded .99 at both the serial position level (34,344 potential errors) and at the trial level (3,816 trials). For 53 participants, however, statistical power is only .86 for an effect size of 0.4. Accented word identification error rate, accent, bimodal presentation, time of trial, serial position, and its interactions were all linked to recall errors (see Table 3). Figure 1 displays the mean accuracy Experimental Psychology (2019), 66(1), 40–57
44
K. Y. Chan et al., Foreign Accent and Serial Recall
Table 2. Mean identification error rates (%) and standard deviations (SD) for each of the nine stimulus words in the perceptual identification task, Experiments 1–2 Experiment 1 Auditory M
SD
Experiment 2
Bimodal
Accented
M
M
SD
SD
Native M
SD
Bud
4.44
19.00
0.80
2.70
9.13
28.70
1.20
2.70
Cot
38.50
48.00
0.00
0.00
53.50
49.90
20.80
1.69
Dog
0.00
0.00
0.00
0.00
1.30
3.44
0.00
0.00
Fin
41.10
46.00
2.70
4.50
48.30
50.70
20.40
1.26
Gas
1.85
4.80
1.20
3.30
2.17
5.18
0.80
1.69
Job
1.48
5.30
0.00
0.00
0.00
0.00
0.00
0.00
Lice
5.56
20.00
0.40
2.00
13.90
34.20
9.20
6.27
Pool
15.20
36.00
0.00
0.00
9.57
28.70
1.20
1.93
Soak
8.15
27.00
0.00
0.00
17.80
38.60
0.00
0.00
rates (Figure 1A), mean error rates for intrusion (Figure 1B), omission (Figure 1C), repetition (Figure 1D), and order (Figure 1E) as a function of output SP for the two accent conditions in the two groups. Recall performance was lower among accented words than native words across all SPs in the auditory group. When the accent word identification error rate was higher than otherwise, the following error rates were higher: intrusion (β = 0.012, SE = 0.001, p < .001), omission (β = 0.003, SE = 0.001, p < .05), and order (β = 0.003, SE = 0.001, p < .01). The less intelligible the accented words were, the higher the intrusion, omission, and order error rates were. The auditory group made significantly more intrusions during recall of accented words than native words, across all SPs, β = 0.318, SE = 0.084, p < .001. These results suggest that less intelligible accented stimuli might contribute to misperceptions and successful recall of the misperception, thereby yielding intrusions. Figure 1C suggested a higher omission error rate for accented words than native words, especially in late SPs, regardless of presentation mode. However, the regression model showed no significant accent effect on omission error rates in the auditory group, except at SPs 4 and 9, controlling for other explanatory variables. The other explanatory variable that might account for this discrepancy is the significant interaction between time of trial and SP at middle and late SPs; see Table 3 for details. As the practice effect was larger at middle and late SPs, and native trials were presented after the accented trials, it might have partially accounted for the fewer omissions observed for native word recall at middle and late SPs. Contrary to our predictions, order errors for accented words and native words were not significantly different in the auditory group. One possible explanation is that accented words recalled in an incorrect order were also misperceived and counted as intrusions instead. Experimental Psychology (2019), 66(1), 40–57
The bimodal group made fewer intrusions than the auditory group, for both accented words, β = 1.143 (= 2.016 + 0.318 + 0.555), p < .001, and native words, β = 2.016, SE = 0.249, p < .001, which showed a larger reduction, β = 0.555, SE = 0.105, p < .001. As the intelligibility of accented stimuli was lower than that of native stimuli in this study, this latter result was unexpected. Even without recognition difficulty, the bimodal group still made significantly more intrusions for recalling accented words than native words ( 1.143 > 2.016; βaccented = 1.143 (= 2.016 + 0.318 + 0.555); βnative = 2.016, respectively). This result suggests that accents might exert other detrimental effects on memory in addition to recognition difficulty. More omissions occurred in the bimodal group than the auditory group at SPs 7–9, βsp7 = 0.791, SE = 0.084; βsp8 = 1.339, SE = 0.086; βsp9 = 1.383, SE = 0.089; all ps < .001; during recall of both native words, β = 1.748, SE = 0.731, p < .05, and accented words, β = 1.546 (= 1.748 + 0.027 + 0.175; p < .05). Increased omissions in final SPs suggest that the additional information in the bimodal presentation interferes with direct retrieval of information from STM or its maintenance. The bimodal group also showed significantly more order errors than the auditory group at SPs 3–6 (βsp3 = 0.380, SE = 0.079; βsp4 = 0.271, SE = 0.079; βsp5 = 0.439, SE = 0.079; βsp6 = 0.239, SE = 0.080; all ps < .001). Encoding of relational information for items at middle SPs was poorer as they were less likely to be rehearsed enough to enter LTM or to be retrieved directly from STM. Taken together, these results might suggest that integrating the auditory and visual cues from bimodal information might increase participants’ cognitive load regardless of accent type, which can yield poor encoding of the relational information among middle items, as well as poor maintenance of information in STM. The time of trial effect was significant for intrusions (β = 0.007, SE = 0.002, p < .001), omissions (β = 0.010, SE = 0.002, p < .001), and order errors (β = 0.005, SE = 0.001, p < .001). Participants were slightly less likely to make these errors in later trials, suggesting a small practice effect less than a tenth the size of other significant regression coefficients. The practice effect was larger for accented words than native words for omission, β = 0.013, SE = 0.004, p < .01. Participants were slightly more likely to commit repetitions in later trials (β = 0.010, SE = 0.001, p < .001), suggesting participants’ weaker response suppression in later trials. Controlling for this practice effect, the other explanatory variables still showed significant effects. Other variables and interactions were not significant. The current results suggest that accent-induced identification difficulty accounted for the lower recall performance of accented words compared to native words. In bimodal Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
45
Table 3. Summary of multivariate outcome, 3-level cross-classification analyses of intrusion, omission, repetition, and order errors for Experiment 1 Experiment 1 Explanatory variable Constant
Intrusion (SE)
Omission (SE)
Repetition (SE)
Order (SE)
2.862*** (0.125)
3.284*** (0.413)
3.162*** (0.172)
1.247*** (0.128)
Word identification error rate
0.012*** (0.001)
Accented
0.318*** (0.084)
Bimodal
2.016*** (0.249)
Contact w non-native Time of trial SP 2
0.007*** (0.002)
(0.001)
0.003** (0.001)
(0.082)
1.748*
(0.731)
0.126
(0.826)
0.010*** (0.002)
(0.063)
0.236
(0.344)
0.010*** (0.001)
(0.225) (0.255)
0.005*** (0.001)
(0.153)
0.905*** (0.070)
0.368*** (0.056)
0.967*** (0.141)
1.462*** (0.068)
SP 4
0.993*** (0.064)
1.152*** (0.138)
1.693*** (0.067)
1.553*** (0.061)
1.803*** (0.141)
1.711*** (0.067)
0.214** (0.062)
0.435**
0.373 0.206
SP 3 SP 5
0.187**
0.003* 0.027
SP 6
0.312*** (0.061)
1.795*** (0.060)
2.055*** (0.138)
1.466*** (0.067)
SP 7
0.044
1.599*** (0.061)
2.249*** (0.136)
1.139*** (0.068)
1.669*** (0.062)
2.217*** (0.136)
0.607*** (0.071)
1.485*** (0.062)
1.142*** (0.149)
0.992*** (0.079)
(0.087)
SP 8 SP 9 Accented Bimodal
0.203** (0.069) 0.555*** (0.105)
Accented Time of trial
0.175*
(0.084)
0.013** (0.004)
Accented SP 2 Accented SP 3 Accented SP 4
0.381*
(0.171)
0.391*
(0.164)
0.198*
(0.097)
Accented SP 5 Accented SP 6 Accented SP 7 Accented SP 8 Accented SP 9 Bimodal SP 2 Bimodal SP 3
0.380*** (0.079)
Bimodal SP 4
0.271** (0.079)
Bimodal SP 5
0.439*** (0.079)
Bimodal SP 6
0.293*** (0.080)
Bimodal SP 7
0.500** (0.168)
0.791*** (0.084)
Bimodal SP 8
1.339*** (0.086)
Bimodal SP 9
1.383*** (0.089)
Contact w non-native SP 2
0.275*
Contact w non-native SP 3
0.382** (0.126)
Contact w non-native SP 4
0.635*** (0.120)
Contact w non-native SP 5
0.821*** (0.113)
(0.130)
0.571*** (0.125) 0.423*
(0.175)
0.872*** (0.125)
Contact w non-native SP 6
0.948*** (0.111)
0.565** (0.166)
0.957*** (0.125)
Contact w non-native SP 7
1.139*** (0.113)
0.662*** (0.160)
0.911*** (0.126)
Contact w non-native SP 8
1.493*** (0.113)
0.903*** (0.160)
0.706*** (0.132)
Contact w non-native SP 9
1.631*** (0.114)
0.834*** (0.202)
Time of trial SP 2 Time of trial SP 3 Time of trial SP 4
0.015*** (0.004)
Time of trial SP 5
0.011*** (0.002)
Time of trial SP 6
0.012*** (0.002)
Time of trial SP 7
0.017*** (0.002)
Time of trial SP 8
0.018*** (0.002)
Time of trial SP 9
0.030*** (0.004) (Continued on next page)
Ă&#x201C; 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 40â&#x20AC;&#x201C;57
46
K. Y. Chan et al., Foreign Accent and Serial Recall
Table 3. (Continued) Experiment 1 Explanatory variable
Intrusion (SD)
Omission (SD)
Repetition (SD)
Order (SD)
Variance at each level Subject Trial SP
36%
65%
25%
14%
–
4%
–
6%
64%
31%
75%
80%
0.603
0.125
0.084
0.076
–
0.000
0.037
0.100
0.123
0.158
0.240
0.111
0.113
0.137
0.245
1.267
0.375
0.873
Explained variance at each level Subject Trial SP Total variance explained BIC
–
0.000
Notes. The default category for comparison are: Accent type – Native; Presentation mode – Auditory; SP = SP1; Contact w non-native – No regular contact with non-native speakers. BIC = Bayesian information criterion. Initially, nonsignificant explanatory variables were removed to preserve degrees of freedom without increasing omitted variable bias. Some explanatory variables were initially significant but were no longer significant after addition of interaction terms; these variables remain in the model for proper interpretation of the results. SE = standard error. *p < .05; **p < .01; ***p < .001.
presentation, participants showed fewer intrusions for both native and accented words. With the visual display to reduce misperception, participants were more likely to successfully identify, encode, and retrieve words. However, recall of accented words still showed significantly more intrusions than native words, suggesting that the detrimental effects of accents go beyond just misidentification of words. Omissions and order errors occurred more often at late and middle SPs, respectively, in the bimodal condition than in the auditory condition. In the bimodal condition, integrating the visual display with the auditory stimuli might increase the overall cognitive load. This might cause poorer encoding of the relational information among middle items, as well as poorer maintenance of information in STM regardless of accent. Overall, this pattern of results suggested that recognition difficulty induced by accents contributed to increased intrusions during recall. Apart from misperception, recognition difficulty induced by acoustic-phonetic deviations in accented speech might also disrupt encoding of the stimulus into memory. To test whether foreign accent also disrupts memory processing, we increased ISI in Experiment 2 to allow extra time for processing and encoding of the foreign-accented words.
accented word recognition (Munro & Derwing, 1995b; Van Engen & Peelle, 2014). With only 150-ms ISI in Experiment 1, identification and phonological encoding of the accented words might be incomplete and disrupted by successive stimuli. To test whether accents incur extra processing costs on the phonological encoding and rehearsal of stimuli, ISI was increased to 4 s in Experiment 2. The longer ISIs provided participants sufficient time to finish recognizing, encoding, and rehearsing accented words to facilitate later retrieval. Improvement of accented word recall performance in this experiment compared to the auditory condition in Experiment 1 would reflect the processing costs induced by accent on phonological encoding, as well as the benefit of having extra time for rehearsal. However, long ISIs do not help participants comprehend accented words that would otherwise be misrecognized, so accent-induced misperception was expected to remain. A native condition with longer ISIs serves as a baseline for comparison with the accented condition, in which accent-induced misperceptions and recognition failures cause errors. Increasing the ISIs increased the duration of the whole experiment. To reduce participants’ potential fatigue, we kept the total duration of the whole experiment comparable with Experiment 1 by collecting data on the native and accented conditions from two randomly assigned, separate groups of participants.
Experiment 2 Experiment 2 aimed to examine whether foreign accents exert detrimental effects on memory processing in addition to misidentification of words. A foreign accent induces mismatches between the speech input and the representations stored in listeners’ memories, so more processing time might be required to resolve these mismatches during Experimental Psychology (2019), 66(1), 40–57
Method Participants Fifty participants with the same profile of attributes described in Experiment 1 were recruited for Experiment 2. Participants were randomly assigned to the two conditions (25 participants in each condition). Ó 2019 Hogrefe Publishing
(A)
Mean Accuracy Rate (%)
K. Y. Chan et al., Foreign Accent and Serial Recall
47
100.0 80.0
Auditory Accent
60.0
Auditory Native
40.0
Bimodal Accent
20.0
Bimodal Native
0.0 1
2
3
4
5
6
7
8
9
(B)
Mean Error Rate (%)
Output Serial Position
100.0
Auditory Accent
80.0
60.0
Auditory Native
40.0
Bimodal Accent
20.0
Bimodal Native
0.0 1
2
3
4
5
6
7
8
9
(C)
Mean Error Rate (%)
Output Serial Position
100.0
Auditory Accent
80.0
Auditory Native
60.0 40.0
Bimodal Accent
20.0
Bimodal Native
0.0 1
2
3
4
5
6
7
8
9
Output Serial Position
Mean Error Rate (%)
(D) 100.0
Auditory Accent
80.0
Auditory Native
60.0 40.0
Bimodal Accent
20.0
Bimodal Native
0.0
1
2
3
4 5 6 Output Serial Position
7
8
9
Mean Error Rate (%)
(E) 100.0 80.0
Auditory Accent
60.0
Auditory Native
40.0
Bimodal Accent
20.0
Bimodal Native
0.0 1
2
3
4 5 6 Output Serial Position
7
8
9
Figure 1. (A) Mean accuracy rates (%), (B) mean error rates (%) for intrusion, (C) omission, (D) repetition, and (E) order error with the error bars representing 95% confidence intervals are plotted as a function of output SP for the two accent conditions in the serial recall task with auditory and bimodal presentation, Experiment 1.
Materials The same set of native and accented stimulus words from Experiment 1 were used as stimuli in the SR and PI tasks. Ă&#x201C; 2019 Hogrefe Publishing
Procedure The procedure was identical to the auditory condition of Experiment 1 except for the following. Each participant only received the first 36 trials in the SR task with 4-s ISIs. For Experimental Psychology (2019), 66(1), 40â&#x20AC;&#x201C;57
48
the native condition, native stimuli from the SR task were used as stimuli in the PI task.
Results and Discussion Data for the perceptual identification tasks and the serial recall tasks for each condition can be found in ESM 1. The scoring criteria in Experiment 1 were also used in Experiment 2. The mean identification error rate and standard deviation for the PI task are shown in Table 2. The native words were highly intelligible with a mean identification error rate of 5.95%, ranging from 0% to 20.8%. Like Experiment 1, the mean identification error rates of the accented words varied substantially, ranging from 0% to 53.5%. Comparison of Accented Conditions Across Experiments For the SR task, we pooled only the participants in the accented condition across Experiments 1 and 2 during the data analysis. For an effect size of 0.1, statistical power exceeded .99 at both the SP level (25,272 potential errors) and trial level (2,808 trials). For 78 participants, statistical power is .95 for an effect size of 0.4. The analysis was the same as Equation 1 except that the explanatory variable, 4-second ISI, was added, and accented was omitted, along with their interaction variables. Figure 2 displays the mean accuracy rates (Figure 2A), mean error rates for intrusion (Figure 2B), omission (Figure 2C), repetition (Figure 2D), and order (Figure 2E) as a function of output SP for the accent conditions in the SR task with auditory presentation and 150-ms ISI in Experiment 1 and 4-s ISI in Experiment 2. A summary of the multivariate outcome, 3-level cross-classification analyses of intrusion, omission, repetition, and order errors for Experiment 2 is shown in Table 4. As the bimodal condition results were the same as those in Experiment 1, we focus on the results related to Experiment 2. Consistent with Experiment 1, accent word identification error rate significantly predicted intrusion (β = 0.019, SE = 0.001, p < .001), omission (β = 0.002, SE = 0.001, p < .01), and order error rates (β = 0.002, SE = 0.001, p < .01). The less intelligible the accented words were, the higher the likelihood of intrusion, omission, or order errors during recall. Long ISIs in Experiment 2 resulted in significantly fewer intrusions and repetitions during accented word recall, βintrusion = 0.591, SE = 0.252, p < .05; βrepetition = 0.734, SE = 0.248, p < .01. Interactions between ISI and SP were significant for all types of errors during accented word recall. The auditory presentation mode with a longer 4-s ISI resulted in fewer intrusions during accented word recall at SPs 6 and 7, βSP6 = 0.333; βSP7 = 0.283, and fewer repetitions at Experimental Psychology (2019), 66(1), 40–57
K. Y. Chan et al., Foreign Accent and Serial Recall
SPs 6, 8, and 9, βSP6 = 0.540; βSP8 = 0.742; βSP9 = 1.687. Compared to 150-ms ISIs, 4-s ISIs resulted in significantly fewer order errors during accented word recall at SPs 5, 8, and 9, βSP5 = 0.330; βSP8 = 0.410; βSP9 = 0.455, as well as fewer omissions in SPs 5 and 9, βSP5 = 0.623; βSP9 = 0.576. The intrusion and omission results implied that with short ISIs, listeners struggled to identify and encode the accented words into memory, especially those in the middle SPs, as the extra processing costs induced by accents accumulate during the progressive stimulus presentation. Longer ISIs provided more time for resolving mismatches induced by accent so that identification and encoding of the accented words could be completed without interference from incoming stimuli. Results from Experiment 2 demonstrated that the disruption of phonological encoding incurred by accents can partially account for performance deficits for recall of accented words presented auditorily. Experiment 2 consistently shows that longer ISIs aid encoding and recall of middle and late items. Fewer repetitions during accented word recall occurred with 4-s ISIs, especially in the middle and late SPs. This implies that with longer ISIs, participants could better encode the accented stimuli and were less likely to erroneously repeat an item that has been recalled earlier. Also, fewer order errors occurred in middle and late SPs during accented word recall with 4-s ISI than with 150-ms ISI. This suggested that longer ISIs allowed participants to better encode the relational information among the middle and late items. As people’s recall for middle items is typically worse than those in the initial and final SPs, it is not surprising that longer ISIs benefit the middle items more than others (Glanzer & Cunitz, 1966). With the build-up of cognitive load from rehearsing early items and interference from incoming stimuli, the middle items are typically not rehearsed enough to be transferred to LTM nor maintained long enough for retrieval from STM (Glanzer & Cunitz, 1966). With longer ISIs, the middle items were more likely to be processed completely and rehearsed enough to be transferred to LTM. The fewer order and omission errors at late SPs with longer rather than shorter ISIs might be explained by the listeners using the additional acoustically coded representation of the final item stored in a separate sensory buffer store, namely echoic memory (Neisser, 2014) or precategorical acoustic storage (Crowder & Morton, 1969) with the longer ISIs. As the final list item was not followed by other stimuli, this additional acoustically coded representation of the final item stored in echoic memory (Neisser, 2014) was not overwritten by subsequent auditory events. With longer ISIs, there was sufficient time for the listeners to process the indexical information of the final stimuli in echoic memory, such as the gender, voice, and accent of the Ó 2019 Hogrefe Publishing
(A)
Mean Accuracy Rate (%)
K. Y. Chan et al., Foreign Accent and Serial Recall
100.0 150-ms ISI Accent
80.0
150-ms ISI Native
60.0
4-s ISI Accent
40.0
4-s ISI Native
20.0 0.0 1
2
3
(B)
4
5
6
7
8
9
Mean Error Rate (%)
Output Serial Position 100.0
150-ms ISI Accent
80.0
150-ms ISI Native
60.0
4-s ISI Accent
40.0
4-s ISI Native
20.0 0.0 1
(C)
2
3
4
5
6
7
8
9
Mean Error Rate (%)
Output Serial Position
100 80
150-ms ISI Accent
60
150-ms ISI Native 4-s ISI Accent
40
4-s ISI Native
20 0 1
2
3
4
5
6
7
8
9
Output Serial Position
Mean Error Rate (%)
(D) 100.0
150-ms ISI Accent 150-ms ISI Native 4-s ISI Accent 4-s ISI Native
80.0 60.0 40.0 20.0 0.0
Mean Error Rate (%)
1
(E)
49
2
3
4 5 6 Output Serial Position
7
8
9
100.0
150-ms ISI Accent 150-ms ISI Native 4-s ISI Accent 4-s ISI Native
80.0 60.0
40.0 20.0
0.0 1
2
3
4
5
6
7
8
9
Output Serial Position Figure 2. (A) Mean accuracy rates (%), (B) mean error rates (%) for intrusion, (C) omission, (D) repetition, and (E) order error with the error bars representing 95% confidence intervals are plotted as a function of output SP for the two accent conditions in the serial recall task with auditory presentation and 150 ms interstimulus interval (ISI) in Experiment 1 and 4 s ISI in Experiment 2.
speaker (Nygaard, Sommers, & Pisoni, 1995). This additional indexical information of the final items might make it more temporally distinctive from the prior items. Therefore, the final items were more likely to be accurately retrieved without being mis-ordered or lost. The time of trial effect was significant for omissions (β = 0.013, SE = 0.002, p < .001) and order errors (β = 0.006, SE = 0.002, p < .01). Participants were slightly less likely to make omission and order errors in later trials, Ó 2019 Hogrefe Publishing
showing small practice effects. The significant time of trial and SP interaction at SPs 4–9 for omission suggests fewer omission errors for middle and late SPs at later trials, βSP4 = 0.019; βSP5 = 0.025; βSP6 = 0.019; βSP7 = 0.018; βSP8 = 0.025; βSP9 = 0.045. The time of trial effect was also significant for repetitions (β = 0.010, SE = 0.003, p < .01), suggesting participants’ weaker response suppression in later trials. Other variables and interactions were not significant.
Experimental Psychology (2019), 66(1), 40–57
50
K. Y. Chan et al., Foreign Accent and Serial Recall
Table 4. Summary of multivariate outcome, 3-level cross-classification analyses of intrusion, omission, repetition, and order errors for Experiments 1 and 2, accented conditions only Explanatory variable Constant Word identification error rate
Intrusion (SE)
Omission (SE)
Repetition (SE)
Order (SE)
2.817*** (0.132)
2.312*** (0.340)
3.595*** (0.141)
1.221*** (0.120)
0.019*** (0.001)
0.002** (0.001)
0.002** (0.001)
Bimodal
1.624*** (0.257)
1.406*
(0.616)
0.407
(0.211)
4-second interstimulus interval
0.591*
(0.252)
0.814
(0.621)
0.734** (0.248)
0.219
(0.214)
0.414
(0.574)
0.107
0.096
(0.200)
0.002
(0.002)
Regular contact with non-native speaker Time of trial
0.013*** (0.002)
(0.260)
0.010** (0.003)
SP 2
0.006** (0.002) 0.848*** (0.078)
SP 3 SP 4
0.402*** (0.070)
0.788***
1.347*** (0.077)
0.931*** (0.071)
1.022*** (0.134)
1.543*** (0.075)
SP 5
0.166*
(0.067)
1.137*** (0.069)
1.438*** (0.125)
1.408*** (0.078)
SP 6
0.281*** (0.068)
1.625*** (0.065)
1.240*** (0.146)
1.183*** (0.076)
SP 7
0.138*
(0.070)
1.799*** (0.077)
1.714*** (0.129)
0.855*** (0.080)
SP 8
0.125
(0.067)
1.737*** (0.072)
1.500*** (0.140)
0.183*
1.288*** (0.082)
0.284
SP 9
(0.245)
(0.087)
1.364*** (0.110)
Bimodal SP 2 Bimodal SP 3
0.426*** (0.092)
Bimodal SP 4 Bimodal SP 5 Bimodal SP 6
0.607*** (0.104)
Bimodal SP 7
0.943*** (0.123)
Bimodal SP 8
1.227*** (0.108)
Bimodal SP 9
1.108*** (0.127)
4-second interstimulus interval SP 2 4-second interstimulus interval SP 3 4-second interstimulus interval SP 4 4-second interstimulus interval SP 5
0.623*** (0.115)
4-second interstimulus interval SP 6
0.333*
(0.132)
4-second interstimulus interval SP 7
0.283*
(0.136)
0.330** (0.098) 0.540* (0.224) 0.194
0.385** (0.128)
4-second interstimulus interval SP 8
(0.104)
0.742*** (0.210)
0.410** (0.123)
1.687*** (0.443)
0.455*
(0.199)
Contact w non-native SP 2
0.314*
(0.146)
Contact w non-native SP 3
0.469** (0.143)
4-second interstimulus interval SP 9
0.576*** (0.148)
Contact w non-native SP 4
0.337** (0.120)
0.494*** (0.141)
Contact w non-native SP 5
0.764*** (0.142)
Contact w non-native SP 6
0.698*** (0.143)
Contact w non-native SP 7
0.320** (0.114)
Contact w non-native SP 8
0.713*** (0.116)
Contact w non-native SP 9
0.918*** (0.120)
0.411*
(0.175)
0.544*
(0.226)
0.686*** (0.145) 0.626*** (0.154)
Time of trial SP 2 Time of trial SP 3 Time of trial SP 4
0.019** (0.006)
Time of trial SP 5
0.025*** (0.006)
Time of trial SP 6
0.019*** (0.005)
Time of trial SP 7
0.018** (0.005)
Time of trial SP 8
0.015*
(0.006)
Time of trial SP 9
0.025*** (0.005) 0.045*** (0.006)
Variance at each level Subject Trial SP
32%
54% 6%
–
6%
68%
40%
78%
81%
–
22%
13%
(Continued on next page) Experimental Psychology (2019), 66(1), 40–57
Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
51
Table 4. (Continued) Explanatory variable
Intrusion (SE)
Omission (SE)
Repetition (SE)
Order (SE)
Explained variance at each level Subject
0.515
0.081
0.184
0.105
Trial
–
0.000
–
0.000 0.162
SP Total variance explained BIC
0.024
0.112
0.104
0.183
0.088
0.122
0.145
0.073
0.697
0.702
0.871
Notes. The default category for comparison are: Accent type – Native; Presentation mode – Auditory; SP = SP1; Contact w non-native – No regular contact with non-native speakers. BIC = Bayesian information criterion; SE = standard error. Initially, nonsignificant explanatory variables were removed to preserve degrees of freedom without increasing omitted variable bias. Some explanatory variables were initially significant but were no longer significant after addition of interaction terms; these variables remain in the model for proper interpretation of the results. *p < .05; **p < .01; ***p < .001.
Comparison of the Accented and Native Conditions Participants from the native and accented conditions were pooled. The analysis equation is the same as 1 except that bimodal and its interactions were removed. For an effect size of 0.1, statistical power exceeded .99 at the SP level (16,800 potential errors) and is .99 at the trial level (1,800 trials). For 25 participants, statistical power is only .52 for an effect size of 0.4. Accent, serial position, time of trial, and their interactions were linked to recall errors. Participants with higher word identification error rates had more intrusions, β = 0.027, SE = 0.001, p < .001 (see Table 5). Participants made more intrusions when recalling accented words than native words, β = 0.798, SE = 0.276, p < .01. Even though word recognition was more likely to be completed with 4-s ISIs, misperceptions and recognition failure still occurred for accented words with low intelligibility. There were significantly more omissions, but fewer order errors for accented words than native words at SPs 8 and 9 (omission: βsp8 = 0.572, SE = 0.140, p < .001; βsp9 = 0.589, SE = 0.168, p < .001; order: βsp8 = 0.370, SE = 0.139, p < .01; βsp9 = 0.550, SE = 0.216, p < .005). Accented words were more likely than native words to be mis-recognized, so they were more likely to be omitted rather than mis-ordered during recall. Time of trial was significant for intrusions, (β = 0.007, SE = 0.003, p < .05), and order errors (β = 0.010, SE = 0.003, p < .001). In later trials, participants were slightly more likely to make intrusions, but less likely to make order errors. Time of trial was significant for omissions at SPs 4–9, showing fewer omissions in later trials at the middle and late SPs. The interaction between time of trials and accent was significant for repetition, β = 0.019, SE = 0.004, p < .01, suggesting that participants showed weaker response suppression in later trials for accented words than native words. Phonological Analyses of Intrusions and Misperceptions Previous research showed that misperceptions for native words (Vitevitch & Luce, 1999) and foreign-accented words Ó 2019 Hogrefe Publishing
(Chan & Vitevitch, 2015; Cho & Feldman, 2013) are likely to sound similar to the target words. To determine whether intrusions were likely a result of successful recall of misperceptions, intrusions from SR tasks were matched with misperceptions from the PI tasks and their phonological similarity to the stimuli was examined. Details of the phonological transcription and matching on similarity are shown in Appendix B. Intrusions in Experiments 1 and 2 were categorized into either matching with misperceptions or not, and the corresponding frequency distribution is displayed in Table 6. As expected, chi-square tests of independence showed a significant association between accent and matching with misperceptions for intrusions during the auditory presentation with 150-ms ISIs, w2(1, N = 2,994) = 232.0, p < .001; during the bimodal presentation with 150-ms ISIs, w2(1, N = 561) = 17.8, p < .001; and the auditory presentation with 4-s ISIs, w2(1, N = 1,897) = 71.1, p < .001. A higher proportion of intrusions from accented words than native words matched with misperceptions, confirming that intrusions from accented words were more likely to stem from misperception. For intrusions from accented words in Experiment 1, a chi-square test of independence also showed a significant association between presentation mode and matching with misperceptions, w2(1, N = 2,171) = 148.5, p < .001. Compared to bimodal presentation, a higher proportion of intrusions from auditory presentation matched with misperceptions, confirming that accented stimuli were more likely to be misperceived with auditory presentation, and their successful recall manifested as intrusions. Aligned with findings from Chan and Vitevitch (2015) and Cho and Feldman (2013), a majority of the misperceptions for accented words sounded similar to the stimulus words: 72.7% and 85.9% for the auditory groups with 150-ms ISIs and 4-s ISIs, respectively. These contrast with only 40% for the bimodal group. Also, only 31% of the misperceptions for native words sounded similar to the stimuli. Intrusions in Experiments 1 and 2 were categorized as sounding similar to the stimuli or not, and further Experimental Psychology (2019), 66(1), 40–57
52
K. Y. Chan et al., Foreign Accent and Serial Recall
Table 5. Summary of multivariate outcome, 3-level cross-classification analyses of intrusion, omission, repetition, and order errors for Experiment 2 Explanatory variable Constant
Intrusion (SE)
Omission (SE)
Repetition (SE)
2.847*** (0.138)
2.804*** (0.264)
3.887*** (0.145)
Word identification error rate
0.027*** (0.001)
Accented
0.798** (0.276)
Order (SE) 1.352*** (0.126) 0.406
0.140
(0.527)
0.104 (0.285)
(0.252)
0.004
(0.003)
0.002 (0.004)
SP 3
0.484*** (0.093)
0.502** (0.185)
0.876*** (0.083)
SP 4
0.942*** (0.088)
0.576** (0.181)
1.205*** (0.081)
SP 5
1.203*** (0.086)
1.110*** (0.162)
1.012*** (0.082)
SP 6
1.389*** (0.085)
0.662*** (0.178)
1.003*** (0.082)
SP 7
1.377*** (0.085)
1.004*** (0.165)
SP 8
0.855*** (0.089)
0.711*** (0.176)
0.115
SP 9
0.082
0.710** (0.274)
1.449*** (0.123)
Regular contact with non-native speaker Time of trial
0.007*
(0.003)
SP 2
0.010*** (0.003) 0.681*** (0.084)
(0.102)
Accented Time of trial
0.548*** (0.085) (0.091)
0.019* (0.009)
Accented SP 2 Accented SP 3 Accented SP 4 Accented SP 5 Accented SP 6 Accented SP 7 Accented SP 8
0.572*** (0.140)
0.370** (0.139)
Accented SP 9
0.589*** (0.168)
0.550*
(0.216)
Contact w non-native SP 2 Contact w non-native SP 3 Contact w non-native SP 4 Contact w non-native SP 5 Contact w non-native SP 6 Contact w non-native SP 7 Contact w non-native SP 8 Contact w non-native SP 9 Time of trial SP 2 Time of trial SP 3 Time of trial SP 4
0.016*
Time of trial SP 5
0.037*** (0.007)
Time of trial SP 6
0.030*** (0.007)
Time of trial SP 7
0.037*** (0.007)
Time of trial SP 8
0.057*** (0.008)
Time of trial SP 9
0.070*** (0.009)
(0.008)
0.018** (0.006)
Variance at each level Subject Trial SP
31%
44%
20%
15%
–
8%
–
12%
69%
48%
80%
73%
0.456
0.028
0.070
0.047
Explained variance at each level Subject Trial SP Total variance explained BIC
–
0.000
–
0.000
0.054
0.077
0.068
0.158
0.179
0.053
0.069
0.122
0.223
0.509
1.150
0.743
Notes. The default category for comparison are: Accent type – Native; Presentation mode – Auditory; SP = SP1; Contact w non-native – No regular contact with non-native speakers. BIC = Bayesian information criterion; SE = standard error. Initially, nonsignificant explanatory variables were removed to preserve degrees of freedom without increasing omitted variable bias. Some explanatory variables were initially significant but were no longer significant after addition of interaction terms; these variables remain in the model for proper interpretation of the results. *p < .05; **p < .01; ***p < .001.
Experimental Psychology (2019), 66(1), 40–57
Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
53
Table 6. The frequency distribution (relative frequency in parentheses) of intrusions in each of the conditions in Experiments 1 and 2 across matching with misperceptions or not Accent
Presentation mode
ISI
Matched with misperceptions
Not matched with misperceptions
Accented
Auditory
150 ms
1,060 (59.4%)
724 (40.6%)
Native
Auditory
150 ms
376 (31.1%)
834 (68.9%)
Accented
Bimodal
150 ms
98 (25.3%)
289 (74.7%)
Native
Bimodal
150 ms
17 (9.8%)
157 (90.2%)
Accented
Auditory
4s
1,188 (84.0%)
227 (16.0%)
Native
Auditory
4s
318 (66.0%)
164 (34.0%)
Note. ISI = interstimulus interval.
Table 7. The frequency distribution (relative frequency in parentheses) of intrusions in each of the conditions in Experiments 1 and 2 across matching with misperceptions or not and similar sounding to the stimulus words or not Similar Sounding ISI
Matched with misperceptions
Not matched with misperceptions
Dissimilar Sounding
Accent
Presentation mode
Total
Matched with misperceptions
Not matched with misperceptions
Total
Accented
Auditory
150 ms
881 (49.40%)
99 (5.55%)
Native
Auditory
150 ms
366 (30.20%)
87 (7.19%)
980 (54.9%)
179 (10.00%)
625 (35.0%)
453 (37.4%)
10 (0.83%)
747 (61.7%)
Accented
Bimodal
150 ms
97 (25.10%)
95 (24.50%)
757 (62.6%)
192 (49.6%)
1 (0.26%)
194 (50.2%)
195 (50.4%)
Native
Bimodal
150 ms
17 (9.77%)
70 (40.20%)
87 (50.0%)
0 (0.00%)
87 (50.0%)
87 (50.0%)
Accented
Auditory
4s
1,052 (74.40%)
22 (1.55%)
1,074 (75.9%)
136 (9.61%)
205 (14.5%)
341 (24.1%)
Native
Auditory
4s
318 (66.00%)
25 (5.19%)
343 (71.2%)
0 (0.00%)
139 (28.8%)
139 (28.8%)
804 (45.1%)
Note. ISI = interstimulus interval.
categorized into matching with misperceptions or not; the corresponding frequency distribution is displayed in Table 7. As expected, chi-square tests of independence showed significant associations between accent and similarity with stimuli in intrusions during the auditory presentations with 150-ms ISIs, w2(1, N = 2,994) = 88.4, p < .001, and 4-s ISIs, w2(1, N = 1,897) = 4.27, p < .038. A higher proportion of intrusions was phonologically similar to the stimuli for accented words than native words in both auditory groups regardless of ISIs. Intrusions for accented words during auditory or bimodal presentation did not differ with respect to similarity to stimuli, w2(1, N = 2,171) = 3.62, p > .05. Compared to the bimodal presentation, a much higher proportion of similar sounding intrusions for accented words during auditory presentation matched with misperceptions, w2(1, N = 1,172) = 180, p < .00001. This result confirms that similar sounding intrusions for accented words from the auditory presentation condition were likely misperceptions. On the other hand, intrusions from the bimodal condition were equally likely to sound similar to the stimuli even though they did not stem from misperceptions. This intrusion of similar sounding words during the bimodal condition likely occurred during memory processing rather than perceptual processing. This implies that the similar sounding words were strongly activated during recall. This result is consistent with Cho and Feldman’s (2013) finding Ó 2019 Hogrefe Publishing
in a memory recognition task: false alarms of phonologically similar words for both native and accented words presented bimodally.
General Discussion This study disentangled the factors that might contribute to impaired memory for foreign-accented words – misperception/recognition failure and disruption of memory encoding. The auditory group in Experiment 1 tended to misidentify accented words and showed more intrusions in the recall of accented words than native words. The bimodal group almost identified the accented word perfectly with the synchronized visual display in the SR task to aid recognition. The bimodal group showed fewer intrusions than the auditory group, supporting our hypothesis that increased intrusions in accented word recall was partially due to successful recall of misperceptions. However, the bimodal group still showed more intrusions for accented words than native words, implying that foreign accents exert other detrimental effects on recall in addition to inducing recognition difficulty. The bimodal condition yielded more omission and order errors at the late and middle SPs, respectively. Experiment 2 used longer ISIs and demonstrated that the extra processing costs incurred by accents on phonological Experimental Psychology (2019), 66(1), 40–57
54
encoding also account for the poorer accented word recall. Participants showed fewer intrusions and repetitions during accented word recall, especially in the middle and late SPs. During accented word recall, fewer order and omission errors occurred in middle and late SPs, respectively. Longer ISIs provided sufficient time to resolve accent-induced mismatches, so that participants could complete identification, encoding, and rehearsal of the accented words, particularly in the middle SPs, without interference from incoming stimuli. Successful recall of misperceptions accounted for a higher proportion of intrusions for accented words than for native words, especially during auditory presentation. Intrusions for accented words from the auditory and bimodal presentation modes did not differ with respect to similarity to stimuli, but a much higher proportion of these similar sounding intrusions matched with misperceptions during auditory presentation than bimodal presentation. These similar sounding intrusions for accented words in the bimodal group were not a result of misperceptions, implying that the similar sounding words were strongly activated during recall. Consistent with past research by Gill (1994) and Pickel and Staller (2012), the current results showed that foreign accents impaired memory. As the intelligibility of accented words in these previous studies was not measured, the possibility that accented word “perception” failure partially contributed to impaired recall performance could not be excluded. The current results suggest that accented words were likely to be misperceived or not recognized. Although these are actually perception errors, subsequent accurate encoding and retrieval of the misperceived words would then be incorrectly considered memory errors. Thus, it is particularly crucial for memory researchers studying foreign-accented speech to measure the intelligibility of the foreign-accented stimuli. The current findings contrasted with previous findings from Cho and Feldman (2013) that recall of foreignaccented words and less intelligible words were better. A possible explanation for this discrepancy with our findings is that participants in that study were given visual feedback on word identification and given enough time for recognizing and encoding the words into memory. This study disentangled the impacts of visual feedback and extra processing time on accented word recall. With only synchronized visual display without extra processing time, recall of accented words still showed more intrusions compared to native words in the current study. Aligned with the effortfulness hypothesis, even when recognition was successful, foreign-accented speech induced effortful perceptual processing, like noise-degraded or synthetic speech. The increased effort in perceptual processing of accented speech appeared to drain the cognitive Experimental Psychology (2019), 66(1), 40–57
K. Y. Chan et al., Foreign Accent and Serial Recall
resources deployed for phonological encoding of words and interfere with subsequent memory processes (Cousins, Dar, Wingfield, & Miller, 2014; Francis & Nusbaum, 2009; Wild et al., 2012). This effort likely results in poorer representation of accented words in memory, which might explain the mis-recall of similar sounding words that were not misperceptions in the bimodal presentation. Further research is needed to determine whether the intrusion of similar sounding words occur during encoding, storage, or retrieval. This study also isolated the impact of accents on disrupting the encoding of items into memory. With shorter ISIs, there were more intrusions and repetitions for accented word recall, especially in middle and late SPs, and more order and omission errors in middle and late SPs. Without adequate time to process accented words, subsequent items can disrupt phonological encoding of earlier items into memory, as well as encoding of relational information between items. The observed trade-off between processing and storage in handling foreign-accented speech can be explained by the Ease of Language Understanding (ELU) model (Ronnberg et al., 2013). The ELU model emphasizes the important role of WM in online language processing and its interaction with LTM, especially for listening in adverse conditions. According to the ELU model, WM capacity is required for explicit compensatory processing, such as inference-making, semantic integration, and inhibiting irrelevant information. When lexical access is delayed by a mismatch between the speech signal and the listeners’ representation in LTM (Ronnberg et al., 2013), these explicit processing mechanisms are slower (in seconds) and are supported by the modality-general verbal WM that is limited and shared by memory operations and other higher-level cognitive functions. The ELU model can also account for additional omission and order errors in the bimodal condition compared to the auditory condition in Experiment 1. Integrating the visual cue with the auditory input in the bimodal condition places extra demands on the shared pool of modality-general WM, thereby leaving less WM for phonological encoding and encoding relational information among items. This finding showed that synchronized visual cues might not be the best compensatory strategies for improvised accented speech. Other compensatory strategies that do not draw from the same pool of WM resources would be preferable. We explore whether the current findings can be explained by the item-order trade-off observed in serial recall and recognition of long and short words (Hendry & Tehan, 2005). Item errors include intrusions, omissions, and repetitions. Serial recall of short words was more accurate than long words, as less time is needed for processing short words, leaving more time available for encoding their order information. But long words were recognized more Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
accurately than short words, likely due to the additional time needed for processing long words, resulting in substantial item processing. The item-order trade-off implies that increased item processing comes at the expense of order processing (Hendry & Tehan, 2005). In the current study, accented words were more difficult to process than native words, so more item processing was needed for accented words. If there is an item-order tradeoff, accented words are expected to have fewer item errors and more order errors than native words. But accented words showed more intrusions than native words, with no significant difference in order errors in either presentation mode with 150-ms ISIs. Accented word recall showed more intrusions and omissions, but fewer order errors than native word recall with 4-s ISIs. The item-order trade-off does not seem to account for the different results in the auditory and bimodal conditions either. The bimodal presentation had a differential influence across types of item errors. Based on Figure 1, the bimodal group had fewer intrusions across all SPs but more omissions only at late SPs. With the aid of visual display, the bimodal condition was expected to reduce overall item processing, leaving more resources for processing order information. Contrary to this prediction, more order errors were observed in the bimodal condition. The current findings also have implications for echoic memory. With long ISIs, participants showed fewer order and omission errors at the final SPs. This result aligns with the findings by Nygaard et al. (1995) that variation in talker characteristics resulted in improved serial recall at 4-s ISIs. Superior recall of words at the final SPs can be attributed to its additional acoustic representation in echoic memory (Conrad & Hull, 1968). When given sufficient time, listeners could fully encode details of the speaker’s accent in echoic memory and use it as distinctive temporal order cues for serial recall (Nygaard et al., 1995). Findings from this study should be considered in light of some limitations. Like many previous studies on serial recall (Frankish, 2008; Roodenrys & Miller, 2008; Vitevitch, Chan, & Roodenrys, 2012), this study used the same set of stimuli across trials to facilitate comparison across experiments. This might create an interference effect that worsens recall performance (Baddeley, 1966). Further studies can use a larger set of stimuli spoken by multiple speakers to increase generalizability of the findings to other speakers, accents, and words, as well as minimizing the influence of perceptual adaptation to speakers and accent on performance. In summary, the present findings suggest that foreign accents impact word recognition and serial recall by causing misperception and disrupting memory encoding. Effortful perceptual processing of accented speech can interfere with subsequent memory processes by exhausting Ó 2019 Hogrefe Publishing
55
the limited shared pool of modality-independent WM. Given the crucial role of WM in processing foreignaccented speech, further studies can examine compensatory strategies for accented speech processing that require less engagement of WM. The relation between individual differences in WM capacity and their variability in accented speech recall also needs further studies.
Electronic Supplementary Material The electronic supplementary material is available with the online version of the article at https://doi.org/10.1027/ 1618-3169/a000430 ESM 1. Data (.xlsx) The study design and data for the perceptual identification tasks and the serial recall tasks for each condition in Experiments 1 and 2.
References Baddeley, A. D. (1966). The influence of acoustic and semantic similarity on long-term memory for word sequences. The Quarterly Journal of Experimental Psychology, 18, 302–309. https://doi.org/10.1080/14640746608400047 Benjamini, Y., Krieger, A. M., & Yekutieli, D. (2006). Adaptive Linear Step-up Procedures That Control the False Discovery Rate. Biometrika, 93(3), 491–507. Bent, T., & Bradlow, A. R. (2003). The interlanguage speech intelligibility benefit. Journal of the Acoustical Society of America, 114, 1600–1610. https://doi.org/10.1121/1.1603234 Boersma, P., & Weenink, D. (2009). Praat: Doing phonetics by computer. (Version 5.1.05) [Computer program]. Retrieved from http://www.praat.org/ Caramazza, A., Yeni-Komshian, G. H., Zurif, E. B., & Carbone, E. (1973). The acquisition of a new phonological contrast: The case of stop consonants in French-English bilinguals. Journal of the Acoustical Society of America, 54, 421–428. https://doi. org/10.1121/1.1913594 Chan, K. Y., & Vitevitch, M. S. (2015). The influence of neighborhood density on the recognition of Spanish-accented words. Journal of Experimental Psychology: Human Perception and Performance, 41, 69–85. https://doi.org/10.1037/a0038347 Charlton, C., Rasbash, J., Browne, W. J., Healy, M., & Cameron, B. (2017). MLwiN (Version 3.00). Bristol, UK: Centre for Multilevel Modelling, University of Bristol. Cho, K. W., & Feldman, L. B. (2013). Production and accent affect memory. The Mental Lexicon, 8, 295–319. https://doi.org/ 10.1075/ml.8.3.02cho Clarke, C. M., & Garrett, M. F. (2004). Rapid adaptation to foreignaccented English. The Journal of the Acoustical Society of America, 116, 3647–3658. https://doi.org/10.1121/1.1815131 Conrad, R., & Hull, A. J. (1968). Input modality and the serial position curve in short-term memory. Psychonomic Science, 10, 135–136. https://doi.org/10.3758/bf03331446 Cousins, K. A., Dar, H., Wingfield, A., & Miller, P. (2014). Acoustic masking disrupts time-dependent mechanisms of memory encoding in word-list recall. Memory & Cognition, 42, 622–638. https://doi.org/10.3758/s13421-013-0377-7
Experimental Psychology (2019), 66(1), 40–57
56
Crowder, R. G., & Morton, J. (1969). Precategorical acoustic storage (PAS). Perception & Psychophysics, 5, 365–373. https://doi.org/10.3758/bf03210660 Flege, J. E., & Hillenbrand, J. (1984). Limits on phonetic accuracy in foreign language speech production. The Journal of the Acoustical Society of America, 76, 708–721. https://doi.org/ 10.1121/1.391257 Francis, A. L., & Nusbaum, H. C. (2009). Effects of intelligibility on working memory demand for speech perception. Attention, Perception, & Psychophysics, 71, 1360–1374. https://doi.org/ 10.3758/app.71.6.1360 Frankish, C. (2008). Precategorical acoustic storage and the perception of speech. Journal of Memory and Language, 58, 815–836. https://doi.org/10.1016/j.jml.2007.06.003 Gill, M. M. (1994). Accent and stereotypes: Their effect on perceptions of teachers and lecture comprehension. Journal of Applied Communication Research, 22, 348–361. https://doi. org/10.1080/00909889409365409 Glanzer, M., & Cunitz, A. R. (1966). Two storage mechanisms in free recall. Journal of Verbal Learning and Verbal Behavior, 5, 351–360. https://doi.org/10.1016/S0022-5371(66)80044-0 Goldstein, H. (2011). Multilevel statistical models (Vol. 922). West Sussex, United Kingdom: John Wiley & Sons. Hendry, L., & Tehan, G. (2005). An item/order trade-off explanation of word length and generation effects. Memory, 13, 364–371. https://doi.org/10.1080/09658210344000341 Henson, R. N. A. (1998). Item repetition in short-term memory: Ranschburg repeated. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1162–1181. https://doi. org/10.1037/0278-7393.24.5.1162 Hurlstone, M. J., Hitch, G. J., & Baddeley, A. D. (2014). Memory for serial order across domains: An overview of the literature and directions for future research. Psychological Bulletin, 140, 339–373. https://doi.org/10.1037/a0034221 Imai, S., Walley, A. C., & Flege, J. E. (2005). Lexical frequency and neighborhood density effects on the recognition of native and Spanish-accented words by native English and Spanish listeners. The Journal of the Acoustical Society of America, 117, 896–907. https://doi.org/10.1121/1.1823291 Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19, 1–36. https://doi.org/10.1097/00003446-199802000-00001 Mattys, S. L., Davis, M. H., Bradlow, A. R., & Scott, S. K. (2012). Speech recognition in adverse conditions: A review. Language and Cognitive Processes, 27, 953–978. https://doi.org/10.1080/ 01690965.2012.705006 McCormack, T., Brown, G. D. A., Vousden, J. I., & Henson, R. N. A. (2000). Children’s serial recall errors: Implications for theories of short-term memory development. Journal of Experimental Child Psychology, 76, 222–252. https://doi.org/10.1006/jecp. 1999.2550 Munro, M. J., & Derwing, T. M. (1995a). Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning, 45, 73–97. https://doi.org/ 10.1111/j.1467-1770.1995.tb00963.x Munro, M. J., & Derwing, T. M. (1995b). Processing time, accent, and comprehensibility in the perception of native and foreignaccented speech. Language and Speech, 38, 289–306. https:// doi.org/10.1177/002383099503800305 Munro, M. J., & Derwing, T. M. (1998). The effects of speaking rate on listener evaluations of native and foreign-accented speech. Language Learning, 48, 159–182. https://doi.org/10.1111/ 1467-9922.00038 Murdock, B. B. J. (1962). The serial position effect of free recall. Journal of Experimental Psychology, 64, 482–488. https://doi. org/10.1037/h0045106
Experimental Psychology (2019), 66(1), 40–57
K. Y. Chan et al., Foreign Accent and Serial Recall
Neisser, U. (2014). Cognitive psychology: Classic edition. New York, NY: Taylor & Francis. Nygaard, L. C., Sommers, M. S., & Pisoni, D. B. (1995). Effects of stimulus variability on perception and representation of spoken words in memory. Perception & Psychophysics, 57, 989–1001. https://doi.org/10.3758/bf03205458 Perception Research Systems. (2007). Paradigm. Retrieved from http://www.paradigmexperiments.com Pickel, K. L., & Staller, J. B. (2012). A perpetrator’s accent impairs witnesses’ memory for physical appearance. Law and Human Behavior, 36, 140–150. https://doi.org/10.1037/ h0093968 Rabbitt, P. M. A. (1968). Channel-capacity, intelligibility and immediate memory. The Quarterly Journal of Experimental Psychology, 20, 241–248. https://doi.org/10.1080/14640746808400158 Reed, M. (2000). He who hesitates: Hesitation phenomena as quality control in speech production, obstacles in non-native speech perception. Journal of Education, 182, 67–91. https:// doi.org/10.1177/002205740018200306 Reinisch, E., & Holt, L. L. (2014). Lexically guided phonetic retuning of foreign-accented speech and its generalization. Journal of Experimental Psychology: Human, Perception and Performance, 40, 539–555. https://doi.org/10.1037/ a0034409 Riazantseva, A. (2001). Second language proficiency and pausing: A study of Russian speakers of English. Studies in Second Language Acquisition, 23, 497–526. https://doi.org/10.1017/ S027226310100403X Ronnberg, J., Lunner, T., Zekveld, A., Sorqvist, P., Danielsson, H., Lyxell, B., . . . Rudner, M. (2013). The Ease of Language Understanding (ELU) model: Theoretical, empirical, and clinical advances. Frontiers in Systems Neuroscience, 7, 31. https://doi. org/10.3389/fnsys.2013.00031 Roodenrys, S., & Miller, L. M. (2008). A constrained Rasch model of trace redintegration in serial recall. Memory & Cognition, 36, 578–587. https://doi.org/10.3758/mc.36.3.578 Surprenant, A. M. (1999). The effect of noise on memory for spoken syllables. International Journal of Psychology, 34, 328–333. https://doi.org/10.1080/002075999399648 Surprenant, A. M. (2007). Effects of noise on identification and serial recall of nonsense syllables in older and younger adults. Aging, Neuropsychology, and Cognition, 14, 126–143. https:// doi.org/10.1080/13825580701217710 Temple, L. (2000). Second language learner speech production. Studia Linguistica, 54, 288–297. https://doi.org/10.1111/14679582.00068 Van Engen, K. J., & Peelle, J. E. (2014). Listening effort and accented speech. Frontiers in Human Neuroscience, 8, 577. https://doi.org/10.3389/fnhum.2014.00577 Van Wijngaarden, S. J. (2001). Intelligibility of native and nonnative Dutch speech. Speech Communication, 35, 103–113. https://doi.org/10.1016/S0167-6393(00)00098-4 Vitevitch, M. S., Chan, K. Y., & Roodenrys, S. (2012). Complex network structure influences processing in long-term and short-term memory. Journal of Memory and Language, 67, 30–44. https://doi.org/https://doi.org/10.1016/j.jml.2012.02. 008 Vitevitch, M. S., & Luce, P. A. (1999). Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language, 40, 374–408. https://doi. org/10.1006/jmla.1998.2618 Wild, C. J., Yusuf, A., Wilson, D. E., Peelle, J. E., Davis, M. H., & Johnsrude, I. S. (2012). Effortful listening: The processing of degraded speech depends critically on attention. The Journal of Neuroscience, 32, 14010–14021. https://doi.org/10.1523/ JNEUROSCI.1528-12.2012
Ó 2019 Hogrefe Publishing
K. Y. Chan et al., Foreign Accent and Serial Recall
57
Witteman, M. J., Weber, A., & McQueen, J. M. (2013). Tolerance for inconsistency in foreign-accented speech. Psychonomic Bulletin & Review, 21, 512–519. https://doi.org/10.3758/ s13423-013-0519-8
Appendix B
History Received January 19, 2018 Revision received September 14, 2018 Accepted September 17, 2018 Published online February 19, 2019
Intrusions and Misperceptions
Acknowledgment The authors also wish to thank several undergraduate research assistants, including Emily Wingate, Catherine Mathers, Ashley Heberling, Mariah Hawes, Erin Lee, and Allison Isrin, for their help with data collection. Open Data The study design and data are available in the Electronic Supplementary Material, ESM 1. ORCID Kit Ying Chan https://orcid.org/0000-0002-5386-9020 Kit Ying Chan Department of Social and Behavioural Sciences Academic 1 Y7419 City University of Hong Kong Tat Chee Avenue Kowloon Hong Kong vivien.chanky@cityu.edu.hk
Details for Phonological Analyses of Misperceptions from the perceptual identification tasks and intrusions from the serial recall tasks in Experiments 1 and 2 were phonologically transcribed and compared with those of the nine stimulus words. Misspelling, transpositions of letters, and typographical errors that involved a single letter in the intrusion were cleaned up and corrected in specific conditions: (a) The omission of a letter in a word was corrected only if the response did not form another English word, and (b) the transposition or addition of a single letter in the word was corrected if the letter was within one key of the target letter on the keyboard. For each of the conditions in Experiments 1 and 2, phonological transcriptions of the intrusions, misperceptions, and the nine stimulus words were then compared to determine if they were exact match or phonologically similar. Two words are considered to be phonologically similar if addition, deletion, or substitution of a phoneme in one word forms the other word (Luce & Pisoni, 1998). For example, the word cat is phonologically similar to the words _at, scat, fat, cot, and cap.
Appendix A Analysis Equation for Experiment 1 For the vector Errorytij, the error type y (intrusion, omission, repetition, order) at serial position i in trial j by person k occurs with an expected value via the Logit or Probit link function (F) of the grand mean intercept βy, with unexplained components at the person-, trial-, and serial position-levels (or residuals) for the outcome variable y (gyk, fyjk. eyijk).
Erroryijk ¼ βy þ eyijk þ f yjk þ gyk þ βy1 ContactN onnativeS peakeryk þ βy2k WordI dentificationE rror Rateyjk þ βy3k AccentedS peechyjk þ βy4k Bimodalyjk þ βy5k Timeo ft rialyjk þ βy6jk SerialP ositionyijk þ βyxk Triali nteractionsyjk :
Ó 2019 Hogrefe Publishing
ð1Þ
Experimental Psychology (2019), 66(1), 40–57
Research Article
Task Switching Hurts Memory Encoding Michèle C. Muhmenthaler and Beat Meier Institute of Psychology, University of Bern, Switzerland
Abstract: Research consistently shows that task switching slows down performance on switch compared to repeat trials, but the consequences on memory are less clear. In the present study, we investigated the impact of task switching on subsequent memory performance. Participants had to switch between two semantic classification tasks. In Experiment 1, the stimuli were univalent; in Experiment 2, the stimuli were bivalent (relevant for both tasks). The aim was to disentangle the conflicts triggered by task switching and bivalency. In both experiments, recognition memory for switch and repeat stimuli was tested subsequently. During encoding, task switching produced switch costs. Critically, subsequent memory was lower for switch compared to repeat stimuli in both experiments, and this effect was increased in Experiment 2 with bivalent material. We suggest that the requirement to switch tasks hurts the encoding of task-relevant information and thus impairs subsequent memory performance. Keywords: cognitive control, memory, univalent stimuli, bivalent stimuli, response compatibility, memory selectivity
With the beginning of the industrial world, it was a major issue to find the most efficient way to execute work procedure. According to Taylorism, the parsing of a procedure into small parts and the repetition of those small elements by eliminating all unnecessary movements was this “one best way” (Kanigel, 2005). However, in order to specify how goal-directed behavior is implemented, in the quest to understand cognitive processing, Miller, Galanter, and Pribam (1960) suggested a “test-operate-test-exit” (TOTE) unit, which, by definition, includes task switches as an optimal way to efficient performance. While successful performance necessarily requires flexibility, investigating the consequences of switching tasks on memory has just begun. In laboratory situations, such behavior is typically explored with the task-switching paradigm (e.g., Allport, Styles, & Hsieh, 1994; Jersild, 1927; Rogers & Monsell, 1995). The main goal of the present study was to investigate how task switching affects subsequent memory performance. Cognitive control refers to the ability to form a plan, to maintain it in face of distraction, and to adjust behavior appropriately in case of cognitive conflict (Norman & Shallice, 1986; Posner & Snyder, 1975; Botvinick, Braver, Barch, Carter, & Cohen, 2001). Task switching is a typical example in which cognitive control is necessary. The increase in cognitive control associated with the requirement to switch Experimental Psychology (2019), 66(1), 58–67 https://doi.org/10.1027/1618-3169/a000431
between two tasks usually results in slower and less accurate performance compared to repeating the same task (e.g., Rogers & Monsell, 1995). The conflict produced by task switching is assumed to reflect the involvement of endoge-nous control processes that are needed to reconfigure the task set (Vandierendonck, Liefooghe, & Verbruggen, 2010; Rogers & Monsell, 1995). The requirement for cognitive control is further enhanced when the material involves bivalent stimuli, that is, stimuli that can be used to perform both tasks rather than univalent stimuli. For example, if one task requires participants to classify animals as birds or mammals and the other task requires participants to classify objects as musical instruments or kitchen utensils, a sparrow would be a univalent stimulus because it can only be used for the animal task but not for the object task. In contrast, if one task requires participants to classify a stimulus by size (e.g., as bigger or smaller than a soccer ball) and the other task requires participants to classify a stimulus by animacy (i.e., as living or non-living), a sparrow would be a bivalent stimulus because it can be used for both, the size and the animacy task. Bivalent stimuli create an additional conflict because they not only require to switch task, but also to select which task to perform (Allport et al., 1994; Woodward, Meier, Tipper, & Graf, 2003). Responding to bivalent stimuli causes slower reaction times compared to responding to univalent stimuli and even leads to long-lasting slowing on subsequent performance (i.e., the “bivalency effect,” Meier, Woodward, Rey-Mermet, & Graf, 2009; Woodward et al., 2003). Both types of conflicts – task switching and bivalency – contribute to “switch costs” as Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
they both slow down reaction times and increase error rates (Jersild, 1927; Rogers & Monsell, 1995). However, as most task-switching experiments involve bivalent stimuli, the effects of switching and bivalency on switch costs are typically confounded. By using one experiment with univalent stimuli and one experiment with bivalent stimuli, we aimed to assess the separate impact of task switching and bivalency on subsequent memory performance in the present study. So far, only a few studies have examined the effect of task switching on memory and all of them used bivalent stimuli. Reynolds, Donaldson, Wagner, and Braver (2004) investigated encoding processes during switching and repeating a task. In the study phase, participants performed two semantic classification tasks with single words. In two blocks, they performed one of the tasks alone (single-task condition), and in one block, they switched between the two tasks (task-switching condition). In a subsequent memory test, more words from the single-task compared to the task-switching condition were recognized correctly. Thus, memory performance was lower when control demands were higher. More interestingly for the purpose of the present study, within the task-switching blocks, memory performance for repeat stimuli was better than for switch stimuli, suggesting not only a block-specific but also a trial-specific effect. Together, the higher cognitive demands associated with task switching reduced memory performance. Richter and Yeung (2012) also investigated the effect of task switching on memory. They used compound stimuli consisting of pictures and words and participants had to switch between classifying them. Thus, each trial consisted of task-relevant (target) and task-irrelevant (distractor) information. The results showed that task switching compared to task repetition impaired memory performance for targets, but improved memory performance for distractors. The authors explained the latter with interference from previously active task sets (i.e., task-set inertia; Allport et al., 1994). Due to residual attention to the competing, now-irrelevant task, encoding of the distractor would be facilitated in switch trials (Yeung, Nystrom, Aronson, & Cohen, 2006). In contrast, attention toward task-relevant information was unimpeded in repeat trials, resulting in better encoding for targets in repeat compared to switch trials. In a follow-up study, Richter and Yeung (2015) replicated these results. Chiu and Egner (2016) focused on task-irrelevant stimulus features by investigating two distractor categories. In one group, participants switched between two classification tasks, the distractors were relevant in one task and irrelevant in the competing task. In the other group, the distractors (objects in the background) were never task relevant. The results showed better memory for distractors which Ă&#x201C; 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
59
were task relevant in one of the two tasks on switch compared to repeat trials, indicating that task-set inertia enhanced distractor encoding (Yeung et al., 2006). In the other condition with the truly irrelevant distractors, the results showed that memory for distractors was lower in switch than in repeat trials, indicating that the higher cognitive demands associated with task switching reduced encoding of completely irrelevant information (Jenkins, Lavie, & Driver, 2005). Together, these findings suggest that task switching affects incidental memory performance. The interference associated with task switching results in less focused attention toward task-relevant information, leading to lower memory performance (Richter & Yeung, 2012, 2015). However, as all the previous studies have used bivalent stimuli, task switching and stimulus bivalency were confounded. In order to address the pure impact of task switching, we used univalent stimuli in Experiment 1 of the present study. Moreover, all the previous studies have used a task-cueing procedure in which a cue signals which task is to be performed such that switch and repeat trials appear in a random order (e.g., Shaffer, 1965). Task cueing requires the active maintenance of both task sets and may thus present additional attentional monitoring demands (Braver, Reynolds, & Donaldson, 2003). In contrast, in the present study, we used the alternating run paradigm in which switch and repeat trials appear in a predictable order (e.g., AABB) in order to reduce these demands (cf. Rogers & Monsell, 1995).
The Present Study We present two task-switching experiments, one with univalent and one with bivalent stimuli. In the study phase of both experiments, participants had to switch between two semantic classification tasks. Then, a surprise memory test took place. We hypothesized that memory performance for switch trials would be lower than for repeat trials in both experiments (i.e., with univalent and bivalent stimuli) due to the higher control demands for task switching compared to task repetition. The enhanced cognitive demands impair target encoding by affecting stimulus-processing priorities (Lavie, Hirst, De Fockert, & Viding, 2004). In Experiment 2, we expected more interference in switch trials due to the additional requirement to counteract the between-task interference associated with bivalent stimuli (Allport & Wylie, 1999; Rey-Mermet & Meier, 2012) which has been shown to impair the encoding of task-relevant information (cf. Richter & Yeung, 2012, 2015). In both experiments, we used the remember/know procedure to assess the contribution of recollection and familiarity to recognition memory performance (Tulving, 1985; Yonelinas, 2002). As switching task requires attention and Experimental Psychology (2019), 66(1), 58â&#x20AC;&#x201C;67
60
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
dividing attention reduces recollection (Yonelinas, 2002; Gardiner & Parkin, 1990), we expected that the difference between switch and repeat stimuli would be mainly expressed in remember responses.
Experiment 1 The aim of Experiment 1 was to test whether the conflict triggered by task switching affects subsequent recognition memory performance. Participants performed two different tasks (animal and object classification) in a regular AABBorder. For half of the participants, the stimuli were presented as words, and for the other half, they were presented as pictures. Importantly, all the stimuli were univalent.
Method Participants The participants were 80 volunteers (43 male and 37 female) from the general population, recruited by word of mouth, and all of them were German speaking with an age from 18 to 35 years (M = 24.70, SD = 4.51). The study was approved by the local ethical committee of the University of Bern; all participants gave written consent. Material For the condition with pictures, the material consisted of 160 photographs of easy to name stimuli. The pictures were collected from a web search. Half were animals (mammals or birds), and the other half were objects (musical instruments or kitchen utensils). The size of the photographs was approximately 300 300 pixels. For the condition with words, 160 words were used. They were typical exemplars of the same four categories and consisted of 3–10 letters. The words were displayed in black letters against a white background in Courier New font.1 The stimuli were divided into two lists of 80 pictures and words, respectively, and contained an equal number of stimuli of the four categories. One of the lists was used in the study phase, and both lists were presented in the test phase. The stimuli were counterbalanced across participants, so that each stimulus occurred equally often in the repeat and switch condition. Procedure One half of the participants were tested with words and the other half with pictures; they were randomly assigned to each condition and were tested individually in a computer 1
laboratory. In the study phase, they were instructed to categorize the stimuli as quickly and correctly as possible. For animals, participants had to classify them as mammal or bird, and for objects, they had to classify them as musical instrument or kitchen utensil. The stimuli were presented randomized in the middle of the screen, each task twice in succession (see Figure 1). After a practice phase with 10 trials, participants performed the study phase with 80 trials. They responded on a standard computer keyboard using their index fingers. They had to press the a-key when the stimulus was either a mammal or a musical instrument and the l-key when the stimulus was either a bird or a kitchen utensil. The stimuli were presented until a response key was pressed, and then the next stimulus was presented after 200 ms of blank screen. Following the study phase, participants had to complete a demanding reading span task (Daneman & Carpenter, 1980). The main purpose of this task was to create a filled retention interval between study and test phase. Participants had to read a series of two to six sentences. For each sentence, they had to indicate whether it was meaningful and they had to recall the last word of the sentence. Reading span was defined as the size of the largest set in which all words were correctly recalled in at least three of the five consecutive trials. The third part of the experiment involved an incidental recognition memory test and an additional remember/know judgment (cf. Meier, Rey-Mermet, Rothen, & Graf, 2013). Participants had to indicate whether they had seen a stimulus already during the task-switching phase by pressing the j-key for “old” stimuli or not by pressing the n-key for “new” stimuli. In case of an “old”-response, they were required to give an additional remember/know judgment by pressing the 1-key for “remember” or the 2key for “know” on the number pad. For each trial, the stimulus was presented in the middle of the screen until a response key was pressed. The stimuli appeared in randomized order with an interval of 200 ms. One half of the stimuli were old (presented in the study phase) and the other half new (unseen). The entire experiment lasted about 25 min. All raw data for Experiment 1 are listed in the Electronic Supplementary Material, ESM 1. Analysis For the study phase, mean reaction times and accuracy in the task-switching phase were analyzed separately using an analysis of variance (ANOVA) with the within-subject factor trial type (repeat vs. switch) and the between-subject factor material (words vs. pictures). For the test phase, the hit and the false alarms for each participant were
Materials used to conduct the research (including analysis code) will be made available to other researchers for purposes of replicating the procedure or reproducing the results by email to the corresponding author.
Experimental Psychology (2019), 66(1), 58–67
Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
61
Figure 1. Predictable AABB study trial sequence of Experiment 1.
computed. As it was not possible to assign the false alarm rates to the repeat or switch condition, we used hit rates only as recognition scores (cf. Ortiz-Tudela, Milliken, Botta, LaPointe, & Lupiañez, 2016). Memory performance was also analyzed with the within-subject factor trial type (repeat vs. switch) and the between-subject factor material (words vs. pictures). In addition, remember and know responses were analyzed separately. Reading span score was correlated with accuracy, reaction times, and the hit rate. We excluded one participant with an error rate > 30% in the study phase. An α level of .05 was used. Effect sizes are expressed as ηp2 values.
(M = 970 ms, SE = 46) than to words (M = 1,375 ms, SE = 45), F(1, 77) = 39.6, p < .001, ηp2 = .34, but the interaction was not significant, F(1, 77) = 0.37, p = .543, ηp2 = .05). The same ANOVA on the accuracy data revealed that performance was lower on switch (M = 0.93, SE = 0.01) than on repeat trials (M = 0.95, SE = 0.01), F (1, 77) = 10.1, p = .002, ηp2 = .12. Accuracy was lower for words (M = 0.92, SE = 0.01) than for pictures (M = 0.96, SE = 0.01), F(1, 77) = 16.3, p < .001, ηp2 = .18, but the interaction was not significant F(1, 77) = 3.02, p = .086, ηp2 = .04, indicating that switch costs were not different for words and pictures. Together, our results showed typical switch costs.
Results
Test Phase Overall, the proportion of hits was M = 0.71, SE = 0.14, and the proportion of false alarms was M = 0.23, SE = 0.13. The ANOVA with the factors trial type and materials revealed that memory was significantly better for repeat (M = 0.72, SE = 0.13) than for switch trials (M = 0.70, SE = 0.17), F (1, 77) = 6.8, p = .011, ηp2 = .08. Words and pictures did
Study Phase As expected, participants were faster to respond to repeat (M = 1,110 ms, SE = 29) than to switch trials (M = 1,234 ms, SE = 37), F(1, 77) = 41.5, p < .001, ηp2 = .35. Overall, participants were faster to respond to pictures Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Experimental Psychology (2019), 66(1), 58–67
62
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
effect of task switching, unconfounded by stimulus bivalency. In the study phase, we found the expected switch costs; thus, the enhanced demands of task switching were associated with an increased encoding time. More importantly, in the test phase, recognition memory was better for repeat than for switch trials, indicating that the conflict triggered by task switching affected subsequent memory performance. Thus, task switching hurts memory encoding for task-relevant information even for univalent stimuli. As expected, this effect was mainly expressed in remember responses. In Experiment 2, we investigated how the conflict triggered by bivalency further affects memory performance. Toward this goal, we designed a similar experiment as Experiment 1, but we used bivalent material.
Figure 2. Memory performance in Experiment 1. Mean proportion of hits as a function of task switching with univalent stimuli. The shaded areas reflect remember; the solid areas represent know responses. Error bars represent standard errors.
not differ, F(1, 77) = 1.74, p = .19, ηp2 = .02, and the interaction was not significant, F(1, 77) = 1.37, p = .245, ηp2 = .02. The critical analysis is depicted in Figure 2. To assess the contribution of remember and know judgments on memory performance, additional ANOVAs with the same design were conducted. Significantly more remember responses were associated with repeat (M = 0.53, SE = 0.02) than with switch trials (M = 0.49, SE = 0.02), F(1, 77) = 12.75, p = .001, ηp2 = .14; know responses did not vary with trial type, F(1, 77) = 1.50, p = .225, ηp2 = .02. No other effect was significant, F < 2.06, p > .155. Thus, the difference in memory performance between switch and repeat trials was due to higher recollection than familiarity. Follow-up Analysis In order to explore the relationship between the task switching and memory results and working memory capacity, we analyzed the reading span task. The average reading span was 2.72 (SD = .95). This score was not significantly correlated to the scores of hits (r = .04), accuracy (r = .11), or reaction times (r = .16). Therefore, working memory capacity did not seem to be related to task or memory performance.
Discussion The goal of Experiment 1 was to examine whether the conflict produced by task switching affects subsequent memory performance. We used univalent stimuli to test the pure Experimental Psychology (2019), 66(1), 58–67
Experiment 2 In Experiment 2, we used pictures as stimuli and participants had do classify them as smaller or bigger than a soccer ball or as living or non-living. As all the stimuli could be used for both tasks, they were bivalent. Moreover, as we used the same set of response keys for both tasks, a third kind of conflict occurred on some trials, that is, response incompatibility. If a stimulus would require the same key for both tasks, for example, the a-key to classify a picture of an elephant as bigger than a soccer ball in the size task and as living in the animacy task, the response mapping was compatible. In contrast, when the stimulus required different response keys for each of the tasks, for example, the a-key to classify a house as bigger than a soccer ball and the l-key to classify it as non-living, the response mapping was incompatible. For incompatible response mappings, the inappropriate response has to be suppressed and this usually slows down performance (Gade & Koch, 2007; Kornblum, Hasbroucq, & Osman, 1990). We expected lower memory performance for incompatible and switch stimuli due to the presence of conflict. Moreover, we expected a stronger effect for bivalent compared to univalent materials because of the between-task conflict with bivalent materials (Allport et al., 1994; Meier et al., 2009).
Method Participants and Design The participants were 40 undergraduate students (4 male and 36 female) from the University of Bern, and all of them were German speaking. The age ranged from 19 to 33 years (M = 21.79, SD = 2.75), and they participated in the study for course credits. The study was approved by the local ethical committee of the University of Bern, and all participants gave written consent. Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
Material A total of 128 colored photographs were used which were collected from a web search (see Footnote 1). They could be classified both as smaller or bigger than a soccer ball and as living or non-living. The stimuli were arranged in separate lists of 64 pictures, counterbalanced across category and trial type, such that each stimulus occurred equally often in the repeat and switch condition and in each task. One of the lists was used in the study phase, and both lists were presented in the test phase. Lists were counterbalanced across participants. Procedure The procedure was identical as in Experiment 1 with the following exceptions. Participants were instructed to perform the size task when the stimulus appeared in the upper part of the screen and to perform the animacy task when it appeared in the lower part. The stimuli were presented clockwise, beginning in the upper half on the left, which led to a predictable AABB sequence of the two tasks as depicted in Figure 3. Participants had to press the a-key when an object was bigger than a soccer ball or living and the l-key when the object was smaller than a soccer ball or non-living. After a brief practice phase with 8 trials, participants performed the study phase with 64 trials. After the reading span task which was identical to Experiment 1,
63
the recognition memory test was administered with 128 stimuli, half of them old and the other half new. The entire experiment lasted about 25 min. All raw data for Experiment 2 are listed in ESM 2. Analysis For the study phase, task-switching performance was analyzed using a 2 (Trial Type: switch vs. repeat) 2 (Response Type: compatible vs. incompatible) ANOVA for both reaction times and accuracy. For the test phase, the proportion of hits and the false alarms were analyzed. As it was not possible to assign the false alarm rates to repeat or switch trials, we used hit rates only as recognition scores (cf. Ortiz-Tudela et al., 2016). Memory performance and the remember/know judgments were analyzed using the same two factors trial type and response type. One participant was excluded because reaction time performance was more than 3 SD slower than all other participants. An α level of .05 was used. Effect sizes are expressed as ηp2 values.
Results Study Phase Reaction time analysis revealed that the participants responded significantly faster on repeat (M = 1,098 ms,
Figure 3. Predictable AABB study trial sequence in Experiment 2. Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Experimental Psychology (2019), 66(1), 58–67
64
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
SE = 41) than on switch trials (M = 1,536 ms, SE = 64), F(1, 38) = 118.72, p < .001, ηp2 = .76. Response type, F(1, 38) = .30, p = .59, ηp2 = .01, and the interaction between trial type and response type were not significant, F(1, 38) < .01, p = .99, ηp2 < .01. Accuracy analysis revealed that participants were more accurate on repeat (M = 0.95, SE = 0.01) than on switch trials (M = 0.92, SE = 0.01), F(1, 38) = 10.15, p = .003, ηp2 = .21. Response type, F(1, 38) = 1.96, p = .170, ηp2 = .05, and the interaction between response type and trial type were not significant, F(1, 38) < 1, p = .922, ηp2 < .01. Test Phase The proportion of hits was M = 0.76, SE = 0.16 and the proportion of false alarms was M = 0.08, SE = 0.07. Hit rates only for each conflict type were further analyzed, and the results are presented in Figure 4. The ANOVA revealed that repeat stimuli were better recognized (M = 0.80, SE = 0.02) than switch stimuli (M = 0.73, SE = 0.02) as indicated by a main effect of trial type, F(1, 38) = 18.23, p < .001, ηp2 = .32. Neither the main effect of response type, F(1, 38) = .01, p = .92, ηp2 < .01, nor the interaction between trial type and response type were significant, F(1, 38) = .56, p = .46, ηp2 < .01. To assess the contribution of recollection and familiarity on memory performance, additional ANOVAs with the same design were conducted. Significantly more remember responses were associated with repeat (M = 0.63, SE = 0.03) than with switch trials (M = 0.56, SE = 0.03), F(1, 38) = 11.7, p < .01, ηp2 = .24. In contrast, know responses did not vary with trial type, F(1, 38) = 0.11, p = .744, ηp2 < .01. No other effect was significant, F < 2.88, p > .098. Thus, as in Experiment 1, the difference between switch and repeat trials was due to higher recollection than familiarity.
Follow-Up Analysis The average reading span was 2.59 (SD = 0.68). This score was not significantly correlated to the scores of hits (r = .02), accuracy (r = .17) or reaction times (r = .12). Again, working memory capacity did not seem to be related to task or memory performance.
Discussion Experiment 2 replicated and extended the results of Experiment 1. As in Experiment 1, in the study phase, responses were slower and less accurate for switch than for repeat trials. Moreover, the switch costs in Experiment 2 were much larger than in Experiment 1. Crucially, we found again better memory for repeat than for switch trials, as in Experiment 1. In fact, the size of this effect was much stronger with bivalent stimuli (i.e., ηp2 = .32) than with univalent stimuli (i.e., ηp2 = .08). As partial eta squared is a reliable measure to compare the effect size of a manipulation across studies (Cohen, 1973; cf. Pedhazhur, 1977), this comparison indicates that the memory effect is four times larger with bivalent stimuli than with univalent stimuli. This suggests that with bivalent switch stimuli, encoding of task-relevant information was additionally impaired. In contrast, the conflict produced by response type had neither an effect on task nor on memory performance, suggesting that this conflict was too weak to affect performance. As in Experiment 1, the difference between repeat and switch stimuli was mainly expressed in remember responses and the contribution was stronger with bivalent stimuli (i.e., ηp2 = .24) than with univalent stimuli (i.e., ηp2 = .14). This corroborates that switching task requires attention, and this requirement is enhanced with bivalent stimuli.
General Discussion
Figure 4. Memory performance in Experiment 2. Mean proportion of hits as a function of task switching with bivalent stimuli. The shaded areas reflect remember; the solid areas represent know responses. Error bars represent standard errors.
Experimental Psychology (2019), 66(1), 58–67
The aim of the study was to investigate the impact of task switching on subsequent memory performance. In two experiments, we combined a task-switching procedure with an incidental recognition memory test. The stimuli were either univalent (Experiment 1) or bivalent (Experiment 2); switch and bivalent stimuli were considered as conflict stimuli. Another conflict was induced by incompatible stimulus-response mappings. The conflict produced by task switching impaired memory performance in both experiments, as memory was lower for switch than for repeat stimuli. As there is no between-task conflict with univalent materials (Mayr & Keele, 2010; Wylie & Allport, 2000), the requirement to reconfigure the task set in switch trials may have produced Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
this effect in Experiment 1 (Rogers & Monsell, 1995). In Experiment 2, bivalency further impaired memory performance for switch trials, reflected in a larger switch effect than in Experiment 1. The results are in line with the studies by Reynolds et al. (2004) and Richter and Yeung (2012, 2015). They also found lower memory performance with task-relevant switch stimuli. As all the previous studies used bivalent stimuli, our study is the first that provides evidence that even univalent task switching hurts memory encoding for target events. We suggest that task switching produced interference which resulted in less focused attention toward the target events (Lavie et al., 2004), rather than diminishing a general encoding capacity. In other words, the selectivity of memory encoding was reduced under high cognitive control demands (Richter & Yeung, 2012, 2015). The results of the remember/know procedure revealed that in both experiments, fewer “remember” responses were given for switch than for repeat trials. In contrast, “know” responses did not vary according to the encoding condition. Moreover, the effect of “remember” responses regarding the difference between switch and repeat trials was stronger in Experiment 2 than in Experiment 1. This corresponds with the idea that attention was more focused in repeat than in switch trials, rendering participants more certain about their decisions. Recollection is found to be sensitive to attention manipulations (Yonelinas, 2002). For example, in an experiment by Gardiner and Parkin (1990), participants learned word lists in a full and a divided attention condition. The following word recognition test showed that divided attention reduced the “remember” responses while the “know” responses did not differ. The same pattern was found in our results: Stimuli from repeat trials, in which attention was unimpeded, led to more remember responses than stimuli from switch trials, in which attention had to be shared between target processing and task switching. This effect was more pronounced with bivalent materials, as selecting the appropriate task required more attention due to overlapping stimulus features (Allport et al., 1994; Woodward et al., 2003). In summary, both task switching and bivalency impair memory. Interestingly, this does not generalize to all kinds of conflict. Studies on the effects of Stroop conflict on subsequent memory performance found improved memory performance for Stroop compared to non-conflicting stimuli. For example, in a study by Krebs, Boehler, De Belder, and Egner (2015), faces were presented in a study phase either with congruent information (the word man over a male face) or with incongruent information (the word woman over a male face). The subsequent face recognition test showed that irrelevant incongruent information improved subsequent memory for faces, that is, a conflictinduced memory benefit. Similar results were reported by Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
65
Rosner, D’Angelo, MacLellan, and Milliken (2015). Their participants had to read one word of a word pair. Half of the items were congruent (the words had the same identity), and the other half were incongruent (the words had different identities). The results of the subsequent recognition test showed better memory for incongruent than for congruent stimuli. Crucially, in these studies the conflict arose from the coactivation of two incompatible responses (Egner, Delano, & Hirsch, 2007), for example, the picture of a woman with the superimposed word “man” (cf. Krebs et al., 2015). In Stroop conflict, the focus of attention is strategically directed at the target in order to avoid errors (Botvinick et al., 2001; Verguts & Notebaert, 2009). As a consequence, encoding mechanisms are up-regulated, leading to better memory performance for targets. In contrast, in the present study, the conflict arose from selecting the relevant task set in a task-switching environment. When participants have to switch tasks, the focus of attention toward the target is reduced because attention is required for selecting the appropriate task. Therefore, memory performance is reduced in switch trials. In the case of bivalent stimuli, even more attention is required for selecting the relevant task due to the overlapping stimulus features and thus memory performance is further affected.
Conclusion Finding the most efficient way to execute work procedure is a major issue of mankind. To be efficient, most approaches – as, for example, the TOTE unit (Miller et al., 1960) – favor fast and flexible shifts. While goaldirected performance can be improved by switching tasks, our results suggest that this may be unprofitable for memory: The experiments presented here provide evidence that task switching impairs memory performance for task-relevant materials. Moreover, our study is the first that provides evidence that even task switching with univalent stimuli affects memory encoding.
Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at https://doi.org/10.1027/ 1618-3169/a000431 ESM 1. Data (.sav) Raw data of Experiment 1. ESM 2. Data (.sav) Raw data of Experiment 2.
Experimental Psychology (2019), 66(1), 58–67
66
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
References Allport, A., Styles, E. A., & Hsieh, S. (1994). Shifting intentional set: Exploring the dynamic control of tasks. In C. Umilta & M. Moscovitch (Eds.), Conscious and nonconscious information processing: Attention and performance XV (pp. 421–452). Cambridge, MA: MIT Press. Allport, A., & Wylie, G. (1999). Task switching: Positive and negative priming of task-set. In G. W. Humphreys, J. Duncan, & A. M. Treisman (Eds.), Attention, space and action: Studies in cognitive neuroscience (pp. 273–296). Oxford, UK: Oxford University Press. Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108, 624–652. https://doi.org/10.1037/0033-295X. 108.3.624 Braver, T. S., Reynolds, J. R., & Donaldson, D. I. (2003). Neural mechanisms of transient and sustained cognitive control during task switching. Neuron, 39, 713–726. https://doi.org/ 10.1016/S0896-6273(03)00466-5 Chiu, Y. C., & Egner, T. (2016). Distractor-relevance determines whether task-switching enhances or impairs distractor memory. Journal of Experimental Psychology: Human Perception and Performance, 42, 1–5. https://doi.org/10.1037/xhp0000181 Cohen, J. (1973). Eta-squared and partial eta-squared in fixed factor ANOVA designs. Educational and Psychological Measurement, 33, 107–112. Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 19, 450–466. Egner, T., Delano, M., & Hirsch, J. (2007). Separate conflictspecific cognitive control mechanisms in the human brain. NeuroImage, 35, 940–948. Gade, M., & Koch, I. (2007). Cue-task associations in task switching. The Quarterly Journal of Experimental Psychology, 60, 762–769. https://doi.org/10.1080/17470210701268005 Gardiner, J. M., & Parkin, A. J. (1990). Attention and recollective experience in recognition memory. Memory & Cognition, 18, 579–583. Jenkins, R., Lavie, N., & Driver, J. (2005). Recognition memory for distractor faces depends on attentional load at exposure. Psychonomic Bulletin & Review, 12, 314–320. https://doi.org/ 10.3758/bf03196378 Jersild, A. T. (1927). Mental set and shift. Archives of Psychology, 89, 1–81. Kanigel, R. (2005). The one best way: Frederick Winslow Taylor and the enigma of efficiency, Vol. 1, (1st ed.). Cambridge, MA: The MIT Press. Kornblum, S., Hasbroucq, T., & Osman, A. (1990). Dimensional overlap: Cognitive basis for stimulus–response compatibility – A model and taxonomy. Psychological Review, 97, 253–270. Krebs, R. M., Boehler, C. N., De Belder, M., & Egner, T. (2015). Neural conflict–control mechanisms improve memory for target stimuli. Cerebral Cortex, 25(3), 833–843. https://doi. org/10.1093/cercor/bht283 Lavie, N., Hirst, A., De Fockert, J. W., & Viding, E. (2004). Load theory of selective attention and cognitive control. Journal of Experimental Psychology: General, 133, 339–354. https://doi. org/10.1037/0096-3445.133.3.339 Mayr, U., & Keele, S. W. (2010). Changing internal constraints on action: The role of backward inhibition. Journal of Experimental Psychology: General, 129, 4–26. Meier, B., Rey-Mermet, A., Rothen, N., & Graf, P. (2013). Recognition memory across the lifespan: The impact of word frequency and study-test interval on estimates of familiarity
Experimental Psychology (2019), 66(1), 58–67
and recollection. Frontiers in Psychology, 4, 1–15. https://doi. org/10.3389/fpsyg.2013.00787 Meier, B., Woodward, T. S., Rey-Mermet, A., & Graf, P. (2009). The bivalency effect in task switching: General and enduring. Canadian Journal of Experimental Psychology – Revue canadienne de psychologie expérimentale, 63, 201–210. Miller, G. A., Galanter, E., & Pribam, K. (1960). Plans and the structure of behavior. New York, NY: Holt, Rinehart & Winston. Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behaviour. In R. J. Davidson, R. Schwartz, & D. Shapiro (Eds.), Consciousness and self-regulation (Vol. 4, pp. 1–18). New York, NY: Plenum Press. Ortiz-Tudela, J., Milliken, B., Botta, F., LaPointe, M., & Lupiañez, J. (2016). A cow on the prairie vs. a cow on the street: Long-term consequences of semantic conflict on episodic encoding. Psychological Research, 81, 1264–1275. https://doi.org/ 10.1007/s00426-016-0805-y Pedhazhur, E. J. (1977). Multiple regression in behavioral research. Fort Worth, TX: Harcourt Brace. Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive control. In R. L. Solso (Ed.), Information processing and cognition: The Loyola Symposium (pp. 55–85). Hillsdale, NJ: Erlbaum. Rey-Mermet, A., & Meier, B. (2012). The bivalency effect: Evidence for flexible adjustment of cognitive control. Journal of Experimental Psychology: Human Perception and Performance, 38, 213–221. https://doi.org/10.1037/a0026024 Reynolds, J. R., Donaldson, D. I., Wagner, A. D., & Braver, T. S. (2004). Item- and task-level processes in the left inferior prefrontal cortex: Positive and negative correlates of encoding. NeuroImage, 21, 1472–1483. https://doi.org/10.1016/j. neuroimage.2003.10.033 Richter, F. R., & Yeung, N. (2012). Memory and cognitive control in task switching. Psychological Science, 23, 1256–1263. https:// doi.org/10.1177/0956797612444613 Richter, F. R., & Yeung, N. (2015). Corresponding influences of top-down control on task switching and long-term memory. The Quarterly Journal of Experimental Psychology, 68, 1124–1147. https://doi.org/10.1080/17470218.2014.976579 Rogers, R. D., & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. Rosner, T. M., D’Angelo, M. C., MacLellan, E., & Milliken, B. (2015). Selective attention and recognition: Effects of congruency on episodic learning. Psychological Research, 79, 411–424. https://doi.org/10.1007/s00426-014-0572-6 Shaffer, L. H. (1965). Choice reaction with variable S–R mapping. Journal of Experimental Psychology, 70, 284–288. Tulving, E. (1985). Memory and consciousness. Canadian Psychology, 26, 1–12. Vandierendonck, A., Liefooghe, B., & Verbruggen, F. (2010). Task switching: Interplay of reconfiguration and interference control. Psychological Bulletin, 136, 601–626. https://doi.org/10.1037/ a0019791 Verguts, T., & Notebaert, W. (2009). Adaptation by binding: A learning account of cognitive control. Trends in Cognitive Sciences, 13, 252–257. https://doi.org/10.1016/j.tics.2009. 02.007 Woodward, T. S., Meier, B., Tipper, C., & Graf, P. (2003). Bivalency is costly: Bivalent stimuli elicit cautious responding. Experimental Psychology, 50, 233–238. Wylie, G., & Allport, A. (2000). Task switching and the measurement of “switch costs”. Psychological Research, 63, 212–233. Yeung, N., Nystrom, L. E., Aronson, J. A., & Cohen, J. D. (2006). Between-task competition and cognitive control in task switching. Journal of Neuroscience, 26, 1429–1438. https://doi.org/ 10.1523/JNEUROSCI.3109-05.2006
Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
M. C. Muhmenthaler & B. Meier, Task Switching Hurts Memory Encoding
Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years of research. Journal of Memory and Language, 46(3), 441–517. https://doi.org/10.1006/jmla.2002. 2864 History Received October 15, 2017 Revision received August 31, 2018 Accepted September 26, 2018 Published online February 19, 2019 Acknowledgments We thank Simone Aellen, Janira Perrotta, Teodora Popa, Ellen Surdel, and Anja Zahnd for running Experiment 1 and Stefan Walter for programming support.
Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
67
Open Data Raw data are available in the Electronic Supplementary Materials, ESM 1 and 2. ORCID Beat Meier https://orcid.org/0000-0003-3303-6854 Beat Meier Institute of Psychology University of Bern Fabrikstr. 8 3012 Bern Switzerland beat.meier@psy.unibe.ch
Experimental Psychology (2019), 66(1), 58–67
Short Research Article
On the Lack of Real Consequences in Consumer Choice Research And Its Consequences Sina A. Klein and Benjamin E. Hilbig Cognitive Psychology Lab, Department of Psychology, University of Koblenz-Landau, Germany
Abstract: Experimental tasks measure actual behavior when the consequences that follow actions and choices mirror those of real-life behavior. Consequently, choice tasks in consumer research would need to include both costs (losing a previously earned endowment) and gains (actually receiving what was chosen) to structurally resemble real-life consumer choices. A literature review of studies (k = 446) in consumer research confirms that full implementation of consequences is rare. The extent to which presence versus absence of these consequences systematically affects observable behavior is tested in an experiment (N = 669) comparing a fully consequential (cost and gain consequences), a partially consequential (gain consequence only), and a hypothetical (no consequences) consumer choice task. Results show that consequences, once real, affect both the general willingness to purchase and the relative preferences for different products. Hence, it would seem advisable to more carefully consider the role of consequences in future consumer research. Keywords: consumer choice, research practices, literature review, food choice
Many subfields of psychology ultimately aim to explain and predict behavior. That is, they intend to draw conclusions about what people might actually do in “real life” (and why they would do so) from different kinds of observations such as participants’ responses on a self-report questionnaire or responses in some laboratory task. As has been repeatedly argued (Baumeister, Vohs, & Funder, 2007; Funder, 2009a, 2009b; Furr, 2009), many of the observations psychologists predominantly rely on are more or less strongly removed from the to-be-explained behavior. For example, several groups of authors (Baumeister et al., 2007; Furr, 2009; Meredith, Dicks, Noel, & Wagstaff, 2017; Patterson, 2008; Patterson, Giles, & Teske, 2011) argue – and demonstrate in literature reviews – that vast portions of recent psychological research rely on observations that cannot be considered “actual behavior.” Thus, Baumeister et al. (2007) provocatively state that much of psychology has become “the science of self-reports and finger movements” (p. 396). Importantly, the core argument is not that questionnaire responses or button presses are, per se, poor examples of behavior. Anyone who ever filled out an immigration or tax form (a questionnaire) or clicked a Website’s “buy” Experimental Psychology (2019), 66(1), 68–76 https://doi.org/10.1027/1618-3169/a000420
button for a hugely expensive product will indubitably agree that these actions – while essentially being self-reports and finger movements – entail a lot of behavior. What then sets apart these examples from the omnipresent self-report personality questionnaires, hypothetical scenarios, or reaction time tasks that Baumeister et al. (2007) and others have convincingly argued do not represent observations of actual behavior? We argue that the core distinguishing aspect is whether the consequences a research participant faces (conditional on her and potentially others’ actions) match or at least approximate the consequences faced by agents in the corresponding real-life situations in a structurally comparable way. If the tasks given to participants “carry some form of consequence (e.g., social, financial, effort, time, self-efficacy)”, these will typically be “substantially more informative of real [. . .] behavior” (Morales, Amir, & Lee, 2017). Correspondingly, Lewandowski and Strohmetz (2009) have argued that consequences for the self or others are one defining element of behavioral choice: “Rather than ask participants to self-report what they believe they would choose, behavioral choice focuses on what participants actually select as the dependent variable” (p. 998). Similarly, Diederich (2003a, 2003b) argues that real consequences ought to be implemented to induce choice conflict in multi-attribute decision tasks. Indeed, this very principle – that well-specified consequences help transform researchers’ observations from some artificial task into truly behavioral observations – dates Ó 2019 Hogrefe Publishing
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
back to the very origins of modern empirical psychology (one might say that Behaviorism was marked by an almost obsessive focus on consequences) and is now one of the cornerstone practices of experimental economics (Camerer & Hogarth, 1999; Hertwig & Ortmann, 2001). In early discussions about economics becoming an experimental science, Smith (1982) specified certain conditions or “precepts” that make an economic experiment valid although it does not fully represent the natural setting. In line with this, Plott (1991) argued that economic experiments do not have to mirror real-life settings exactly but the features essential to test a theory. As a consequence, experimental economists predominantly study fully specified games with well-defined structures of consequences (most commonly monetary gains or losses) that mirror the essential features of corresponding real-life situations. For example, the essence of real-life behaviors such as giving to charity is mirrored in the structure of consequences built into the Dictator Game (e.g., Forsythe, Horowitz, Savin, & Sefton, 1994) in which an individual in the role of the dictator is free to allocate an actual valued resource between herself and a passive recipient. In this simple economic game, the consequences mirror those of the real-life behavior targeted: If a dictator does not give anything, she will face the consequence of having/keeping the entire endowment to herself; to the extent that she allocates part of the resource to another, she bears the consequence of having/keeping less of the valued endowment. Thus, whereas the specific actions performed by the participant (e.g., typing a number into a box on the computer screen) do not necessarily match those involved in the real-life situation (e.g., placing money in a collecting tin of some charity on the street), the consequences attached to her actions do. Hence, participants in economic games “make real decisions with potent consequences” (Murnighan & Wang, 2016, p. 81) which is exactly why these games have long been considered to yield relatively objective observations of actual behavior (e.g., Pruitt & Kimmel, 1977). Actual decision behavior occurs daily, and one prominent and frequent example is consumer choice. In light of the above arguments, one may well ask whether and to what extent the field of consumer choice research actually studies consumer behavior or, in Lewandowski and Strohmetz (2009) terminology, behavioral choice. In other words, do the paradigms most heavily relied on in consumer choice research involve the types of consequences that define real-life consumer behavior? Arguably, real-life consumer choice behavior most commonly involves choosing from a selection of goods/products/services and spending a valuable resource (usually money) to then receive what was 1
69
chosen. More specifically, actual consumer choices typically involve some endowment (money) which was previously earned in some form and which must be invested so as to purchase a product. Thus, as in the above example of clicking a Website’s “buy” button, real-life consumer choices bear two relevant sets of consequences: First, these choices are almost always costly in that deciding to buy a product is accompanied by a reduction or loss of one’s endowment. Second, they involve positive consequences or gains in that one actually receives the good/product/service and may consume or in some other way profit from it. Correspondingly, one can broadly classify the consumer choice scenarios studied in consumer choice research into three different categories – depending on which of the consequences as defined above are actually present. First, a fully consequential consumer choice situation is characterized by both consequences, namely costs (e.g., losing something that was previously earned) and gains, meaning actually receiving the good/product/service. Second, a partially consequential consumer choice situation involves only one of the consequences, most typically1 actually receiving the product but without bearing actual costs; in other words, whereas the positive consequence of consuming the chosen product is real, the negative consequence of losing (part of) an endowment that was obtained through effort is absent. Strictly speaking, choice situations in which a participant is simply given an endowment as a windfall without investing time or effort are also classified as only partially consequential – given that whatever is potentially spent was not previously earned. Third, hypothetical consumer choice situations involve neither of the consequences, that is, there are no costs associated with the choices made, but neither does one receive the chosen product.
Literature Review To gain more firm insight on common research practices, we conducted a literature review including all studies on consumer choice published between February 2012 and April 2013 in the Journal of Consumer Research. Raters classified all studies within each published paper into four categories: (1) fully consequential choice, for example, buying a product with money that had to be earned in a previous task, (2) partially consequential choice, for example, selecting one out of several products and actually receiving/consuming it but without having to pay for it, (3) hypothetical consumer choice (including willingness to purchase), and (4) other non-choice tasks such as product evaluations, for example, on some scales with product features. Out of all 446 studies that were coded, 281 (63.0%)
In principle, one could also think of a scenario in which only the costs are real, whereas the gains are not; however, we are not aware of any study actually implementing such a situation, arguably because participants would be unwilling to engage in such a study given its incentive structure.
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 68–76
70
used measures falling at least into one of these categories and were hence included in the analyses below.2 The detailed coding sheet is available online (https://osf.io/ z5tn6/). Out of all studies, about two-thirds used choice tasks (63.5%) and about one-third used product evaluations (36.5%). Within the studies that used choice tasks, more than two-thirds (69.1%) of studies used hypothetical choices and only a fourth (24.9%) used partially consequential choices. No more than 6.0% of studies used fully consequential choices involving both cost and gain consequences. In conclusion, real consequences are not common in the study of consumer choice and fully consequential choice studies remain a rare exception. Whereas the above evidence clearly indicates that (full) consequences are rarely implemented in consumer choice research, previous research suggests that the absence of both cost and gain consequences might indeed pose a problem. First, with regard to the cost consequence, Moser, Raffaelli, and Notaro (2013) point out that it is important to let participants use their own money instead of an endowment given by the experimenter as money received as a windfall might increase participants’ willingness to purchase. This idea is based on findings showing that, under some circumstances, people are more attached to their own, previously earned money than to money that was gained before the decision (e.g., Carlsson, He, & Martinsson, 2013; Cherry, Frykblom, & Shogren, 2002; Smith, 2010; Thaler & Johnson, 1990) which would also be predicted by effort justification, a mechanism to reduce cognitive dissonance (Festinger, 1957). However, while one group of participants in Moser et al.’s (2013) study actually had to use their own money, the only comparison made was to a hypothetical choice task. Therefore, evidence on consumer choices differing in whether money had to be previously earned or not (and thus, whether the choice is actually costly or not) is still missing (Moser et al., 2013). Second, with regard to the gain consequence, several meta-analyses (e.g., Foster & Burrows, 2017; Harrison & Rutström, 2008; Murphy, Allen, Stevens, & Weatherhead, 2005) comparing hypothetical and actual choice behavior to investigate hypothetical bias showed that the use of hypothetical versus actual preference or choice variables does indeed yield different results. Specifically, hypothetical measures lead to an overstatement of preferences and values for products and goods (e.g., in terms of how much money one would pay for a product) in comparison with
2
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
actual measures (Murphy et al., 2005). Hypothetical bias typically refers to a choice between a certain product or good and money. Taking this one step further, we investigate whether there is also some form of hypothetical bias in a choice between different products. Taken together, previous research indicates that the presence versus absence of consequences in choice tasks may alter the observable behavior systematically. This implies that tasks that do not model the real-life behavior one is intending to draw conclusions on will often have to be treated with caution (Morales, Amir, & Lee, 2017). However, to the best of our knowledge, studies have not directly compared the effects of cost and gain consequences and their interplay on these systematic behavioral changes. Therefore, it remains unclear to what degree cost and gain consequences contribute to the systematic change of behavior and hence which specific aspects of choice tasks alter which specific aspect of behavior. Ultimately, it remains difficult to judge whether certain consumer choice tasks will provide reliable estimates of people’s willingness to purchase and their relative preferences for specific options. To close this gap, we conducted an experiment that directly compares a fully consequential, a partially consequential, and a hypothetical choice task in the same setting.
Experiment Specifically, two comparisons between these different choice tasks can reveal whether and, if so, to which degree the two types of consequences alter choice behavior. First, the fully and partially consequential choice tasks only differ with respect to the presence of the cost consequence, namely the fact that one has to invest effort and earn the money before being able to spend it. Thus, a comparison between these two conditions will provide insight into whether the presence of costs alters the willingness to spend the money and purchase the product and hence choice behavior. Second, the two consequential tasks differ from the hypothetical task only with respect to the presence of the gain consequence, namely actually receiving the product after making a choice. Thus, a comparison between the hypothetical condition and each of the two consequential conditions will test the presence of a hypothetical bias and hence provide insight into whether the possibility to actually consume the product alters the probability of selecting certain options.
Studies/articles that could not be assigned to any of the four categories were, for example, theoretical articles or studies with dependent variables such as creativity, perceived time, or mood. In addition to the analysis reported in the main text, we used a more liberal criterion for the inclusion of such studies. For example, we excluded evaluations of similarity of products from the conservative categorization but included it as product evaluation in the more liberal categorization. Also, studies from two retracted papers were also included in the more liberal categorization. In total, 314 studies were included in this second analysis. Results did not differ more than 3% from the conservative analysis.
Experimental Psychology (2019), 66(1), 68–76
Ó 2019 Hogrefe Publishing
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
Methods Design and Procedure An experiment with a one-factorial between-subjects design was conducted. Participants were assigned to one of three conditions in which they were faced with a fully consequential, a partially consequential, or a hypothetical consumer choice. The real-life consumer choice modeled in this study resembled a grocery shopping situation which involved what may be termed a “considerate” (i.e., organic and fair trade) and a “non-considerate” product option (see Reese & Kohlmann, 2015, for a similar choice task). Participants decided whether to invest a monetary endowment to buy one of two chocolate options or whether to keep the money. The two chocolate options were one organic, fair trade chocolate bar of 100 g, or two nonorganic, non-fair trade chocolate bars of 100 g each. The specific chocolate bars were highly similar on all attributes except for organic and fair trade produce. The monetary equivalent of the chocolate options was roughly the same. Compared to keeping the money, the chocolate options involved a 10–25% discount so as to make choosing either chocolate option more attractive. In the fully consequential condition, participants first had to earn the monetary endowment for the choice task. Specifically, they briefly worked on unrelated, mildly effortful cognitive tasks for 3 min, such as disentangling alphabetic strings, identifying words within a list of non-words, or solving simple rule of three calculations (see Figure A1 in the Appendix for an example). If they solved at least four out of nine tasks (83.1% of the participants assigned to this condition were successful), they earned the 3€ endowment. Then, they were asked to choose whether they wanted to keep the money or spend it to buy chocolate, that is, either of the two chocolate options. In the partially consequential condition, participants were given 3€ without having to work for it and – exactly parallel to participants in the fully consequential condition – then chose whether they wanted to keep the endowment or spend it to buy one of the two chocolate options. In both the fully and the partially consequential conditions, participants actually received their chosen option, whether it was their monetary endowment or whichever chocolate option they had chosen. In the hypothetical condition, participants indicated whether they would keep their 3€ endowment or spend it to buy one of the two chocolate options; they were asked to decide as if the choice was fully consequential, but fully aware in 3
71
advance that neither consequence would actually be materialized. The study was conducted in different places on campus, for example, next to the cafeteria or the main entrance. Passing individuals were asked whether they wanted to participate in the study. After providing informed consent, participants were seated in separate booths, ensuring that their choice was confidential. In the fully consequential condition, participants first worked on the separate cognitive tasks. If they passed, the remaining procedure was the same as for participants in the other conditions: The experimenter placed a questionnaire, the three chocolate bars, and 3€ in cash in front of the participant. The questionnaire consisted of demographics, the consumer choice task, and the following screening checks (the complete questionnaires and the cognitive tasks for the fully consequential condition are available at https://osf.io/z5tn6/): Participants indicated whether they had food intolerance to chocolate and whether they consumed chocolate in general. If they indicated food intolerance or indicated that they did not eat chocolate in general, they were excluded from the sample (see below). After completing the questionnaire, participants were given their chosen chocolate or money (only in the two consequential conditions), debriefed and dismissed.
Analytical Strategy, Power Analysis, and Sample Since the task structure of the present experiment is conditional in nature (choosing either chocolate option is conditional on deciding to spend the endowment), the appropriate analysis takes the dependence of observations into account. A straightforward approach is to model observed choices with a two-stage multinomial processing tree (MPT) model3 (Batchelder & Riefer, 1999; Erdfelder et al., 2009) with separate probabilities for (i) keeping versus spending the monetary endowment, and, conditionally on the latter, (ii) making a considerate or a non-considerate product choice. Thus, the MPT model as depicted in Figure 1 comprises two parameters: First, parameter k denotes the probability to keep the endowment and thus distinguishes between the willingness to keep the money (probability k) and spending the money to buy a product (probability 1 k). Second, and conditionally on the willingness to buy a product, parameter c describes the probability of a considerate product choice and distinguishes between
Alternatively, ordinary chi-square tests could be used; however, they do not directly account for the conditional structure, which is why the MPT model is to be preferred. Nonetheless, we additionally conducted all analyses using standard chi-square tests. An overall test, a test without the monetary option, and a test without the monetary option and both consequential conditions analyzed together all revealed significant differences between conditions (p < .05) and thus corroborate the analyses in the MPT framework. Results of these analyses are available at https://osf.io/z5tn6/.
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 68–76
72
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
Figure 1. Graphical representation of the baseline multinomial processing tree model (for one condition).
choosing one organic and fair trade chocolate bar (probability c) and choosing two normal chocolate bars (probability 1 c). Models of this type have been used in other contexts in which the choice structure is a conditional one (e.g., Klein, Hilbig, & Heck, 2017). The complete MPT model across all three conditions correspondingly consists of three trees as depicted in Figure 1, one for each condition. Each tree, and hence each condition, has distinct parameters (kh & ch, kpc & cpc, kfc & cfc, with subscripts denoting conditions, that is, h for hypothetical, pc for partially consequential, and fc for fully consequential). The full model equations are available at https://osf.io/z5tn6/. To estimate parameters and test whether there are differences in choice behavior across conditions (i.e., likelihood ratio tests), we used the software multiTree (Moshagen, 2010). To compare conditions, we first implemented the parameter restriction kh = kpc = kfc to test for overall differences in parameter k (and hence the choice between keeping the 3€ endowment vs. “buying” either chocolate) and the parameter restriction ch = cpc = cfc to test for overall differences in parameter c (and hence the choice between considerate and non-considerate chocolate consumption). Second, whenever an overall test indicated significant differences between conditions, we conducted pairwise comparisons between all specific parameters (kh = kpc, kh = kfc, kpc = kfc and ch = cpc, ch = cfc, cpc = cfc). In order to set a lower-bound sample size, we computed an approximate4 a priori power analysis in the MPT framework using multiTree (Moshagen, 2010). Specifically, we aimed to detect a difference in c-parameters of ch = .60, cpc = .50, and cfc = .40 (conservatively assuming that the k-parameter is .40 in each condition) which corresponds to a small effect of Cohen’s ω = .13 with a power of 1 – β = .80 (with α = .05). The required overall sample size was N = 599 which we thus set as our lower bound. 4
5
In total, 850 participants completed the study. We excluded all participants who indicated a food intolerance against chocolate (n = 117), indicated they did not consume chocolate in general (n = 31), misunderstood the instructions and clearly indicated this toward the experimenter (n = 29), and/or had missing values in either one of the screening variables (n = 16) or the choice task (n = 3). Additionally, we excluded participants who noted on the questionnaire that they were vegan (n = 3). The final sample consisted of N = 669 participants aged between 18 and 54 years (M = 22.35, SD = 3.97). Slightly more than half of participants were female (n = 387), and most were students (n = 641).
Results Across conditions, choices were relatively evenly distributed across options with keeping one’s monetary endowment turning out to be the most frequently chosen option (38%), followed by organic, fair trade chocolate (33%) and normal chocolate (29%). The raw data and choice proportions per condition are reported in Table A1 in the Appendix. Figure 2 summarizes parameter estimates from the MPT model.5 As can be seen, there were noteworthy differences between conditions: Once the choice was fully consequential, 55% of participants decided to keep their money, whereas only about 30% of participants made the same decision when the choice was either hypothetical or only partially consequential (i.e., the money was given as windfalls and not previously earned through effort). Correspondingly, a test for overall differences between k-parameters turned out significant (ΔG2 (df = 2) = 38.85, p < .001, Cohen’s ω = .24). To follow up, we conducted
Within the MPT framework, power analysis requires the full specification of model parameters and thus assumptions about the parameter values in the population. Correspondingly, without strong a priori knowledge on the to-be-expected parameter values, such a power analysis necessary remains approximate. Note that, by necessity, parameter estimates are directly implied by the choice proportions (and vice versa). For example, the observed proportion for choices for the considerate option is, by definition, (1 k) c.
Experimental Psychology (2019), 66(1), 68–76
Ó 2019 Hogrefe Publishing
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
73
Figure 2. Parameter estimates for parameter k (keeping money vs. spending money) and parameter c (considerate vs. non-considerate product choice) across conditions. Error bars represent one standard error of the parameter estimate.
pairwise comparisons to investigate which specific differences drive the overall test result. The parameters kh and kpc did not differ significantly (ΔG2 (df = 1) = 0.13, p < .718, Cohen’s ω = .01), indicating that the probability of keeping one’s money was comparable in the hypothetical and partially consequential conditions. By contrast, both kh and kfc (ΔG2 (df = 1) = 30.87, p < .001, Cohen’s ω = .21) and kpc and kfc (ΔG2 (df = 1) = 26.74, p < .001, Cohen’s ω = .20) differed significantly. Thus, the fully consequential condition differed significantly from the other two in that more people chose to keep their monetary endowment as compared to the hypothetical and partially consequential conditions. As can also be seen in Figure 2, once choosing to buy chocolate (i.e., conditional on 1 k), 46% of participants made a considerate choice in the hypothetical condition, whereas roughly 60% of participants made a considerate choice in each of the consequential conditions. An overall test confirmed that parameter c differed across conditions (ΔG2 (df = 2) = 6.55, p < .038, Cohen’s ω = .10). More specifically, parameters cpc = cfc did not differ significantly (ΔG2 (df = 1) = 0.02, p < .897, Cohen’s ω = .01), indicating that considerate choices (for the organic, fair trade chocolate) were comparably likely in the two consequential conditions. By contrast, parameters ch and cpc (ΔG2 (df = 1) = 5.48, p < .019, Cohen’s ω = .09) and ch and cfc differed significantly (ΔG2 (df = 1) = 3.78, p < .052, Cohen’s ω = .08), although the latter comparison only borders the conventional level of significance. In summary, in the hypothetical condition, Ó 2019 Hogrefe Publishing
participants were significantly less likely to select the organic, fair trade chocolate than those in either of the consequential conditions (which, in turn, did not differ).
Discussion All too often, conclusions about behavior are drawn from research tasks and paradigms that do not (or only incompletely) match the to-be-modeled real-life situations in terms of a correspondence in the structure of possible consequences (Morales et al., 2017). The upshot that research often does not provide observations of “actual behavior” (Baumeister et al., 2007) appears to hold for consumer choice research as evidenced by our literature review: The vast majority of choice tasks used in recent consumer choice research were hypothetical or only partially consequential in nature, whereas fully consequential tasks – involving both costs and gains and thus mirroring real-life consumer choice situations – are exceptionally rare. At the same time, several meta-analyses imply that hypothetical and actual choice tasks lead to differences in choice behavior (Foster & Burrows, 2017; Harrison & Rutström, 2008; Murphy et al., 2005). However, until now, there was insufficient evidence on how and to which degree consequences in terms of both costs (losing at least part of an endowment that was previously earned) and gains (actually receiving the chosen option) drive such differences. Experimental Psychology (2019), 66(1), 68–76
74
Hence, the experiment reported on herein compared a fully consequential (cost and gain consequences), a partially consequential (gain consequence only), and a hypothetical choice task (no consequences) in the same setting. Results revealed that costs – that is, having to work for the endowment – and indeed only costs reduced the willingness to spend the endowment. This is in line with previous research regarding earned endowments versus endowments received as a windfall (Carlsson et al., 2013; Cherry et al., 2002; Festinger, 1957; Smith, 2010, Thaler & Johnson, 1990) and provides direct evidence for corresponding assumptions about consumer choice behavior (Moser et al., 2013). Furthermore, and arguably more problematically, the possibility to actually consume the products – that is, the reality of gain consequences – increased the probability of selecting a considerate (fair trade, organic) product over a non-considerate one. In other words, participants’ preferences given the exact same options depended on whether the corresponding consequences were going to be materialized. Taken together and in the extreme, the absence of both (cost and gain) consequences leads to twice as many participants choosing a non-considerate product (39%) in comparison with when both consequences were present (19%). This supports the findings of several metaanalyses showing the existence of hypothetical bias in choice tasks (Foster & Burrows, 2017; Harrison & Rutström, 2008; Murphy et al., 2005) and extends these findings to choices between different products. In summary, our experiment demonstrates that both costs and gains are important consequences that can alter choice behavior. Whereas cost consequences affect the general willingness to purchase and thus whether a product is bought or not, gain consequences actually affect specific preferences, that is, which options are chosen. In turn, these results provide some guidelines for the implementation of consequences in consumer choice tasks: First, as the presence of cost consequences affects whether a product is purchased or not, it also determines whether the choice between different products is actually relevant or not. If the willingness to purchase a product approaches zero, it is neither relevant which out of different products is chosen nor how this choice can be influenced (e.g., through some experimental manipulation). Therefore, before implementing only gain consequences in experiments, it is important to demonstrate that one might reasonably expect at least some willingness to purchase (at cost). Second, whenever the preferences for certain options – and how these may be influenced – are of interest, it would seem highly advisable to implement the corresponding gain consequences or, at a minimum, demonstrate that these preferences are independent of whether or not the consequences are hypothetical versus real.
Experimental Psychology (2019), 66(1), 68–76
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
Of course, it must also be acknowledged that there are practical limitations to the consequences that can realistically be implemented in consumer choice research. In particular, it will simply be unrealistic in studies specifically focusing on expensive or otherwise impractical goods/ products/services (we can neither actually give a car or holiday trip to participants nor have them pay for it). However, we would argue that research is certainly not exclusively concerned with goods/products/services of this nature; rather, the focus is most commonly on underlying mechanisms and principles of consumer behavior. The latter can be studied using those types of goods/products/services that actually lend themselves to implementing fully consequential choice tasks. Thus, rather than calling for universal or even compulsory implementation of cost and gain consequences, we merely emphasize that the issue of consequences should not simply be ignored away. To conclude, the prevalent use of hypothetical choices in consumer choice research can and will likely lead to systematically inaccurate predictions for both the willingness to buy a product and the relative proportion of choices among different products. Although using fully consequential choice tasks which include both cost and gain consequences might be more complicated and costly to the researcher than hypothetical tasks, the more accurate estimation of choice behavior should outweigh these additional expenses. In the long run, a shift toward more commonly implementing both cost and gain consequences will help foster exchange with researchers from other, related fields for whom “actual behavior” is the imperative criterion (most notably, behavioral economics) and help counteract reducing psychology to “the science of self-reports and finger movements” (Baumeister et al., 2007).
References Batchelder, W. H., & Riefer, D. M. (1999). Theoretical and empirical review of multinomial process tree modeling. Psychonomic Bulletin & Review, 6, 57–86. https://doi.org/ 10.3758/BF03210812 Baumeister, R. F., Vohs, K. D., & Funder, D. C. (2007). Psychology as the science of self-reports and finger movements: Whatever happened to actual behavior? Perspectives on Psychological Science, 2, 396–403. https://doi.org/10.1111/j.17456916.2007.00051.x Camerer, C. F., & Hogarth, R. M. (1999). The effects of financial incentives in experiments: A review and capital-laborproduction framework. Journal of Risk and Uncertainty, 19, 7–42. https://doi.org/10.1023/A:1007850605129 Carlsson, F., He, H., & Martinsson, P. (2013). Easy come, easy go. The role of windfall money in lab and field experiments. Experimental Economics, 16(2), 190–207. https://doi.org/ 10.1007/s10683-012-9326-8 Cherry, T. L., Frykblom, P., & Shogren, J. F. (2002). Hardnose the dictator. The American Economic Review, 192, 1218–1221. https://doi.org/10.1257/00028280260344740
Ó 2019 Hogrefe Publishing
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
Diederich, A. (2003a). Decision making under conflict: Decision time as a measure of conflict strength. Psychonomic Bulletin & Review, 10, 167–176. https://doi.org/10.3758/BF03196481 Diederich, A. (2003b). MDFT account of decision making under time pressure. Psychonomic Bulletin & Review, 10, 157–166. https://doi.org/10.3758/BF03196480 Erdfelder, E., Auer, T.-S., Hilbig, B. E., Aßfalg, A., Moshagen, M., & Nadarevic, L. (2009). Multinomial processing tree models. A review of the literature. Zeitschrift für Psychologie/Journal of Psychology, 217, 108–124. https://doi.org/10.1027/0044-3409.217.3.149 Festinger, L. (1957). A theory of cognitive dissonance. Stanford, CA: Stanford University Press. Forsythe, R., Horowitz, J. L., Savin, N. E., & Sefton, M. (1994). Fairness in simple bargaining experiments. Games and Economic Behavior, 6, 347–369. https://doi.org/10.1006/game. 1994.1021 Foster, H., & Burrows, J. (2017). Hypothetical bias: A new metaanalysis. In D. McFadden & K. Train (Eds.), Contingent valuation of environmental goods (pp. 270–291). Cheltenham, UK: Edward Elgar Publishing. Funder, D. C. (2009a). Naive and obvious questions. Perspectives on Psychological Science, 4, 340–344. https://doi.org/10.1111/ j.1745-6924.2009.01135.x Funder, D. C. (2009b). Persons, behaviors and situations: An agenda for personality psychology in the postwar era. Journal of Research in Personality, 43, 120–126. https://doi.org/ 10.1016/j.jrp.2008.12.041 Furr, R. M. (2009). Personality psychology as a truly behavioral science. European Journal of Personality, 23, 369–401. https:// doi.org/10.1002/per.724 Harrison, G. W., & Rutström, E. E. (2008). Experimental Evidence on the existence of hypothetical bias in value elicitation methods. In C. R. Plott & V. L. Smith (Eds.), Handbook of experimental economics results (pp. 752–767). Amsterdam, The Netherlands: Elsevier. Hertwig, R., & Ortmann, A. (2001). Experimental practices in economics: A methodological challenge for psychologists? Behavioral and Brain Sciences, 24, 383–451. Klein, S. A., Hilbig, B. E., & Heck, D. W. (2017). Which is the greater good? A social dilemma paradigm disentangling environmentalism and cooperation. Journal of Environmental Psychology, 53, 40–49. https://doi.org/10.1016/j.jenvp.2017.06.001 Lewandowski, G. W Jr., & Strohmetz, D. B. (2009). Actions can speak as loud as words: Measuring behavior in psychological science. Social and Personality Psychology Compass, 3, 992–1002. https://doi.org/10.1111/j.1751-9004.2009.00229.x Meredith, S. J., Dicks, M., Noel, B., & Wagstaff, C. R. D. (2017). A review of behavioural measures and research methodology in sport and exercise psychology. International Review of Sport and Exercise Psychology, 11, 25–46. https://doi.org/10.1080/ 1750984X.2017.1286513 Morales, A. C., Amir, O., & Lee, L. (2017). Keeping it real in experimental research – Understanding when, where, and how to enhance realism and measure consumer behavior. Journal of Consumer Research, 44, 465–476. https://doi.org/ 10.1093/jcr/ucx048 Moser, R., Raffaelli, R., & Notaro, S. (2013). Testing hypothetical bias with a real choice experiment using respondents’ own money. European Review of Agricultural Economics, 41, 25–46. https://doi.org/10.1093/erae/jbt016 Moshagen, M. (2010). multiTree: A computer program for the analysis of multinomial processing tree models. Behavior Research Methods, 42, 42–54. https://doi.org/10.3758/BRM.42.1.42 Murnighan, J. K., & Wang, L. (2016). The social world as an experimental game. Organizational Behavior and Human Decision Processes, 136, 80–94. https://doi.org/10.1016/j.obhdp. 2016.02.003
Ó 2019 Hogrefe Publishing
75
Murphy, J. J., Allen, P. G., Stevens, T. H., & Weatherhead, D. (2005). A meta-analysis of hypothetical bias in stated preference valuation. Environmental and Resource Economics, 30, 313–325. https://doi.org/10.1007/s10640-004-3332-z Patterson, M. L. (2008). Back to social behavior: Mining the mundane. Basic and Applied Social Psychology, 30, 93–101. https://doi.org/10.1080/01973530802208816 Patterson, M. L., Giles, H., & Teske, M. (2011). The decline of behavioral research? Examining language and communication journals. Journal of Language and Social Psychology, 30, 326–340. https://doi.org/10.1177/0261927X11407174 Plott, C. R. (1991). Will economics become an experimental science? Southern Economic Journal, 57, 901–919. https:// doi.org/10.2307/1060322 Pruitt, D. G., & Kimmel, M. J. (1977). Twenty years of experimental gaming: Critique, synthesis, and suggestions for the future. Annual Review of Psychology, 28, 363–392. Reese, G., & Kohlmann, F. (2015). Feeling global, acting ethically: Global identification and fairtrade consumption. The Journal of Social Psychology, 155, 98–106. https://doi.org/10.1080/ 00224545.2014.992850 Smith, V. L. (1982). Microeconomic systems as an experimental science. The American Economic Review, 72, 923–955. Smith, V. L. (2010). Theory and experiment: What are the questions? Journal of Economic Behavior and Organization, 73, 3–15. https://doi.org/10.1016/j.jebo.2009.02.008 Thaler, R. H., & Johnson, E. J. (1990). Gambling with the house money and trying to break even: The effects of prior outcomes on risky choice. Management Science, 36, 643–660. https://doi. org/10.1287/mnsc.36.6.643 History Received January 30, 2018 Revision received April 5, 2018 Accepted May 22, 2018 Published online February 19, 2019 Acknowledgments The authors thank Anja Humbs, Theresa Behringer, and Janine Rispler for their help with coding the studies for the literature review. Conflict of Interest None. Open Data The detailed coding sheet is available online (https://osf.io/z5tn6/). Complete questionnaires and the cognitive tasks for the fully consequential condition are available at https://osf.io/z5tn6/. Full model equations are available at https://osf.io/z5tn6/. Funding This work was supported by the German Research Foundation [grant number HI-1600/6-1]. ORCID Sina A. Klein https://orcid.org/0000-0002-8154-5429 Sina A. Klein Cognitive Psychology Lab Department of Psychology University of Koblenz-Landau Fortstraße 7 76826 Landau Germany klein@uni-landau.de
Experimental Psychology (2019), 66(1), 68–76
76
S. A. Klein & B. E. Hilbig, Consequences in Consumer Choice
Appendix Table A1. Frequency and proportion of choices across conditions frequency (row-wise proportion) of choices Condition
n
money
considerate chocolate option
non-considerate chocolate option
Fully consequential
222
122 (.55)
58 (.26)
42 (.19)
Partially consequential
221
68 (.31)
90 (.41)
63 (.29)
Hypothetical
226
66 (.29)
73 (.32)
87 (.39)
Figure A1. Examples for tasks in the fully consequential condition.
Experimental Psychology (2019), 66(1), 68â&#x20AC;&#x201C;76
Ă&#x201C; 2019 Hogrefe Publishing
Short Research Article
The Effect of a Verbal Concurrent Task on Visual Precision in Working Memory Ed D. J. Berry1 , Richard J. Allen1, Amanda H. Waterman1, and Robert H. Logie2 1
School of Psychology, University of Leeds, UK
2
Department of Psychology, University of Edinburgh, UK
Abstract: By investigating the effect of individualized verbal load on a visual working memory task, we investigated whether working memory is better captured by modality-specific stores or a general attentional resource. A visual measure was used that allows for the precision of representations in working memory to be quantified. Bayesian analyses were employed to contrast the likelihood of our data assuming a small versus a large effect, as predicted by the differing accounts. We found evidence that the effect of verbal load on visual precision and binary feature recall was small. The results were indeterminate for the size of the dual task effect on verbal accuracy and the probability of recalling a continuous target feature. These results, in part, support a multiple component account of working memory. An analysis of how the chosen effect intervals affect the results is also reported, highlighting the importance of making specific predictions in the literature. Keywords: working memory, dual task, cognitive load, visual memory, short-term memory
Dual task paradigms, in which participants complete two tasks individually and then concurrently, have been used since the inception of working memory research (Baddeley & Hitch, 1974). Such paradigms have likely endured due to the simplicity of their logic: If two tasks draw on the same parts of the cognitive system, then people should be worse at doing them together than carrying out the single tasks alone. This logic has been pivotal in establishing multiple component accounts of working memory (Baddeley, 2012). Such accounts vary but share the view that storage in working memory (WM) is served by distinct stores for phonological versus visuospatial information, together with an executive resource that co-ordinates the domain-specific stores. In addition, Baddeley (2000) proposed an amodal store, the episodic buffer, to store integrated items. Others have suggested that binding is served by communication between domain-specific resources without the need for the concept of an executive resource or an episodic buffer (Logie, 2016) and that there may be multiple “executive” resources, each of which supports a specific function, including task switching, updating, and inhibition (e.g., Miyake et al., 2000), as well as communication between Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
domain-specific stores, and implementation of mnemonic strategies. Classically, the observation that dual task interference is limited when each task involves different modalities has been taken as evidence for modality-specific storage capacities. For example, Cocchini, Logie, Della Sala, MacPherson, and Baddeley (2002) asked participants to retain a sequence of digits with sequence length set at the span of each participant. During a 15-second retention interval, participants either saw a blank screen or were shown a series of random square matrix patterns in which half the squares were black and half white. Following each pattern, they were shown a blank matrix and were asked to recall which squares had previously been shown in black. After the blank or filled retention interval, participants were asked to recall the digit sequence. The matrix recall task was also performed without the verbal memory preload. Recall of the digits was unaffected by the matrix recall task during the retention interval, and recall of the matrix patterns was unaffected by having a digit memory preload. In contrast, when a single matrix pattern was used as a memory preload, and the 15-second retention interval was filled by a perceptuo-motor tracking task, there was significant disruption of recall of the matrix pattern, relative to the condition with a blank retention interval. Digit recall was unaffected when the retention interval was filled with the tracking task. Experimental Psychology (2019), 66(1), 77–85 https://doi.org/10.1027/1618-3169/a000428
78
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
This picture has been complicated recently by the observation of asymmetric dual task costs between verbal and visual domains, where the effect of a verbal load on visual working memory was found to be larger than the effect of a visual load on verbal memory (e.g., Morey, Morey, van der Reijden, & Holweg, 2013; see Morey, 2018 for a review). This contrasts with previous studies that have shown the domain-specific verbal and visual dual task costs to be symmetric (Farmer, Berman, & Fletcher, 1986; Logie, 1986; Logie, Zucco, & Baddeley, 1990). The manipulation of the cognitive load of a secondary task (e.g., Doherty & Logie, 2016; Logie, Cocchini, Della Sala, & Baddeley, 2004) has also been important for multiple component models. The load can be set so that it is below, at, or above each participant’s capacity to store information in service of a single task. Here, the concern is not the “code” in which information is stored but the role of executive resources in a task. Any detrimental effects on a concurrent task of increasing cognitive load above the capacity of a passive store might imply a role for executive resources in that task, for example, to implement mnemonic strategies. In contrast to the multiple component accounts of working memory, the embedded processes account does not make use of modality-specific stores (Cowan, 2005). Rather, as items are presented their features are automatically activated in long-term memory (LTM). Embedded within this activated memory, a small number of integrated items can be represented in a domain-general focus of attention. Executive functions are not part of the model as such but play a role in influencing what enters the focus of attention. In contrast to the Cocchini et al. (2002) study, work informed by the embedded processes account has demonstrated substantial dual task interference between modalities (Morey & Cowan, 2004, 2005). The memoranda in such cases are thought to share the limited focus of attention, resulting in the observed drop in performance. Proponents of embedded processes do not deny the possibility of passive storage contributing to performance in working memory (e.g., Morey & Cowan, 2005), drawing on item features that are currently activated in LTM but are outside the focus of attention. The question is whether all storage in WM can be accounted for by activated LTM, or if there are, in addition, domain-specific stores that provide the primary hosts for temporary memory within a multiple component working memory (Logie, 2016; see also Norris, 2017). Here, we aimed to contrast the predictions of Logie’s (2011, 2016) multiple component account and a general attentional resource account using a dual task paradigm, within the context of a continuous response task. Following Gorgoraptis, Catalao, Bays, and Husain (2011, Experiment 2), our visual task required participants to view
Experimental Psychology (2019), 66(1), 77–85
an array of colored bars, each shown in a different orientation, and subsequently to recall the angle of orientation of a target bar by circular, analog adjustment. By requiring 0analog recall of the angle of orientation of a target stimulus, we obtained a fine-grained measure of the quality of representations in WM (Zokaei, Burnett Heyes, Gorgoraptis, Budhdeo, & Husain, 2015). Such a measure may be more sensitive to potential dual task interference than measures of item recall that are widely used in dual task studies. In addition, we required participants to judge which of two colors was present in a test array. This allowed us to explore binary recall of a feature (color) alongside analog recall of another feature for the same item (e.g., Pertzov, Heider, Liang, & Husain, 2015). As well as performing this task on its own, participants performed it while maintaining lists of letters in memory. Crucially, here we titrated the verbal load such that the list length was set to each participant’s span (e.g., Cocchini et al., 2002; Logie et al., 2004). Participants completed the letter recall task with and without the interleaved visual memory task. Given that span is assumed to reflect the maximum capacity for immediate memory, a general attentional resource theory (e.g., Cowan, 2005) would assume that storing a letter sequence at span should use most, if not all, of the focus of attention. Therefore, combining the verbal and visual tasks should result in a substantial cost to performance on one or both memory tasks. The multiple component account of working memory assumes that there are domain-specific immediate memory systems, respectively, for temporary visual storage (the visual cache, Logie, 1995) and for phonologically based temporary verbal storage (the phonological loop, Baddeley, 1992). A strict interpretation of this theoretical framework would predict that there would be no reduction in performance under dual task conditions, if we assume that span for any given task is a pure measure of the capacity of one domain-specific system that supports performance on that task. However, in the multiple component framework (Logie, 2016), it is assumed that when one component reaches its capacity limit, other components of the system are recruited to support performance. Span provides a measure of the capacity of the cognitive system to perform the task, and this may reflect the use of more than one working memory component to support performance. For example, if a set of letters is presented visually for serial ordered recall, then it is well established that participants typically rely on a phonologically based representation of the letters (e.g., Conrad, 1964), assumed to involve the phonological loop. However, several studies have demonstrated that participants may also retain a representation of the visual appearance of the letters (e.g., Logie, Della Sala, Wynn, & Baddeley, 2000; Logie, Saito, Morita, Varma, & Norris, 2016). That is, span for visually presented letters may
Ó 2019 Hogrefe PublishingDistributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
involve both the phonological loop and at least some of the capacity of the visual cache. If memory for the letters is then combined with another task that involves visual memory or visual processing, not all of the capacity of the visual component would be available. So, there would be a small overall reduction in dual task compared with single task performance. This effect could be mitigated by using auditory presentation of the verbal memoranda, coupled with a visually presented, nonverbal task. This is the approach used in the experiment that we report here. However, a task designed to test visual working memory might gain support from some verbal storage or processing (e.g., names of shapes or colors, or spatial orientations), even if the main load is on a specific visual component of working memory, so there may still be a small dual task cost even when using different input modalities. Previous studies providing evidence for domain-specific components of working memory almost invariably show such a cost, interpreted as above. However, most striking is that the dual task cost, even if statistically robust, tends to be small compared to the residual levels of performance on each task when they are performed concurrently (e.g., Cocchini et al., 2002; Duff & Logie, 2001; Logie et al., 1990, 2004). Very much larger dual task costs are observed when both tasks are chosen to rely primarily on the same component of working memory (e.g., Logie et al., 1990). Based on these assumptions, the current study was designed to test distinct (preregistered) predictions concerning dual task costs, derived from different theoretical approaches to working memory, with the multi-component model predicting a small effect size associated with dual task costs, and the embedded processes model a medium to large effect size.
79
Figure 1. A trial of the staircase span/letter recall tasks.
obtained from the School of Psychology Ethics Committee, University of Leeds, UK.
Materials All tasks were written in PsychoPy 1.84 (Peirce, 2007). The code for all the tasks is available at https://osf.io/59c4g/.
Participants
Letter Span Each participant’s verbal letter span was determined using a staircase procedure. Letters were presented over headphones at a rate of one per second followed by a 8-second retention interval. Participants then orally recalled the letters in order, with the experimenter typing responses on a second hidden screen. Participants began with a pair of trials with lists of five letters. If 80% or more of the items were correctly recalled (in the correct list position), then the list length was increased by one, otherwise list length was dfecreased by one. Participants continued this procedure for eight pairs of trials. If a participant achieved over 80% of items correct on their final pair of trials, and it was the highest list length they had reached, then additional trials were presented until less than 80% of items were correctly recalled. A participant’s span was the longest list length at which 80% or more of the items were correctly recalled. Letters were randomly selected from a pool of 18 letters that excluded vowels and “y” (b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, and x) (Figure 1).
Thirty participants were tested (Mage = 22.97; range = 19–30). Participants were paid £7 for participating and had normal or corrected to normal vision. Participants would have been excluded if they had a letter span of less than three or a known cognitive difficulty (e.g., dyslexia). No participants were excluded. Ethical approval was
Letter Recall Task For the letter recall task, participants completed 20 trials in which lists of letters were presented aurally for spoken recall. The list length on all trials was set at the previously measured span for each individual participant. The timings were identical to the letter span task.
Method This study was preregistered on the Open Science Framework. The preregistration form and a time-stamped archive of the task and analysis scripts can be found at https://osf. io/e5bkg. The data, materials, and all the analysis scripts can be found at https://osf.io/59c4g/.
Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Experimental Psychology (2019), 66(1), 77–85
80
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
Results Outcome Measures
Figure 2. A trial of the orientation recall task. For the dual task, such a trial was completed in place of the 8-second retention interval of the letter recall task. Shades of gray represent different colors. Not to scale.
Orientation Recall Task Participants were presented with 3-item arrays of colored bars measuring 2 0.3° of visual angle at different orientations (see Figure 2). The colors of the three items were randomly selected from a set of eight easily distinguishable colors (red, orange, yellow, green, cyan, blue, pink, and purple). The orientations were randomly selected such that no two bars in an array were within 10° of one another. The items were presented at a random subset of eight possible locations equidistant on an invisible circle around fixation with a radius of 6° of visual angle. The study array was presented for 1 s followed by a 1-second retention interval. Following the retention interval, two colored bars were presented and participants had to indicate which of the colors was present in the first display. The target was always one of the two colors presented in this recognition phase. After participants made their response, and following a further 0.5 s delay, a bar of the same color that the participant selected was displayed in the center of the screen. Participants were required to recall the orientation of the bar of that color in the 3-item array using the “left” and “right” arrow keys on a keyboard to adjust the orientation. Each participant completed 60 trials. Dual Task For the dual task, participants were presented with an at-span list of letters and completed a trial of the orientation recall task in place of the 8-second retention interval. The letters were then recalled orally. Participants completed 60 such trials.
Design All participants completed all tasks in a single session lasting approximately 1 hr with the span task first and the dual task last. The order of the two single tasks was counterbalanced. Experimental Psychology (2019), 66(1), 77–85
Following our preregistration (https://osf.io/e5bkg), four outcome measures were used to evaluate the dual task interference effect. The first two measures were selected to reflect previous work quantifying the fidelity with which continuous or analog, features are represented in working memory (e.g., Bays, Catalao, & Husain, 2009; Gorgoraptis et al., 2011). 1. Precision: 1/circular SD of target orientations minus response orientations, corrected for guessing by subtracting the precision expected under a uniform response distribution. 2. Probability of making a target response: estimated using the mixture model described in Bays et al. (2009). 3. Color judgment accuracy: the proportion of correct absent/present color judgments for the visual task. 4. Letter recall accuracy: the proportion of items correctly recalled for the letter recall task. Code to calculate the first two outcome variables are implemented in Matlab by Paul Bays (http://www. paulbays.com/code/JV10/index.php) and has been translated into R by EDJB (https://github.com/eddjberry/ precision-mixture-model).
Confirmatory Analysis Analysis was carried out using the Bayes Factor package (Morey & Rouder, 2015) in R (R Core Team, 2016). For each outcome measure, the posterior estimates and 95% Bayesian credible interval for the mean difference between the single and dual task conditions are reported. The 95% credible interval excluding zero could be interpreted as suggesting a difference between the two conditions (Kruschke, 2014). Secondly, Bayes Factors are used to determine how likely the data are under a model assuming a small dual task interference effect versus a model assuming a medium to large effect. A small effect was defined as the interval from 0 to 0.3 for a standardized effect size. A medium to large effect is defined as the interval from 0.5 to infinity (e.g., R. D. Morey, 2014). Figure 3 shows how the prior density was distributed over each of these effect size intervals in the analysis. It is clear from Figure 3 that while the large effect interval was from 0.5 to infinity, the prior density approaches zero as effect sizes approach infinity. For the Bayes Factor analysis, values greater than 1 indicate support for a small effect over a large effect. A Bayes Factor of 5 in favor of either model was selected a priori to indicate substantive support from the data. We acknowledge that Bayes Ó 2019 Hogrefe PublishingDistributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
81
Figure 3. The two effect size intervals for the small and large effect models. As can be seen, for the large effect model the majority of the prior density was distributed over values of less than 2.
Figure 4. Violin plots for the distribution of posterior estimates of the crossmodal interference effect size. Horizontal lines represent 2.5, 50, and 97.5th quantiles.
Factors should primarily be used to inform relative plausibility of competing models but suggest their use as the basis for decision criteria can be instructive (e.g., Jeffreys, 1961). Finally, posterior estimates of the effect size for the difference between the single and dual task condition for each outcome measure are reported (see also Figure 4). The median posterior estimate for the mean difference between precision for the single (M = 0.61, SD = 0.3) and dual task (M = 0.54, SD = 0.32) conditions was 0.07 Ă&#x201C; 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
(95% credible interval [ 0.0064, 0.15]). The data were 6.29 times more likely under a model assuming a small versus a large effect. The median posterior estimate for the effect size of the change in precision between the single and dual task conditions was 0.32 (95% credible interval [ 0.027, 0.68]). For the probability of recalling the target orientation, the median estimate of the mean difference between the single (M = 0.83, SD = 0.17) and dual task (M = 0.71, SD = 0.27) Experimental Psychology (2019), 66(1), 77â&#x20AC;&#x201C;85
82
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
Table 1. The Bayes Factors (BF) in support of a small versus large dual task effect for different effect size intervals. Values lower than 1 indicate support for the embedded processes rather than the multiple component model effect size interval. Bold values indicate cases where either interval is supported by a BF of 5 Multiple component interval
Embedded processes interval
Color judgment
Precision
Probability target
Verbal accuracy
(0, 0.1)
(0.3, Inf)
4.27
1.15
0.37
0.19
(0, 0.2)
(0.3, Inf)
5.09
1.71
0.66
0.38
(0, 0.3)
(0.3, Inf)
5.50
2.28
1.04
0.66
(0, 0.1)
(0.5, Inf)
18.71
3.17
0.69
0.30
(0, 0.2)
(0.5, Inf)
22.29
4.74
1.22
0.58
(0, 0.3)
(0.5, Inf)
24.09
6.29
1.93
1.02
(0, 0.1)
(0.8, Inf)
1,002.19
85.12
8.28
2.37
(0, 0.2)
(0.8, Inf)
1,194.33
127.15
14.64
4.60
(0, 0.3)
(0.8, Inf)
1,290.44
168.92
23.17
8.03
Note. Inf = Infinity.
conditions was 0.11 (95% credible interval [0.015, 0.2]). The Bayes Factor in support of a small versus a large effect was 1.93. Finally, the median estimate for the effect size was 0.41 (95% credible interval [0.054, 0.79]). The median estimate for the difference in color judgment accuracy between the single (M = 0.91, SD = 0.072) and dual task (M = 0.9, SD = 0.06) conditions was 0.014 (95% credible interval [ 0.011, 0.039]). The Bayes Factor in support of a small versus a large effect was 24.09. The median estimate for the effect size was 0.2 (95% credible interval [ 0.16, 0.57]). The median estimate for the difference in the proportion of letters correctly recalled between the single (M = 0.79, SD = 0.12) and dual task (M = 0.76, SD = 0.13) conditions was 0.031 (95% credible interval [0.0071, 0.056]). The Bayes Factor in support of a small versus a large effect was 1.02. The median estimate for the effect size was 0.46 (95% credible interval [0.1, 0.84]). The average span for participants was 6.23 (SD = 1.05; range = 5â&#x20AC;&#x201C;9). Overall, the Bayes Factors and effect size estimates are generally in alignment, indicating some support for small rather than medium-large dual task effects, and effect size estimates of between 0.2 and 0.5.
Exploratory Analysis Given the reduction in the probability of recalling the orientation of the target bar between the single and dual task conditions, we explored whether this accompanied an increase in the probability of recalling the orientation of the other items in the array (non-targets) or guessing. The median estimate for the difference in the probability of recalling a non-target orientation between the single (M = 0.0000088, SD = 0.000018) and dual task (M = 0.000059, SD = 0.00015) conditions was 0.000045 (95% credible interval [ 0.0001, 0.0000077]). The median estimate for the difference in the probability of a uniform Experimental Psychology (2019), 66(1), 77â&#x20AC;&#x201C;85
response distribution, that is, guessing, between the single (M = 0.17, SD = 0.17) and dual task (M = 0.29, SD = 0.27) conditions was 0.11 (95% credible interval [ 0.2, 0.014]). While we feel the intervals chosen for the multiple component and embedded processes models are justified, we acknowledge others may disagree and prefer alternate intervals. To facilitate this disagreement, Table 1 replicates the Bayes Factor analysis used for the confirmatory analysis by varying the interval chosen to represent the two models. As the upper limit of the multiple component model interval is reduced, the evidence in favor of that model reduces. We have also created a Shiny web application (Chang, Cheng, Allaire, Xie, & McPherson, 2017) where readers are able to select their own intervals for the two models available at https://edjberry.shinyapps.io/BF_intervals/. Finally, the analysis was rerun using only those trials where verbal accuracy was 100%. This criterion is commonly used when evaluating dual task effects (e.g., Morey & Cowan, 2004). The mixture model could not be used for this subset as there were insufficient trials for the model to converge. Nevertheless, the analyses of precision and color recall accuracy could still be carried out. For precision, the Bayes Factor in support of a small versus a large effect was 323.9. The median estimate for the dual task effect size was 0.04 (95% credible interval [ 0.30, 0.38]). For color recall accuracy, the small effect model was supported by a Bayes Factor of 1,994.5 (median effect size estimate: 0.13; 95% credible interval [ 0.49, 0.22]).
Discussion This study investigated the magnitude of the cross-modal interference effect for concurrently remembering verbal (letter sequences) and visual information (orientation of colored bars). Bayes Factor analyses supported the prediction that there is a small, rather than a large, reduction in Ă&#x201C; 2019 Hogrefe PublishingDistributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
the precision with which items are represented in visual working memory when required to concurrently maintain verbal information. This analysis also supported the prediction that there would be a small reduction in the accuracy of recalling categorical color information for visual items. For the probability of recalling the target orientation, the data were more likely under a small effect model but failed to meet our a priori cut-off. The results for the letter recall were also indeterminate with respect to our two predictions. Thus, while not all our measures reached the preregistered cut-off for providing evidence to differentiate between the two models, on balance our results provide more support for multiple component accounts of WM and little clear support for a domain-general resource account of storage in WM. The exploratory analyses showed that the reduction in the probability of recalling the target orientation in the dual task condition resulted in an increased probability of making a uniform response (i.e., guessing), rather than participants being more likely to recall a non-target orientation. Thus, a verbal load does not appear to increase the probability of mis-binding errors due to item features interfering in visual WM. When discussing dual task effects, it is useful to distinguish between perceptual-motor and cognitive processes (e.g., Thalmann & Oberauer, 2016). Dual task costs are typically largest, both within and between modalities, when the interfering tasks involve overlapping perceptual-motor processing (Thalmann & Oberauer, 2016). For example, one would expect interference to be greater if both tasks require participants to make verbal responses. In contrast, dual task interference is generally smaller when the two tasks share only cognitive processes. This distinction fits nicely with our results as we ensured there was no overlap in perceptual-motor processes for our tasks. With our verbal task, stimuli were presented aurally and responses were spoken. For the visual task, stimuli were presented visually with manual responses. Therefore, our results are not contaminated by within-modality perceptual-motor interference inflating supposed cross-modal interference effects. This distinction is important where we want to isolate our inquiry to cognitive processes distinct from the attentional bottlenecks at input or during response output. One possible limitation of this work could be that 3-item visual arrays are insufficient to capture most or all of a domain-general storage capacity in working memory. This would leave additional capacity to maintain verbal items, resulting in the small dual task costs we observe. However, the difficulty of the verbal task was set individually at each participant’s measured span precisely to address this concern. It was assumed that participants would use any domain-general storage capacity, in addition to passive storage, to maximize performance on the letter span task. Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
83
This domain-general capacity would then not be available to the same extent in the dual task condition, resulting in a large deterioration in performance. While there was a reduction in performance, this did not meet our preregistered decision rule. This highlights the need for the field to focus on effect size predictions rather than simply whether effects are observed or not. We did not titrate the visual task given the difficulty of doing so with a continuous (analog) response task: performance can only be evaluated from a large number of trials meaning that any staircase procedure would be prohibitively long to complete. The exploratory analysis reported in Table 1 illustrates how the strength of evidence can shift when the small and large effect size interval are varied. The evidence for a small dual task effect ranged from “decisive” to “barely worth mentioning” (Jeffreys, 1961), depending on the particular pair of intervals that were selected. However, on balance, the preregistered intervals indicate some support for small dual task effects and no evidence for large effects. The Shiny app we have created allows readers to choose their own pair of intervals and see how the results are affected (see Table 1). Although the outcomes of this study somewhat favor a multiple component approach, they do not decisively decide between this and a general resource account of WM. Challenges remain in identifying effective methodological tools to cleanly distinguish between theoretical accounts. For example, the overall pattern of relatively small dual task costs observed in the present study might be captured by appealing to alternative distinctions between different forms of storage (namely the focus of attention and activated LTM) described within embedded processes accounts. Thus, the visual task could be accomplished via the focus of attention, while the letter stimuli in the verbal serial recall task are held in activated LTM. However, this speculative explanation is unlikely, not least given the recent persuasive arguments by Norris (2017) that LTM does not provide a plausible way of retaining serial order information. More broadly, it is notable that Cowan, Saults, & Blume (2014) modified the original Cowan (2005) theoretical framework by arguing for a peripheral component of working memory that functions like the phonological loop in the multiple component models and is separate from the focus of attention. This suggests that the embedded processes and the multi-component accounts might be starting to resolve their differences (see Baddeley, 2012; Gray et al., 2017; Hu, Hitch, Baddeley, Zhang, & Allen, 2014). The two accounts could still be contrasted by increasing the number of to-be-remembered items in the visual task. At least, one multiple component account (Duff & Logie, 2001; Logie, 2011; Logie et al., 2004) explains dual task interference by suggesting that a fixed amount of general
Experimental Psychology (2019), 66(1), 77–85
84
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
processing resource is required when completing two tasks simultaneously. This means the magnitude of the interference effect should remain constant under an increase in the overall demand of the two tasks. Some evidence for this was reported by Logie et al. (2004) but with small sample sizes in a study focused on contrasting healthy aging with Alzheimer’s disease. Crucially, in that study, the visual processing task involved following a moving target around a computer screen, so engaged perceptuo-motor processing load rather than memory for visual items. As noted earlier, memory for visual material may be supplemented by using verbal codes. For example, in the current experiment, approximate orientations might be coded as points on a compass (north, northeast, southwest, etc.). This might help explain why some of our measures (namely probability of target orientation recall and letter recall) did not reach the preregistered cut-off for small dual task effects. Future studies could adopt different methodologies, for example, using difficult-to-name colors or different shades of the same color as the visual memoranda, which could further minimize the potential contribution of verbal coding. If the storage of verbal and visual items is dissociable, then increasing the number of visual items should simply result in more visual information being forgotten in both the single and dual task conditions. This would not be affected by the imposition of verbal load over and above the cost associated with performing the two tasks that we observe here. On the other hand, if a domain-general storage capacity supports storage in WM, then the dual task cost should be larger when the number of items for the visual task is increased. If a larger total number of items draws on a shared storage capacity, the reduction in performance under dual task load should be more pronounced, supporting a domain-general account of WM. Future work could also investigate whether other factors that affect dual task interference, such as verbal rehearsal of the concurrent task (Morey & Cowan, 2005), generalize to precision measures. This study represents an attempt to quantify the magnitude of dual task costs that emerge between verbal and (continuous and categorical) visual memory and compare these against preregistered predictions derived from multi-component and embedded processes accounts of working memory. Overall, three of our measures produced evidence for small dual task costs as predicted by the multi-component approach, while the remaining two were equivocal and did not reach our preregistered criteria for either small or medium to large effects. The outcomes of this study, as well as the opportunity to contrast how the adoption of differing effect size intervals provide shifting evidence for different models, should connect to and inform the ongoing movement to more robustly and transparently test theoretical accounts of working memory. Experimental Psychology (2019), 66(1), 77–85
References Baddeley, A. D. (1992). Working memory. Science, 255, 556–559. Baddeley, A. D. (2012). Working memory: Theories, models, and controversies. Annual Review of Psychology, 63, 1–29. https:// doi.org/10.1146/annurev-psych-120710-100422 Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4, 417–423. http://doi.org/10.1016/S1364-6613(00)01538-2 Baddeley, A. D., & Hitch, G. (1974). Working memory. Psychology of Learning and Motivation, 8, 47–89. https://doi.org/10.1016/ S0079-7421(08)60452-1 Bays, P. M., Catalao, R. F. G., & Husain, M. (2009). The precision of visual working memory is set by allocation of a shared resource. Journal of Vision, 9, 7.1–7.11. https://doi.org/ 10.1167/9.10.7 Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J. (2018). shiny: Web application framework for R. R package version 1.2.0. Retrieved from https://CRAN.R-project.org/package=shiny Cocchini, G., Logie, R. H., Della Sala, S., MacPherson, S. E., & Baddeley, A. D. (2002). Concurrent performance of two memory tasks: Evidence for domain-specific working memory systems. Memory & Cognition, 30, 1086–1095. https://doi.org/10.3758/ BF03194326 Conrad, R. (1964). Acoustic confusions in immediate memory. British Journal of Psychology, 55, 75–84. https://doi.org/ 10.1111/j.2044-8295.1964.tb00899.x Cowan, N. (2005). Working memory capacity,. New York, NY: Psychology Press. Cowan, N., Saults, J. S., & Blume, C. L. (2014). Central and peripheral components of working memory storage. Journal of Experimental Psychology General, 143, 1806–1836. https://doi. org/10.1037/a0036814 Doherty, J. M., & Logie, R. H. (2016). Resource-sharing in multiplecomponent working memory. Memory & Cognition, 44, 1157–1167. https://doi.org/10.3758/s13421-016-0626-7 Duff, S. C., & Logie, R. H. (2001). Processing and storage in working memory span. The Quarterly Journal of Experimental Psychology, 54A, 31–48. http://doi.org/10.1080/02724980042000011 Farmer, E., Berman, I., & Fletcher, Y. (1986). Evidence for a visuospatial scratch-pad in working memory. The Quarterly Journal of Experimental Psychology, 38A, 675–688. https://doi.org/ 10.1080/14640748608401620 Gorgoraptis, N., Catalao, R. F. G., Bays, P. M., & Husain, M. (2011). Dynamic updating of working memory resources for visual objects. Journal of Neuroscience, 31, 8502–8511. https://doi. org/10.1523/JNEUROSCI.0208-11.2011 Gray, S., Green, S., Alt, M., Hogan, T., Kuo, T., Brinkley, S., & Cowan, N. (2017). The structure of working memory in young children and its relation to intelligence. Journal of Memory and Language, 92, 183–201. https://doi.org/10.1016/j.jml.2016. 06.004 Hu, Y., Hitch, G. J., Baddeley, A. D., Zhang, M., & Allen, R. J. (2014). Executive and perceptual attention play different roles in visual working memory: Evidence from suffix and strategy effects. Journal of Experimental Psychology: Human Perception and Performance, 40, 1665–1678. https://doi.org/10.1037/ a0037163 Jeffreys, H. (1961). Theory of probability. Oxford, UK: Oxford University Press. Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R. JAGS, and Stan. Cambridge, MA: Academic Press. Logie, R. H. (1986). Visuo-spatial processing in working memory. The Quarterly Journal of Experimental Psychology, 38A, 229–247. https://doi.org/10.1080/14640748608401596
Ó 2019 Hogrefe PublishingDistributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
Ed D. J. Berry et al., The Effect of a Verbal Concurrent Task on Visual Precision
Logie, R. H. (1995). Visuo-spatial working memory. Hove, UK: Erlbaum. Logie, R.H. (2011). The functional organisation and the capacity limits of working memory. Current Directions in Psychological Science, 20(4), 240–245. https://doi.org/ 10.1177/0963721411415340 Logie, R. H. (2016). Retiring the central executive. The Quarterly Journal of Experimental Psychology, 69, 2093–2109. https:// doi.org/10.1080/17470218.2015.1136657 Logie, R. H., Cocchini, G., Della Sala, S., & Baddeley, A. D. (2004). Is there a specific executive capacity for dual task coordination? Evidence from Alzheimer’s disease. Neuropsychology, 18, 504–513. https://doi.org/10.1037/0894-4105.18.3.504 Logie, R. H., Della Sala, S., Wynn, V., & Baddeley, A. D. (2000). Visual similarity effects in immediate verbal serial recall. The Quarterly Journal of Experimental Psychology, 53A, 626–646. https://doi.org/10.1080/713755916 Logie, R. H., Saito, S., Morita, A., Varma, S., & Norris, D. (2016). Recalling visual serial order for verbal sequences. Memory & Cognition, 44, 590–607. https://doi.org/10.3758/s13421-0150580-9 Logie, R. H., Zucco, G., & Baddeley, A. D. (1990). Interference with visual short-term memory. Acta Psychologica, 75, 55–74. https://doi.org/10.1016/0001-6918(90)90066-O Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal Lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49–100. https://doi.org/10.1006/cogp.1999.0734 Morey, C. C. (2018). The case against specialized visual-spatial short-term memory. Psychological Bulletin, 144, 849–883. https://doi.org/10.1037/bul0000155 Morey, C. C., & Cowan, N. (2004). When visual and verbal memories compete: Evidence of cross-domain limits in working memory. Psychonomic Bulletin & Review, 11, 296–301. https:// doi.org/10.3758/BF03196573 Morey, C. C., & Cowan, N. (2005). When do visual and verbal memories conflict? The importance of working-memory load and retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 703–713. https://doi.org/10.1037/ 0278–7393.31.4.703 Morey, C. C., Morey, R. D., van der Reijden, M., & Holweg, M. (2013). Asymmetric cross-domain interference between two working memory tasks: Implications for models of working memory. Journal of Memory and Language, 69, 324–348. https://doi.org/10.1016/j.jml.2013.04.004 Morey, R. D. (2014). Bayes factor t tests, part 2: Two-sample tests,. [Blog post]. Retrieved from http://bayesfactor.blogspot. co.uk/2014/02/bayes-factor-t-tests-part-2-two-sample.html Morey, R. D., & Rouder, J. N. (2015). BayesFactor: Computation of Bayes Factors for common designs,. Retrieved from https:// cran.r-project.org/package=BayesFactor Norris, D. (2017). Short-term memory and long-term memory are still different. Psychological Bulletin, 143, 992–1009. https:// doi.org/10.1037/bul0000108
Ó 2019 Hogrefe Publishing Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001
85
Peirce, J. W. (2007). PsychoPy-psychophysics software in Python. Journal of Neuroscience Methods, 162, 8–13. https://doi.org/ 10.1016/j.jneumeth.2006.11.017 Pertzov, Y., Heider, M., Liang, Y., & Husain, M. (2015). Effects of healthy ageing on precision and binding of object location in visual short term memory. Psychology and Aging, 30, 26–35. https://doi.org/10.1037/a0038396 R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.r-project.org/ Thalmann, M., & Oberauer, K. (2016). Domain-specific interference between storage and processing in complex span is driven by cognitive and motor operations. The Quarterly Journal of Experimental Psychology, 0218, 1–18. https://doi.org/10.1080/ 17470218.2015.1125935 Zokaei, N., Burnett Heyes, S., Gorgoraptis, N., Budhdeo, S., & Husain, M. (2015). Working memory recall precision is a more sensitive index than span. Journal of Neuropsychology, 9, 319– 329. http://doi.org/10.1111/jnp.12052
History Received January 18, 2018 Revision received July 6, 2018 Accepted August 1, 2018 Published online February 19, 2019 Funding This work was supported by a Study Visit grant to Ed D. J. Berry from the Experimental Psychology Society, UK. Robert H. Logie’s research is funded by a grant from the UK Economic and Social Research Council, number ES/N010728/1. Some of the experimental tasks were adapted, with permission, from code written by Jason M. Doherty and Stephen Rhodes for the “Working memory across the adult lifespan: An adversarial collaboration” project (http://womaac.psy.ed.ac.uk/). Open Data All tasks were written in PsychoPy 1.84 (Peirce, 2007). The code for all the tasks is available at https://osf.io/59c4g/. ORCID Ed D. J. Berry https://orcid.org/0000-0003-3456-4122 Ed D. J. Berry School of Psychology University of Leeds Lifton Place Leeds, West Yorkshire LS2 9JT UK eddjberry@gmail.com
Experimental Psychology (2019), 66(1), 77–85
Short Research Article
Affective Influence on ContextSpecific Proportion Congruent (CSPC) Effect Neutral or Affective Facial Expressions as Context Stimuli Jinhui Zhang , Andrea Kiesel, and David Dignath Department of Psychology, University of Freiburg, Germany
Abstract: Congruency effects diminish in contexts associated with mostly incongruent trials compared with contexts associated with mostly congruent trials. Here, we aimed to assess affective influences on this context-specific proportion congruent (CSPC) effect. We presented either neutral or affective faces as context stimuli in a Flanker task and associated mostly incongruent trials with male/female faces for a neutral-context group and with angry/happy faces for a affective-context group. To assess general influences of affective valence, we compared CSPC effects between the neutral-context group and the affective-context group. To assess valence-specific influences, we compared the size of CSPC effects – for the affective-context group only – between participants for whom mostly incongruent trials were associated with angry faces and participants for whom mostly incongruent trials were associated with happy faces. However, the modulating influence on the CSPC effect from affective versus neutral contexts or from valence-proportion mappings was not statistically significant. Keywords: cognitive control, affective valence, Flanker task, context-specific proportion congruent effect
The Context-Specific Proportion Congruent (CSPC) Effect
Cognitive control is necessary when people face competing response tendencies. For example, when a person who plans to lose weight has to choose between high caloric, tasty food, and fat-free, but less tasty food, cognitive control helps him or her to behave in a goal-oriented way. Typical protocols to investigate cognitive control in the laboratory include so-called interference tasks such as Stroop, Flanker, or Simon tasks (see Egner, 2008, for a brief introduction of these tasks). For instance, in a color-word Stroop task, participants respond to the ink color of words while ignoring the semantic meaning of these color words. Typically, congruent combinations (e.g., RED in red ink), in which ink color and word meaning match, are responded to faster and with higher accuracy compared to incongruent combinations (e.g., RED in blue ink) in which ink color and word meaning do not match. The difference in reaction time (RT) and error rate between incongruent trials and congruent trials is termed congruence effect (CE), which provides an index of cognitive control: A smaller CE signals more cognitive control. Experimental Psychology (2019), 66(1), 86–97 https://doi.org/10.1027/1618-3169/a000436
Recent findings suggest that cognitive control can be highly context-sensitive. For instance, Crump, Gong, and Milliken (2006) used a version of the Stroop task, in which participants classified a color patch following an irrelevant prime color word. They manipulated the proportion of congruent to incongruent trials depending on locations (i.e., contextual cues). For example, while presentation of a color patch in the upper part of the screen comprised mostly congruent combinations (e.g., a blue patch followed BLUE), presentation of a color patch in the lower part of the screen comprised mostly incongruent combinations (e.g., a red patch followed BLUE). Results revealed that even though the overall proportion of congruent to incongruent trials across both locations was balanced, the biased proportions of congruent to incongruent trials at each location had a strong influence on the magnitude of CEs: The CE at the mostly-congruent location (i.e., the location associated with mostly congruent trials) was larger than the CE at the mostly-incongruent location (i.e., the location associated with mostly incongruent trials). This context-specific proportion congruent (CSPC) effect has been reproduced with other contextual cues such as shape (Crump, Vaquero, & Milliken, 2008), Ó 2019 Hogrefe Publishing
J. Zhang et al., Affective Influence on CSPC Effect
color (Lehle & Hübner, 2008), foreperiod (Wendt & Kiesel, 2011), and semantic categories (Cañadas, Lupiáñez, Kawakami, Niedenthal, & Rodríguez-Bailón, 2016; Cañadas, Rodríguez-Bailón, Milliken, & Lupianez, 2013). The CSPC effect suggests that different control settings are applied in different contexts. Because participants are not explicitly informed about the proportion manipulation, researchers assume that participants build up association between contextual cues and control settings (e.g., a control setting used on incongruent trials becomes bound to the mostly-incongruent context). Once this association has been learned, encountering a context stimulus acts as a retrieval cue that allows participants to recall the associated control setting (Crump, 2016). Therefore, the CSPC effect indicates not just learning of control settings and contextual cues, but also the implementation and retrieval of these control settings. Alternatively, it has been suggested that the CSPC effect might be driven by feature binding or contingency learning (e.g., Schmidt, Lemercier, & Houwer, 2014). However, the CSPC effect has been also reproduced with unbiased items, of which the proportion of congruent to incongruent trials was 1:1 (e.g., Crump & Milliken, 2009). Such a transfer effect (reflected by a CSPC effect in the unbiased items) suggests that contextual cues are associated with abstract control settings rather than simple stimulus-response mappings. In addition, generalization of control to inconsistent members or new members of a category provides further evidence that the CSPC effect cannot be attributed to specific context-stimulus-response bindings (Cañadas et al., 2013, 2016; see also Weidler & Bugg, 2016).
Affective General Influence on the CSPC Effect: Affective Versus Neutral Context Stimuli In this research, we are particularly interested in affective influences on cognitive control. There is ample evidence that emotion affects cognitive control (e.g., Hart, Green, Casp, & Belger, 2010; Kanske & Kotz, 2011). While in most previous studies researchers focused on affective influences on the CE, our aim is to assess affective influences on the CSPC effect. We will specifically investigate general influences on the CSPC effect from affective context stimuli compared with neutral context stimuli. Based on recent theories, two opposing hypotheses can be formulated. On the one hand, Pessoa (2009) proposed that task-irrelevant affective stimuli usually divert mental resources needed for cognitive control away from task-relevant stimuli and thus impair performance. Accordingly, we predict that affective context stimuli compared with neutral ones impair the implementation or retrieval of control Ó 2019 Hogrefe Publishing
87
states. Therefore, the size of the CSPC effect should become smaller for affective context relative to neutral context stimuli. We dub this prediction the affectiveimpairment hypothesis. On the other hand, Verguts and Notebaert (2009) proposed that cognitive control effects are rooted in associative learning and that the conflict induced by an incongruent combination (e.g., RED in blue ink) triggers an arousal response which then facilitates associative learning. According to this view, arousal-inducing affective stimuli compared with neutral stimuli should facilitate the learning of context-control associations and therefore increase the CSPC effect. We term this prediction the affectivefacilitation hypothesis.
Affective Valence-Specific Influence on the CSPC Effect: Positive Versus Negative Affect In addition to affective general influences, there are also reasons to assume more valence-specific influences on the CSPC effect. For instance, several theoretical accounts suggested that conflict is experienced as a negative affective state (Dignath & Eder, 2015; Dreisbach & Fischer, 2012) and that this negative signal is used to guide control adjustment (Botvinick, 2007; Dreisbach & Fischer, 2015; Inzlicht, Bartholow, & Hirsh, 2015). Based on this idea, it has been hypothesized that a negative signal triggered by conflict in incongruent trials might be counteracted by a positive signal induced by the affective context, which then decreases the degree of cognitive control on incongruent trials (Van Steenbergen, Band, & Hommel, 2009). Consistent with this assumption, findings have shown that the induction of positive affect via task-irrelevant affective stimuli impairs cognitive control (e.g., Van Steenbergen et al., 2009; Van Steenbergen, Band, & Hommel, 2012; but see Dignath, Janczyk, & Eder, 2017). According to this affective-signal neutralization hypothesis, we predict that the CSPC effect should decrease when mostly incongruent trials are paired with positive context stimuli (compared with a situation when mostly incongruent trials are paired with negative context stimuli). However, there is also evidence suggesting the other way around: Conflict stimuli in a negative context failed to trigger control adaptation whereas conflict stimuli in a neutral/positive context were effective (Dreisbach, Reindl, & Fischer, 2016; Fritz, Fischer, & Dreisbach, 2015). In the CSPC paradigm, task-irrelevant affect could serve as a background against which the actual conflict signal becomes relatively weaker (e.g., against a negative background) or relatively stronger (e.g., against a positive background). Indeed, research on “hedonic contrast” effects has shown Experimental Psychology (2019), 66(1), 86–97
88
that affective evaluations of stimuli are not absolute but relative to evaluations of alternative stimuli or “background” affect (Eder & Dignath, 2014; Larsen & Norris, 2009). For this affective contrast hypothesis, we assume that the size of the CSPC effect becomes smaller when mostly incongruent trials are paired with negative context stimuli (compared with a situation when mostly incongruent trials are paired with positive context stimuli).
Previous Evidence for Affective Influences on the CSPC Effect Currently, there is no empirical evidence of affective general influences on the CSPC effect. However, two published studies tested affective valence-specific influences on the CSPC effect (Cañadas et al., 2016; Dreisbach et al., 2016). Dreisbach et al. (2016) used locations (i.e., stimuli presented above or below the center of a screen) as contextual cues in a Simon task. Within a typical CSPC design, mostly incongruent trials were presented above (below) while mostly congruent trials were presented below (above). As a result, a significant CSPC effect was observed only when mostly incongruent trials were presented in the upper location. Based on the assumption that the lower location is associated with more negative affect relative to the upper location, the finding of Dreisbach et al. (2016) supports the affective contrast hypothesis. A study by Cañadas et al. (2016) manipulated valence more directly by using pictures of facial expressions (including angry faces and happy faces) as context stimuli in an arrow Flanker task. In a CSPC design, angry (happy) faces were paired with mostly incongruent trials while happy (angry) faces were paired with mostly congruent trials. However, Cañadas et al. (2016) observed no valence-specific influences on the size of CSPC effects.
Current Experiment In the current experiment, we adopted the more direct way of affective manipulation used by Cañadas et al. (2016), to provide a further test of valence-specific influences on the CSPC effect. In addition, we also aimed to assess whether affect in general (positive and negative) influences the CSPC effect. Specifically, we used pictures of facial expressions as context stimuli in a letter Flanker task (for evidence of CSPC effects with facial expressions as context stimuli, see Cañadas et al., 2013, 2016). In the Flanker task, participants identified a central target letter accompanied by four lateralized distracting letters. The context stimuli were either pictures of neutral male faces and neutral female Experimental Psychology (2019), 66(1), 86–97
J. Zhang et al., Affective Influence on CSPC Effect
faces (the neutral-context group) or pictures of angry faces and happy faces (the affective-context group). The neutralcontext group consisted of a male-high-conflict group, for which neutral male faces were paired with mostlyincongruent trials, and a female-high-conflict group for which neutral female faces were paired with mostlyincongruent trials. Similarly, the affective-context group consisted of a negative-high-conflict group, for which angry faces were paired with mostly-incongruent trials, and a positive-high-conflict group, for which happy faces were paired with mostly-incongruent trials. To probe affect-general influences on the CSPC effect, we compared the CSPC effects between the neutral-context group and the affectivecontext group. To probe affective valence-specific influences on the CSPC effect, we compared the CSPC effects (for the affective-context group only) between the negative-high-conflict condition and the positive-high-conflict condition. Regarding the affective general influence on the CSPC effect, we hypothesized that if CSPC effects were larger for the neutral-context group than for the affective-context group, this would speak in favor of the affective-impairment hypothesis and suggest that cognitive control is disrupted by affective context stimuli. Instead, if CSPC effects were larger for the affective-context group, this would be more consistent with the affective-facilitation hypothesis and suggest that the arousal induced by affective context stimuli facilitates cognitive control. Regarding the valence-specific influence on the CSPC effect, we hypothesized that if CSPC effects were decreased for the pairing of mostly-incongruent contexts with positive affect (i.e., happy faces), this would be consistent with the affective-signal neutralization hypothesis. Consequently, such pattern of results would support the idea that the negative signals triggered by conflict are crucial for control adjustment and that positive signals induced by affective contexts diminish control. Instead, if CSPC effects were reduced for the pairing of mostly-incongruent contexts with negative affect, this would be consistent with an affective contrast hypothesis, in support of the idea that aversive conflict signals become less effective in negative contexts relative to positive contexts. Furthermore, as explained above in more detail, it has been claimed that different control mechanisms fed into the CSPC effect. Therefore, we used biased versus unbiased items, since the CSPC effect for biased items has been attributed to stimulus-response learning (Schmidt et al., 2014), while the CSPC effect for unbiased items is assumed to reflect a more pure measure of attentional control processes (see Crump & Milliken, 2009). Although it was not our intention to disentangle these two (not mutual exclusive, see Egner, 2014) accounts, we included both items types for exploratory reasons. Ó 2019 Hogrefe Publishing
J. Zhang et al., Affective Influence on CSPC Effect
Method Participants Using the CSPC effect size (ηp2 = .17) observed by Cañadas et al. (2016, Experiment 1), we estimated that 42 participants are required for power of .8 to reproduce the CSPC effect for biased items in a similar design. Ninety-six volunteers (24 men, Mage = 24.4 years, age range: 19–45 years) participated for €7 or course credit. Each participant was quasi-randomly assigned to two groups including a neutral-context group (N = 48), for which pictures of neutral female and neutral male faces served as context stimuli, and an affective-context group (N = 48), for which pictures of angry faces and happy faces served as context stimuli. For half of the neutral-context group (n1 = 24), neutral female faces were used as context stimuli and paired to mostly-incongruent trials (female-high-conflict group). For the other half of the neutral-context group (n2 = 24), neutral male faces were used as context stimuli and paired with mostly-incongruent trials (male-high-conflict group). In addition, the affective-context group was further divided into two subgroups. For half of the participants in the affective-context group (n3 = 24), angry faces were used as context stimuli and paired with mostly-incongruent trials (negative-high-conflict group), while for the other half of participants of the affective-context group (n4 = 24) happy faces were used as context stimuli and paired with mostlyincongruent trials (positive-high-conflict group). One participant of the negative-high-conflict group did not finish the experiment due to technical problems. Furthermore, data of one participant in the negative-high-conflict group were excluded because of an exceptionally high error rate (M = 33%) deviating from the mean error rate of the overall sample (M = 9%) more than 3 SD. As a result, the female-, male-, positive-, and negative-high-conflict groups included, respectively, data of 24, 24, 24, and 22 participants that were further analyzed. Participants were all right-handed, had normal or corrected-to-normal vision, and reported no color blindness.
Apparatus and Stimuli The experiment was programmed with E-Prime 2.0 and presented on 1,920 1,080 LCD monitors (IIYAMA GB2488HSU-B1). Pictures of two female and two male characters were selected from Karolinska Directed Emotional Faces (KDEF; Lundqvist, Flykt, & Öhman, 1998) and resized to 479 650 pixels. For the neutral-context group, context stimuli were pictures of neutral facial expressions (female face: AF14NES and AF01NES; male face: AM09NES and AM11NES). For the affective-context group, context stimuli were pictures of affective facial expressions Ó 2019 Hogrefe Publishing
89
of female characters (happy face: AF01HAS and AF14HAS; angry face: AF01ANS and AF14ANS). The average arousal of neutral expressions on a 9-point scale (1 = calm, and 9 = aroused) was 2.52, and the average arousal of affective expressions was 3.65 (for ratings of arousal, see Goeleven, De Raedt, Leyman, & Verschuere, 2008). Besides, pictures of happy faces (AF01HAS, AF14HAS, AM09HAS, and AM11HAS) were used as context stimuli in a practice block for the neutral-context group, and pictures of neutral expressions (AF01NES and AF14NES) were used as context stimuli in the practice block for the affectivecontext group. For the Flanker task stimuli ([225, 0, 0]; Arial, 36), the letters H and S composed biased items (congruent combination: HHHHH and SSSSS; incongruent combination: HHSHH and SSHSS); E and A composed unbiased items (congruent combinations: EEEEE and AAAAA; incongruent combinations: EEAEE and AAEAA). Each Flanker task stimuli extended 1 cm in height and 4.5 cm in width. Responses were collected with a standard QWERTZ keyboard with the keys Y and M. Mappings of responses and stimuli were counterbalanced across participants whereby each response key was always mapped to one biased and one unbiased item. The background remained black during the entire experiment.
Procedure Participants were instructed to respond always to the central letter as quickly and correctly as possible and were not informed about the proportion manipulation. Each trial began with a centered picture of a face. After 600 ms, Flanker task stimuli were presented centrally, superimposed on the face with an SOA of 100 ms between the distractor letters and the central letter to maximize the Flanker congruence effect (see, e.g., Wendt, Kiesel, Geringswald, Purmann, & Fischer, 2014 for a similar procedure). Flanker stimuli disappeared after 200 ms, while the picture of the face remained on the screen until response (maximum: 1,200 ms from the onset of the central target letter). If there was no response registered within the response window, participants received feedback (“Bitte schneller! [too slow]”) for 500 ms; if participants responded incorrectly, they received feedback (“Fehler! [error]”) for 500 ms together with an acoustic error signal. The next trial started after a variable inter-trial-interval between 1,000 and 1,500 ms. The trial procedure was identical in all blocks. There were 13 blocks of 80 trials each (as shown in Table 1). Block 1 was considered practice, in which participants learned the mappings between stimuli and response keys. Blocks 2–5 were a training phase to foster learning between control settings and context stimuli (Dreisbach et al., 2016; Lehle & Hübner, 2008), in which mostly-incongruent Experimental Psychology (2019), 66(1), 86–97
90
J. Zhang et al., Affective Influence on CSPC Effect
Table 1. The number of trials of each combination per block. In the leftmost column are the proportions congruent (i.e., proportions of congruent trials) associated with contexts. Context stimuli in Block 1 were equally associated with congruent and incongruent combinations, so the corresponding proportion congruent is 50%; Blocks 2 and 5 consisted of mostly-incongruent contexts, so the corresponding proportion congruent is 20%; Blocks 3 and 4 consisted of mostly-congruent contexts, so the proportion congruent stated is 80%; Blocks 6–13 consisted of both mostlyincongruent contexts and mostly-congruent contexts. Biased item
Unbiased item
Congruent combination Proportion congruent
HHHHH
Incongruent combination
SSSSS
SSHSS
Congruent combination
Incongruent combination
HHSHH
EEEEE
AAAAA
AAEAA
EEAEE
10
10
10
10
10
4
4
4
4
4
4
4
4
4
Block 1 50%
10
10
10
Blocks 2 and 5 20%
4
4
28
28
Blocks 3 and 4 80%
28
28
4 Blocks 6–13
20%
2
2
14
14
2
2
2
2
80%
14
14
2
2
2
2
2
2
contexts and mostly-congruent contexts were, respectively, paired with 20% and 80% congruent trials. Each training block consisted of either mostly-incongruent contexts or mostly-congruent contexts only (blocked context presentation or list-wide proportion congruency effect, Logan & Zbrodoff, 1979). Blocks 6–13 were the test phase in which context stimuli were presented randomly and equally often. During the training and the test phases, biased items were four times as often as unbiased items (see Table 1 for details). For biased items, the proportion of congruent to incongruent trials differed between contexts, with mostly-congruent contexts paired with 87.5% congruent trials and mostly-incongruent contexts paired with 12.5% congruent trials. For unbiased items, the proportion of congruent to incongruent trials was 1:1 in both contexts. This resulted in an overall proportion of 80% congruent trials for mostly-congruent contexts and 20% congruent trials for mostly incongruent contexts.
Analysis To probe the valence-specific influence on the CSPC effect, correct RTs (and error rates) in the test blocks of the affective-context group were submitted into a 2 (Proportion Congruent: 20% vs. 80%) 2 (Congruence: incongruent vs. congruent) 2 (Valence of Mostly-Incongruent Context: positive vs. negative) mixed factors analysis of variance (ANOVA), with valence of mostly-incongruent context as a between-subjects factor and proportion congruent and congruency as within-subjects factors. Biased items and unbiased items were analyzed separately with identical ANOVAs. To probe the affective general influence on the Experimental Psychology (2019), 66(1), 86–97
CSPC effect, correct RTs (and error rates) in the test blocks of both the affective-context and the neutral-context groups were submitted into a 2 (Proportion Congruent: 20% vs. 80%) 2 (Congruence: incongruent vs. congruent) 2 (Context Type: neutral vs. affective) mixed factors ANOVA, with context type as a between-subjects factor and proportion congruent (i.e., the proportion of congruent trials associated with contexts) and congruency as within-subjects factors. Data of the training blocks were analyzed in the same method as in the test blocks. Data were analyzed with IBM SPSS Statistics 22 and R version 3.3.3 (https://www.r-project.org). Significance criterion was 0.05 for the analysis of valence-specific effects and adjusted to 0.025 for the analysis of valence-general effects due to multiple testing on the same data set. Before analysis of RTs, the first trial in each block (1.25%), error trials (8.35%), post-error trials (7.25%), and trials with RTs above or below 3 SDs from the cell mean for each condition (calculated separately for each participant, 1.25%) were discarded. Before analysis of error rates, the first trial in each block and post-error trials were excluded. Additionally, Bayes factors for the null hypothesis were computed (BF01; Rouder, Speckman, Sun, Morey, & Iverson, 2009) for all theoretically relevant comparisons of CSPC effects (affective general influence: affective context vs. neutral context; affective valence-specific influence: negative mostly-incongruent context vs. positive mostly-incongruent context). A BF01 larger than 3 is considered to provide sufficient or positive evidence for the null hypothesis that the CSPC effect does not differ between groups (for more details, see Jarosz & Wiley, 2014). To calculate the BF01, r was set to 1 (for more details, see Rouder et al., 2009). Ó 2019 Hogrefe Publishing
J. Zhang et al., Affective Influence on CSPC Effect
Results Training Block Affective Valence-Specific Influence Biased Items For RTs, the CE was significant, F(1, 44) = 266.69, p < .001, ηp2 = .858. The mean RT for incongruent trials (487 ms) was longer than that for congruent trials (417 ms). The main effect of proportion congruent was significant, F(1, 44) = 5.30, p = .026, ηp2 = .108, with a longer mean RT for mostly-incongruent contexts (456 ms) than for mostly-congruent contexts (448 ms). The main effect of valence of mostly-incongruent context was not significant, F(1, 44) = 0.40, p = .533. The CE was modulated by proportion congruent, F(1, 44) = 100.65, p < .001, ηp2 = .696. The CE in mostly-incongruent contexts (36 ms) was smaller than that in mostly-congruent contexts (105 ms). The two-way interaction effect of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.36, p = .550, and F(1, 44) = 0.37, p = .544, respectively. The three-way interaction of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.16, p = .687, BF01 = 4.25. For error rates, the main effects of proportion congruent and congruence were significant, F(1, 44) = 16.82, p < .001, ηp2 = .277, and F(1, 44) = 46.83, p < .001, ηp2 = .516, respectively. The mean error rate for mostly-incongruent contexts (7.0%) was smaller than for mostly-congruent contexts (10.7%). The mean error rate for incongruent trials (12.8%) was larger than for congruent trials (4.9%). The main effect of valence of mostly-incongruent context was not significant, F(1, 44) = 0.90, p = .347. The interaction effect of Congruence Proportion Congruent was significant, F(1, 44) = 39.46, p < .001, ηp2 = .401. The CE in mostly-incongruent contexts (2.4%) was smaller than in mostly-congruent contexts (13.4%). The two-way interaction effect of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Valence of Mostly-incongruent Context was not significant, F(1, 44) = 3.74, p = .060, and F(1, 44) = 0.95, p = .336, respectively. The three-way interaction effect of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.96, p = .333. BF01 = 2.99. Unbiased Items For RTs, the CE was significant, F(1, 44) = 90.05, p < .001, ηp2 = .672. The mean RT for incongruent trials (536 ms) was longer than that for congruent trials (485 ms). The main effect of proportion congruent or of valence of mostlyincongruent context was not significant, F(1, 44) = 3.47, p = .069, and F(1, 44) = 0.33, p = .571, respectively. The Ó 2019 Hogrefe Publishing
91
CE was modulated by proportion congruent, F(1, 44) = 4.89, p = .032, ηp2 = .100. The CE in mostly-incongruent contexts (38 ms) was smaller than the CE in mostly-congruent contexts (64 ms). The two-way interaction of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.10, p = .759, and F(1, 44) = 0.48, p = .491, respectively. The three-way interaction of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 1.47, p = .232, BF01 = 2.39. For error rates, the main effects of proportion congruent and congruence were significant, F(1, 44) = 15.51, p < .001, ηp2 = .261, and F(1, 44) = 8.69, p = .005, ηp2 = .165, respectively. The mean error rate for mostly-incongruent contexts (12.0%) was smaller than for mostly-congruent contexts (17.6%); the mean error rate for incongruent trials (16.7%) was larger than for congruent trials (12.9%). The main effect of valence of mostly-incongruent context was nonsignificant, F(1, 44) = 0.26, p = .615. The interaction effect of Congruence Valence of Mostly-incongruent Context was significant, F(1, 44) = 6.21, p = .017, ηp2 = .124. The CE was smaller when mostly-incongruent contexts were negative (0.6%) than when they were positive (7.1%). The two-way interaction effect of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Proportion Congruent was not significant, F(1, 44) = 0.01, p = .921, and F(1, 44) = 0.77, p = .385, respectively. The three-way interaction effect of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 3.98, p = .052, BF01 = 0.83. Affective Valence-General Influence Biased Items For RTs, the main effect of proportion congruent was significant, F(1, 92) = 22.03, p < .001, ηp2 = .193, with a larger mean RT for mostly-incongruent contexts (467 ms) than for mostly-congruent contexts (453 ms). The CE was significant, F(1, 92) = 520.88, p < .001, ηp2 = .850, with a longer mean RT for incongruent trials (492 ms) than for congruent trials (428 ms). The main effect of context type was not significant, F(1, 92) = 2.01, p = .160. The CE was modulated by context type, F(1, 92) = 6.46, p = .013, ηp2 = .066, with a larger CE in affective contexts (70 ms) than in neutral contexts (56 ms). The CE was also modulated by proportion congruent, F(1, 92) = 161.19, p < .001, ηp2 = .637, with a smaller CE in mostly-incongruent contexts (27 ms) than in mostly-congruent contexts (100 ms). The Proportion Congruent Context Type interaction effect was not significant, F(1, 92) = 4.30, p = .041. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 0.78, p = .378, BF01 = 4.38. Experimental Psychology (2019), 66(1), 86–97
92
The CE was also mirrored in error rates, with a higher error rate for incongruent trials (12.5%) than for congruent trials (5.4%), F(1, 92) = 65.38, p < .001, ηp2 = .415. The main effect of proportion congruent was significant, F(1, 92) = 13.72, p < .001, ηp2 = .130. The error rate in mostly-incongruent contexts (7.6%) was smaller than in mostly-congruent contexts (10.3%). The main effect of context type was not significant, F(1, 92) < 1, p = .990. The two-way interaction of Congruence Proportion Congruent was significant, F(1, 92) = 51.30, p < .001, ηp2 = .358. The CE in mostly-incongruent contexts (1.8%) was smaller than in mostly-congruent contexts (12.2%). The interaction of Proportion Congruent Context Type was not significant, F(1, 92) = 2.20, p = .142, nor was Congruence Context Type, F(1, 92) = 1.19, p = .278. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 0.21, p = .650, BF01 = 6.18. Unbiased Items For RTs, the CE was significant, F(1, 92) = 172.48, p < .001, ηp2 = .652, with a longer mean RT for incongruent trials (548 ms) than for congruent trials (497 ms). The main effect of proportion congruent was not significant, F(1, 92) = 1.31, p = .256, nor was the main effect of context type, F(1, 92) = 2.45, p = .121. The two-way interaction of Congruence Proportion Congruent was significant, F(1, 92) = 17.31, p < .001, ηp2 = .158, with a smaller CE in mostly-incongruent contexts (44 ms) than in mostly-congruent contexts (68 ms). The two-way interaction effect of Proportion Congruent Context Type or Congruence Context Type was not significant, F(1, 92) = 1.85, p = .177, and F(1, 92) < 1, p = .988, respectively. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 1.20, p = .276, BF01 = 3.61. For error rates, the CE was significant, F(1, 92) = 27.39, p < .001, ηp2 = .229, with a higher error rate for incongruent trials (17.6%) than that for congruent trials (12.7%). The main effect of proportion congruent was also significant, F(1, 92) = 14.14, p < .001, ηp2 = .133, with a smaller error rate in mostly-incongruent contexts (13.4%) than in mostly-congruent contexts (16.9%). The main effect of context type was not significant, F(1, 92) = 0.01, p = .924. The two-way interaction of Congruence Proportion Congruent was not significant, F(1, 92) = 3.68, p = .058, ηp2 = .058, though the CE in mostly-incongruent contexts (3.3%) was smaller than in mostly-congruent contexts (6.6%). The interaction of Proportion Congruent Context Type was not significant, F(1, 92) = 3.25, p = .075, nor was Congruence Context Type, F(1, 92) = 0.69, p = .409. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 0.56, p = .457, BF01 = 4.88. Experimental Psychology (2019), 66(1), 86–97
J. Zhang et al., Affective Influence on CSPC Effect
Test Block Affective Valence-Specific Influence Biased Items For RTs, the CE was significant, F(1, 44) = 660.65, p < .001, ηp2 = .938, with a longer mean RT for incongruent trials (475 ms) than for congruent trials (403 ms). The main effect of proportion congruent or valence of mostly-incongruent context was not significant, F(1, 44) = 2.21, p = .144, and F(1, 44) = 0.34, p = .561, respectively. The twoway interaction of Congruence Proportion Congruent was not significant, F(1, 44) = 1.54, p = .221, though the CE was smaller in mostly-incongruent contexts (69 ms) than in mostly-congruent contexts (75 ms; see Figure 1). The two-way interaction of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Valence of Mostly-incongruent Context was not significant, F(1, 44) = 1.48, p = .230, and F(1, 44) = 0.77, p = .386, respectively. The three-way interaction of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.03, p = .863, BF01 = 4.51. For error rates, the CE was significant, F(1, 44) = 37.42, p < .001, ηp2 = .460, with a higher error rate for incongruent trials (10.9%) than for congruent trials (4.2%). The main effect of proportion congruent or valence of mostlyincongruent context was not significant, F(1, 44) = 0.36, p = .552, and F(1, 44) = 0.83, p = .366, respectively. The interaction of Congruence Proportion Congruent was not significant, F(1, 44) = 0.05, p = .826. The two-way interaction of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.72, p = .400, and F(1, 44) = 0.47, p = .497, respectively. The three-way interaction of Congruence Proportion Congruent
Figure 1. Congruence effect in millisecond as a function of proportion congruent (20% vs. 80%) associated with contexts, valence of mostlyincongruent contexts (negative vs. positive), and item type (biased vs. unbiased). Error bars attached to each column represent 95% CI.
Ó 2019 Hogrefe Publishing
J. Zhang et al., Affective Influence on CSPC Effect
93
Valence of Mostly-incongruent Context was not significant, F(1, 44) = 1.53, p = .222, BF01 = 2.32. Unbiased Items For RTs, the CE was significant, F(1, 44) = 168.93, p < .001, ηp2 = .793, with a longer mean RT for incongruent trials (527 ms) than for congruent trials (462 ms). The main effect of proportion congruent was not significant, F(1, 44) = 0.19, p = .664, nor was the main effect of valence of mostlyincongruent context, F(1, 44) = 0.21, p = .650. The twoway interaction of Congruence Proportion Congruent was not significant, F(1, 44) = 0.04, p = .841, nor was Proportion Congruent Valence of Mostly-incongruent Context, F(1, 44) = 0.25, p = .620, or Congruence Valence of Mostly-incongruent Context, F(1, 44) = 0.27, p = .603. The three-way interaction of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 2.53, p = .119, BF01 = 1.51. For error rates, the CE was significant, F(1, 44) = 21.68, p < .001, ηp2 = .330, with a higher error rate for incongruent trials (16.5%) than for congruent trials (10.3%). The main effect of proportion congruent or valence of mostly-incongruent context was not significant, F(1, 44) = 0.41, p = .525, and F(1, 44) = 0.75, p = .390, respectively. The interaction of Congruence Proportion Congruent was not significant, F(1, 44) = 1.18, p = .282. The two-way interaction of Proportion Congruent Valence of Mostly-incongruent Context or Congruence Valence of Mostly-incongruent Context was not significant, F(1, 44) = 1.57, p = .217, and F(1, 44) = 0.35, p = .556, respectively. The three-way interaction of Congruence Proportion Congruent Valence of Mostly-incongruent Context was not significant, F(1, 44) = 0.23, p = .635, BF01 = 4.13. Affective Valence-General Influence Biased Items For RTs, the main effect of congruence (i.e., CE) was significant, F(1, 92) = 1,186.87, p < .001, ηp2 = .928, with longer RTs for incongruent trials (477 ms) than for congruent trials (407 ms). The main effect of proportion congruent or context type was not significant, F(1, 92) = 1.96, p = .165, and F(1, 92) = 0.37, p = .543, respectively. The CE was modulated by proportion congruent, F(1, 92) = 9.42, p = .003, ηp2 = .093. The CE was larger in mostly-congruent contexts (74 ms) than in mostly-incongruent contexts (66 ms), indicating a significant CSPC effect (see Figure 2). The two-way interaction effect of Proportion Congruent Context Type or Congruence Context Type was not significant, F(1, 92) = 0.85, p = .359, and F(1, 92) = 0.85, p = .360, respectively. More important for the present research question, the three-way interaction of Congruence Proportion Congruent Context type was not significant, F(1, 92) = 1.19, p = .278, indicating that the CSPC Ó 2019 Hogrefe Publishing
Figure 2. Congruence effect in millisecond as a function of proportion congruent (20% vs. 80%) associated with contexts, context type (neutral vs. affective), and item type (biased vs. unbiased). Error bars attached to each column represent 95% CI.
effect was not further modulated by context type, BF01 = 3.62. The CE was also mirrored in the analysis of error rates, F(1, 92) = 69.62, p < .001, ηp2 = .431, with an higher error rate for incongruent trials (10.6%) than for congruent trials (3.9%; see Table 2). The main effect of proportion congruent or context type was not significant, F(1, 92) = 2.13, p = .148, and F(1, 92) = 0.33, p = .570, respectively. The interaction of Congruence Proportion Congruent was not significant, F(1, 92) = 0.45, p = .505, nor was Proportion Congruent Context Type, F(1, 92) = 0.40, p = .529, or Congruence Context Type, F(1, 92) < 0.01, p = .964. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 0.19, p = .662, BF01 = 5.77. Unbiased Items For RTs, the CE was significant, F(1, 92) = 435.55, p < .001, ηp2 = .826, with longer RTs for incongruent trials (535 ms) than for congruent trials (470 ms). The main effect of proportion congruent was not significant, F(1, 92) = 1.44, p = .233, nor was the main effect of context type, F(1, 92) = 1.72, p = .193. The two-way interaction of Congruence Proportion Congruent was not significant, F(1, 92) = 0.41, p = .524, nor was Proportion Congruent Context Type, F(1, 92) = 0.23, p = .631, or Congruence Context Type, F(1, 92) < 0.01, p = .961. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 0.08, p = .779, BF01 = 6.08. For error rates, the CE was significant, F(1, 92) = 47.60, p < .001, ηp2 = .341, with an higher error rate for incongruent trials (16.2%) than for congruent trials (10.1%). The main effect of proportion congruent or context type was not significant, F(1, 92) = 1.97, p = .164, and F(1, 92) = 0.08, p = .772, respectively. The two-way interaction of Congruence Experimental Psychology (2019), 66(1), 86–97
94
J. Zhang et al., Affective Influence on CSPC Effect
Table 2. Mean error rate (M; %) and standard error (SE) as a function of item type (biased vs. unbiased), congruence (incongruent vs. congruent), proportion congruent (20% vs. 80%), and context type (neutral vs. emotional) or, for the emotional-context group only, valence of mostlyincongruent context (negative vs. positive) 20%
80%
Incongruent Item
Congruent
Incongruent
Congruent
M
SE
M
SE
M
SE
M
SE
Context type Neutral Biased
9.6
1.2
3.4
0.7
10.8
1.5
3.7
0.6
15.0
1.7
9.6
1.3
16.8
1.8
10.2
1.3
Biased
10.7
1.2
4.1
0.7
11.1
1.6
4.3
0.7
Unbiased
15.7
1.7
10.5
1.3
17.4
1.8
10.1
1.4
Unbiased Emotional
Valence of mostly-incongruent context Negative Biased
8.9
1.8
3.8
1.1
10.5
2.1
3.8
0.9
14.9
2.5
10.1
2.1
14.9
2.8
9.0
1.9
Biased
12.4
1.7
4.4
1.1
11.7
2.0
4.8
0.9
Unbiased
16.4
2.4
10.9
2.0
19.6
2.7
11.2
1.8
Unbiased Positive
Proportion Congruent was not significant, F(1, 92) = 1.86, p = .176, nor was Proportion Congruent Context Type, F (1, 92) = 0.19, p = .664, or Congruence Context Type, F(1, 92) = 0.01, p = .912. The three-way interaction of Congruence Proportion Congruent Context Type was not significant, F(1, 92) = 0.15, p = .697, BF01 = 5.88.
General Discussion In the current experiment, we investigated how and whether affect influences the CSPC effect. In the typical CSPC design, we used pictures of facial expressions as context stimuli and manipulated the affective expressions of faces across participants. To probe affect-general influences on the CSPC effect, we compared the size of CSPC effects between a neutral-context group and an affectivecontext group. To probe valence-specific influences on the CSPC effect, we compared – for the affectivecontext group only – the size of CSPC effects between participants for whom negative faces were contextual cues of mostly incongruent trials and participants for whom positive faces were contextual cues of mostly incongruent trials. While we observed a significant CSPC effect for biased items, results showed no affect-general influence on the CSPC effect. This interpretation was further supported by BFs (biased items: 3.62; 5.77; unbiased items: 5.88; 6.08) that provide sufficient or positive evidence (Jarosz & Wiley, 2014) for the null hypothesis that affects in general did not modulate the size of the CSPC effect. Therefore, the present results speak against the affective-impairment Experimental Psychology (2019), 66(1), 86–97
hypothesis and the affective-facilitation hypothesis. Instead, results suggest that affective cues (e.g., negative or positive facial expressions) are as efficient as non-affective cues (e.g., male or female faces) for the learning and retrieval of control settings. In the current experiment, we used affective expressions and gender categories, both of which are social categories. It is possible that there is little difference in efficacy between social categories (gender categories and affective expressions) as cues in the CSPC effect, which would also account for the absent valencegeneral influence on the CSPC effect. Therefore, it might be interesting to compare non-social, neutral cues with non-social, affective cues (e.g., via conditioning). It might be also interesting for future studies to compare social cues (faces with features in their typical position) with non-social cues (feature-scrambled faces) (cf. Taubert, AagtenMurphy, & Parr, 2012). Regarding our second hypothesis, we did not observe evidence for a valence-specific influence on the CSPC effect. Regarding this question, BFs (biased items: 2.32; 4.51; unbiased items: 1.51; 4.13) provide only insufficient evidence to support the null hypothesis. Thus, the present results are not suitable to decide for or against the affective-signal neutralization and the affective contrast hypothesis. A power analysis (using G power 3.1.9.2) suggested that the required sample size for each group to detect a small effect size (d = 0.20) with power of .80 would N = 394. Therefore, the present study cannot rule out that with sufficient power, it might be possible to detect small effect. However, our results replicate a previous study by Cañadas et al. (2016) who observed a significant CSPC effect, but no valencespecific modulation on the CSPC effect. Ó 2019 Hogrefe Publishing
J. Zhang et al., Affective Influence on CSPC Effect
Regarding the data of the training blocks, a proportion congruence effect (PCE) was observed both in biased and unbiased items, suggesting that the PCE resulted from a block-wise, top-down control rather than from contingency learning (cf., Bugg, 2012; see also Bugg & Chanani, 2011). Additionally, the CE in RTs of biased items was larger in the affective-context group than in the neutral-context group. This result is consistent with the idea of Pessoa (2009) that task-irrelevant emotional stimuli usually divert control resources away from and thus impair task performance. To further test valence-specific influences on the CE during training, we additionally submitted RTs of the biased items only in the mostly-incongruent context (and only in the mostly-congruent context) of the affective-context group into a 2 (Congruence) 2 (Context-valence: negative vs. positive) ANOVA analysis, with congruence as a within-subjects factor and context-valence as a betweensubjects factor. As a result, the two-way interaction effect of Context-valence Congruence was not significant, F(1, 44) = 0.46, p = .500, BF01 = 3.72, and F(1, 44) = 0.06, p = .804, BF01 = 4.44, for the comparison in the mostlyincongruent context and in the mostly-congruent context, respectively, suggesting that affective contexts compared with neutral contexts impaired the CE irrespective of the affective valence of context stimuli.
Limitations of the Present Study In the present research, we observed a significant CSPC effect only for biased items, but not for unbiased items (see also Hutcheon & Spieler, 2017). This is in line with recent research suggesting that a CSPC effect in unbiased items requires a considerable large sample size (Crump, Brosowsky, & Milliken, 2017). For biased items, it remains unclear to which extend other processes such as contingency learning or feature binding contribute to the CSPC effect (Schmidt et al., 2014). Furthermore, the present study did not include a manipulation check to verify whether and how strong affective faces elicited affective responses. While we presented affective stimuli that have been used successfully in previous research to provoke affective responses even after many repetitions of the same stimulus material (e.g., Richards, Holmes, Pell, & Bethell, 2013), future studies would benefit from a more controlled manipulation checks (e.g., electromyography of the corrugator supercilii). Finally, although we did not assess directly whether participants paid attention to the affective faces (e.g., with catch trials), an overall significant CSPC effect combined with the absence of significant differences in the CSPC effect between the affective-context and the neutral-context groups provides evidence that participants of the affective-context group Ó 2019 Hogrefe Publishing
95
successfully encoded and used the affective expression of the faces as contextual cues.
Conclusion In summary, the present study replicated previous findings that affective faces can act as context cues for the allocation of attention as indicated by the CSPC effect. While results were inconclusive regarding affective valence-specific influences on the CSPC, evidence suggests that the magnitude of CSPC effects did not differ between affective context cues and neutral cues.
References Botvinick, M. M. (2007). Conflict monitoring and decision making: Reconciling two perspectives on anterior cingulate function. Cognitive, Affective, & Behavioral Neuroscience, 7, 356–366. https://doi.org/10.3758/CABN.7.4.356 Bugg, M. J. (2012). Dissociating levels of cognitive control: The case of Stroop interference. Current Directions in Psychological Science, 21, 302–309. https://doi.org/10.1177/ 0963721412453586 Bugg, M. J., & Chanani, S. (2011). List-wide control is not entirely elusive: Evidence from picture-word Stroop. Psychonomic Bulletin & Review, 18, 930–936. https://doi.org/10.3758/ s13423-011-0112-y Cañadas, E., Lupiáñez, J., Kawakami, K., Niedenthal, P. M., & Rodríguez-Bailón, R. (2016). Perceiving emotions: Cueing social categorization processes and attentional control through facial expressions. Cognition and Emotion, 30, 1149–1163. https:// doi.org/10.1080/02699931.2015.1052781 Cañadas, E., Rodríguez-Bailón, R., Milliken, B., & Lupianez, J. (2013). Social categories as a context for the allocation of attentional control. Journal of Experimental Psychology: General, 142, 934–943. https://doi.org/10.1037/A0029794 Crump, M. J. C. (2016). Learning to selectively attend from context-specific attentional histories: A demonstration and some constraints. Canadian Journal of Experimental Psychology, 70, 59–77. https://doi.org/10.1037/cep0000066 Crump, M. J. C., Brosowsky, N. P., & Milliken, B. (2017). Reproducing the location-based context-specific proportion congruent effect for frequency unbiased items: A reply to Hutcheon and Spieler (2016). The Quarterly Journal of Experimental Psychology, 70, 1792–1807. https://doi.org/10.1080/ 17470218.2016.1206130 Crump, M. J. C., Gong, Z., & Milliken, B. (2006). The contextspecific proportion congruent Stroop effect: Location as a contextual cue. Psychonomic Bulletin & Review, 13, 316–321. https://doi.org/10.3758/BF03193850 Crump, M. J. C., & Milliken, B. (2009). The flexibility of contextspecific control: Evidence for context-driven generalization of item-specific control settings. The Quarterly Journal of Experimental Psychology, 62, 1523–1532. https://doi.org/10.1080/ 17470210902752096 Crump, M. J. C., Vaquero, J. M. M., & Milliken, B. (2008). Context-specific learning and control: The roles of awareness, task relevance, and relative salience. Consciousness and Cognition, 17, 22–36. https://doi.org/10.1016/j.concog.2007. 01.004
Experimental Psychology (2019), 66(1), 86–97
96
Dignath, D., & Eder, A. B. (2015). Stimulus conflict triggers behavioral avoidance. Cognitive, Affective, & Behavioral Neuroscience, 15, 822–836. https://doi.org/10.3758/s13415-015-0355-6 Dignath, D., Janczyk, M., & Eder, A. B. (2017). Phasic valence and arousal do not influence post-conflict adjustments in the Simon task. Acta Psychologica, 174, 31–39. https://doi.org/ 10.1016/j.actpsy.2017.01.004 Dreisbach, G., & Fischer, R. (2012). Conflicts as aversive signals. Brain and Cognition, 78, 94–98. https://doi.org/10.1016/j. bandc.2011.12.003 Dreisbach, G., & Fischer, R. (2015). Conflicts as aversive signals for control adaptation. Current Directions in Psychological Science, 24, 255–260. https://doi.org/10.1177/ 0963721415569569 Dreisbach, G., Reindl, A. L., & Fischer, R. (2016). Conflict and disfluency as aversive signals: Context-specific processing adjustments are modulated by affective location associations. Psychological Research, 82, 324–336. https://doi.org/10.1007/ s00426-016-0822-x Egner, T. (2008). Multiple conflict-driven control mechanisms in the human brain. Trends in Cognitive Sciences, 12, 374–380. https://doi.org/10.1016/j.tics.2008.07.001 Egner, T. (2014). Creatures of habit (and control): A multi-level learning perspective on the modulation of congruency effects. Frontiers in Psychology, 5, 1247. https://doi.org/10.3389/ fpsyg.2014.01247 Eder, A. B., & Dignath, D. (2014). I like to get nothing: Implicit and explicit evaluation of avoided negative outcomes. Journal of Experimental Psychology: Animal Learning and Cognition, 40, 55–62. https://doi.org/10.1037/xan0000005 Fritz, J., Fischer, R., & Dreisbach, G. (2015). The influence of negative stimulus features on conflict adaption: Evidence from fluency of processing. Frontiers in Psychology, 6, 1–9. https:// doi.org/10.3389/fpsyg.2015.00185 Goeleven, E., De Raedt, R., Leyman, L., & Verschuere, B. (2008). The Karolinska directed emotional faces: A validation study. Cognition and Emotion, 22, 1094–1118. https://doi.org/ 10.1080/02699930701626582 Hart, S. J., Green, S. R., Casp, M., & Belger, A. (2010). Emotional priming effects during Stroop task performance. NeuroImage, 49, 2662–2670. https://doi.org/10.1016/j.neuroimage.2009. 10.076 Hutcheon, T., & Spieler, D. (2017). Limits on the generalizability of context-driven control. The Quarterly Journal of Experimental Psychology, 70, 1292–1304. https://doi.org/10.1080/17470218. 2016.1182193 Inzlicht, M., Bartholow, B. D., & Hirsh, J. B. (2015). Emotional foundations of cognitive control. Trends in Cognitive Sciences, 19, 126–132. https://doi.org/10.1016/j.tics.2015.01.004 Jarosz, A. F., & Wiley, J. (2014). What are the odds? A practical guide to computing and reporting Bayes factors. Journal of Problem Solving, 7, 2–9. https://doi.org/10.7771/19326246.1167 Kanske, P., & Kotz, S. A. (2011). Conflict processing is modulated by positive emotion: ERP data from a flanker task. Behavioural Brain Research, 219, 382–386. https://doi.org/10.1016/j. bbr.2011.01.043 Larsen, J. T., & Norris, J. I. (2009). A facial electromyographic investigation of affective contrast. Psychophysiology, 46, 831– 842. https://doi.org/10.1111/j.1469-8986.2009.00820.x Lehle, C., & Hübner, R. (2008). On-the-fly adaptation of selectivity in the flanker task. Psychonomic Bulletin & Review, 15, 814– 818. https://doi.org/10.3758/PBR.15.4.814 Logan, G. D., & Zbrodoff, N. J. (1979). When it helps to be misled: Facilitative effects of increasing the frequency of conflicting
Experimental Psychology (2019), 66(1), 86–97
J. Zhang et al., Affective Influence on CSPC Effect
stimuli in a Stroop-like task. Memory & Cognition, 7, 166–174. https://doi.org/10.3758/BF03197535 Lundqvist, D., Flykt, A., & Öhman, A. (1998). The Karolinska Directed Emotional Faces – KDEF [CD ROM]. Stockholm, Sweden: Department of Clinical Neuroscience, Psychology section, Karolinska Institute. Pessoa, L. (2009). How do emotion and motivation direct executive control? Trends in Cognitive Sciences, 13, 160–166. https://doi. org/10.1016/j.tics.2009.01.006 Richards, A., Holmes, A., Pell, P. J., & Bethell, E. J. (2013). Adapting effects of emotional expression in anxiety: Evidence for an enhanced late positive potential. Social Neuroscience, 8, 650–664. https://doi.org/10.1080/17470919.2013. 854273 Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237. https://doi.org/10.3758/PBR.16.2.225 Schmidt, J. R., Lemercier, C., & Houwer, J. D. (2014). Contextspecific temporal learning with not-conflict stimuli: Proof-ofprinciple for a learning account of context-specific proportion congruent effects. Frontiers in Psychology, 5, 1–10. https://doi. org/10.3389/fpsyg.2014.01241 Taubert, J., Aagten-Murphy, D., & Parr, L. A. (2012). A comparative study of face processing using scrambled faces. Perception, 41, 460–473. https://doi.org/10.1068/p7151 Van Steenbergen, H., Band, G. P. H., & Hommel, B. (2009). Reward counteracts conflict adaptation: Evidence for a role of affect in executive control. Psychological Science, 20, 1473–1477. https://doi.org/10.1111/j.1467-9280.2009.02470.x Van Steenbergen, H., Band, G. P. H., & Hommel, B. (2012). Reward valence modulates conflict-driven attentional adaptation: Electrophysiological evidence. Biological Psychology, 90, 234–241. https://doi.org/10.1016/j.biopsycho.2012.03.018 Verguts, T., & Notebaert, W. (2009). Adaptation by binding: A learning account of cognitive control. Trends in Cognitive Sciences, 13, 252– 257. https://doi.org/10.1016/j.tics.2009.02.007 Weidler, J. B., & Bugg, M. J. (2016). Transfer of location-specific control to untrained locations. The Quarterly Journal of Experimental Psychology, 69, 2202–2217. https://doi.org/10.1080/ 17470218.2015.1111396 Wendt, M., & Kiesel, A. (2011). Conflict adaptation in time: Foreperiods as contextual cues for attentional adjustment. Psychonomic Bulletin & Review, 18, 910–916. https://doi.org/ 10.3758/s13423-011-0119-4 Wendt, M., Kiesel, A., Geringswald, F., Purmann, S., & Fischer, R. (2014). Attentional adjustment to conflict strength: Evidence from the effects of manipulating flanker-target SOA on response times and prestimulus pupil size. Experimental Psychology, 61, 55–67. https://doi.org/10.1027/1618-3169/ a000227
History Received November 10, 2017 Revision received July 17, 2018 Accepted July 18, 2018 Published online February 19, 2019 Acknowledgments The authors would like to thank Sari Alsalti, Julia Ditz, and Patrik Seuling for help with the data collection and Gesine Dreisbach and Juan Lupiáñez for helpful comments on an earlier version of this manuscript.
Ó 2019 Hogrefe Publishing
J. Zhang et al., Affective Influence on CSPC Effect
Open Data Experimental materials, data, and analyses scripts can be retrieved from the Open Science Framework: https://osf.io/2dzuc/? view_only=52a75b55897149be8a57f70e975dc2fe Funding This research was supported by the China Scholarship Council (Scholarship no. 201507040020) and grants within the Priority Program, SPP 1772 from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG), Grant no. KI1388/8-1 and Grant No. DI 2126/1-1.
97
Jinhui Zhang Department of Psychology University of Freiburg Engelbergerstraße 41 79085 Freiburg Germany jinhui.zhang@psychologie.uni-freiburg.de
ORCID Jinhui Zhang https://orcid.org/0000-0002-3997-1686
Ó 2019 Hogrefe Publishing
Experimental Psychology (2019), 66(1), 86–97
Instructions to Authors Experimental Psychology publishes innovative, original, high quality experimental research. The scope of the journal is defined by experimental methodology and thus papers based on experiments from all areas of psychology are welcome. To name just a few fields and domains of research, Experimental Psychology considers manuscripts reporting experimental work on learning, memory, perception, emotion, motivation, action, language, thinking, problem-solving, judgment and decision making, social cognition, and neuropsychological aspects of these topics. Apart from the use of experimental methodology, a primary criterion for publication is that research papers make a substantial contribution to theoretical research questions. For research papers that have a mainly applied focus, Experimental Psychology is not the appropriate outlet. Experimental Psychology publishes the following types of articles: Research Articles, Short Research Articles, Theoretical Articles, and Registered Reports. Replication studies should be submitted as a Registered Report. Manuscript Submission. All manuscripts should in the first instance be submitted electronically at http://www.editorial manager.com/exppsy. Detailed instructions to authors are provided at http://www.hogrefe.com/j/exppsy Copyright Agreement. By submitting an article, the author confirms and guarantees on behalf of him-/herself and any coauthors that he or she holds all copyright in and titles to the submitted contribution, including any figures, photographs, line drawings, plans, maps, sketches, tables, raw data, and other electronic supplementary material, and that the article and its contents does not infringe in any way on the rights of third parties. ESM and raw data files will be published online as received from the author(s) without any conversion, testing, or reformatting. They will not be checked for typographical errors or functionality. The author indemnifies and holds harmless the publisher from any third party claims. The author agrees, upon acceptance of the article for publication, to transfer to the publisher the exclusive right to reproduce and distribute the article and its contents, both physically and in nonphysical, electronic, and other form, in the journal to which it has been submitted and in other independent publications, with no limits on the number of copies or on the form or the extent of the distribution. These rights are transferred for the duration of copyright as defined by international law. Furthermore, the author transfers to the publisher the following exclusive rights to the article and its contents:
Experimental Psychology (2019), 66(1)
1. The rights to produce advance copies, reprints, or offprints of the article, in full or in part, to undertake or allow translations into other languages, to distribute other forms or modified versions of the article, and to produce and distribute summaries or abstracts. 2. The rights to microfilm and microfiche editions or similar, to the use of the article and its contents in videotext, teletext, and similar systems, to recordings or reproduction using other media, digital or analogue, including electronic, magnetic, and optical media, and in multimedia form, as well as for public broadcasting in radio, television, or other forms of broadcast. 3. The rights to store the article and its content in machinereadable or electronic form on all media (such as computer disks, compact disks, magnetic tape), to store the article and its contents in online databases belonging to the publisher or third parties for viewing or downloading by third parties, and to present or reproduce the article or its contents on visual display screens, monitors, and similar devices, either directly or via data transmission. 4. The rights to reproduce and distribute the article and its contents by all other means, including photomechanical and similar processes (such as photocopying or facsimile), and as part of so-called document delivery services. 5. The right to transfer any or all rights mentioned in this agreement, as well as rights retained by the relevant copyright clearing centers, including royalty rights to third parties. Online Rights for Journal Articles Hogrefe will send the corresponding author of each accepted paper free of charge an e-offprint (PDF) of the published version of the paper when it is first released online. This e-offprint is provided exclusively for the author’s personal use, including for sharing with coauthors. Other uses of the e-offprint/ published version of record, including but not limited to the following, are not permitted except with the express written permission of the publisher: posting the e-offprint/published version of record to a personal or institutional website or to an institutional or disciplinary repository; changing or modifying the digital file; reproducing, distributing, or licensing the article in whole or in part for commercial use. If you wish to post the article to your personal or institutional website or to archive it in an institutional or disciplinary repository, please use either a pre-print or a post-print of your manuscript in accordance with the publication release for your article and the document ‘‘Guidelines on sharing and use of articles in Hogrefe journals’’ on the journal’s web page at www.hogrefe. com/j/exppsy. February 2019
Ó 2019 Hogrefe Publishing
Current research on the myriad ways in which memory, conscious and unconscious, impacts everyday life Topics covered include • Prospective memory • Distributed practice effects • Interactions between long-term and working memory • Memory errors • Applied memory research in educational settings • The importance of cues for real-life memory effects
Melody Wiseheart (Editor)
Applied Memory Research Zeitschrift für Psychologie, Vol. 222/2 2014, iv + 52 pp., large format US $49.00 / € 34.95 ISBN 978-0-88937-462-1 Applied memory research investigates the practical aspects of one of the crucial facets of human life experience: Memory. Our ability to make informed, conscious decisions in the present depends on the existence of memories, both consciously and unconsciously obtained, as do the many seemingly automatic tasks we perform every day. From the earliest days of experimental psychology, researchers have attempted to understand the “laws” of memory with a view to application in fields as varied as improvement of learning processes, advertising, men-
www.hogrefe.com
tal ability examination, and legal system evaluations. This volume provides the reader with a general sense of the current state of the field of applied memory research. It covers learning and memory within the field of educational psychology, error sources in forensic psychology, language impairment within clinical psychology, and interactions between social and technical systems within industrial-organizational psychology, as well as memory-related aspects of architectural and industrial design.
So that’s how my mind works – Now I get it!
“As a fan of PSI theory for more than 20 years, I am very happy seeing it translated for popular consumption! The book makes the theory reasonably simple, with lots of fun illustrations.” Kennon M. Sheldon, PhD, Professor of Psychological Sciences, University of Missouri, Columbia, MO
Johannes Storch / Corinne Morgenegg / Maja Storch / Julius Kuhl
Now I Get It!
Understand Yourself and Take Charge of Your Behavior 2018, vi + 248 pp. US $34.80 / € 27.95 ISBN 978-0-88937-541-3 Also available as eBook Using the example of four colleagues working together in a small company, Now I Get It! shows us the main personality types and their strengths and weakness in such a way that we gain real “now I get it!” insights into what is going on in our own and others’ subconscious. How does my mind work and what kind of personality do I have? When we can answer these questions and have come to terms with who we are, then the solutions to many issues that arise in everyday life will fall into place. What sort of people do I get on with best and how can I
www.hogrefe.com
best deal with the others? Are there recurring stressful situations in my professional or private life, and how do I resolve them? This humorously written and illustrated book, by the world’s leading experts in personality systems interaction (PSI) theory and the Zurich Resource Model (ZRM), gives us profound insights into our and other people’s subconscious thoughts – so we can adapt our own behavior and interactions to improve our quality of life. Cartoons and worksheets help us on our way.
Assessment and treatment of Internet addiction “This excellent book is a pleasure to read. At a time when clinicians are scrambling to learn what they can about the rapidly developing problem of Internet addiction, this book offers them an excellent place to start.” Hilarie Cash, PhD, Chief Clinical Officer and Co-Founder of reSTART Life, PLLC, Fall City, WA – the first residential treatment program for Internet addiction in the US
Daria J. Kuss / Halley M. Pontes
Internet Addiction (Series: Advances in Psychotherapy – Evidence-Based Practice – Volume 41) 2019, iv + 86 pp. US $29.80 / € 24.95 ISBN 978-0-88937-501-7 Also available as eBook This book examines how you can identify, assess, and treat Internet addiction in the most effective manner. Internet use has become an integral part of our daily lives, but at what point does it become problematic? What are the different kinds of Internet addiction? And how can professionals best help clients? This compact, evidence-based guide written by leading experts from the field helps disentangle the debates and controversies around Internet addiction, including social media addiction
www.hogrefe.com
and Internet gaming disorder, and outlines the current assessment and treatment methods. The book presents a 12–15 session treatment plan for Internet and gaming addiction using the method and setting with the best evidence: group CBT. Printable tools in the appendix help clinicians implement therapy. This accessible book is essential reading for clinical psychologists, psychiatrists, psychotherapists, counsellors, social workers, teachers, as well as students.
The latest knowledge on how to tackle the complexities of hoarding disorder “If you wish to help those who suffer with the debilitating problem of hoarding, get this book and learn from these experienced scientist–practitioners.” Michael A. Tompkins, PhD, ABPP, Co-Director, San Francisco Bay Area Center for Cognitive Therapy; Assistant Clinical Professor, University of California at Berkeley
Gregory S. Chasson / Jedidiah Siev
Hoarding Disorder (Series: Advances in Psychotherapy – Evidence-Based Practice – Volume 40) 2019, viii + 76 pp. US $29.80 / € 24.95 ISBN 978-0-88937-407-2 Also available as eBook Hoarding disorder, classified as one of the obsessive-compulsive and related disorders in the DSM-5, presents particular challenges in therapeutic work, including treatment ambivalence and lack of insight of those affected. This evidence-based guide written by leading experts presents the latest knowledge on assessment and treatment of hoarding disorder. The reader gains a thorough grounding in the treatment of choice for hoarding – a specific form of CBT interweaved with psy-
www.hogrefe.com
choeducational, motivational, and harm-reduction approaches to enhance treatment outcome. Rich anecdotes and clinical pearls illuminate the science, and the book also includes information for special client groups, such as older individuals and those who hoard animals. Printable handouts help busy practitioners. This book is essential reading for clinical psychologists, psychiatrists, psychotherapists, and practitioners who work with older populations, as well as students.