Experimental Psychology 2020

Page 1

Volume 67 / Number 1 / 2020

Volume 67 / Number 1 / 2020

Experimental Psychology

Experimental Psychology

Editors-in-Chief Andreas Eder Christian Frings Editors Tom Beckers Ullrich Ecker Manuel Perea Jörg Rieskamp James Schmidt Alexander Schütz Samuel Shaki Sarah Teige-Mocigemba Julia Vogt Matthias Wieser


State-of-the art research synthesis methods for psychologists pril Out A 2020

Michael Bošnjak, Nadine Wedderhoff (Editors)

Hotspots in Psychology 2020 Zeitschrift für Psychologie - Volume 43 2020, iv / 64 pp., large format US $49.00 / € 34.95 ISBN: 9780889375741

This fourth Hotspots in Psychology is devoted to systematic reviews and meta-analyses in research-active fields that have generated a considerable number of primary studies. The common denominator is the research synthesis nature of the contributions, and not a specific psychological topic or theme that all articles have to address. This issue explores methodological advances in research synthesis methods relevant for any subfield of psychology.

..

Zeitschrift für Psychologie

Founded by Hermann Ebbinghaus and Arthur König in 1890

www.hogrefe.com

The contributions include: the application of a network meta-analytic approach to analyze the effect of transcranial direct current stimulation on memory; analyzing the performance of a meta-analytic structural equation modeling approach when variables in primary studies have been artificially dichotomized; assessing quality-related aspects of systematic reviews with AMSTAR and AMSTAR2; as well as a graphical approach to depict study-level statistical power in the context of meta-analysis.

About the Journal The Zeitschrift für Psychologie publishes high-quality research from all branches of empirical psychology that is clearly of international interest and relevance, and does so in four topical issues per year. The guest editors and the editorial team are assisted by an experienced international editorial board and external

reviewers to ensure that the journal’s strict peer-review process is in keeping with its long and honorable tradition of publishing only the best of psychological science. The subjects being covered are determined by the editorial team after consultation within the scientific community, thus ensuring topicality.


Experimental Psychology

Volume 67/Number 1 /2020


Editors

A. Eder, Würzburg, Germany

C. Frings, Trier, Germany

Associate Editors

T. Beckers, Leuven, Belgium U. Ecker, Perth, Australia M. Perea, Valencia, Spain J. Rieskamp, Basel, Switzerland J. Schmidt, Dijon, France

A. Schütz, Marburg, Germany S. Shaki, Samaria, Israel S. Teige-Mocigemba, Marburg, Germany J. Vogt, Reading, UK M. Wieser, Rotterdam, The Netherlands

Editorial Board

U. J. Bayen, Düsseldorf, Germany H. Blank, Portsmouth, UK A. Bröder, Mannheim, Germany J. De Houwer, Ghent, Belgium R. Dell’Acqua, Padova, Italy G. O. Einstein, Greenville, SC, USA E. Erdfelder, Mannheim, Germany M. Goldsmith, Haifa, Israel D. Hermans, Leuven, Belgium R. Hertwig, Berlin, Germany J. L. Hicks, Baton Rouge, LA, USA P. Juslin, Uppsala, Sweden Y. Kareev, Jerusalem, Israel D. Kerzel, Geneva, Switzerland A. Kiesel, Freiburg, Germany K. C. Klauer, Freiburg, Germany R. Kliegl, Potsdam, Germany I. Koch, Aachen, Germany J. I. Krueger, Providence, RI, USA S. Lindsay, Victoria, BC, Canada

E. Loftus, Irvine, CA, USA T. Meiser, Mannheim, Germany K. Mitchell, West Chester, PA, USA N. W. Mulligan, Chapel Hill, NC, USA B. Newell, Sydney, Australia K. Oberauer, Zürich, Switzerland F. Parmentier, Palma, Spain M. Regenwetter, Champaign, IL, USA R. Reisenzein, Greifswald, Germany J. N. Rouder, Irvine, CA, USA D. Shanks, London, UK C. Stahl, Köln, Germany M. Steffens, Landau, Germany S. Tremblay, Quebec, Canada C. Unkelbach, Köln, Germany M. Waldmann, Göttingen, Germany E. Walther, Trier, Germany P. A. White, Cardiff, UK D. Zakay, Tel Aviv, Israel

Publisher

Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail publishing@hogrefe.com, Web http://www.hogrefe.com

Production

Regina Pinks-Freybott, Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail production@hogrefe.com

Subscriptions

Hogrefe Publishing, Herbert-Quandt-Str. 4, D-37081 Göttingen, Germany, Tel. +49 551 99950-956, Fax +49 551 99950-998, E-mail zeitschriftenvertrieb@hogrefe.de

Advertising/Inserts

Melanie Beck, Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail marketing@hogrefe.com

ISSN

ISSN-L 1618-3169, ISSN-Print 1618-3169, ISSN-Online 2190-5142

Copyright Information

© 2020 Hogrefe Publishing. The journal as well as the individual contributions to it are protected under international copyright law. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, digital, mechanical, photocopying, microfilming or otherwise, without prior written permission from the publisher. All rights, including translation rights, are reserved.

Publication

Published in six issues per annual volume. Experimental Psychology is the continuation of Zeitschrift für Experimentelle Psychologie (ISSN 0949-3964), the last annual volume of which (Volume 48) was published in 2001.

Subscription Prices

Calendar year subscriptions only. Rates for 2020: Institutions – from US $483.00/€370.00 (print only; pricing for online access can be found in the journals catalog at hgf.io/journalscatalog); Individuals – US $264.00/€199.00 (print & online). Postage and handling – US $24.00/€18.00. Single copies – US $85.00/€66.50 + postage & handling.

Payment

Payment may be made by check, international money order, or credit card to Hogrefe Publishing, Merkelstr. 3, D-37085 Göttingen, Germany, or, for customers in North America, to Hogrefe Publishing, Inc., Journals Department, 361 Newbury Street, 5th Floor, Boston, MA 02115, USA.

Electronic Full Text

The full text of Experimental Psychology is available online at http://econtent.hogrefe.com/loi/zea

Abstracting/Indexing Services

Experimental Psychology is abstracted/indexed in Current Contents/Social & Behavioral Sciences (CC/S&BS), Social Science Citation Index (SSCI), Medline, PsyJOURNALS, PsycINFO, PSYNDEX, ERIH, Scopus, and EMCare. 2018 Impact Factor 1.000, Journal Citation Reports (Clarivate Analytics, 2019)

Experimental Psychology (2020), 67(1)

© 2020 Hogrefe Publishing


Contents Editorial

Experimental Psychology in the Year 2020: Where We Stand and Where to Go Andreas B. Eder and Christian Frings

1

Research Articles

Experienced Category Variability Modulates the Impact of Context on Evaluative Judgments Marı́lia Prada and Teresa Garcia-Marques

5

Short Research Articles

Registered Report

© 2020 Hogrefe Publishing

Your Face and Moves Seem Happier When I Smile: Facial Action Influences the Perception of Emotional Faces and Biological Motion Stimuli Fernando Marmolejo-Ramos, Aiko Murata, Kyoshiro Sasaki, Yuki Yamada, Ayumi Ikeda, José A. Hinojosa, Katsumi Watanabe, Michal Parzuchowski, Carlos Tirado, and Raydonal Ospina

14

Why People With High Alexithymia Make More Utilitarian Judgments: The Role of Empathic Concern and Deontological Inclinations Xiangyi Zhang, Zhihui Wu, Shenglan Li, Ji Lai, Meng Han, Xiyou Chen, Chang Liu, and Daoqun Ding

23

Stroke Encoding Processes of Chinese Character During Sentence Reading Mingjun Zhai, Hsuan-Chih Chen, and Michael C. W. Yip

31

Parsing for Position David J. Lobina, José E. Garcı́a-Albea, and Josep Demestre

40

Effects of Input Modality on Vocal Effector Prioritization in Manual–Vocal Dual Tasks Mareike A. Hoffmann, Melanie Westermann, Aleks Pieczykolan, and Lynn Huestegge

48

Does Object Size Matter With Regard to the Mental Simulation of Object Orientation? Sau-Chin Chen, Bjorn B. de Koning, and Rolf A. Zwaan

56

Experimental Psychology (2020), 67(1)



Editorial

Experimental Psychology in the Year 2020 Where We Stand and Where to Go Andreas B. Eder1 and Christian Frings2 Department of Psychology, University of Würzburg, Germany

1

Department of Psychology, University of Trier, Germany

2

With the year 2020, the journal Experimental Psychology started into a new decade and our term as Editors-in-Chief has already reached its half time: time for a résumé what we have achieved so far and what we have planned for the future.

Where Do We Stand? In our previous Editorial (Eder & Frings, 2018), we identified the following five criteria of excellence for a research journal that would serve as benchmarks for our own editorial practice: Standard 1: A good journal has a specialty. Standard 2: A good journal has rigorous peer review. Standard 3: A good journal is transparent. Standard 4: A good journal honors the value of reproducible data. Standard 5: A good journal is author-friendly. Our activities were geared toward strengthening these standards, resulting in several tweaks and changes concerning the review and publication procedures at Experimental Psychology. For a highlight of the relevance of our journal subject (experimental research in psychology), we invited several distinguished scientists to write articles that highlight significant advances in influential areas of experimental psychology. We are very happy that several renowned scientists followed our invitation. Bernhard Hommel from Leiden University (The Netherlands) wrote an article in which he extended the Theory of Event Coding to representations of the self and others, highlighting the relevance of experimental research for our understanding of social communication (Hommel, 2018). Charles Spence from the © 2020 Hogrefe Publishing

University of Oxford (United Kingdom) published a theoretical review on the relationship between color and taste/flavor, highlighting the importance of experimental research on multisensory perception (Spence, 2019). Jan De Houwer from Ghent University (Belgium) discussed in his article the importance of relational knowledge for our understanding of classic “associative” phenomena, proposing an alternative to dual-system accounts (De Houwer, 2019). Additional prominent researchers have agreed to submit articles for coming issues. Sign up to our electronic journal (https://econtent.hogrefe.com/) so that you will not miss them! Rigorous peer review is the most important instrument for quality assurance, and we implemented this measure without compromise. Every submission was reviewed without exception by independent experts in the field. As a consequence of this rigorous policy, not every submission could make it into print, with the rate of acceptance at 37% in the last year while the rate of submissions remained constant over the last 2 years. We see it as very important to emphasize the high quality of papers at Experimental Psychology. In this regard, we are extremely fortunate to have the support of a professional Editorial Board that has served excellently for this journal since many years. They are the pumping heart of this journal, and we want to thank them here for their continued dedication to this journal. Our special thanks go to Arndt Bröder (University of Mannheim, Germany) and Gesine Dreisbach (University of Regensburg, Germany) who decided to step down from the board after having served as editors for many years. We are fortunate to have them replaced with excellent new editors who will reinforce our Editorial Board with their expertise in selected subfields of experimental psychology. Ullrich Ecker is Associate Professor for Psychology at the University of Western Australia. His research investigates Experimental Psychology (2020), 67(1), 1–4 https://doi.org/10.1027/1618-3169/a000471


2

human memory, memory updating, and the processing of misinformation. Julia Vogt is Associate Professor for Psychology at the University of Reading (United Kingdom). Her research focus is on motivation, emotion, and social cognition. Welcome to our Editorial Board! Other editorial activities aimed to strengthen transparency about decisions and publication practices. Advance in this direction was made with the award of Open Science badges to articles for the acknowledgment of open science practices (open data, open materials; preregistration). The badges were provided by the Open Science Foundation, and they appear in both the print journal and online articles. A study showed that acknowledgment of open science practices with digital badges improves public accessibility of data and materials used in scientific research (Kidwell et al., 2016). Experimental Psychology is one of the few scientific journals that made publication of raw data in a public data archive obligatory (except for noted reasons). Since the official introduction, the Open Data badge was awarded to every empirical article published in our journal. The Open Material badge was awarded to 27% and the Preregistration badge to 14% of the articles, attesting a high amount of transparency. In addition to Open Science badges, we also made efforts to increase the number of open-access publications. Specifically, one article per issue was published open access based on editors’ pick and without additional costs for the authors. Since the start of this initiative, an additional six articles (in addition to the three invited articles noted above) could be made open access (Arnold, Heck, Bröder, Meiser, & Boywitt, 2019; Berry, Allen, Waterman, & Logie, 2019; Genschow, Schuler, Cracco, Brass, & Wänke, 2019; Hussey & De Houwer, 2018; Kranz, Nadarevic, & Erdfelder, 2019; Zhang, Phan, Li, & Guo, 2018). We thank Hogrefe Publishing for the generous support of this publication model, and we would like to see more articles published under the Hogrefe OpenMind License in the future. In times of a noted “replication crisis” in psychology, a good journal should honor the value of reproducible data. Experimental Psychology acknowledged the importance of preregistration of study plans and publication of negative study results very early with the introduction of the Registered Report article type. A Registered Report is a preregistered study plan detailing the theoretical background, empirical hypotheses, methods, and data-analytic strategies for a planned but not yet conducted experiment. The study plan is evaluated by scientific peers and, importantly, an editorial decision on acceptance is made before the results of the experiment are known. The preregistration prevents hypothesizing after the results are known and withholding negative results from publication. In addition, the format is particularly suitable for a preregistration of direct replication studies that often fail to Experimental Psychology (2020), 67(1), 1–4

Editorial

provide affirmative results (see, e.g., Mieth, Bell, & Buchner, 2019). We are very happy that several registered report articles appeared in our journal since the start of our editorial term (e.g., Frech, Loschelder, & Friese, 2019; Grange, 2018) and that more are under revision at the moment. Last, but not least, a good journal has to be authorfriendly. A service, typically unnoticed by external people, is the careful screening of incoming manuscripts for formal errors by our Editorial Assistants, Anand Krishna and Sarah Schäfer. Their diligent proofreading and constructive feedback to authors is an invaluable asset to our editorial work. In our previous Editorial, we also stated the goal to reduce the manuscript turnover time from completion of submission to the first decision after review. In 2017, the average turnaround time was 64 days. In 2018, we could reduce this time to 49.5 days (SD = 28.7) and in 2019 even further to 39 days (SD = 23.4)! We are extremely happy that we were able to reduce the turnaround times below 40 days, and we thank our editorial team for making this possible. In addition to the reduction of turnaround times, we also made efforts to facilitate advance online publication after acceptance of manuscripts. We are aware that authors want their papers published as soon as possible after acceptance. We therefore made an agreement with Hogrefe Publishing that advance online publication in the paper’s final form is guaranteed within 12 weeks after acceptance (provided that there were no significant delays in the approval of proofs for release). Furthermore, Hogrefe Publishing also revised the guidelines on sharing and use of articles submitted to Experimental Psychology. The manuscript version accepted by the journal for publication could now be shared and posted at any time after acceptance, including on authors’ personal websites, in their own institutional repositories, in not-for-profit subjectbased repositories, and in scholarly communication networks such as ResearchGate or Academia.edu. In addition, the final version of the article as published in the journal (version of record) can be shared at any time with individuals upon request for their personal scientific or other noncommercial research uses and as part of a grant application or submission of a thesis or doctorate. For more information, the interested reader is referred to the Hogrefe Publishing Internet guidelines on sharing and use of articles in Hogrefe scientific journals.

Where to Go? Since the start of our editorial term, we hence could note several positive trends in the development of Experimental © 2020 Hogrefe Publishing


Editorial

Psychology, such as shortened turnaround times, pervasive use of open science practices, and removal of embargo policies. These trends must be consolidated and further strengthened in the coming time. However, there were also negative developments. Particularly alarming is the steady decline of the journal’s impact factor (IF) from 1.83 in the year 2016 to 1.21 in 2017 and 1.00 in the year 2018. While the interpretation of the IF is complex (for an indepth discussion, see our previous Editorial; Eder & Frings, 2018), it is a quantitative measure of the visibility of a journal in the scientific community. We deem an IF of 1.00 as unacceptably low according to this standard. Since the start of our editorial term, we started several initiatives that should increase the recognition of our journal (e.g., invited articles by distinguished scientists, special issues), the most important being the rigorous quality assurance of journal contributions (see our rates of acceptance above). Thus, we are confident that the negative trend has already been stopped and that a turnaround has been reached. In addition, we now also use social media to reach a broader audience. Readers are invited to follow us on Twitter to get all updates about the journal such as announcements of the latest publications, important news, and reminders regarding special issues.

Call for Registered Replication Studies Replication lies at the heart of the scientific method, and cumulative knowledge is only possible on the basis of confirmed and reproducible scientific findings. Experimental Psychology therefore wants to encourage researchers to carry out replication research using its registered report format. Replication studies could be direct (using the same research protocol as the original study) or conceptual (investigation of the same research question with a different research protocol). Using a 2stage review system, study protocols are reviewed by peers before data collection commences, and results from accepted studies will be published, regardless of the direction or statistical significance of the outcome (for more information, see our Author Instructions). For an additional incentive, the editors of Experimental Psychology and its publisher, Hogrefe Publishing, will award grants of €500 for each of two replication studies that pass Stage 1 of the tiered review process. The grant money may be used for any purpose as long as it is related to the replication, and it is paid even in the unlikely case in which the submitted article is rejected at Stage 2. Proposals can be submitted at any time, and there are no additional requirements for © 2020 Hogrefe Publishing

3

participation. Please indicate in your cover letter whether you want to apply for the grant (for detailed instructions, see hogrefe.com). Grants will be awarded on a first come, first served basis each year for a period of 2 years starting May 1 this year (2020).

Call for Special Issue Experimental Psychology invites submission of proposals for thematic special issues on a wide range of topics in experimental psychology, particularly those focusing on timely or emergent research areas. Consistent with the journal’s priorities, articles must meet the journal’s primary criteria, namely, the rigorous use of experimental methodology and/or a strong and innovative theoretical contribution to experimental psychology as a basic science. A special issue typically comprises a review of the special issue topic as well as empirical research papers or articles on methodological innovations (see, e.g., the forthcoming special issue on Stress & Cognition in Humans guest-edited by Gregor Domes and Christian Frings). A target article might also be published together with one or more invited comments. Proposals can be submitted at any time (for details, see hogrefe.com/j/exppsy).

References Arnold, N. R., Heck, D. W., Bröder, A., Meiser, T., & Boywitt, C. D. (2019). Testing hypotheses about binding in context memory with a hierarchical multinomial modeling approach. Experimental Psychology, 66(3), 239–251. https://doi.org/10.1027/16183169/a000442 Berry, E. D. J., Allen, R. J., Waterman, A. H., & Logie, R. H. (2019). The effect of a verbal concurrent task on visual precision in working memory. Experimental Psychology, 66(1), 77–85. https://doi.org/ 10.1027/1618-3169/a000428 De Houwer, J. (2019). Moving beyond System 1 and System 2. Experimental Psychology, 66(4), 257–265. https://doi.org/10. 1027/1618-3169/a000450 Eder, A. B., & Frings, C. (2018). What makes a quality journal? Experimental Psychology, 65(5), 257–262. https://doi.org/10. 1027/1618-3169/a000426 Frech, M.-L., Loschelder, D. D., & Friese, M. (2019). How and why different forms of expertise moderate anchor precision in price decisions. Experimental Psychology, 66(2), 165–175. https://doi. org/10.1027/1618-3169/a000441 Genschow, O., Schuler, J., Cracco, E., Brass, M., & Wänke, M. (2019). The effect of money priming on self-focus in the imitationinhibition task. Experimental Psychology, 66(6), 423–436. https://doi.org/10.1027/1618-3169/a000466 Grange, J. A. (2018). Does task activation in task switching influence inhibition or episodic interference? Experimental Psychology, 65(6), 393–404. https://doi.org/10.1027/1618-3169/a000423

Experimental Psychology (2020), 67(1), 1–4


4

Hommel, B. (2018). Representing oneself and others. Experimental Psychology, 65(6), 323–331. https://doi.org/10.1027/1618-3169/ a000433 Hussey, I., & De Houwer, J. (2018). Implicit association test as an analogical learning task. Experimental Psychology, 65(5), 272–285. https://doi.org/10.1027/1618-3169/a000416 Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., …, Nosek, B. A. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLOS Biology, 14(5), e1002456. https://doi.org/10.1371/journal.pbio.1002456 Kranz, D., Nadarevic, L., & Erdfelder, E. (2019). Bald and bad? Experimental evidence for a dual-process account of baldness stereotyping. Experimental Psychology, 66(5), 340–354. https:// doi.org/10.1027/1618-3169/a000457 Mieth, L., Bell, R., & Buchner, A. (2019). The “mnemonic time-travel effect”. Experimental Psychology, 66(6), 437–442. https://doi. org/10.1027/1618-3169/a000461 Spence, C. (2019). On the relationship(s) between color and taste/ flavor. Experimental Psychology, 66(2), 99–111. https://doi.org/ 10.1027/1618-3169/a000439 Zhang, Y., Phan, Z., Li, K., & Guo, Y. (2018). Self-serving bias in memories: Selectively forgetting the connection between negative information and the self. Experimental Psychology, 65(4), 237–245. https://doi.org/10.1027/1618-3169/a000409

Experimental Psychology (2020), 67(1), 1–4

Editorial

Published online April 30, 2020 Andreas B. Eder Department of Psychology Universität Würzburg Röntgenring 10 97070 Würzburg Germany andreas.eder@uni-wuerzburg.de Christian Frings Department of Psychology Universität Trier Universitätsring 15 54286 Trier Germany chfrings@uni-trier.de

© 2020 Hogrefe Publishing


Research Article

Experienced Category Variability Modulates the Impact of Context on Evaluative Judgments Marı́lia Prada1 and Teresa Garcia-Marques2 Instituto Universitário de Lisboa (ISCTE-IUL), CIS-IUL, Lisbon, Portugal

1

ISPA – Instituto Universitário, William James Center for Research, Lisbon, Portugal

2

Abstract. Data from two experiments show that the experienced structure of a category (i.e., as having high vs. low variability) modulates the impact of context on evaluative judgments of individual exemplars. Target objects (unfamiliar in Experiment 1 and familiar in Experiment 2) were primed with positive and negative images while varying the number (Experiment 1) or typicity (Experiment 2) of exemplars known from a category prior to the judgment task. The results show that evaluations of object valence were more influenced by valenced context cues in high than in low variability category conditions. These results are taken as evidence that more varied exemplar-based category representations facilitate context effects on stimulus evaluation. Keywords: exemplar variability, judgment malleability, context influences, evaluation

According to exemplar models, the information stored in our memory includes distinct exemplars, and their categorization occurs considering how similar these stored exemplars are to a given percept (Hintzman, 1986; Medin & Schaffer, 1978; Nosofsky, 1986). Not only these exemplar models account for the categorization process itself, but they also help to understand how we judge such categories and their elements (e.g., Lord & Lepper, 1999; Sia, Lord, Blessum, Thomas, & Lepper, 1999). From the understanding provided by exemplar models, we learn that the variability of exemplars is highly relevant for the process of mental categorization (e.g., Cohen, Nosofsky, & Zaki, 2001; Rips, 1989; Sloman, 1994). For instance, increased perceived variability has been shown to lead to optimal categorization (Ashby & Gott, 1988; Fried & Holyoak, 1984). Also, an increase in exemplar variability during category learning led individuals to perform better in a classification of novel exemplars task (e.g.,Cohen et al., 2001; Fried & Holyoak, 1984; Lambert & Wyer, 1990; Lively, Logan, & Pisoni, 1993; Sakamoto, Jones, & Love, 2008; Stewart & Chater, 2002; Wahlheim & DeSoto, 2017; Wahlheim, Finn, & Jacoby, 2012). Thus, category variability affects how easily categories are learned, with highly varied categories being © 2020 Hogrefe Publishing

more difficult to acquire than less varied categories (Hahn, Bailey, & Elvin, 2005; Homa & Vosburgh, 1976). Category variability also affects the learningof abstraction(e.g., Zentall, Wasserman, Lazareva, Thompson, & Rattermann, 2008) and promotes generalization in language development (Bowerman & Choi, 2001; Waxman & Klibanoff, 2000) and in the nonverbal domain of object categorization (e.g., Ribar, Oakes, & Spalding, 2004; Vukatana, Graham, Curtin, & Zepeda, 2015). All these studies document that exemplar variability is a relevant feature for basic cognitive processes. However, to our knowledge, there are few or no studies (see Hahn et al., 2005 and Nosofsky & Kantner, 2006 for memory studies) that address how variability of learned exemplars may moderate other related cognitive processes. In this paper, we focus on the variability of exemplars, arguing that variability is a feature of knowledge that imposes constraints on how context is able to bias our thoughts. This argument is corroborated in two studies, showing that our experience of the variability of a given category (i.e., less vs. more variable) modulates the impact of context on evaluative judgments of specific category exemplars.

Judging Category Exemplars Why can exemplars’ category variability be relevant to how we evaluate exemplars? The answer to this question Experimental Psychology (2020), 67(1), 5–13 https://doi.org/10.1027/1618-3169/a000469


6

relies on how we make evaluative judgments and how category features contribute to the evaluative process. Some perspectives suggest that our evaluative judgments may be computed on the spot; at the moment the evaluative goal is established with current available information (e.g., Schwarz, 2007) based on the information that is made accessible in our memory (e.g., Bassili & Brown, 2005). As such, the evaluation of the overall category and category exemplars depends on which exemplars are accessible in our minds. The process of evaluating an exemplar on the spot is simultaneously influenced by the context and constrained by different features of previous knowledge (for a review, see Barsalou, 1987, 1989). Both factors determine what is accessible in our memory at the time of judgment. What is accessible is partially determined by the context in which the judgment occurs (Garcia-Marques & Mackie, 1999; Kahneman & Miller, 1986). The context primes specific exemplars and these provide support for the overall evaluation of a category and its exemplars. It is supposedly because of this that the activation of a specific exemplar, such as a politician (e.g., Bill Clinton), was shown to influence the evaluation of the whole category of politicians (Sia et al., 1999). The category of Black people was also shown to be evaluated more positively when participants were previously exposed to admired Black exemplars (e.g., Martin Luther King, Dasgupta & Greenwald, 2001; cf. Joy-Gaba & Nosek, 2010). Thus, changes in the context in which the judgment occurs may lead to changes in the relative impact of specific exemplars that will support that evaluative judgment (Sia et al., 1999). But what is accessible is also constrained by features of our knowledge structure. For instance, having typical (or prototypical) exemplars available leads to a less contextdependent (and thus more stable) evaluation, as this type of exemplars is easier to retrieve and more likely to support subsequent judgments (Lord, Desforges, Ramsey, Trezza, & Lepper, 1991). In the same line of thought, besides prototypicality, other features that determine exemplars’ accessibility should also impact context sensitivity. Here, we focus on the impact of the variability of exemplars stored in our memory for this process. Specifically, the variability of accessible exemplars is important, given that processing an exemplar depends on the specific features of the subset of exemplars that, in a particular moment, compound the representation of the category (Janczura & Nelson, 1999; Medin & Shoben, 1988; Roth & Shoben, 1983). Context effects on exemplar (or category) evaluations are determined by the exemplars that are made accessible in memory. However, “what can be made accessible” in memory also depends on what is already stored. For instance, it is because we actually know some pleasant insects and some unpleasant flowers that the normative Experimental Psychology (2020), 67(1), 5–13

M. Prada & T. Garcia-Marques, The Importance of Variability

preference for flowers over insects may be reversed when the insects made accessible are positive (e.g., butterfly) and the flowers negative (e.g., weeds, Govan & Williams, 2004). Wisniewski’s (1995) studies offered indirect support for this hypothesis (see also Loken, Joiner, & Peck, 2002), showing that students’ evaluation of “a new instrument” to “clean up pollution” is perceived to be “a vacuum cleaner” or “a sponge” depending on the context where the instrument was presented. In the first case, the object was perceived to be more efficient when presented near roadside trash (vs. ocean spill), and in the second, when presented near the ocean spill. Moreover, Herr (1989; Experiment 2) illustrated how previous knowledge about a category (cars) imposes restrictions on specific “exemplar activation,” leading to a modulation of context effects on category evaluation. Context effects on evaluative judgments were stronger for participants that reported higher (vs. lower) knowledge of the category (for evidence regarding the moderation by level of expertise, see also Bettman & Sujan, 1987; Yi, 1993). That may be why evaluations deriving from a homogeneous category of exemplars tend to be more stable (Armitage & Conner, 2004), and also why we propose that knowledge variability is likely to modulate exemplar evaluation in general. However, no study has yet directly isolated this phenomenon, stressing the underlying feature of variability as highly relevant for how knowledge constraints context biases.

Current Studies Assuming that category representations are highly flexible structures, varying widely across contexts (for reviews, see Barsalou, 1987, 1993; Garcia-Marques, Santos, Mackie, Hagá, & Palma, 2017), we argue that by increasing the variability of exemplars currently representing the category in working memory, we increase the likelihood of a contextual match, which facilitates the occurrence of contexts effects. This hypothesis was examined in two experiments, by testing if the number of exemplars of an unfamiliar category (Experiment 1) and the similarity between exemplars of familiar categories (Experiment 2) influence the sensitivity of exemplar judgments to context effects.

Experiment 1 Participants and Design Participants were 107 university students (74.8% women; Mage = 22.44, SD = 5.10) who volunteered for the study in © 2020 Hogrefe Publishing


M. Prada & T. Garcia-Marques, The Importance of Variability

return for partial course credit. Participants were randomly distributed by the first factor of the following design: 3 (category variability: low; high; control) × 2 (context cue valence: negative prime; positive prime). The second factor was manipulated within-participants, so the test of our hypothesis is associated with a within-between interaction effect. Sampling assumes these effects to be of moderate magnitude, so, if present, the effect would be detected for α = 0.05, with a power of 1 – β = 0.80, which recommendation was of N = 66 (following Faul, Erdfelder, Lang, & Buchner, 2007).

7

familiarity rating scale – Prada & Ricot, 2010) was used as exemplars of the category to be evaluated (target stimuli). The 20 affectively charged images (context cues) were unrelated to the target category (i.e., none represented objects): half were pretested as negative (e.g., graveyard or skull; Mvalence = 3.28, SD = 0.25) and remaining were pretested as positive (e.g., strawberry or toddler smiling; Mvalence = 8.06, SD = 0.32; 9-point valence rating scale – Prada & Garcia-Marques, 2006).

Procedure Materials Two sets of pretested images supported our experimental paradigm: (a) images of unfamiliar objects represented the target category; and (b) affectively charged images were used as primes to promote context influences in participants’ evaluations. A set of 50 images of objects were used to support the variability manipulation (for examples, see Figure 1): half were known familiar domestic objects (e.g., spoon or comb; Prada & Garcia-Marques, 2006) and the other half comprised images of objects pretested as unfamiliar (Prada & Ricot, 2010). In the low variability condition, before the evaluation task, participants saw 30 images: 22 of known domestic objects randomly mixed with eight unfamiliar objects. In the high variability condition, the 30 images contained 12 known domestic objects randomly mixed with 18 unfamiliar target objects. All images were presented with equal resolution (800 × 600) in grayscale. We collected ratings of perceived diversity and general liking of the objects of each set using 7-point rating scales. The images in the low variability set were perceived as being more homogeneous (M = 5.47, SD = 1.70) than those in the high variability set (M = 4.00, SD = 2.25), t(31) = 2.13, p = .042, d = 0.77. No significant differences were detected regarding the liking of the target objects between low (M = 3.18, SD = 1.07) and high variability conditions (M = 3.63, SD = 1.09), t(31) = 1.19, p = .243, d = 0.43. A different set of images depicting 20 objects pretested as unfamiliar (Mfamiliarity = 3.16, SD = 0.73; 9-point

Participants arrived in the laboratory in groups of 3–6 people after providing their informed consent. The experiment was performed on a computer (E-prime software) with participants randomly distributed by each of the three “category variability” conditions. To assure that participants experienced a high or a low variability category, before the evaluative task, they were presented with the high or the low variability set of images that was described as a “categorization task.” In this task, participants were exposed to a set of 30 images, with each image singly presented in the center of the screen, and they categorized it, as quickly as possible, as either a “domestic” (S key) or “nondomestic” (L key) object. This task allowed those in the high variability condition to represent the target category of objects with higher variability of exemplars than those in the low variability condition. Those in the control condition had no category representation prior to the evaluation task, as they only performed the evaluative task. All participants then received instructions for the evaluative task (20 trials) in which they were shown pairs of images sequentially, and their task was to evaluate, as fast as possible, the object represented in the second image of the pair using a 5-point rating scale (from 1 = I do not like it to 5 = I like it a lot). Each trial started with an attentionfixation sign (+, 500 ms), followed by the valenced image (200 ms) that supports the context cue manipulation, a blank screen (100 ms) and then the target (category exemplar). The target remained visible until a response was registered (keys 1–5). Figure 1. Examples of the materials used in the variability manipulation.

© 2020 Hogrefe Publishing

Experimental Psychology (2020), 67(1), 5–13


8

M. Prada & T. Garcia-Marques, The Importance of Variability

In order to calibrate their response times (RT), participants were invited first to perform a training task (five trials) comprising neutral geometric shapes as targets. In this training phase, feedback was provided to reinforce target evaluations below 1500 ms. Each session took about 10 min. At the end of the session, participants were thanked and debriefed.

Results Categorization Task Accuracy in the categorization task was represented by the number of known domestic objects classified as domestic and unfamiliar objects as nondomestic. Participants in the high variability condition (M = 0.89, SD = 0.07) were more accurate than those in the low variability condition (Mlow = 0.84, SD = 0.09), t(67) = 2.27, p = .027, d = 0.38. Response latencies in this task did not significantly vary according to experimental condition (Mlow = 1475, SE = 142; Mhigh = 1444, SE = 144), F(1,67) = 0.024, p = .858, ηp2 = 0.000. Yet, participants were slower to categorize nonfamiliar objects (M = 1675 ms, SE = 144) than domestic objects (M = 1244 ms, SE = 67), F(1,67) = 18.75, MSE = 6 395 281, p < .001, ηp2 = 0.219.

Evaluation Task We test our hypothesis directly by analyzing the impact of contextual variability and context cue valence on exemplars evaluation (dependent measure). Specifically, we conducted a mixed ANOVA (3 context variability × 2 context cue valence) with the second factor defined as repeated measures. Evidence of context effects was provided by the significant main effect of context cue valence on exemplars’ evaluation, F(1,104) = 142.52, MSE = 59.30, p < .001, ηp2 = 0.578, suggesting an assimilation effect. Indeed, targets preceded by positive stimuli (M = 2.95, SE = 0.09) were rated more positively than targets preceded by negative stimuli (M = 1.90, SE = 0.06). Importantly, as hypothesized, the context effect was qualified by the experimental conditions, as shown by the interaction between cue valence and contextual variability, F(2,104) = 4.49, MSE = 1.87, p = .013, ηp2 = 0.079 (see Figure 2). When decomposing the interaction in two orthogonal contrasts, we found that (a) the context effect was stronger in the higher variability condition than in the compound of the two other conditions, t(104) = 2.76, p = .007, d = Experimental Psychology (2020), 67(1), 5–13

Figure 2. Target ratings according to context cue valence (negative vs. positive) and contextual variability (low, control, high). Error bars represent standard errors.

0.54, and (b) no significant difference was found between those two conditions – low variability and the control condition, t(104) = 1.23, p = .223, d = 0.24. The main effect of variability conditions in target ratings did not reach standards of significance, F(1,104) = 3.02, MSE = 2.54, p = .053, ηp2 = 0.055. Still, the pattern of means suggests that participants in the high variability condition evaluated targets more positively (M = 2.64, SE = 0.11) than those in the low variability (M = 2.37; SE = 0.11) and the control conditions (M = 2.27, SE = 0.11). Although the results were as expected, some aspects should be critically addressed. These are mainly related with having used a categorization task as a way of manipulating target category variability, relying on quantity, as a manipulation of variability, and because the simple higher previous exposure to unfamiliar objects in the high variability condition could have interfered with our results. The categorization task may actually have cued participants to form two categories, domestic and nondomestic objects, which may have become relevant in the evaluation task. Although the creation of a category of unknown objects was intended, its contrast with domestic objects was irrelevant to our goals, therefore it was kept constant in the two variability conditions. Having the number of exemplars as a source of variability is not problematic in the sense that sample size is known to be a source of variability. However, it can restrict our conclusions, forcing a different theoretical explanation for our results if does not generalizes to other sources of variability. That is why we changed this manipulation in Experiment 2. By doing so, we also refute criticism suggesting that mere exposure effects (e.g., Bornstein, 1989; Zajonc, 1968) could be influencing our results. However, if that was the case, it would have promoted a main effect of experimental conditions (which did not occur) instead of the observed interaction. Nevertheless, in Experiment 2, variability was manipulated through the levels of typicality of exemplars activated in memory. © 2020 Hogrefe Publishing


M. Prada & T. Garcia-Marques, The Importance of Variability

Experiment 2 Participants and Design Participants were 56 university students – two original participants reported that they did not understand the first task and were excluded from the sample before data analysis – (87.5% females; Mage = 22.23, SD = 5.78) who volunteered for the study in return for partial course credit. These participants were randomly distributed along the conditions defined by the following design: 2 (contextual variability: low; high) × 2 (context cue valence: negative; positive) × 2 (target category: snakes; spiders). The inclusion of two target categories (i.e., snakes and spiders) served only a replication purpose. The specific target category did not have a main effect on evaluations, nor interacted with prime valence and variability, all Fs < 1. Therefore, it was subsequently dropped from all analyses. Only context cue valence was manipulated withinparticipants.

Materials A set of 60 pretested images of snakes or spiders were printed in color in small cards (5.5 cm × 8 cm). All images were previously evaluated regarding their typicality and valence using 9-point scales. All images were negative but differed in typicality. Based on the typicality ratings, two sets of 18 images were created. The low variability set only included exemplars that had been previously evaluated as typical of their category (e.g., house spider). The high variability set included six typical exemplars and 12 less typical exemplars (e.g., tarantula). As such, the low variability set contains exemplars pretested as more typical (M = 4.75, SD = 2.30) than the high variability set (M = 4.21, SD = 2.10), t(29) = 2.93, p < .001, d = 1.09. As expected, all stimuli were evaluated as negative and, although the mean in the low variability set was perceived as more negative (Mlow = 2.44, SD = 1.77) than the one in the high set (Mhigh = 2.63, SD = 2.12), the difference regarding likeability ratings was not significant, t(29) = 1.33, p = .193, d = 0.49. For the evaluation task, the target stimuli included a different set of images depicting snakes or spiders (Mtipicality = 4.81, SD = 2.36) presented in grayscale. Prime images supporting context cue manipulations were the same stimuli used in Experiment 1.

Procedure Participants arrived in the laboratory in groups of 3–6 participants and performed the experiment in individual © 2020 Hogrefe Publishing

9

booths, where instructions were provided on a computer screen. First, participants were presented with an “image perception and categorization” task. For this, participants were handed a set of 18 cards (displaying either snakes or spiders) and asked to perform a cluster task by clustering the cards according to their own liking/preference in piles that could go from 1 (i.e., all images equally liked or disliked) to 18 (i.e., each image of a category evaluated differently). Instructions specified that all images in a given cluster should be considered similar in preference and put inside the same envelope. Afterward, a number should be written in each envelope, ranking the pile preference (e.g., envelope number 1 would be the cluster most disliked). The maximum duration of the task was 7 min (a warning sound provided by the computer after 5 m alerted that the task should be quickly completed). Participants were then asked to proceed to the evaluative task. Instructions and procedure for this task were identical to Experiment 1, but targets were either 20 images of snakes or 20 images of spiders, presented randomly. At the end of the session, participants were thanked and debriefed.

Results Clustering Task To check the contextual variability manipulation, we computed the “probability of differentiation” index (Linville, Salovey, & Fischer, 1986) for each condition. This index of perceived variability takes into account both the number of piles that participants used in the first task and the number of exemplars they included in each pile. As expected, the probability of differentiation was (marginally) higher in the high variability conditions (M = 0.50, SD = 0.29) than in the low variability conditions (M = 0.36, SD = 0.32), t(54) = 1.92, p = .060, d = 0.52.

Evaluation Task In order to test our moderation hypothesis, target evaluations were analyzed in a mixed ANOVA (2 variability × 2 cue context valence) with context cue valence as a repeated factor. As expected, valence impacted target evaluation, with an assimilation pattern, such that targets preceded by positive cues (M = 2.85, SD = 0.15) were rated more positively than targets preceded by negative cues (M = 1.83, SE = 0.11), F(1,54) = 49.72, MSE = 28.92, p < .001, ηp2 = 0.479. Supporting our main hypothesis, and replicating Experimental Psychology (2020), 67(1), 5–13


10

Figure 3. Target ratings according to context cue valence (negative vs. positive) and contextual variability (low vs. high). Error bars represent standard errors.

Experiment 1, this effect was qualified by the contextual variability manipulation, as was shown by a significant interaction between the two factors, F(1,54) = 10.87, MSE = 6.33, p = .002, ηp2 = 0.168. As shown in Figure 3, the effect is stronger for the high variability (although the contrast for the low variability condition is also significant, t(54) = 2.70, p = .009, d = 0.73). Thus, by increasing the variability of exemplars accessible to participants, we increased their sensitivity to contextual influences. In contrast to Experiment 1, the main effect of the variability manipulation was nonsignificant, F(1,54) = 0.171, MSE = 0.24, p = .681, ηp2 = 0.003, which means that different experiences of variability associated with the target category did not lead to different evaluations of the exemplars.

Discussion In two experiments, we examined if (and how) contextual variability modulates the impact that a context cue’s affective tone exerts on the evaluation of exemplars. The assumption was that contextual variability of a category allows access to a more heterogeneous set of exemplars that would facilitate the occurrence of context effects. The contextual variability of exemplars was manipulated prior to the evaluation task by exposing participants either to few/many exemplars of an unfamiliar category (Experiment 1) or to more/less typical exemplars of a familiar category (Experiment 2). Overall, the results suggest that, although the contextual activation of an affective tone impacted the evaluation of category exemplars, this assimilative effect increased in higher contextual variability conditions. These results add to the literature that already shows the importance of exemplar variability to classification and learning, with evidence showing that exemplar variability Experimental Psychology (2020), 67(1), 5–13

M. Prada & T. Garcia-Marques, The Importance of Variability

also modulates context effects. This is assumed to occur because an increase in the variability of exemplars currently representing the category in working memory leads to an increase in the likelihood of a contextual match, which, in turn, facilitates the occurrence of contexts effects. Our findings also add to previous demonstrations that increased sensitivity to context effects over evaluation is observed when the information set available in memory is higher (e.g., higher knowledge; Herr, 1989). According to our results, it is likely that moderation occurred due to the higher heterogeneity of exemplars available to those participants. Hence, what supports such increased sensitivity to context effects is the contextual variability of exemplars able to be activated in memory. If this is the case, all sources of contextual variability (not only prior knowledge) are likely to have similar effects. This is a research path worth exploring in the future, also focusing on what levels of variability favor or mitigate this increase in sensitivity to context effects. Underlying the cue context effects on evaluation is a mechanism that allows increased accessibility of category exemplars that are congruent with the target valence (Bassili & Brown, 2005; Conrey & Smith, 2007; Rydell & Gawronski, 2009; Schwarz & Bless, 1992; Sia et al., 1999). For instance, Schwarz and Bless’s (1992) inclusion– exclusion model suggests that priming effects depend on the ease with which the prime feature (valence) is activated and incorporated into a target evaluation. Framed by these perspectives, we theorized that a more variable knowledge structure is likely to facilitate the emergence of these assimilation effects, whereas a narrow category could be reducing assimilation, giving rise to the emergence of contrast effects (Schwarz & Bless, 1992). In line with this assumption, research has also showed that individuals who contact with more diverse categories understand distances between exemplars differently (Hahn et al., 2005), and this may contribute to increased assimilation. In our view, this is the mechanism assume to be underlying our results. However, there are alternative pathways to explain why variability can increase context sensitivity. For instance, the observed increase in sensitivity may occur because higher variability may change individuals’ response criteria. That is, the mechanism can be the same as was previously detected for other effects of category variability, namely: highly variable categories are associated with an increase in looseness of our mental activity; and people set a lower criterion for classifying objects into highly variable categories than into low variable categories (Cohen et al., 2001; Nosofsky & Kantner, 2006). Future studies should account for this alternative explanation by addressing if increased context sensitivity is directly related to the use of lower classification criterion. © 2020 Hogrefe Publishing


M. Prada & T. Garcia-Marques, The Importance of Variability

Also, our hypothetical explanatory mechanism should be contrasted with alternative processes. For instance, it is possible that the simple subjective experience of variability might be sufficient to promote increased sensitivity to the context. This would be somewhat related to other phenomena such as “priming by variance” (i.e., the impact of the variance of a visual array in the responses to a subsequently presented visual array with similar levels of feature variance; Michael, de Gardelle, & Summerfield, 2014) or “priming variability” (i.e., exposure to variability increases preference for evaluating diverse over nondiverse samples; Rhodes & Brickman, 2010). Another possibility is that our manipulations of variability were selectively priming a more global type of processing, known to increase assimilation effects (Förster, Liberman, & Kuschel, 2008). Thus, although our data clearly suggest that promoting an experience with more varied exemplars of a category leads to increased context sensitivity of evaluations of similar exemplars, several alternative explanations should be examined in future research. Future research should also seek to generalize the effect to other types of tasks, procedures, and materials. Although in this paper we show that the effect may occur with different manipulations of variability (one based on the number of exemplars and the other based on the type of exemplars made accessible prior to the evaluation task) and with different target categories (one more unfamiliar and the other more familiar), evidence of the “variability effect” should also be obtained by using different procedures (e.g., Fazio’s affective priming task – Fazio, Sanbonmatsu, Powell, & Kardes, 1986; or Payne’s affect misattribution procedure – Payne, Cheng, Govorun, & Stewart, 2005) and materials. One important point is how context was defined in this paper. Our manipulation is similar to the ones used in priming procedures. This may be a very special way of inducing a context since it may encompass other mechanisms, such as activating a response tendency, that are not necessarily present in all types of tasks. Future research should also aim to overcome the current studies’ limitations. For example, to control for valence effects, we worked exclusively with negative stimuli, which limits the generality of the effect. However, because positive information shares more similarities, resulting in a higher density in memory representations (Koch, Alves, Krüger, & Unkelbach, 2016; Unkelbach, Fiedler, Bayer, Stegmüller, & Danner, 2008), it is possible that the effect can only be properly tested in a negative context. Future studies should address this possibility. We manipulated variability of activated exemplars accessible in memory by manipulating either their number or their level of perceived similarity. Although these two concepts are not independent, given that low variability © 2020 Hogrefe Publishing

11

classes require a high threshold for similarity (Hampton, 2015), one was manipulated with regard to a new category and the other with a well-established category. It could thus be argued that we have no guarantee that the two manipulations are good operationalizations of one, and only one, construct – category variability. Although this is an empirical question to be addressed in the future, an argument in favor of their consistency was that both increased context sensitivity, regardless of manipulating the number of exemplars regarding a newly learned category or the perceived homogeneity of a known category. This suggests that it is the variability of the exemplars made more accessible in memory that modulates cognition. Thus, what we have been focusing on refers to the representations of categories that are episodically constructed by exemplars that are made relatively more or less accessible via a combination of long-term stored knowledge, recent experiences, and the immediate context. As such, future studies should explore if our results are only related to the category structure resulting from these processing features or with knowledge structure itself.

Conclusion The main goal of the present research was to examine the impact of the features of knowledge distribution – in particular, variability – in contextual influences. Our findings indicate that a higher variability of current accessible exemplars promotes those effects, constituting the first direct evidence that variability exerts such modulation. Paradoxically, these data suggest that the more we know about the diversity of the world, the more unstable are our evaluations, as they are more susceptible to the influence of small cues occurring in our context.

References Armitage, C. J., & Conner, M. (2004). The effects of attitudinal ambivalence on attitude intention behavior relations. In G. Haddock & G. R. O. Maio (Eds.), Contemporary perspectives on the psychology of attitudes (pp. 121–144). Hove, UK: Psychology Press. Ashby, F. G., & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology. Learning, Memory, and Cognition, 14, 33–53. https://doi.org/10.1037//0278-7393.14.1.33 Barsalou, L. W. (1987). The instability of graded structure: Implications for the nature of concepts. In U. Neisser (Ed.), Concepts and conceptual development: Ecological and intellectual factors in categorization (pp. 101–140). New York, NY, US: Cambridge University Press.

Experimental Psychology (2020), 67(1), 5–13


12

Barsalou, L. W. (1989). Intraconcept similarity and its implications for interconcept similarity. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 76–121). New York, NY, US: Cambridge University Press. https://doi.org/10.1017/ cbo9780511529863.006 Barsalou, L. W. (1993). Flexibility, structure, and linguistic vagary in concepts: Manifestations of a compositional system of perceptual symbols. In A. F. Collins, S. E. Gathercole, M. A. Conway, & P. E. Morris (Eds.), Theories of memory (pp. 29–101). Hillsdale, NJ: Lawrence Erlbaum Associates. Bassili, J. N., & Brown, R. D. (2005). Implicit and explicit attitudes: Research, challenges, and theory. In The handbook of attitudes (pp. 543–574). Mahwah, NJ: Lawrence Erlbaum Associates. Bettman, J. R., & Sujan, M. (1987). Effects of framing on evaluation of comparable and noncomparable alternatives by expert and novice consumers. Journal of Consumer Research, 14, 141–154. https://doi.org/10.1086/209102 Bornstein, R. F. (1989). Exposure and affect: Overview and metaanalysis of research, 1968–1987. Psychological Bulletin, 106, 265–289. https://doi.org/10.1037/0033-2909.106.2.265 Bowerman, M., & Choi, S. (2001). Shaping meanings for language: Universal and language – specific in the acquisition of spatial semantic categories. In M. Bowerman & C. S. Levinson (Eds.), Language acquisition and conceptual development (pp. 475–511). New York, NY: Cambridge University Press. https://research.vu. nl/en/publications/shaping-meanings-for-language-universaland-language-specific-inCohen, A. L., Nosofsky, R. M., & Zaki, S. R. (2001). Category variability, exemplar similarity, and perceptual classification. Memory & Cognition, 29, 1165–1175. https://doi.org/10.3758/bf03206386 Conrey, F. R., & Smith, E. R. (2007). Attitude representation: Attitudes as patterns in a distributed, connectionist representational system. Social Cognition, 25, 718–735. https://doi.org/10. 1521/soco.2007.25.5.718 Dasgupta, N., & Greenwald, A. G. (2001). On the malleability of automatic attitudes: Combating automatic prejudice with images of admired and disliked individuals. Journal of Personality and Social Psychology, 81, 800–814. https://doi.org/10.1037/ 0022-3514.81.5.800 Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191. https://doi.org/10.3758/bf03193146 Fazio, R. H., Sanbonmatsu, D. M., Powell, M. C., & Kardes, F. R. (1986). On the automatic activation of attitudes. Journal of Personality and Social Psychology, 50, 229–238. https://doi.org/ 10.1037/0022-3514.50.2.229 Förster, J., Liberman, N., & Kuschel, S. (2008). The effect of global versus local processing styles on assimilation versus contrast in social judgment. Journal of Personality and Social Psychology, 94, 579–599. https://doi.org/10.1037/0022-3514.94.4.579 Fried, L. S., & Holyoak, K. J. (1984). Induction of category distributions: A framework for classification learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 234–257. https://doi.org/10.1037/0278-7393.10.2.234 Garcia-Marques, L., & Mackie, D. M. (1999). The impact of stereotype-incongruent information on perceived group variability and stereotype change. Journal of Personality and Social Psychology, 77(5), 979–990. https://doi.org/10.1037/0022-3514.77.5.979 Garcia-Marques, L., Santos, A. S., Mackie, D. M., Hagá, S., & Palma, T. A. (2017). Cognitive malleability and the wisdom of independent aggregation. Psychological Inquiry, 28, 262–267. https:// doi.org/10.1080/1047840X.2017.1373558 Govan, C. L., & Williams, K. D. (2004). Changing the affective valence of the stimulus items influences the IAT by re-defining the

Experimental Psychology (2020), 67(1), 5–13

M. Prada & T. Garcia-Marques, The Importance of Variability

category labels. Journal of Experimental Social Psychology, 40, 357–365. https://doi.org/10.1016/j.jesp.2003.07.002 Hahn, U., Bailey, T. M., & Elvin, L. B. C. (2005). Effects of category diversity on learning, memory, and generalization. Memory & Cognition, 33, 289–302. https://doi.org/10.3758/bf03195318 Hampton, J. A. (2015). Categories, prototypes and exemplars. In N. Riemer (Ed.), The Routledge Handbook of Semantics (pp. 125–141). New York, NY, US: Routledge. https://doi.org/10. 4324/9781315685533-20 Herr, P. M. (1989). Priming price: Prior knowledge and context effects. Journal of Consumer Research, 16, 67–75. https://doi. org/10.1086/209194 Hintzman, D. L. (1986). “Schema abstraction” in a multiple-trace memory model. Psychological Review, 93, 411–428. https://doi. org/10.1037/0033-295x.93.4.411 Homa, D., & Vosburgh, R. (1976). Category breadth and the abstraction of prototypical information. Journal of Experimental Psychology: Human Learning and Memory, 2, 322–330. https:// doi.org/10.1037/0278-7393.2.3.322 Janczura, G. A., & Nelson, D. L. (1999). Concept accessibility as the determinant of typicality judgments. The American Journal of Psychology, 112, 1–19. https://doi.org/10.2307/1423622 Joy-Gaba, J. A., & Nosek, B. A. (2010). The surprisingly limited malleability of implicit racial evaluations. Social Psychology, 41, 137–146. https://doi.org/10.1027/1864-9335/a000020 Kahneman, D., & Miller, D. T. (1986). Norm theory: Comparing reality to its alternatives. Psychological Review, 93(2), 136–153. https:// doi.org/10.1037/0033-295x.93.2.136 Koch, A., Alves, H., Krüger, T., & Unkelbach, C. (2016). A general valence asymmetry in similarity: Good is more alike than bad. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42, 1171–1192. https://doi.org/10.1037/xlm0000243 Lambert, A. J., & Wyer, R. S. (1990). Stereotypes and social judgment: The effects of typicality and group heterogeneity. Journal of Personality and Social Psychology, 59, 676–691. https://doi. org/10.1037/0022-3514.59.4.676 Linville, P. W., Salovey, P., & Fischer, G. W. (1986). Stereotyping and perceived distributions of social characteristics: An application to ingroup-outgroup perception. In J. F. Dovidio & S. L. Gaertner (Eds.), Prejudice, discrimination, and racism (pp. 165–208). San Diego, CA: Academic Press. Lively, S. E., Logan, J. S., & Pisoni, D. B. (1993). Training Japanese listeners to identify English /r/ and /l/. II: The role of phonetic environment and talker variability in learning new perceptual categories. Journal of the Acoustical Society of America, 94, 1242–1255. https://doi.org/10.1121/1.408177 Loken, B., Joiner, C., & Peck, J. (2002). Category attitude measures: Exemplars as inputs. Journal of Consumer Psychology, 12(2), 149–161. https://doi.org/10.1207/s15327663jcp1202_07 Lord, C. G., Desforges, D. M., Ramsey, S. L., Trezza, G. R., & Lepper, M. R. (1991). Typicality effects in attitudebehavior consistency: Effects of category discrimination and category knowledge. Journal of Experimental Social Psychology, 27, 550–575. https:// doi.org/10.1016/0022-1031(91)90025-2 Lord, C. G., & Lepper, M. R. (1999). Attitude Representation Theory. Advances in Experimental Social Psychology, 31, 265–343. https://doi.org/10.1016/s0065-2601(08)60275-0 Medin, D. L., & Schaffer, M. M. (1978). Context theory of classification learning. Psychological Review, 85, 207–238. https://doi. org/10.1037/0033-295x.85.3.207 Medin, D. L., & Shoben, E. J. (1988). Context and structure in conceptual combination. Cognitive Psychology, 20, 158–190. https://doi.org/10.1016/0010-0285(88)90018-7 Michael, E., de Gardelle, V., & Summerfield, C. (2014). Priming by the variability of visual information. Proceedings of the National

© 2020 Hogrefe Publishing


M. Prada & T. Garcia-Marques, The Importance of Variability

Academy of Sciences, 111, 7873. https://doi.org/10.1073/pnas. 1308674111 Nosofsky, R. M. (1986). Attention, similarity, and the identification– categorization relationship. Journal of Experimental Psychology: General, 115, 39–57. https://doi.org/10.1037/0096-3445.115. 1.39 Nosofsky, R. M., & Kantner, J. (2006). Exemplar similarity, study list homogeneity, and short-term perceptual recognition. Memory & Cognition, 34, 112–124. https://doi.org/10.3758/bf03193391 Payne, B. K., Cheng, C. M., Govorun, O., & Stewart, B. D. (2005). An inkblot for attitudes: Affect misattribution as implicit measurement. Journal of Personality and Social Psychology, 89, 277–293. https://doi.org/10.1037/0022-3514.89.3.277 Prada, M., & Garcia-Marques, T. (2006). Normas da valência das imagens do Ficheiro de Imagens Multicategoriais (FIM) [Valence norms of the multicategorial image file]. Laboratório de Psicologia, 4, 109–137. https://doi.org/10.14417/lp.765 Prada, M., & Ricot, R. (2010). Qual é coisa, qual é ela? Avaliação de valência e familiaridade de imagens de objectos desconhecidos [Valence and familiarity ratings of images of unfamiliar objects]. Laboratório de Psicologia, 8, 151–169. https://doi.org/10.14417/lp. 639 Rhodes, M., & Brickman, D. (2010). The role of within-category variability in category-based induction: A developmental study. Cognitive Science, 34, 1561–1573. https://doi.org/10.1111/j.15516709.2010.01137.x Ribar, R. J., Oakes, L. M., & Spalding, T. L. (2004). Infants can rapidly form new categorical representations. Psychonomic Bulletin & Review, 11, 536–541. https://doi.org/10.3758/bf03196607 Rips, L. J. (1989). Similarity, typicality, and categorization. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 21–59). New York, NY: Cambridge University Press. Roth, E. M., & Shoben, E. J. (1983). The effect of context on the structure of categories. Cognitive Psychology, 15, 346–378. https://doi.org/10.1016/0010-0285(83)90012-9 Rydell, R. J., & Gawronski, B. (2009). I like you, I like you not: Understanding the formation of context-dependent automatic attitudes. Cognition and Emotion, 23, 1118–1152. https://doi.org/ 10.1080/02699930802355255 Sakamoto, Y., Jones, M., & Love, B. C. (2008). Putting the psychology back into psychological models: Mechanistic versus rational approaches. Memory & Cognition, 36, 1057–1065. https://doi.org/10.3758/mc.36.6.1057 Schwarz, N. (2007). Attitude construction: Evaluation in context. Social Cognition, 25, 638–656. https://doi.org/10.1521/soco. 2007.25.5.638 Schwarz, N., & Bless, H. (1992). Constructing reality and its alternatives: An inclusion/exclusion model of assimilation and contrast effects in social judgment. In L. L. Martin & A. Tesser (Eds.), The construction of social judgments (pp. 217–245). Hillsdale, NJ: Lawrence Erlbaum Associates. Sia, T. L., Lord, C. G., Blessum, K. A., Thomas, J. C., & Lepper, M. R. (1999). Activation of exemplars in the process of assessing social category attitudes. Journal of Personality and Social Psychology, 76, 517–532. https://doi.org/10.1037/0022-3514.76.4.517 Sloman, S. A. (1994). When explanations compete: The role of explanatory coherence on judgements of likelihood. Cognition, 52, 1–21. https://doi.org/10.1016/0010-0277(94)90002-7 Stewart, N., & Chater, N. (2002). The effect of category variability in perceptual categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 893–907. https://doi.org/ 10.1037/0278-7393.28.5.893 Unkelbach, C., Fiedler, K., Bayer, M., Stegmüller, M., & Danner, D. (2008). Why positive information is processed faster: The

© 2020 Hogrefe Publishing

13

density hypothesis. Journal of Personality and Social Psychology, 95, 36–49. https://doi.org/10.1037/0022-3514.95.1.36 Vukatana, E., Graham, S. A., Curtin, S., & Zepeda, M. S. (2015). One is not enough: Multiple exemplars facilitate infants’ generalizations of novel properties. Infancy, 20, 548–575. https://doi.org/ 10.1111/infa.12092 Wahlheim, C. N., & DeSoto, K. A. (2017). Study preferences for exemplar variability in self-regulated category learning. Memory, 25, 231–243. https://doi.org/10.1080/09658211.2016.1152378 Wahlheim, C. N., Finn, B., & Jacoby, L. L. (2012). Metacognitive judgments of repetition and variability effects in natural concept learning: Evidence for variability neglect. Memory & Cognition, 40, 703–716. https://doi.org/10.3758/s13421-011-0180-2 Waxman, S. R., & Klibanoff, R. S. (2000). The role of comparison in the extension of novel adjectives. Developmental Psychology, 36, 571–581. https://doi.org/10.1037/0012-1649.36.5.571 Wisniewski, E. J. (1995). Prior knowledge and functionally relevant features in concept learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 449–468. https://doi. org/10.1037/0278-7393.21.2.449 Yi, Y. (1993). Contextual priming effects in print advertisements: The moderating role of prior knowledge. Journal of Advertising, 22, 1–10. https://doi.org/10.1080/00913367.1993.10673391 Zajonc, R. B. (1968). Attitudinal effects of mere exposure. Journal of Personality and Social Psychology, 9, 1–27. https://doi.org/10. 1037/h0025848 Zentall, T. R., Wasserman, E. A., Lazareva, O. F., Thompson, R. K. R., & Rattermann, M. J. (2008). Concept learning in animals. Comparative Cognition & Behavior Reviews, 3, 13–45. https://doi. org/10.3819/ccbr.2008.30002 History Received July 19, 2019 Revision received January 7, 2020 Accepted February 11, 2020 Published online June 9, 2020 Authorship Both authors contributed to the study design. Testing and data collection were performed by M. Prada. Both authors performed the data analysis and interpretation and approved the final version of the manuscript for submission. Open Data All original materials used to conduct the research will be made available to other researchers for purposes of replicating the procedure or reproducing the results at https://osf.io/m7azk/. Funding This research was funded by Fundação para a Ciência e Tecnologia with a research grant UID/PSI/04810/2013 awarded to the second author. ORCID Marı́lia Prada  https://orcid.org/0000-0002-6845-8881 Marı́lia Prada Instituto Universitário de Lisboa (ISCTE-IUL) CIS-IUL Av. das Forças Armadas Office AA.110 1649-026 Lisbon Portugal marilia_prada@iscte-iul.pt

Experimental Psychology (2020), 67(1), 5–13


Get connected – with us ! Follow us on Twitter or LinkedIn to get the latest news about recent releases, the most exciting research published in our journals, free resources such as free access research articles or interviews with Hogrefe authors and editors, special offers, and much more.

www.hogrefe.com


Set of cards covering VIA’s 24 character strengths, 6 virtues – and practical interventions to help your clients! “The cards are insightful, user-friendly, and ready-for-impact – a tangible resource to help clients build character strengths fluency, catalyze exploration, and set clients on a trajectory of strengths application for building well-being, enhancing relationships, and managing stress.” Ryan M. Niemiec, PsyD, Education Director, VIA Institute on Character, Cincinnati, OH

Matthijs Steeneveld / Anouk van den Berg

Character Strengths Intervention Cards 50 Cards With Instruction Booklet 2020, 50 cards + 16-page booklet US $34.80 / € 27.95 ISBN 978-0-88937-566-6 Also available as eBook The VIA character strengths look at what positive character traits help us lead fulfilling and happy lives, rather than looking at what is wrong with us. Research has shown that knowing your strengths and using them more often leads to greater well-being, better performance, and more resilience. With these cards, you can help clients learn about their character strengths. This full-color 50-card set provides cards for each of the 24 VIA character strengths and 6 virtues as well

www.hogrefe.com

as information cards to hand out in groups and individual sessions. On top of that, 16 ready-to-use, evidence-based intervention cards help clients discover and explore their strengths and practice applying them more often. The cards are a valuable addition to the toolboxes of coaches, trainers, and therapists from any background. The card set is an excellent resource to use with the book Character Strengths Interventions: A Field Guide for Practitioners by Ryan M. Niemiec (ISBN 978-0-88937-492-8).


Research Article

Your Face and Moves Seem Happier When I Smile Facial Action Influences the Perception of Emotional Faces and Biological Motion Stimuli Fernando Marmolejo-Ramos1, Aiko Murata2, Kyoshiro Sasaki3,4,5, Yuki Yamada4, Ayumi Ikeda6, José A. Hinojosa7,8,9, Katsumi Watanabe3,10, Michal Parzuchowski11, Carlos Tirado12, and Raydonal Ospina13 Centre for Change and Complexity in Learning, The University of South Australia, Australia

1

NTT Communication Science Laboratories, Kyoto, Japan

2

Faculty of Science and Engineering, Waseda University, Tokyo, Japan Faculty of Arts and Science, Kyushu University, Fukuoka, Japan

3 4

5

Japan Society for the Promotion of Science, Tokyo, Japan Graduate School of Human-Environment Studies, Kyushu University, Japan

6

Instituto Pluridisciplinar, Universidad Complutense de Madrid, Spain

7

Dpto. Psicologı́a Experimental, Procesos Cognitivos y Logopedia, Universidad Complutense de Madrid, Spain

8

Facultad de Lenguas y Educación, Universidad de Nebrija, Madrid, Spain

9

Art & Design, University of New South Wales, Australia

10

Centre of Research on Cognition and Behaviour, SWPS University of Social Sciences and Humanities, Sopot, Poland 12 Gösta Ekman Laboratory, Department of Psychology, Stockholm University, Sweden 11

Departamento de Estatı́stica, CAST Laboratory, Universidade Federal de Pernambuco, Brazil

13

Abstract. In this experiment, we replicated the effect of muscle engagement on perception such that the recognition of another’s facial expressions was biased by the observer’s facial muscular activity (Blaesi & Wilson, 2010). We extended this replication to show that such a modulatory effect is also observed for the recognition of dynamic bodily expressions. Via a multilab and within-subjects approach, we investigated the emotion recognition of point-light biological walkers, along with that of morphed face stimuli, while subjects were or were not holding a pen in their teeth. Under the “pen-in-the-teeth” condition, participants tended to lower their threshold of perception of happy expressions in facial stimuli compared to the “no-pen” condition, thus replicating the experiment by Blaesi and Wilson (2010). A similar effect was found for the biological motion stimuli such that participants lowered their threshold to perceive happy walkers in the pen-in-the-teeth condition compared to the no-pen condition. This pattern of results was also found in a second experiment in which the no-pen condition was replaced by a situation in which participants held a pen in their lips (“pen-in-lips” condition). These results suggested that facial muscular activity alters the recognition of not only facial expressions but also bodily expressions. Keywords: face, emotions, biological motion, mirror neurons, embodied cognition

The two-way relationship between action and perception has demonstrated that perception affects motor actions (e.g., Salgado-Montejo et al., 2016) and that motor actions affect perception (e.g., Bach-Y-Rita, Collins, Saunders, Experimental Psychology (2020), 67(1), 14–22 https://doi.org/10.1027/1618-3169/a000470

White, & Scadden, 1969; Gonzalo-Fonrodona & Porras, 2013; Yonemitsu, Sung, Naka, Yamada, & MarmolejoRamos, 2017). Thus, a crossmodal correspondence seems to exist between perception and action. This is a central tenet of embodied cognition. It took a century for the theory of emotions of James (1890) to be supported by evidence from neuroscience regarding the bodily feedback hypothesis. The James–Lange theory of emotion suggested that emotions could be either © 2020 Hogrefe Publishing


F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

suppressed or intensified by the body’s cardiovascular, visceral of muscular feedback. In 2003, researchers tested that claim. Asking subjects to imitate facial expressions triggered neural activation in limbic regions such as the amygdala (Carr, Iacoboni, Dubeau, Mazziotta, & Lenzi, 2003). In a related experiment, Hennenlotter et al. (2009) tested their cosmetic treatment patients either two weeks prior to or after their scheduled injection of botulinum toxin to frown muscles, thus blocking those muscles’ ability to contract. The procedure involved asking patients to imitate angry facial expressions. The results showed a clear pattern: blockage of the frown muscles reduced the activation of the left amygdala during the imitation. Blaesi and Wilson (2010) provided evidence in favor of the effect of motor actions on perception by covertly manipulating facial expressions and measuring the impact on the perception of emotional faces. Their Experiment 1 showed that when participants were covertly set to smile, their threshold to recognize a face as portraying happiness was lowered. Researchers induced a covert smile by having participants hold a pen in their teeth (see Figure 1). The covert facial expression set by this act has been shown to benefit the processing of emotionally matching valenced stimuli (e.g., Buck, 1980; Davis, Winkielman, & Coulson, 2017; Niedenthal, 2007; Strack, Martin, & Stepper, 1988;

Figure 1. Illustration of the way a pen is held between the teeth in order to induce a covert smile: (A) a frontal view and (B) a profile view (©Daniela Álvarez, 2020).

© 2020 Hogrefe Publishing

15

see also Marmolejo-Ramos & Dunn, 2013). Studies of the effects of facial manipulation by means other than the “penin-the-teeth” condition have shown that doing so influences the processing of emotional stimuli (e.g., Rhodewalt & Comer, 1979; Wood, Lupyan, Sherrin, & Niedenthal, 2016; Wood, Rychlowska, Korb, & Niedenthal, 2016; Havas, Glenberg, Gutowski, Lucarelli, & Davidson, 2010; Parzuchowski & Szymków-Sudziarska, 2008). Emotional information can also be extracted from sources other than the face. In communicative contexts, people use language buttressed with paralinguistic (e.g., pitch and prosody) and kinesics cues (e.g., body postures and gestures; see Cevasco & Marmolejo-Ramos, 2013; Holler & Levinson, 2019; Parzuchowski & Wojciszke, 2014). Each cue is a source of emotional information (e.g., see Adolphs, 2002, for evidence as to how emotional information can be extracted from prosody). Research in biological motion (Johansson, 1973) has shown that emotional information can indeed be extracted from kinesics (e.g., Clarke, Bradshaw, Field, Hampson, & Rose, 2005; Ikeda & Watanabe, 2009). Recent evidence has further shown that covertly adopting a negatively laden walking style (i.e., the walking style of a depressed person) leads to recalling more negative than positive words. The recall pattern reversed when participants adopted a positively laden walking style (i.e., the style of a happy person; Michalak, Rohde, & Troje, 2015). In a nutshell, perceptual and motor systems are particularly intertwined during the processing of emotionally valenced stimuli (see Holstege, 1992, for the coupling between emotions and motor systems and Satpute et al., 2015, for a meta-analysis of evidence favoring a coupling between emotions and sensory and perceptual systems). Neuroscientific evidence further suggests that facial expressions, e.g., an involuntary smile, compromise not only premotor and face motor areas but also the brain areas involved in social cognition (Schilbach, Eickhoff, Mojzisch, & Vogeley, 2008). The activation of premotor areas has been linked, in turn, with the activation of mirror neurons (see Molenberghs, Cunnington, & Mattingley, 2009). These are multimodal association neurons that match action perception and action execution (see Gallese, 2009; Keysers, 2009; Wilson & Knoblich, 2005) and are indeed needed in social cognitive processes (see Spaulding, 2013). Given the participation of emotions in social processes, it follows that emotion processing consists of integrating multimodal perceptual and motor components such that one component can activate another co-occurring component in order to predict and comprehend emotional states (see Wood, Rychlowska, et al., 2016). A likely multimodal brain area for the integration of perceptual and motor information related to emotional stimuli is the superior temporal sulcus (STS). Experimental Psychology (2020), 67(1), 14–22


16

The STS participates in the processing of biological motion (Grossman, Battelli, & Pascual-Leone, 2005) and facial expressions (Tseng et al., 2015). Thus, it can be entertained that the engagement of a specific motor system related to a specific emotion could affect the perception of emotionally valenced stimuli that mirror the specific engaged motor system as well as other motor systems that resonate with that emotion. Specifically, eliciting a covert smile could cause the online processing of stimuli (e.g., ambiguous faces or body movements) to be perceived in a more positive manner. Much of the embodiment literature focuses on offline effects, i.e., how the bodily cues can change the recall of stored representations (Niedenthal, Barsalou, Winkielman, Krauth-Gruber, & Ric, 2005). The design of our experiments allows the tracing of online processing. That is, we seek to track the ongoing change of perception of the presented stimuli. Therefore, the goal of this experiment is to show the effects of action on perception by replicating the findings of Blaesi and Wilson (2010) and to extend their results to the case of biological motion stimuli. Relatedly, a recent multi-lab replication of the facial feedback hypothesis (Wagenmakers et al., 2016) did not replicate the phenomenon that covertly manipulating facial action affects the perceived funniness of cartoons (Strack et al., 1988). However, the experimental social psychology community has debated whether duplicating this well-known experiment was the best test to replicate the theory that underpins the facial feedback hypothesis (Strack, 2016, 2017). Indeed, a new meta-analysis (Coles, Larsen, & Lench, 2019) of 286 effect sizes (from 138 studies) found weak support for the claim that facial feedback does influence emotional experience. We believe that the experiments reported below test the facial feedback theory directly as the stimuli we used were more relevant for guiding the participants’ behavior and more engaging to assess their valence. After all, judgments of cartoon funniness are less relevant for future behavior than assessing the valence of an ambivalent face or a silhouette of a person walking toward you.

Experiment 1 Methods Participants Volunteers were university students at an area close to Waseda University in Tokyo and psychology undergraduate students from the SWPS University of Social Sciences and Humanities (Sopot) and Stockholm Experimental Psychology (2020), 67(1), 14–22

F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

University (Stockholm, see Table 1). The ethics committee boards from each university approved the experiment. Participants gave written informed consent to abide by the principles of the Declaration of Helsinki (WMA, 2013). Materials In the facial expression evaluation task, the stimuli proposed by Blaesi and Wilson (2010) were used. They consisted of 11 pictures of the same face morphed on a continuum ranging from frowning to smiling. In the bodily expression evaluation task, 11 video clips of point-light biological walkers, ranging on a continuum from sad walking to happy walking, were presented. These video clips were prepared by using the BML publisher software (Troje, 2002, 2008). The experiment was implemented in PsychoPy (Peirce, 2007). The participants’ facial muscular activity was manipulated by requesting that they hold a pen horizontally with their teeth. Disposable wooden chopsticks were used by the Japanese and Polish participants, and disposable wooden pencils were used by the Swedish participants. These items were considered more convenient and hygienic substitutes for the traditional ballpoint pen. Also, wooden chopsticks and pencils have smooth porous surfaces that prevent them slipping from the mouth. However, we refer to these items as pens to follow jargon within the field. Procedure Upon arrival at the laboratory, each participant was ushered into a room and seated in front of a computer. The pen-in-the-teeth and “no-pen” conditions were compared in a within-subjects fashion. In the pen-in-the-teeth condition, participants were instructed to hold a pen horizontally with their teeth, without touching the pen with their lips (see Figure 1). In the no-pen condition, participants did not hold a pen in their mouth. Participants were told that the aim of the experiment was to test their ability to multitask rather than to test the effect of making a smile. The experiment consisted of two large blocks: facial expression stimuli (shown for 750 ms each) were presented in one block, and biological motion videos (shown for 950 ms each) were presented in the other. Within each block, there were seven sub-blocks of 22 randomly shown stimuli; i.e., each stimulus was presented twice, and the instructions on the screen asked participants to alternate between the pen-in-the-teeth and no-pen conditions for each trial. Thus, each participant underwent 308 trials (22stimuli × 7sub-blocks × 2large blocks; 154 trials in the facial expression and 154 trails in the biological motion tasks). The responses sad and happy were mapped onto the leftward (←) and rightward (→) arrow navigation keys; the upward navigation key (↑) was used to call up the trials. The order of the large blocks and the key allocation © 2020 Hogrefe Publishing


F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

17

Table 1. Demographics of the participants in Experiment 1 Gender (Mdnage ± MAD) [rangeage] Laboratory

Age

Males

Females

Total

Range

Mdn ± MAD

Japan

22 (20.5 ± 1.23) [18–23]

18 (20 ± 1.72) [18–28]

40

18–28

20 ± 1.45

Poland

19 (23 ± 2.11) [19–36]

21 (21 ± 2.57) [19–33]

40

19–36

23 ± 2.55

Sweden

14 (26 ± 2.07) [21–31]

26 (26 ± 2.81) [19–37]

40

19–37

26 ± 2.55

Total Total (gender)

120 55

65

Total age range

18–37

Total Mdnage ± MAD

22 ± 3.08

Note. MAD = median absolute deviation; Mdn = median.

Figure 2. Illustration of the experimental sequence in each task.

were counterbalanced; thus, participants were randomly allocated to four counterbalanced conditions. The timeline of each trial is presented in Figure 2. Each trial started with a screenshot instructing participants to hold or not hold the pen (pen for the pen-in-the-teeth condition and free for the no-pen condition). After this instruction, a fixation cross (+) was presented for 1,000 ms. Immediately after, and depending on the experimental block, the emotional expression stimulus or the bodily expression stimulus was presented. Participants were asked to evaluate as quickly and accurately as possible the emotion of each stimulus as sad or happy. The next trial started after a 2,000 ms intertrial interval.

from participants who wrongly mapped the response keys and PSEs with values outside the 0–10 range were excluded from the analyses. To verify whether a distinct pattern of responses occurred for each condition and whether these occurrences were influenced by gender and/or laboratory, we conducted a 2ws × 2bs × 3bs (ws = within-subjects; bs = between-subjects; condition × gender × laboratory) mixed ANOVA on the PSEs of each task (R codes, stimuli, and data sets for the current and supplementary analyses can be found at https://figshare. com/projects/emotional_faces_and_biological_motion_ study/71441).

Design and Statistical Analyses We estimated each participant’s point of subjective equality (PSE) for perceiving the expression as happy in each of the two conditions (pen-in-the-teeth vs. no-pen) in each task. The PSE estimates were obtained by using a psychometric function in a similar vein to the experiment by Blaesi and Wilson (2010). This function was attained by a Logit model analysis (models were fitted by using binomial distribution and logit link function in R) with stimulus number 0 (fully sad) to 10 (fully happy) as the independent variable. The estimated PSEs were defined as 50% happy responses and were used as the dependent variable in the analyses. The estimated PSEs originated

Results

© 2020 Hogrefe Publishing

Data from four Swedish participants were excluded from our analysis because their response functions were opposite to what would be expected had they understood the mapping between their recognition and key responses (i.e., either they did not follow the instructions correctly, or they wrongly mapped the response keys). Figure 3 shows the distribution of each participant’s PSE estimates as a function of gender and laboratory in the facial expression evaluation and in the bodily expression evaluation tasks. The ANOVA indicated that only the factor condition was significant in both tasks (facial Experimental Psychology (2020), 67(1), 14–22


18

F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

Figure 3. Distribution of PSE thresholds as functions of gender and laboratory in the facial and bodily expression evaluation tasks. The solid horizontal lines represent mean values. The gray-dotted line represents the grand mean.

expression task: F(1, 110) = 19.63, p = 2.22 × 10 5, η2 = 0.016; bodily expression task: F(1, 109) = 19.29, p = 2.6 × 10 5, η2 = 0.017). These ANOVAs were carried out using the aov function in R with the following formula: aov(PSE – laboratory × gender × condition + error(participant.ID), data = xx), where xx stands for the facial expression or the biological motion dataset. The effect sizes were obtained via the function anova_stats in the sjstats R package. Note that the same results are obtained by using a linear mixed model as there is only one PSE per participant per condition (i.e., lmer(PSE – laboratory × gender × condition + (1 | participant.ID), data = xx) via the lmerTest R package). That is, participants’ threshold to label a face as happy lowered when they were in the pen-in-the-teeth condition (facial expression task: Mpen-in-the-teeth = 4.91 ± 0.69,Mno-pen =5.08±0.66;bodilyexpressiontask:Mpen-in-theteeth = 4.58 ± 1.09, Mno-pen = 4.86 ± 1.05). In other words, when participants were holding a pen in their teeth, they tended to label the observed stimuli as happy more frequently than when no pen was held. While the current results refer to effects that are specific to an emotional expression, namely, happiness, it could be argued that the experimental manipulations do not allow to determine whether the effect was caused by the induction of a specific (smiling) face or by a more general effect related to the contraction versus relaxation of facial muscles. Hence, we conducted a second experiment in which the no-pen condition was substituted by a condition in which participants held a pen in their lips, i.e., “pen-inlips” condition. Experimental Psychology (2020), 67(1), 14–22

Experiment 2 Method Participants Volunteers were university students from Kyushu University (Japan), SWPS University of Social Sciences and Humanities (Poland), and Universidad Complutense de Madrid (Spain, see Table 2). We strived for sampling new participants from the same countries listed in Experiment 1, but it was not possible to secure participants from Sweden; hence, we sampled from another language/ culture (i.e., Spain). The ethics committee boards from each university approved the experiment. Participants gave written informed consent to abide by the principles of the Declaration of Helsinki (WMA, 2013). Materials Same materials as those in Experiment 1. Procedure Same procedure as that in Experiment 1 with the only difference that the no-pen condition was replaced by condition in which participants held a pen in their lips (i.e., pen-in-lips condition; see Figure 4). Design and Statistical Analyses Same design and statistical analyses as those reported in Experiment 1. The only difference was that PSEs were estimated for the pen-in-the-teeth (same condition © 2020 Hogrefe Publishing


F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

19

Table 2. Demographics of the participants in Experiment 2 Gender (Mdnage ± MAD) [rangeage] Laboratory Japan

Males 22 (19 ± 1.48) [18–23]

Age

Females

Total

Range

Mdn ± MAD

24 (19 ± 1.48) [18–23]

46

18–23

19 ± 1.48

Poland

3 (24 ± 7.41) [19–50]

42 (24.5 ± 8.15) [18–50]

45

18–50

24 ± 7.41

Spain

8 (28 ± 8.89) [18–36]

37 (20 ± 2.96) [18–33]

45

18–36

21 ± 4.44

Total Total (gender)

136 33

103

Total age range

18–50

Total Mdnage ± MAD

20.5 ± 3.70

Note. MAD = median absolute deviation; Mdn = median.

10 4, η2 = 0.013; bodily expression task: F(1, 122) = 14.92, p = 1.81 × 10 4, η2 = 0.017). That is, as in the previous experiment, participants’ threshold to label a face as happy lowered when they were in the pen-in-the-teeth condition (facial expression task: Mpen-in-the-teeth = 4.91 ± 0.73, Mpen-in-lips =5.08±0.75;bodilyexpressiontask:Mpen-in-the-teeth = 4.65 ± 1.30, Mpen-in-lips = 4.99 ± 1.29). In other words, when participants were holding a pen in their teeth, they tended to label the observed stimuli as happy more frequently than when holding a pen in their lips (see Figure 5).

Discussion

Figure 4. Illustration of the way a pen is held between the lips in order to prevent smile: (A) a frontal view and (B) a profile view (©Daniela Álvarez, 2020).

featured in Experiment 1) and pen-in-lips (in lieu of the nopen condition) conditions.

Results A 2ws × 2bs × 3bs (ws = within-subjects; bs = betweensubjects; condition × gender × laboratory) mixed ANOVA on the PSEs of the pen-in-lips and pen-in-the-teeth conditions showed that only the factor condition was significant (facial expression task: F(1, 126) = 14.40, p = 2.28 × © 2020 Hogrefe Publishing

The first multilaboratory experiment replicated the findings that a covert smile lowers the threshold to recognize faces as happy, while not sustaining a smile does not lead to such an effect. The experiment further showed that a covert smile lowers the threshold to recognize a biological walking motion as more positive than when no smile is sustained. Furthermore, a second multilaboratory experiment corroborated these findings when the no-pen condition was replaced by a condition in which participants were prevented from smiling, i.e., pen-in-lips condition. These results provide further evidence in favor of the bidirectional link between perception and action. We used the dynamic biological motions as the stimuli of bodily expressions, whereas the stimuli of facial expressions were still photographs. In the context of facial mimicry research, it has been shown that dynamic facial expressions elicit greater facial mimicry compared to static presentations of faces (Sato & Yoshikawa, 2007). In an experiment of biological motion, Atkinson, Tunstall, and Dittrich (2007) noted the importance of kinematic movement for emotional recognition of bodily expression by showing that emotion classification accuracy was impaired when the movie clips of bodily expressions were inverted or played backward. Furthermore, an experiment Experimental Psychology (2020), 67(1), 14–22


20

F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

Figure 5. Distribution of PSE thresholds as functions of gender and laboratory in the facial and bodily expression evaluation tasks. The solid horizontal lines represent mean values. The gray-dotted line represents the grand mean.

using functional magnetic resonance imaging has indicated that some neural circuits are specifically activated when watching dynamic bodily movements that express angry states (e.g., hypothalamus, the ventromedial prefrontal cortex, and the temporal pole and the premotor cortex; Pichon, de Gelder, & Grezes, 2008). Future studies could investigate differences between the recognition of emotion from dynamic and static stimuli in order to assess the modulation effect observed in the present experiment. Indeed, a more robust examination of the role of facial mimicry in emotion understanding would require testing hypotheses with individuals with Moebius syndrome. Individuals with this syndrome suffer from facial paralysis that prevents them from forming facial expressions (De Stefani, Nicolini, Belluardo, & Ferrari, 2019). A recent large-scale, multilaboratory replication experiment showed that a covert smile has no effect on the processing of emotionally valenced stimuli. Specifically, Wagenmakers et al. (2016) found that a covert smile elicited by holding a pen in the teeth did not change the ratings of cartoons as being funnier when a covert frown was elicited by holding a pen in the lips. This result thus did not replicate the original experiment of Strack et al. (1988). Although that replication experiment and ours seem to follow a similar logic and use similar manipulations, they do differ. For example, unlike our first experiment, the procedure in the 1988 experiment required holding a pen between the lips (for the control condition that prevents a smile) or between the teeth (to induce a smile). We addressed this issue, though, in a second experiment. Experimental Psychology (2020), 67(1), 14–22

Importantly, there was a large discrepancy between the study of Wagenmakers et al. and our studies as to the dependent variable (and hence the underlying process). In the 1988 procedure, as well as in the replication attempt by Wagenmakers et al. (2016), the stimuli were supposedly already mildly positive. Thus, what was hypothesized was that the facial feedback from the unobtrusive smile create[d] a shift in funniness ratings that corresponded with the positivity bias. In our procedure, however, we showed the effectiveness of the manipulation in disambiguating ambivalent stimuli. That is, the bodily cues of unobtrusive smiles were only taken into account when deciding whether an emotionally ambivalent stimulus was positive or not. Furthermore, Strack (2016) argued that the replicators tested the funniness of cartoons in the 1980s that most likely were not understood in the same way 30 years later. However, Wagenmakers et al. (2016) used different comic strips and ensured that they were equally and moderately as funny as the original strips. Our results add evidence to this debate: The stimuli do not need to be positive in the first place. On the contrary, we argue that the sensorimotor interference should be observed especially for those stimuli that are ambiguous. As this kind of interference might influence early stages of perceptual processing (Price, Dieckman, & Harmon-Jones, 2012), it should give an additional clue on the valance of the perceived stimuli and should disambiguate the neutral from the mildly amusing stimuli. In our facial expression categorization task (original files from Emmorey & McCullough, 2009), © 2020 Hogrefe Publishing


F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

the presented face was neither smiling nor sad but was digitally transformed into an ambivalent morph. This is where we observed the strongest PSEs shift due to the facial feedback.

Conclusion The current experiments suggest that facial motor activation seems to compromise motor systems that are not confined to the face and do engage other body parts that resonate with the implied emotional state. That is, the emotion implied by the covert facial expression seems to engage a wide range of motor systems that, all together, are representative of the ongoing emotional state.

References Adolphs, R. (2002). Neural systems for recognizing emotion. Current Opinion in Neurobiology, 12(2), 169–177. https://doi.org/10. 1016/s0959-4388(02)00301-x Atkinson, A. P., Tunstall, M. L., & Dittrich, W. H. (2007). Evidence for distinct contributions of form and motion information to the recognition of emotions from body gestures. Cognition, 104(1), 59–72. https://doi.org/10.1016/j.cognition.2006.05.005 Bach-Y-Rita, P., Collins, C. C., Saunders, F. A., White, B., & Scadden, L. (1969). Vision substitution by tactile image projection. Nature, 221, 963–964. https://doi.org/10.1038/221963a0 Blaesi, S., & Wilson, M. (2010). The mirror reflects both ways: Action influences the perception of others. Brain and Cognition, 72, 306–309. https://doi.org/10.1016/j.bandc.2009.10.001 Buck, R. (1980). Nonverbal behavior and the theory of emotion: The facial feedback hypothesis. Journal of Personality and Social Psychology, 38(5), 811–824. https://doi.org/10.1037/0022-3514. 38.5.811 Carr, L., Iacoboni, M., Dubeau, M. C., Mazziotta, J. C., & Lenzi, G. L. (2003). Neural mechanisms of empathy in humans: A relay from neural systems for imitation to limbic areas. Proceedings of the National Academy of Sciences of the United States of America, 100, 5497–5502. https://doi.org/10.1073/pnas.0935845100 Cevasco, J., & Marmolejo-Ramos, F. (2013). The importance of studying the role of prosody in the comprehension of spontaneous spoken discourse. Revista Latinoamericana de Psicologı́a, 45(1), 21–33. https://doi.org/2013-25866-002 Clarke, T. J., Bradshaw, M. F., Field, D. T., Hampson, S. E., & Rose, D. (2005). The perception of emotion from body movement in point-light displays of interpersonal dialogue. Perception, 34, 1171–1180. https://doi.org/10.1068/p5203 Coles, N. A., Larsen, J. T., & Lench, H. C. (2019). A meta-analysis of the facial feedback literature: Effects of facial feedback on emotional experience are small and variable. Psychological Bulletin, 145(6), 610–651. https://doi.org/10.1037/bul0000194 Davis, J. D., Winkielman, P., & Coulson, S. (2017). Sensorimotor simulation and emotion processing: Impairing facial action increases semantic retrieval demands. Cognitive, Affective, & Behavioral Neuroscience, 17(3), 652–664. https://doi.org/10. 3758/s13415-017-0503-2

© 2020 Hogrefe Publishing

21

De Stefani, E., Nicolini, Y., Belluardo, M., & Ferrari, P. F. (2019). Congenital facial palsy and emotion processing: The case of Moebius syndrome. Genes, Brain, and Behavior, 18 (1): e12548. https://doi.org/10.1111/gbb.12548 Emmorey, K., & McCullough, S. (2009). The bimodal bilingual brain: Effects of sign language experience. Brain and Language, 109(2–3), 124–132. https://doi.org/10.1016/j.bandl. 2008.03.005 Gallese, V. (2009). Mirror neurons, embodied simulation, and the neural basis of social identification. Psychoanalytic Dialogues, 19(5), 519–536. https://doi.org/10.1080/10481880903231910 Gonzalo-Fonrodona, I., & Porras, M. A. (2013). Scaling effects in crossmodal improvement of visual perception by motor system stimulus. Neurocomputing, 114, 76–79. https://doi.org/10.1016/j. neucom.2012.06.047 Grossman, E. D., Battelli, L., & Pascual-Leone, A. (2005). Repetitive TMS over STS disrupts perception of biological motion. Vision Research, 45(22), 2847–2853. https://doi.org/10.1016/j.visres. 2005.05.027 Havas, D. A., Glenberg, A. M., Gutowski, K. A., Lucarelli, M. J., & Davidson, R. J. (2010). Cosmetic use of botulinum toxin-A affects processing of emotional language. Psychological Science, 21(7), 895–900. https://doi.org/10.1177/0956797610374742 Hennenlotter, A., Dresel, C., Castrop, F., Ceballos-Baumann, A. O., Wohlschläger, A. M., & Haslinger, B. (2009). The link between facial feedback and neural activity within central circuitries of emotion – New insights from botulinum toxin-induced denervation of frown muscles. Cerebral Cortex, 19, 537–542. https:// doi.org/10.1093/cercor/bhn104 Holler, J., & Levinson, S. C. (2019). Multimodal language processing in human communication. Trends in Cognitive Sciences, 23 (8), 639–652. https://doi.org/10.1016/j.tics.2019.05.006 Holstege, G. (1992). The emotional motor system. European Journal of Morphology, 30 (1), 67–79. PMID: 1642954 Ikeda, H., & Watanabe, K. (2009). Anger and happiness are linked differently to the explicit detection of biological motion. Perception, 38, 1002–1011. https://doi.org/10.1068/p6250 James, W. (1890). The Principles of Psychology. New York, NY: Holt. Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception & Psychophysics, 14(2), 201–211. https://doi.org/10.3758/BF03212378 Keysers, C. (2009). Mirror neurons. Current Biology, 19 (21), R971–R973. https://doi.org/10.1016/j.cub.2009.08.026 Marmolejo-Ramos, F., & Dunn, J. (2013). On the activation of sensorimotor systems during the processing of emotionallyladen stimuli. Universitas Psychologica, 12(5), 1511–1542. https://doi.org/10.11144/Javeriana.UPSY12-5.assp Michalak, J., Rohde, K., & Troje, N. F. (2015). How we walk affects what we remember: Gait modifications through biofeedback change negative affective memory bias. Journal of Behavior Therapy and Experimental Psychiatry, 46, 121–125. https://doi. org/10.1016/j.jbtep.2014.09.004 Molenberghs, P., Cunnington, R., & Mattingley, J. B. (2009). Is the mirror neuron system involved in imitation? A short review and meta-analysis. Neuroscience and Behavioral Reviews, 33, 975–980. https://doi.org/10.1016/j.neubiorev.2009.03.010 Niedenthal, P. M. (2007). Embodying emotion. Science, 316, 1002–1005. https://doi.org/10.1126/science.1136930 Niedenthal, P. M., Barsalou, L. W., Winkielman, P., Krauth-Gruber, S., & Ric, F. (2005). Embodiment in attitudes, social perception, and emotion. Personality and Social Psychology Review, 9, 184–211. https://doi.org/10.1207/s15327957pspr0903_1 Parzuchowski, M., & Szymków-Sudziarska, A. (2008). Well, slap my thigh: Expression of surprise facilitates memory of surprising material. Emotion, 8(3), 430–434. https://doi.org/10.1037/15283542.8.3.430

Experimental Psychology (2020), 67(1), 14–22


22

Parzuchowski, M., Wojciszke, B. (2014). Hand over heart primes moral judgments and behavior. Journal of Nonverbal Behavior, 38, 145–165. https://doi.org/10.1007/s10919-013-0170-0 Peirce, J. W. (2007). PsychoPy – Psychophysics software in Python. Journal of Neuroscience Methods, 162(12), 8–13. https://doi.org/ 10.1016/j.jneumeth.2006.11.017 Pichon, S., de Gelder, B., & Grèzes, J. (2008). Emotional modulation of visual and motor areas by dynamic body expressions of anger. Social Neuroscience, 3(3-4), 199–212. https://doi.org/10.1080/ 17470910701394368 Price, T. F., Dieckman, L. W., & Harmon-Jones, E. (2012). Embodying approach motivation: body posture influences startle eyeblink and event-related potential responses to appetitive stimuli. Biological Psychology, 90(3), 211–217. doi:10.1016/j.biopsycho. 2012.04.001 Rhodewalt, F., & Comer, R. (1979). Induced-Compliance attitude change: Once more with feeling. Journal of Experimental Social Psychology, 15(1), 35–47. https://doi.org/10.1016/00221031(79)90016-7 Salgado-Montejo, A., Marmolejo-Ramos, F., Alvarado, J. A., Arboleda, J. C., Suarez, D. R., & Spence, C. (2016). Drawing sounds: Representing tones and chords spatially. Experimental Brain Research, 234(12), 35093522. https://doi.org/10.1007/s00221016-4747-9 Sato, W., & Yoshikawa, S. (2007). Spontaneous facial mimicry in response to dynamic facial expressions. Cognition, 104(1), 1–18. https://doi.org/10.1016/j.cognition.2006.05.001 Satpute, A. B., Kang, J., Bickart, K. C., Yardley, H., Wager, T. D., & Barrett, L. F. (2015). Involvement of Sensory Regions in Affective Experience: A Meta-Analysis. Frontiers in Psychology, 6(1860). https://doi.org/10.3389/fpsyg.2015.01860 Schilbach, L., Eickhoff, S. B., Mojzisch, A., & Vogeley, K. (2008). What’s in a smile? Neural correlates of facial embodiment during social interaction. Social neuroscience, 3(1), 37–50. https://doi. org/10.1080/17470910701563228 Spaulding, S. (2013). Mirror neurons and social cognition. Mind & Language, 28 (2), 233–257. https://doi.org/10.1111/mila.12017 Strack, F. (2016). Reflection on the smiling registered replication report. Perspectives on Psychological Science, 11(6), 929–930. https://doi.org/10.1177/1745691616674460 Strack, F. (2017). From data to truth in psychological science. A personal perspective. Frontiers in Psychology, 8(702). https:// doi.org/10.3389/fpsyg.2017.00702 Strack, F., Martin, L. L., & Stepper, S. (1988). Inhibiting and facilitating conditions of the human smile: A nonobtrusive test of the facial feedback hypothesis. Journal of Personality and Social Psychology, 54(5), 768–777. https://doi.org/10.1037//0022-3514. 54.5.768 Troje, N. F. (2002). Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. Journal of Vision, 2(5), 371–387. https://doi.org/10.1167/2.5.2 Troje, N. F. (2008). Retrieving information from human movement patterns. In T. F. Shipley & J. M. Zacks (Eds.), Understanding events: From perception to action (pp. 308–334). New York, NY: Oxford University Press. Tseng, L.-Y., Tseng, P., Liang, W.-K., Hung, D. L., Tzeng, O. J. L., Muggleton, N. G., & Juan, C.-H. (2015). The role of superior temporal sulcus in the control of irrelevant emotional face processing: A transcranial direct current stimulation study. Neuropsychologia, 64, 124–133. https://doi.org/10.1016/j. neuropsychologia.2014.09.015

Experimental Psychology (2020), 67(1), 14–22

F. Marmolejo-Ramos et al., Emotional Faces and Biomotion

Wagenmakers, E.-J., Beek, T., Dijkhoff, L., Gronau, Q. F., Acosta, A., Adams R. B. Jr., …, Zwaan, R. A. (2016). Registered replication report: Strack, Martin, & Stepper (1988). Perspectives on Psychological Science, 11(6), 917–928. https://doi.org/10.1177/ 1745691616674458 Wilson, M., & Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychological Bulletin, 131(3), 460–473. https://doi.org/10.1037/0033-2909.131.3.460 Wood, A., Lupyan, G., Sherrin, S., & Niedenthal, P. (2016). Altering sensorimotor feedback disrupts visual discrimination of facial expressions. Psychonomic Bulletin & Review, 23 (4), 1150–1156. https://doi.org/10.3758/s13423-015-0974-5 Wood, A., Rychlowska, M., Korb, S., & Niedenthal, P. (2016). Fashioning the face: Sensorimotor simulation contributes to facial expression recognition. Trends in Cognitive Sciences, 20(3), 227–240. https://doi.org/10.1016/j.tics.2015.12.010 World Medical Association (2013). Ethical principles for medical research involving human subjects. Journal of the American Medical Association, 310(20), 2191–2194. https://doi.org/10.1001/ jama.2013.281053 Yonemitsu, F., Sung, Y., Naka, K., Yamada, Y., & Marmolejo-Ramos, F. (2017). Does weight lifting improves visual acuity? A replication of Gonzalo-Fonrodona and Porras (2013). BMC Research Notes, 10(362). https://doi.org/10.1186/s13104-017-2699-1 History Received May 9, 2017 Revision received January 31, 2020 Accepted February 3, 2020 Published online May 11, 2020 Acknowledgments Fernando Marmolejo-Ramos thanks Rosie Gronthos and Susan Brunner for proofreading earlier versions of this manuscript and Daniela Álvarez (Alvarezlleras.daniela@gmail.com) for the illustrations. Michal Parzuchowski thanks Kamil Tomaszewski for his assistance with the data collection. Open Data R codes, stimuli, and data sets for the current and supplementary analyses can be found at https://figshare.com/projects/emotional_ faces_and_biological_motion_study/71441. Funding Michal Parzuchowski was funded by Narodowe Centrum Nauki, Poland (Grant NCN 2012/04/A/HS6/0581). Kyoshiro Sasaki was funded by the Japan Society for the Promotion of Science (JSPS KAKENHI Nos. 17J05236 and 19K14482). Yuki Yamada was funded by the Japan Society for the Promotion of Science (JSPS KAKENHI Nos. 15H05709, 17H00875, 18H04199, and 18K12015). José A. Hinojosa was funded by the Ministerio de Ciencia, Innovación y Universidades of Spain (PGC2018-098558-B-I00), and by the Comunidad de Madrid (H2019/HUM-5705). Fernando Marmolejo-Ramos Centre for Change and Complexity in Learning The University of South Australia 160 Currie Street David Pank Bldg. DP1-02 Adelaide, SA 5000 Australia fernando.marmolejo-ramos@unisa.edu.au

© 2020 Hogrefe Publishing


Short Research Article

Why People With High Alexithymia Make More Utilitarian Judgments The Role of Empathic Concern and Deontological Inclinations Xiangyi Zhang1,2, Zhihui Wu1, Shenglan Li1, Ji Lai1, Meng Han1, Xiyou Chen3, Chang Liu4, and Daoqun Ding1,2 Department of Psychology, School of Education Science, Hunan Normal University, Changsha, PR China Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, PR China

1

2

Changsha Experimental High School, Changsha, PR China

3

Department of Criminal Justice, Ningxia Police Vocational College, Ningxia, PR China

4

Abstract. Although recent studies have investigated the effect of alexithymia on moral judgments, such an effect remains elusive. Furthermore, moral judgments have been conflated with the moral inclinations underlying those judgments in previous studies. Using a process dissociation approach to independently quantify the strength of utilitarian and deontological inclinations, the present study investigated the effect of alexithymia on moral judgments. We found that deontological inclinations were significantly lower in the high alexithymia group than in the low alexithymia group, whereas the difference in the utilitarian inclinations between the two groups was nonsignificant. Furthermore, empathic concern and deontological inclinations mediated the association between alexithymia and conventional relative judgments (i.e., more utilitarian judgments over deontological judgments), showing that people with high alexithymia have low empathic concern, which, in turn, decreases deontological inclinations and contributes to conventional relative judgments. These findings underscore the importance of empathy and deontological inclinations in moral judgments and indicate that individuals with high alexithymia make more utilitarian judgments over deontological judgments possibly due to a deficit in affective processing. Keywords: alexithymia, moral judgment, utilitarian inclinations, deontological inclinations, empathic concern

Moral judgment is defined as an evaluation of the actions and character of oneself or others (Avramova & Inbar, 2013). The dual process theory suggests that automatic emotional responses to causing harm motivate characteristically deontological judgments (e.g., killing one to save five is unacceptable), whereas deliberative cost–benefit reasoning motivates characteristically utilitarian judgments (e.g., killing one to save five is acceptable; Greene, 2007, 2013). Recently, considerable evidence supports the claim that a relatively greater number of affective processes seem to promote deontological responses, but some also increase utilitarian responses (e.g., Białek & De Neys, 2017; Reynolds & Conway, 2018); similarly, a relatively greater number of cognitive deliberations appear to promote utilitarian responses, but some also increase © 2020 Hogrefe Publishing

deontological responses (e.g., Byrd & Conway, 2019; Gawronski, Armstrong, Conway, Friesdorf, & Hütter, 2017). A great deal of research has found that individuals who make deontological responses in moral dilemmas tend to score high on measures of affective processing, such as empathic concern (Conway, Goldstein-Greenwood, Polacek, & Greene, 2018) and personal distress (Gleichgerrcht, Tomashitis, & Sinay, 2015). However, clinical patients who exhibit deficits in affective processing tend to make utilitarian responses in moral dilemmas, such as those with frontal traumatic brain injuries (Martins, Faisca, Esteves, Muresan, & Reis, 2012) and lesions in the ventromedial prefrontal cortex (Koenigs et al., 2007). Moreover, nonclinical individuals with high alexithymia also tend to make utilitarian judgments in moral dilemmas (Gleichgerrcht et al., 2015; Patil & Silani, 2014a, 2014b). Alexithymia is a multidimensional personality trait characterized by having difficulty in experiencing and expressing emotions (Taylor, Bagby, & Parker, 1999; Xu, Experimental Psychology (2020), 67(1), 23–30 https://doi.org/10.1027/1618-3169/a000474


24

Opmeer, Van Tol, Goerlich, & Aleman, 2018). Individuals with high alexithymia are aware that they experience an emotion, but they cannot determine whether the emotion is sadness, fright, or anger (Patil & Silani, 2014a; Taylor et al., 1999). They often score low on measures of empathy, such as empathic concern and perspective taking (Gleichgerrcht et al., 2015; Patil & Silani, 2014b). Several researchers have recently begun to investigate the effect of alexithymia on moral judgments. For instance, Patil and Silani (2014a) required participants to complete moral acceptability ratings on personal dilemmas (e.g., pushing one to death to save five people) and impersonal dilemmas (e.g., hitting a switch that will divert toxic fumes from a room with five patients to a room with only one patient). They found that a high alexithymia score is associated with enhanced utilitarian responses to emotionally aversive personal moral dilemmas, and people with high alexithymia have reduced empathic concern, which itself leads to a high endorsement of utilitarian responses to personal moral dilemmas. Using a similar moral dilemma task, Patil et al. (2016) found that increased alexithymia scores lead to decreased empathic concern for others’ welfare, which in turn contributes to more utilitarian responses. Moreover, some researchers have shown that moral acceptability judgments are predicted by high levels of alexithymia in healthy individuals, but not in patients with autism (Brewer et al., 2015) and multiple sclerosis (Gleichgerrcht et al., 2015). However, Cecchetto et al. (2018) found that alexithymia is characterized by a diminished physiological activation (i.e., skin conductance) during moral judgments, but normal selfreport (i.e., valence and arousal) ratings. They argued that alexithymia shapes emotional reactions to moral judgments but does not influence individuals’ moral judgments. Although several studies have examined the effect of alexithymia on moral judgments, the relationship between alexithymia and moral judgments remains elusive. Furthermore, little is known about the underlying mechanism of such relationships. More importantly, moral judgments have been conflated with the moral inclinations underlying those judgments in previous studies (e.g., Gleichgerrcht et al., 2015; Patil & Silani, 2014a). Identifying overt judgments with their underpinning inclinations (i.e., identifying utilitarian judgments with utilitarian inclinations and deontological judgments with deontological inclinations) entails an inverse relationship between the two kinds of inclinations, in which stronger utilitarian inclinations signify weaker deontological inclinations and vice versa (Conway & Gawronski, 2013). However, some researchers argued that deontological and utilitarian inclinations are separate constructs (Conway et al., 2018). Moreover, utilitarian inclinations correlate positively with conventional relative judgments, whereas deontological Experimental Psychology (2020), 67(1), 23–30

X. Zhang et al., The Effect of Alexithymia on Moral Judgments

inclinations correlate negatively (Reynolds & Conway, 2018). Individuals make relative utilitarian judgments due to either weak deontological inclinations or strong utilitarian inclinations (Conway & Gawronski, 2013). Therefore, using a process dissociation approach that can independently quantify the relative strength of deontological and utilitarian inclinations (Byrd & Conway, 2019; Conway & Gawronski, 2013), the present study aims to investigate the effect of alexithymia on moral judgments and the role of the two inclinations and empathic concern in this effect. Deontological inclinations involve relatively more affective responses to harmful actions, whereas utilitarian inclinations involve a relatively more deliberative reasoning focused on costs and benefits (Conway & Gawronski, 2013). Some studies have shown that suppressive emotional expressions and blunt emotional reactions would diminish deontological inclinations, but utilitarian inclinations are not affected (Hayakawa, Tannenbaum, Costa, Corey, & Keysar, 2017; Lee & Gino, 2015). Individuals with high alexithymia have a deficit in emotional expression and experience, which leads to reduced emotional responses in moral dilemmas (Cecchetto et al., 2018; Xu et al., 2018). Therefore, we hypothesize that deontological inclinations are lower for individuals with high alexithymia than those with low alexithymia, whereas utilitarian inclinations do not vary in individuals with high and low alexithymia.

Methods Participants In the first phase of this research, 310 undergraduate students (234 females; age range 16–31 years, Mage = 19.72 years, SDage = 2.48) were recruited to complete the Toronto Alexithymia Scale-20 (TAS-20, Bagby, Taylor, & Parker, 1994). The TAS-20 comprises 20 items, and each item is coded from 1 (strongly disagree) to 5 (strongly agree). Cronbach’s alpha was 0.84 in this study. A power analysis conducted in G*power (Version 3.1.9.2; Faul, Erdfelder, Lang, & Buchner, 2007) indicated that a minimum of the required total sample size was N = 82 to achieve a sufficient power (1 β = 0.90) with a medium effect size of f = 0.25. On the basis of their scores on the TAS-20, the individuals who scored in the top 15% of the distribution (n = 47) were classified as the high alexithymia group, and those who scored in the bottom 15% of the distribution (n = 47) were classified as the low alexithymia group. In the second phase of this research, the participants from these two groups were invited to participate in the © 2020 Hogrefe Publishing


X. Zhang et al., The Effect of Alexithymia on Moral Judgments

subsequent experiment. Because two high alexithymia individuals declined to participate in the subsequent experiment, the remaining 45 high alexithymia participants (38 females; Mage = 18.58 years, SDage = 1.69) and 47 low alexithymia participants (37 females; Mage = 19.15 years, SDage = 1.32) participated in the subsequent experiment. The TAS-20 scores of the two groups are provided in the Electronic Supplementary Material, ESM 1 (Table E1). The two groups had no differences in terms of age, t(90) = 1.82, p = .073, Cohen’s d = 0.38, and gender, χ2 (1, N = 92) = 0.50, p = .48. This study was approved by the Research Ethics Committee of Hunan Normal University. Written informed consent was obtained from all participants involved in this study.

25

the proportion of “unacceptable” responses in congruent dilemmas as follows: U ¼ pðunacceptable j congruentÞ pðunacceptable j incongruentÞ:

(1)

A high U parameter reflects that participants prefer to accept harm when they maximize good outcomes (i.e., incongruent dilemmas) but reject harm when they fail to maximize good outcomes (i.e., congruent dilemmas). Scores on U range from 1 to 1. The deontological (D) parameter was calculated by dividing the proportion of “unacceptable” responses in incongruent dilemmas by nonutilitarian responses as follows: D ¼ pðunacceptable j incongruentÞ=ð1 UÞ:

(2)

Experimental Task and Procedure Participants responded to 10 incongruent and 10 congruent moral dilemmas (see Conway & Gawronski, 2013; available at osf.io/nm7hy; the Chinese version of the moral dilemmas is available at https://doi.org/10.17605/ osf.io/wjev8). In incongruent dilemmas, deontological and utilitarian inclinations would drive different responses. For example, in the incongruent dilemma of a crying baby, participants were asked whether smothering a crying baby to death to save oneself and the other townspeople from being killed is acceptable. In this case, deontological inclinations would lead people to reject smothering the crying baby to death because the action violates deontological rules, whereas utilitarian inclinations would lead people to accept such actions because the action maximizes net welfare. Yet, deontological and utilitarian inclinations would drive the same responses in congruent dilemmas. For example, in the congruent dilemma of a crying baby, participants were asked whether smothering a crying baby to death to save oneself and the other townspeople from being captured is acceptable. In this case, deontological and utilitarian inclinations would lead people to reject smothering the baby to death because dealing harm leads to worse outcomes overall. In other words, utilitarian inclinations would lead people to accept harm in incongruent dilemmas and reject harm in congruent dilemmas because the action can maximize net welfare, whereas deontological inclinations would lead people to reject harm in incongruent and congruent dilemmas because the action violates deontological rules. We employed a process dissociation approach to independently estimate the contributions of deontological and utilitarian inclinations to moral judgments (for more details, see Conway & Gawronski, 2013). The utilitarian (U) parameter was calculated by subtracting the proportion of “unacceptable” responses in incongruent dilemmas from © 2020 Hogrefe Publishing

A high D parameter reflects that participants prefer to reject causing harm even though they could maximize overall outcomes. Scores on D range from 0 to 1. This experiment consisted of one block with 20 moral dilemma trials presented in a pseudorandom order; no more than three consecutive trials were performed under incongruent and congruent conditions (see Conway & Gawronski, 2013; available at osf.io/nm7hy). At the beginning of each trial, a fixation cross was presented in the center of the screen with a duration varying from 600 ms to 1,000 ms. Subsequently, a moral dilemma and a question about the acceptability of the relevant action were presented simultaneously on one screen. Participants were required to indicate whether the described action would be acceptable or unacceptable by pressing the number key 1 to select yes, this is acceptable or pressing the number key 0 to select no, this is unacceptable. The moral dilemmas and the questions remained on the screen until the participants made a choice. After making a response, each trial ended with a blank screen with presentation durations varying from 800 ms to 1,200 ms (see Figure 1). After the experimental task, participants were required to complete the empathic concern subscale of the interpersonal reactivity index (Davis, 1983) to evaluate feelings of concern for the suffering of others (e.g., I am often quite touched by things that I see happen). This subscale consists of seven items, and each item is coded from 1 (= does not describe me well) to 5 (= describes me very well). Cronbach’s alpha was 0.65 in this study. Alexithymia is thought to be associated with impairment in affective empathy, more so than cognitive empathy (Oakley, Brewer, Bird, & Catmur, 2016; Takamatsu & Takai, 2017). Furthermore, only empathic concern can significantly predict moral judgment in sacrificial dilemmas, although empathic concern and personal distress are the components of affective empathy Experimental Psychology (2020), 67(1), 23–30


26

X. Zhang et al., The Effect of Alexithymia on Moral Judgments

Figure 1. Timeline of each trial.

(Gleichgerrcht & Young, 2013; Patil & Silani, 2014a, 2014b). Therefore, we only focused on the role of empathic concern in the effect of alexithymia on moral judgments in the present study.

Results Overall, harmful actions were judged to be more acceptable in incongruent dilemmas (M = 59%, SD = 16%) than in congruent dilemmas (M = 29%, SD = 16%), t(90) = 11.21, p < .001, Cohen’s d = 1.24.

Conventional Analysis First, we computed the number of times that participants accepted causing outcome-maximizing harm in incongruent dilemmas as a measure of conventional relative judgments: Higher scores typically reflect relatively more utilitarian judgments, whereas lower scores typically reflect relatively more deontological judgments. The analysis revealed that the participants in the high alexithymia group (M = 0.60, SD = 0.15) showed a stronger preference for utilitarian judgments over deontological judgments than those in the low alexithymia group (M = 0.53, SD = 0.17), t(90) = 2.05, p = .043, Cohen’s d = 0.43.

Process Dissociation Analysis Before the process dissociation analysis, we conducted preliminary analyses in which we computed the correlations between all unstandardized variables (see Table 1). Consistent with past work (Conway et al., 2018), conventional relative judgments correlated positively with the U parameter, r = 0.652, p < .001, and correlated negatively with the D parameter, r = 0.646, p < .001. However, the Experimental Psychology (2020), 67(1), 23–30

Table 1. Descriptive statistics and correlations between all variables Variables 1. Relative judgments

M

SD

0.56

0.16

1

2

3

2. U parameter

0.27

0.22

.652***

3. D parameter

0.61

0.17

.646***

0.131

4. Alexithymia

52.17

12.27

.215*

.061

.360***

0.49 .222*

0.063

.356**

5. Empathic concern

3.66

4

.42***

Note. D parameter = deontological parameter; N = 92; U parameter = utilitarian parameter; *p < .05; **p < .01; ***p < .001.

U and D parameters were statistically independent from each other, r = 0.131, p = .212. In addition, alexithymia correlated negatively with the D parameter, r = 0.360, p < .001, and had a weak negative correlation with the U parameter, r = 0.061, p = .566. Moreover, these two correlations were statistically different, z = 2.11, p = .035. These findings suggest that each parameter taps an independent process and that the two processes jointly contribute to conventional relative judgments. Next, the two parameter scores were standardized, and a 2 (group: high alexithymia vs. low alexithymia) × 2 (standardized process dissociation parameter: deontological vs. utilitarian) repeated measures mixed-model ANOVA was conducted with the group as a betweensubjects factor and standardized process dissociation parameter as a within-subjects factor. A significant interaction between group and process dissociation parameter was found, F(1, 90) = 4.41, p = .039, η2p ¼ 0:05 (see Figure 2). Post hoc analyses indicated that the deontological inclinations (i.e., deontological parameter) were significantly lower in the high alexithymia group (M = 0.35, SD = 1.01) than in the low alexithymia group (M = 0.33, SD = 0.87), F(1, 90) = 11.92, p = .001, η2p ¼ 0:12. However, the difference in utilitarian inclinations (i.e., utilitarian parameter) between the high alexithymia group (M = 0.06, SD = 0.86) and the low alexithymia group (M = 0.06, SD = 1.12) was nonsignificant, F(1, 90) = 0.30, p = .585, η2p ¼ 0:003. © 2020 Hogrefe Publishing


X. Zhang et al., The Effect of Alexithymia on Moral Judgments

Figure 2. Mean standardized utilitarian (U) and deontological (D) process dissociation parameters in the high and low alexithymia groups. Error bars indicate standard errors.

Mediation Analysis Finally, we examined whether empathic concern and deontological inclinations mediated the relationships between alexithymia and conventional relative judgments. The bootstrap procedure with 5,000 resamples was adopted to examine the significance levels of indirect effects. The results revealed that the total effect of alexithymia on relative judgments was significant, β = 0.22, SE = 0.101, 95% CI = 0.013, 0.405. This coefficient was nonsignificant (i.e., direct effect) when the mediators were included in the analysis, β = 0.02, SE = 0.09, 95% CI = 0.188, 0.165. The indirect effect of alexithymia on relative judgments via deontological inclinations was significant, β = 0.167, SE = 0.082, 95% CI = 0.028, 0.355. Alexithymia had a significant negative effect on deontological inclinations (β = 0.26, p = 0.018), which in turn had a highly significant negative effect on relative judgments (β = 0.65, p = 0.001). The indirect effect ofalexithymiaonrelativejudgments via empathic concern and deontological inclinations was also significant, β = 0.068, SE = 0.032, 95% CI = 0.017, 0.149.

27

Specifically, alexithymia had a highly significant negative effect on empathic concern (β = 0.42, p < .001), and empathic concern had a significant positive effect on deontological inclinations (β = 0.25, p = .03), which in turn had a highly significant negative effect on relative judgments (β = 0.65, p = .001; see Figure 3). Note that a very similar pattern of results was observed when including the U parameter as a covariate. Full details of the results are provided in the Electronic Supplementary Material (ESM 1 Text E1). In addition, we examined whether empathic concern and utilitarian inclinations mediated the relationships between alexithymia and conventional relative judgments. The results revealed that the indirect effect of alexithymia on relative judgments via empathic concern was significant, β = 0.08, SE = 0.039, 95% CI = 0.014, 0.166. Alexithymia had a significant negative effect on empathic concern (β = 0.42, p < .001), which in turn had a significant negative effect on relative judgments (β = 0.19, p = .031; see Figure 4). However, the indirect effect of alexithymia on relative judgments via utilitarian inclinations was nonsignificant, β = 0.028, SE = 0.072, 95% CI = 0.175, 0.105. Moreover, the indirect effect of alexithymia on relative judgments via empathic concern and utilitarian inclinations was nonsignificant, β = 0.013, SE = 0.036, 95% CI = 0.088, 0.051. When the D parameter is included as a covariate, the indirect effect of alexithymia on relative judgments through empathic concern becomes nonsignificant, β = 0.004, SE = 0.011, 95% CI = 0.016, 0.025. Moreover, the indirect effect through the U parameter remains nonsignificant, β = 0.031, SE = 0.077, 95% CI = 0.180, 0.125, and the indirect effect through empathic concern and the U parameter remains nonsignificant, β = 0.014, SE = 0.039, 95% CI = 0.094, 0.059.

Discussion Our conventional analysis revealed that participants in the high alexithymia group showed a stronger preference for utilitarian judgments over deontological judgments than those in the low alexithymia group. Note that conventional analysis entails treating utilitarian judgments as the pure Figure 3. Results of the mediation analysis of empathic concern and deontological inclinations. The standardized coefficients when including empathic concern and deontological inclinations as mediators in the model are presented above the arrow. *p < .05; **p < .01; ***p < .001.

© 2020 Hogrefe Publishing

Experimental Psychology (2020), 67(1), 23–30


28

X. Zhang et al., The Effect of Alexithymia on Moral Judgments

Figure 4. Results of the mediation analysis of empathic concern and utilitarian inclinations. The standardized coefficients when including empathic concern and utilitarian inclinations as mediators in the model are presented above the arrow. *p < .05; **p < .01; ***p < .001.

inverse of deontological judgments (i.e., more utilitarian judgments imply fewer deontological judgments, and vice versa) and thus cannot distinguish increased concern about maximizing good outcomes (i.e., high U) from reduced concern for causing harm (i.e., low D; Conway et al., 2018). However, our process dissociation analysis provided a more nuanced pattern of results, showing that deontological inclinations were significantly lower in the high alexithymia group than in the low alexithymia group, whereas the difference in the utilitarian inclinations between the two groups was nonsignificant. Furthermore, the mediation analysis revealed that empathic concern and deontological inclinations mediated the association between alexithymia and conventional relative judgments. Our findings indicate that individuals with high alexithymia make more utilitarian judgments over deontological judgments due to a deficit in emotional processing. The present study found that participants in the high alexithymia group had a stronger preference for utilitarian judgments over deontological judgments than those in the low alexithymia group, which is in line with previous studies. For example, previous studies showed that high alexithymia can predict moral acceptability of utilitarian judgments and that individuals with high alexithymia make more utilitarian judgments due to diminished empathic concern for victims (Patil & Silani, 2014a, 2014b). Furthermore, our study revealed that deontological inclinations were significantly lower in the high alexithymia group than in the low alexithymia group, whereas the difference in utilitarian inclinations between the two groups was nonsignificant. Previous studies failed to independently quantify the strength of utilitarian and deontological inclinations within individuals with high alexithymia. Thus, determining whether the emotional deficits for individuals with high alexithymia could affect their moral judgment by perturbing one of the inclinations or both (i.e., deontological and utilitarian inclinations) is difficult Experimental Psychology (2020), 67(1), 23–30

(Gleichgerrcht et al., 2015; Patil & Silani, 2014a, 2014b). Using a process dissociation approach, we found a significant discrepancy in deontological inclinations between the high alexithymia group and the low alexithymia group. This finding indicates that high levels of alexithymia could affect individual moral judgments by perturbing the deontological inclinations (i.e., reducing affective response to causing harm). Several recent studies provide powerful evidence that supports our findings. For example, Cecchetto et al. (2018) found that alexithymia is characterized by a diminished physiological activation (i.e., skin conductance) during moral judgments and argued that alexithymia shapes emotional reactions to moral judgments. In addition, Xu et al. (2018) showed that alexithymia is associated with small gray matter volumes in the brain core areas of affective processing (e.g., amygdala, insula, and ventral striatum). They argued that small volumes in these areas may result in deficiencies in properly identifying and expressing emotions for individuals with high alexithymia. The present study also revealed that individuals with high alexithymia have low empathic concern, which in turn decreases deontological inclinations and contributes to more utilitarian judgments over deontological judgments. Given individuals with high alexithymia often have difficulty in identifying and recognizing emotions and their reduced ability to reflect on their internal states (Taylor et al., 1999; Xu et al., 2018), it is unsurprising that they have relatively low empathic concern for the victim (Patil & Silani, 2014a). Moreover, empathic concern correlated positively with the deontological but not utilitarian inclinations (Byrd & Conway, 2019; Conway & Gawronski, 2013; Reynolds & Conway, 2018), and the deontological inclinations correlated negatively with more utilitarian judgments over deontological judgments because the deontological inclinations reflect the degree to which individuals consistently avoid causing harm, regardless of © 2020 Hogrefe Publishing


X. Zhang et al., The Effect of Alexithymia on Moral Judgments

whether doing so maximizes good outcomes (Conway et al., 2018). Therefore, individuals with high alexithymia made more utilitarian judgments over deontological judgments than those with low alexithymia reflecting the absence of concerns about causing harm (i.e., low deontological inclinations) rather than concerns for maximizing good outcomes (i.e., high utilitarian inclinations). Our findings also have potential implications for understanding relationships in individuals with high alexithymia. Individuals with high alexithymia often have difficulty in identifying emotions and empathizing with others (Xu et al., 2018), which may lead to maladaptive responses to others’ emotions and increase the tendency to cause distress to others during social interactions. Moral action is critical for developing and maintaining social relationships (Brewer et al., 2015). Our findings imply that increased alexithymia may lead to atypical moral judgments, which may increase the social difficulties experienced by individuals with high alexithymia. The current study leaves several open questions for future research. First, the current study solely relies on behavioral data to explore the effect of alexithymia on moral judgments. Thus, future research can investigate the underlying neural mechanisms of the differences in moral judgments between individuals with high alexithymia and those with low alexithymia. Second, the number of female participants was larger than that of male participants in the current study. Thus, determining whether gender difference plays a role in moral judgments for individuals with high alexithymia is difficult. Future studies can try to investigate the effect of gender on moral judgment in individuals with high alexithymia. Finally, the participants in this study were undergraduate students. Thus, examining whether our findings are applicable to other samples should be considered in future studies. Despite these limitations, our findings provide the initial evidence of why people with high alexithymia make more utilitarian judgments over deontological judgments. Individuals with high alexithymia have low empathic concern, which in turn decreases deontological inclinations and contributes to more utilitarian judgments over deontological judgments. Our findings shed new light on the moral judgment of individuals with high alexithymia.

Electronic Supplementary Material The electronic supplementary material is available with the online version of the article at https://doi.org/10.1027/ 1618-3169/a000474 ESM 1. Additional table and additional analysis.

© 2020 Hogrefe Publishing

29

References Avramova, Y. R., & Inbar, Y. (2013). Emotion and moral judgement. Wiley Interdisciplinary Reviews: Cognitive Science, 4(2), 169–178. https://doi.org/10.1002/wcs.1216 Bagby, R. M., Taylor, G. J., & Parker, J. D. (1994). The twenty-item Toronto Alexithymia Scale—II. Convergent, discriminant, and concurrent validity. Journal of Psychosomatic Research, 38(1), 33–40. https://doi.org/10.1016/0022-3999(94)90006-x Białek, M., & De Neys, W. (2017). Dual processes and moral conflict: Evidence for deontological reasoners’ intuitive utilitarian sensitivity. Judgment and Decision Making, 12(2), 148–167. https:// doi.org/2017-17480-006 Brewer, R., Marsh, A. A., Catmur, C., Cardinale, E. M., Stoycos, S., Cook, R., & Bird, G. (2015). The impact of autism spectrum disorder and alexithymia on judgments of moral acceptability. Journal of Abnormal Psychology, 124(3), 589–595. https://doi. org/10.1037/abn0000076 Byrd, N., & Conway, P. (2019). Not all who ponder count costs: Arithmetic reflection predicts utilitarian tendencies, but logical reflection predicts both deontological and utilitarian tendencies. Cognition, 192, 103995. https://doi.org/10.1016/j.cognition. 2019.06.007 Cecchetto, C., Korb, S., Rumiati, R. I., & Aiello, M. (2018). Emotional reactions in moral decision-making are influenced by empathy and alexithymia. Social Neuroscience, 13(2), 226–240. https:// doi.org/10.1080/17470919.2017.1288656 Conway, P., & Gawronski, B. (2013). Deontological and utilitarian inclinations in moral decision making: A process dissociation approach. Journal of Personality and Social Psychology, 104(2), 216–235. https://doi.org/10.1037/a0031021 Conway, P., Goldstein-Greenwood, J., Polacek, D., & Greene, J. D. (2018). Sacrificial utilitarian judgments do reflect concern for the greater good: Clarification via process dissociation and the judgments of philosophers. Cognition, 179, 241–265. https://doi. org/10.1016/j.cognition.2018.04.018 Davis, M. H. (1983). Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44(1), 113–126. https://doi.org/10. 1037/0022-3514.44.1.113 Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. https://doi.org/10.3758/bf03193146 Gawronski, B., Armstrong, J., Conway, P., Friesdorf, R., & Hütter, M. (2017). Consequences, norms, and generalized inaction in moral dilemmas: The CNI model of moral decision-making. Journal of Personality and Social Psychology, 113(3), 343–376. https://doi. org/10.1037/pspa0000086 Gleichgerrcht, E., Tomashitis, B., & Sinay, V. (2015). The relationship between alexithymia, empathy and moral judgment in patients with multiple sclerosis. European Journal of Neurology, 22(9), 1295–1303. https://doi.org/10.1111/ene.12745 Gleichgerrcht, E., & Young, L. (2013). Low levels of empathic concern predict utilitarian moral judgment. PLoS One, 8(4), e60418. https://doi.org/10.1371/journal.pone.0060418 Greene, J. D. (2007). Why are VMPFC patients more utilitarian? A dual-process theory of moral judgment explains. Trends in Cognitive Sciences, 11(8), 322–323. https://doi.org/10.1016/j.tics. 2007.06.004 Greene, J. D. (2013). Moral tribes: Emotion, reason, and the gap between us and them. New York, NY: Penguin. Hayakawa, S., Tannenbaum, D., Costa, A., Corey, J. D., & Keysar, B. (2017). Thinking more or feeling less? Explaining the foreign-

Experimental Psychology (2020), 67(1), 23–30


30

language effect on moral judgment. Psychological Science, 28(10), 1387–1397. https://doi.org/10.1177/0956797617720944 Koenigs, M., Young, L., Adolphs, R., Tranel, D., Cushman, F., Hauser, M., & Damasio, A. (2007). Damage to the prefrontal cortex increases utilitarian moral judgements. Nature, 446(7138), 908–911. https://doi.org/10.1038/nature05631 Lee, J. J., & Gino, F. (2015). Poker-faced morality: Concealing emotions leads to utilitarian decision making. Organizational Behavior and Human Decision Processes, 126, 49–64. https:// doi.org/10.1016/j.obhdp.2014.10.006 Martins, A. T., Faisca, L., Esteves, F., Muresan, A., & Reis, A. (2012). Atypical moral judgements following traumatic brain injury. Judgement and Decision Making, 7(4), 478–487. Oakley, B. F. M., Brewer, R., Bird, G., & Catmur, C. (2016). Theory of mind is not theory of emotion: A cautionary note on the reading the mind in the eyes test. Journal of Abnormal Psychology, 125, 818–823. https://doi.org/10.1037/abn0000182 Patil, I., & Silani, G. (2014a). Reduced empathic concern leads to utilitarian moral judgments in trait alexithymia. Frontiers in Psychology, 5, 501. https://doi.org/10.3389/fpsyg.2014.00501 Patil, I., & Silani, G. (2014b). Alexithymia increases moral acceptability of accidental harms. Journal of Cognitive Psychology, 26(5), 597–614. https://doi.org/10.1080/20445911.2014. 929137 Patil, I., Melsbach, J., Hennig-Fast, K., & Silani, G. (2016). Divergent roles of autistic and alexithymic traits in utilitarian moral judgments in adults with autism. Scientific Reports, 6, 23637. https://doi.org/10.1038/srep23637 Reynolds, C. J., & Conway, P. (2018). Not just bad actions: Affective concern for bad outcomes contributes to moral condemnation of harm in moral dilemmas. Emotion, 18(7), 1009–1023. https:// doi.org/10.1037/emo0000413 Takamatsu, R., & Takai, J. (2019). With or without empathy: Primary psychopathy and difficulty in identifying feelings predict utilitarian judgment in sacrificial dilemmas. Ethics and Behavior, 29(1), 71–85. https://doi.org/10.1080/10508422.2017. 1367684

Experimental Psychology (2020), 67(1), 23–30

X. Zhang et al., The Effect of Alexithymia on Moral Judgments

Taylor, G. J., Bagby, R. M., & Parker, J. D. (1999). Disorders of affect regulation: Alexithymia in medical and psychiatric illness. Cambridge, UK: Cambridge University Press. Xu, P., Opmeer, E. M., van Tol, M.-J., Goerlich, K. S., & Aleman, A. (2018). Structure of the alexithymic brain: A parametric coordinate-based meta-analysis. Neuroscience and Biobehavioral Reviews, 87, 50–55. https://doi.org/10.1016/j.neubiorev.2018.01.004 History Received October 29, 2019 Revision received April 6, 2020 Accepted April 9, 2020 Published online June 9, 2020 Publication Ethics This study was approved by the Research Ethics Committee of Hunan Normal University. Written informed consent was obtained from all participants involved in this study. Authorship Xiangyi Zhang and Zhihui Wu contributed equally to this study. Open Data Raw data are available in the following link: https://doi.org/10. 17605/osf.io/wjev8 Funding This work was supported by the National Social Science Foundation of China (19BSH127) to Daoqun Ding. Daoqun Ding Department of Psychology School of Education Science Hunan Normal University 36 Lushan Road Changsha 410081 Hunan PR China psychding@hunnu.edu.cn

© 2020 Hogrefe Publishing


Short Research Article

Stroke Encoding Processes of Chinese Character During Sentence Reading Mingjun Zhai1, Hsuan-Chih Chen1, and Michael C. W. Yip2 Department of Psychology, The Chinese University of Hong Kong, Hong Kong SAR

1

Department of Psychology, The Education University of Hong Kong, Hong Kong SAR

2

Abstract. The present study was conducted to examine whether traditional and simplified Chinese readers (TCRs and SCRs) differed in stroke encoding in character processing by an eye-tracking experiment. We recruited 66 participants (32 TCRs and 34 SCRs) to read sentences comprising characters with different proportions and types of strokes removed in order to explore whether any visual complexity effect existed in their processing of simplified and traditional Chinese characters. The present study found a cross-script visual complexity effect and that SCRs were more influenced by visual complexity change in lexical access than were TCRs. In addition, the stroke-order effect appeared to be more salient for TCRs than for SCRs. Keywords: parsing, monitoring task, verb position, word order change

Chinese, as one of the most widely used languages worldwide, has many unique psycholinguistic properties that have intrigued psychologists’ attention for decades. One of the distinctive characteristics of the Chinese language is that it has two concurrent orthographies used at the same time: traditional Chinese (TC) and simplified Chinese (SC). TC is a set of written scripts of Chinese language evolved from characters used 2,000 years ago. In the 1950s, the Mainland Chinese government simplified the written system in order to increase literacy. The SC is used officially in Mainland China, Singapore, and Malaysia nowadays, while TC is still used widely in Hong Kong and Taiwan. Since the simplification processes reduced the number of characters and the number of strokes in some characters that remain in use (McBride-Chang, Chow, Zhong, Burgess, & Hayward, 2005), two orthographies of the same character identity often have different numbers of constituent strokes. Simplified characters have a mean of 10.8 strokes (Su, 2001), which is approximately 22.5% fewer than that of traditional characters (Gao & Kao, 2002). The stroke number effect has been found consistently in several studies related to Chinese character processing among different paradigms. © 2020 Hogrefe Publishing

In the literature, Just, Carpenter, and Wu (cited in Just & Carpenter, 1987) found that the duration of the reader’s gaze at Chinese characters increased with the number of strokes comprising the characters from their eye-tracking study. Similarly, Yang and McConkie (1999) found that Chinese characters were less likely to be skipped, were more likely to be refixated, and had longer gaze durations when the characters were more complex, i.e., contained more strokes (Chen, Song, Lau, Wong, & Tang, 2003; Tsai & McConkie, 2003;Wang,Pomplun,Chen,Ko,&Rayner,2010).Thesame result was obtained by Tan and Peng (1990) in their lexical decision experiments. Furthermore, Cheng (1981) found, in a tachistoscopic identification task, that the response time of adults was longer for high-stroke than for low-stroke characters (see also Cao & Shen, 1963; Cheng & Fu, 1986; Yeh & Liu, 1972). These findings were replicated later by Leong, Cheng, and Mulcahy (1987) and some other researchers (Yu & Cao, 1992; Zhu, 1991) using a naming task. Other studies have shown that the number of strokes within Chinese characters can influence reading accuracy (Huang & Hsu, 2005), and the lesser the strokes, the lesser the recognition error and reading. Therefore, we can see that the stroke effects (a type of visual complexity) impact on the Chinese character processing to a large extent. Theoretical framework for looking into the effects of visual complexity in Chinese reading was first investigated by Wang (1973, 1981) from alinguistic point of view and by Leong (1986) from a psycholinguistic perspective. From their studies, they Experimental Psychology (2020), 67(1), 31–39 https://doi.org/10.1027/1618-3169/a000478


32

illustratedthatthevisualfeaturesofatargetChinesecharacter would be decoded to activate its orthographic representation, which subsequently activates its conceptual representation. Chen and Kao (2002) reported that the visual–spatial properties of Chinese characters also played a role in efficient character recognition by providing a perceptual basis for the orthographic processing in character recognition. The more visual–spatial properties the characters have, the greater the facilitating effect of orthographic processing of the characters. Nonetheless, high visual complexity may cause inhibition effect on character processing because of the increasing decoding burden for visual features (Yang & McConkie, 1999). More than that, differentiation between characters could be even harder when processing a character in a certain range of stroke number where various characters exist. Thus, visual similarity among the simplified characters is higher than traditional characters. Whether the differences in appearance between the two scripts of the Chinese language affect the reading process of their users has gone largely unanswered by previous studies. Some researchers argued that no difference inreadingorspellingskillscouldbeattributedtothescriptthey used among children in Hong Kong and Beijing (Chan & Wang, 2003). However, others suggested that scripts did make a difference. For example, Chen and Yuen (1991) have found that children from Mainland China were more likely to make visual errors in character recognition than children from Hong Kong. They argued that it might be more difficult for children to distinguish similar characters because of the reduced stroke number of simplified characters. McBrideChang et al. (2005) compared the reading abilities, phonological awareness, and visual skills of two groups of 5-year-old children: TC learners and SC learners. They proposed that the group difference found in visual skills might be due to the different scripts they were learning. Since the simplified characters have fewer visual features for discrimination among characters, simplified Chinese readers (SCRs) must develop stronger visual skills to achieve certain proficiency. While within-script visual complexity effect has been discussed by numerous researchers, few studies have examined the cross-script effect. The significant differences between languages in both visuo-perceptual and linguistic factors make it difficult to examine the visual complexity effect while controlling other influencing factors such as orthographic depth (connection between letters and sounds). However, Abdelhadi, Ibrahim, and Eviatar (2011) managed to investigate the effect of orthographic visual complexity on the process of reading by comparing Arabic and Hebrew. These two languages are similar in many ways, with the exception of visual complexity. Letter shapes and word forms in Arabic are more visually complex than those in Hebrew. This enables researchers to separate visual complexity from other factors that may cause a difference in processing efficiency. The authors argued that Experimental Psychology (2020), 67(1), 31–39

M. Zhai et al., Stroke Encoding Processes

previous findings that reading Arabic was slower than reading Hebrew, English, and other alphabetic languages (Frost, Katz, & Bentin, 1987; Ibrahim, Eviatar, & AharonPeretz, 2007; Katz & Frost, 1992) can be explained, at least partly, by the visual complexity of the orthography. Arabic and Hebrew are both alphabetic languages with fewer complicated orthographic features than a logographic language like Chinese. Comparing TC and SC offers a chance to examine the existence of the cross-script effect further in a logographic language. Studying the difference between two orthographies that are homogeneous at a linguistic level but different at a perceptual level enables us to differentiate the effects of linguistic factors and perceptual factors (e.g., visual complexity) on language processing. The cross-script visual complexity effect in Chinese orthographies can provide clear evidence for the important role that visuo-perceptual factors play in language processing. In Chinese character processing, the differences resulting from thesimplification of traditional to SC characters could be studied by examining the strokes that are encoded systematically in character recognition (Yan et al., 2012). When writing Chinese characters, their constituent strokes are laid down in acertain order. For example,thecharacter口 consists of three strokes that are produced by the sequence: 丨, ㄱ, 一. Previous studies (Tseng, Chang, & Hsiang, 1965; Flores d’Arcias, 1994; Yan et al., 2012) found that strokes were not of equal importance in Chinese character identification. The researchers observed a robust effect of the types of strokes, suggesting that it is more influential to remove beginning strokes than ending strokes, while the least disruptive istoremovestrokesthatdonotaffecttheoverallconfiguration of the character. Nevertheless, characters with a certain number of strokes removed are not affected by the stroke removal. Therefore, it seems that not all constituent strokes are necessary to identify this type of character during reading. Yan et al. (2012) found that SC characters with 15% of strokes removedareaseasytoreadasnormalcharacters.Thefactthat TC characters contain even more strokes than their simplified counterparts raises the possibility that there may also be more redundant strokes than simplified ones in character identification. Tseng et al. (1965) carried out a stroke removal experiment focusing on the middle of the simplification procedure, using experimental materials comprising a mixture of simplified and TC characters. They found that, up until 50% or more of the strokes were removed a character, the character identification could proceed as normal. If the traditional characters have more redundant constituent strokes than simplified ones, it would be likely that a higher percentage of strokes could be removed from the characters and the participants could still read as normal. So, when a certain percentage of strokes is removed from the character, it will be more likely that the eight basic strokes will remain in the traditional characters than in the © 2020 Hogrefe Publishing


M. Zhai et al., Stroke Encoding Processes

simplified ones (Yip, 2000). Therefore, we expected that, in this study, traditional Chinese readers (TCRs) would be less affected by the same percentage of stroke removal. We also anticipated that insights into the research on stroke encoding would be obtained from the simplified procedure. Therefore, we considered it to be very important to examine the types of orthographies (traditional or simplified) in stroke encoding. So far, no research has investigated stroke encoding using both versions (traditional and simplified) of the same set of materials, so the present study attempted to fill this knowledge gap. Hence, the main objective of the present study is to determine whether different encoding strategies in reading would be used by SC and TC users. As aforementioned, TC and SC are two different orthographies of one single language. In other words, they are homogeneous at a linguistic level but different at a perceptual level. Studying the cognitive processing differences involved in reading between the two groups of readers enabled us to tease apart the unique effect of linguistic factors and perceptual factors (e.g., visual complexity) on language processing. If the complexity effect existed across scripts, a difference between traditional and simplified character users should be observed in reading words that are orthographically different in the two scripts (e.g., 無, 无). The difference in stroke number between the traditional version and the simplified version of those characters may lead to two opposite predictions. The higher complexity of traditional characters may increase the perceptual load of readers, thus increasing the difficulty in processing those characters for TC users. Alternatively, traditional characters which are more visually complex may provide more information that facilitates character recognition (Tsai, Kliegl, & Yan, 2012). It is expected that if the advantage of more information provided by traditional characters supersedes the disadvantage of heavier perceptual load due to higher complexity, the readers of traditional characters would find the reading task easier compared to the users of simplified characters. It is also possible that the two groups of readers developed different strategies due to the orthographic differences of the scripts they used. TC might train them into experts in processing characters that comprise more strokes, whereas simplified character users may have developed a more holistic way of processing characters because of the less regular radicals. In this case, words that are orthographically same (e.g., 是, 是) would also be processed differently by the two groups of users. Likewise, if the strokes were processed in a similar way in the two orthographies, the performance of the two groups of subjects should be similar when strokes were removed from the character. Otherwise, there may be a difference in efficiency of processing the same type of strokes and the roles or status of the same type of strokes between two scripts. © 2020 Hogrefe Publishing

33

To carefully investigate this question, we adopted the stroke removal paradigm to determine whether the removed stroke type (beginning strokes, ending strokes, and strokes that did not seriously affect the configuration of the character) and percentage of removed strokes (15, 30, and 50%) would have different impacts on the reading performances of two groups of readers (traditional or simplified character users). With the stroke removal paradigm, we were able to directly manipulate the strokes involved in the Chinese character recognition process. Simplified and traditional versions of the same set of materials were used; that is, the materials were the same in character identity but different at a visual–perceptual level. Therefore, the observed differences would be results of the different scripts used by the two groups of readers, traditional character users (Hong Kong students) and simplified character users (Mainland China students). Here, we used the eye-tracking technique to measure the readers’ performances on reading sentences. There is a large body of evidence suggesting that eye movement behavior is highly sensitive to the influences of lexical processing and that eye movement measures reflect online processing during reading (Magnuson, 2019; Rayner, 1998, 2009). Furthermore, since eye tracking is a noninvasive research technique, it is appropriate to use it to investigate sentence reading. In this study, we used global analyses to track the changes induced by stroke removal, in general, reading performance, and local analyses to examine the content words further in experimental sentences. Orthographic processing occurs in the early stage of visual word identification (Dufau, Grainger, & Holcomb, 2008), and so if visual complexity effects exist, the group differences should be observed in local measures, reflecting the early stages of lexical access, such as first fixation duration (the duration of the first fixation on a target word), gaze duration (the summed duration of all fixations on a target word before a saccade is made away from the word), skipping rate (percentage probability of skipping a target word), but not in those reflecting late stages of lexical activation, e.g., second-pass reading time (the sum of all fixations on a target word during the second pass) and total reading time (the summed duration of all the fixations made on a target word, including refixations).

Experiment Method Participants Thirty-two Hong Kong University students (TC users) and 34 Mainland Chinese University students (SC users) were Experimental Psychology (2020), 67(1), 31–39


34

M. Zhai et al., Stroke Encoding Processes

recruited to participate in this experiment. The participants’ experiences of using their own familiar orthographies were identified by a modified version of the Language Experience and Proficiency Questionnaire (Marian, Blumenfeld, & Kaushanskaya, 2007). Three participants were excluded because they blinked so frequently that more than 70% of the local data were missing. Four traditional readers were excluded because of their low reading efficiency (measured by multiplying reading speed with the accuracy of the correspondent comprehension questions) in the control condition (no stroke removal). After adjustment, there were 27 TCRs and 32 SCRs, and they were matched in reading efficiency of control condition (mean (SCR) = 432 character per minute, mean (TCR) = 387 character per minute, t(57) = 1.096, p = .278). Design and Materials One hundred and fifty sentences and corresponding yes-orno comprehension questions were created for this experiment. Each sentence had two versions: a TC version for the Hong Kong students and a SC version for the Mainland students. Both groups of readers read in their familiar orthographies. The experimental materials were rated by users of the two orthographies in a 6-point scale (1 = very unfamiliar, 6 = very familiar; 1 = very difficult, 6 = very easy). No significant differences were found between the two types of materials in familiarity (mean (TC) = 5.27, mean (SC) = 5.23, p > .05) or readability (mean (TC) = 5.31, mean (SC) = 5.38, p > .05). Three categories of strokes were removed, (1) beginning strokes, (2) ending strokes, and (3) strokes, which did not seriously affect the configuration of the character. In the third condition, the shortest strokes were removed first because they had the least impact on the configuration of the character. For example, the first stroke to be removed might be a dot; the next stroke to be removed might be slightly longer, like a short horizontal line. Thus, the longest and most significant strokes within the characters were kept. Additionally, to better control the testing materials, we avoided deleting the first or final strokes of the characters as well as deleting strokes such that the strokedeleted character became another character. The proportion of strokes that were removed from the characters was also manipulated. In different conditions, 15, 30, or 50% of the strokes were removed. The number of strokes that should be removed for each character was computed and rounded off individually based on the corresponding experimental condition (i.e., 15, 30, or 50%). The examples are shown in Figure 1. The experimental design was a 3 (stroke type) × 3 (percentage of removed stroke) × 2 (reader group) mixedfactor design; the former two factors were within-subject Experimental Psychology (2020), 67(1), 31–39

Figure 1. Example simplified (upper panel) and traditional (lower panel) Chinese stimuli for the experimental and control conditions. From top to bottom, sentences are for control, 15% beginning, 15% ending, 15% configuration retaining, 30% beginning, 30% ending, 30% configuration retaining, 50% beginning, 50% ending, and 50% configuration retaining.

variables, and the third one was a between-subject variable. Apart from the nine test conditions, we also included a control condition in which no strokes were deleted. There were 15 sentences in each condition, and one condition served as one block. The sequence of the ten blocks was counterbalanced using a Latin square. Within each block, the sentences were presented randomly. Apparatus A Dell 19ʺ SVGA monitor was used to display the stimuli. All stimuli were presented in white against a black background on the computer screen. The participants’ eye movements were recorded via an Eyelink 1,000 eyetracking machine that calculated the x- and y-coordinates of the readers’ point of fixation every 2 ms. The computer monitor was positioned approximately 60 cm away from © 2020 Hogrefe Publishing


M. Zhai et al., Stroke Encoding Processes

35

the participants. All sentences were presented in Kai-Ti 44point font, with 0.3 cm between the individual characters. The sentences were always displayed in the center of the screen. Procedure The participants were tested individually. They were instructed to read experimental sentences written in the script with which they were familiar, to comprehend them to the best of their ability, and to try to answer the question that would appear after the sentence when applicable. The participants were required to read the sentences silently, and after reading a sentence, they were asked to press a button to terminate the display of the sentence and then to answer the comprehension question wherever necessary. The participants were required to answer yes or no to the questions, and the responses were recorded by the computer. If they did not understand the sentence, they were required to choose “Don’t know” by pressing the spacebar. Before the formal experiment started, they first went through a 9-point calibration procedure to make sure that the recordings of the eye-tracker were accurate. After successful calibration, the sentences were presented one by one. Recalibration was carried out whenever necessary. They were allowed to take a short break before each sentence was shown.

Results The accuracy rate of comprehension questions in all conditions was relatively high, except when 50% of the beginning strokes were removed. The accuracies are shown in Table 1. Global analyses were carried out based on the wholesentence measures of reading behavior: sentence reading speed (measured by dividing the number of constituent characters by the reading time from the onset time of the sentence to the button-pressing response), sentence

reading efficiency (measured by multiplying the sentencereading speed and the correct rate of comprehension questions), average fixation duration (mean of all fixation durations for each sentence), average saccade size (mean of all saccades for each sentence), number of fixations (total number of fixations for each sentence), number of forward saccades (total count of saccades heading right of each fixation), and number of regressive saccades (total count of saccades heading left of each fixation). We conducted 3 (stroke type) × 3 (proportions of removed strokes) × 2 (reader group) mixed factor repeated measures ANOVAs. Analyses of the differences between the performances of the two groups of readers revealed that the interaction between type of stroke removed and subject group was significant for sentence reading speed (F(2, 56) = 4.08, p < .05, η2 = 0.51), and marginally significant for reading efficiency (F(2, 56) = 3.10, p = .05, η2 = 0.42). No significant interaction was found for other global analyses. Overall, for TCRs, processing characters with beginning strokes removed was more difficult than processing characters with ending strokes removed (reading speed: 235 vs. 255 characters per minute; reading efficiency: 187 vs. 240 characters per minute), but a different pattern was found for the SCRs (reading speed: 282 vs. 274 characters per minute; reading efficiency: 232 vs. 237 characters per minute). There were no significant differences in their processing with beginning or ending strokes removed. The interaction effects between the stroke type and the subject type for all global measures are shown in Figure 2. The two-character words in the experimental sentences were also analyzed to determine whether the stroke removal would exert any differentiated effects on the two groups of readers’ word identification. One to three twocharacter words were selected from each sentence. These target words never appeared at the beginning or the end of the sentences. In the control conditions, significant group differences were observed in first fixation duration (t(57) = 2.115, p < .05) and gaze duration (t(57) = 2.508, p < .05), indicating a tendency that the SCRs made shorter

Table 1. The accuracy rate of comprehension questions Beginning

Ending

Configuration retained

Control

15% strokes removed

89.6% (3.1%)

90.6% (3.1%)

99.0% (1.0%)

94.8% (2.2%)

30% strokes removed

93.8% (2.3%)

83.3% (3.9%)

79.2% (4.4%)

50% strokes removed

53.8% (4.4%)

90.6% (2.7%)

87.1% (3.6%)

Simplified

Traditional 15% strokes removed

79.5% (4.7%)

90.1% (3.5%)

85.8% (3.8%)

30% strokes removed

87.7% (3.6%)

80.2% (4.1%)

76.5% (3.9%)

50% strokes removed

39.5% (6.2%)

87.7% (4.8%)

76.5% (5.3%)

© 2020 Hogrefe Publishing

94.5% (2.7%)

Experimental Psychology (2020), 67(1), 31–39


36

M. Zhai et al., Stroke Encoding Processes

Figure 2. The interaction effect between types of strokes removed and subject type for global eye movement measures. Error bars correspond to standard error mean.

Figure 3. The interaction effect between types of strokes removed and subject type for local eye movement measures. Error bars correspond to standard error mean.

Experimental Psychology (2020), 67(1), 31–39

© 2020 Hogrefe Publishing


M. Zhai et al., Stroke Encoding Processes

fixations than the traditional readers. Nonetheless, no significant group differences were found for skipping rate, second-pass reading time or total reading time (ps > 0.05). The interaction effects between the stroke type and the subject type for all local measures are shown in Figure 3. We conducted a 3 (stroke type) × 3 (proportions of removed strokes) × 2 (reader group) mixed factor repeated measures ANCOVA using the control condition as the covariate. Significant interaction effects between stroke type and subject group were found in gaze duration (F(2,54) = 3.2, p < .05, η2 = 0.54) and skipping rate (F(2,54) = 4.001, p < .05, η2 = 0.42). The results of pairwise comparison suggested that the group difference observed in the control condition for first fixation duration and gaze duration disappeared in the beginning and ending conditions (ps > 0.05) but still existed in the configurationretained condition (ps < 0.05). The group difference in skipping rate was also larger for the configuration-retained condition than for beginning or ending conditions (ps < 0.05). We noticed that SCRs had longer gaze duration and lower skipping rate when ending strokes were removed than when beginning strokes were removed (ps < 0.05), but the advantages for the beginning condition in the early stage measures seemed to be countered in the late stage of lexical access, which are reflected by measures such as second-pass reading time and total reading time.

Discussion The present study investigated the cross-script effect by examining two groups of readers with matched performance in normal reading. Their reading was affected significantly by stroke removal, especially when beginning or ending strokes were removed. Global analyses indicated a tendency that the differences between the beginning and ending stroke conditions were more salient in encoding characters for the TCRs than for the SCRs. Several possible explanations might be able to account for these results. First, as Yan et al. (2012) argued, the mental representation of stroke order must be activated during the procedure of character identification. If the representation of stroke order is not activated equally in the two scripts, there may be a group difference in the effect of stroke order. Second, since the stroke-order effect might be related to the fundamental relationship between production and comprehension (Yan et al., 2012), the group difference in status of strokes may reflect that the linkage between written word processing and production may be stronger in traditional than in SCRs. It might take children more time and effort to practice the written character production when learning traditional characters than simplified ones, © 2020 Hogrefe Publishing

37

resulting in a stronger association between written word processing and production. Third, it is possible that the beginning part of TC contains more informative visual features than that of the SC characters. Future studies are needed to test these possibilities. A significant group difference was found in early stage reading-time measures when both groups were reading normal sentences. Readers who read the less complex characters made shorter fixations than those who read more complex characters. Nevertheless, this group difference was no longer observable when some of the strokes were removed from the characters, suggesting that the early stage lexical access of the SCRs was more influenced by stroke removal than was the case for the TCRs. Therefore, the results of the present study imply that the cross-orthography visual complexity exists between simplified and TC, and the SCRs’ orthographic processing is more sensitive to stroke removal manipulation than the traditional readers’. On the other hand, the TCRs were less affected as we hypothesized earlier. When comparing the results for the SCRs to the findings reported by Yan et al. (2012), we found some inconsistencies. Both global and local analyses suggested that the stroke-order effect was less robust in the present study. In addition, in the study by Yan et al., the participants read characters with 15% of any types of strokes removed as easily as normal, while the reading performances of our participants when reading characters with 15% of beginning or ending strokes removed dropped significantly compared to reading normal sentences (ps < 0.05). This indicates that the participants in the present experiment were influenced more by the stroke removal manipulation than those in the Yan et al. study. Moreover, the overall accuracy of comprehension in the present study (86.2%) was lower compared to the accuracy rate of 92.2% reported by Yan et al. One possible explanation for these discrepancies could be the difficulty of the experimental sentences. The experimental sentences in the present study were longer (about 19 characters) than Yan et al. (2012) used (14 characters). This probably resulted in more complicate sentential structures, harder to understand without full orthographic information. It is possible that, when the overall task difficulty was increased to a certain level, even the ending-stroke-removed sentences became harder to recognize; therefore, the relative difference between the impacts of removing beginning and ending strokes may appear smaller. The overall difficulty of sentential context might be a factor to take into account when investigating stroke encoding in simplified or TC character identification using the stroke removal paradigm. In conclusion, the present study has clearly indicated a difference between the two groups of Chinese readers in Experimental Psychology (2020), 67(1), 31–39


38

relative status of different types of strokes. The beginning strokes played a relatively more significant role in traditional than in SC processing. Furthermore, eye movement measures indicated that the cross-script visual complexity effect occurring between the two concurrent orthographies of Chinese language meant that the SCRs were influenced more by stroke removal than were the traditional readers in the early stage of lexical access. These results provided insights into the difference in processing the two orthographies of the Chinese language. The results revealed that both traditional and simplified characters were processed with similar efficiency during lexical access. However, it seems that the perceptual load caused by visual complexity was higher for traditional than for simplified characters, but the visual features that facilitate word identification might overcome the disadvantage of complexity in character processing. The different relative status of beginning strokes and ending strokes in the two scripts suggested a few potential differences between the two orthographies, including the informative characteristic of strokes and the strength of association between the production and identification of characters. There is one implication from the present pattern of results that simplified characters might be easier to learn in general but, at the same time, deep processing (i.e., accessing all the orthographic, phonological, and semantic information) of the characters might be affected greatly if certain critical (but not redundant) strokes are removed. So far, only a few studies have focused on this research topic. Further investigations of related issues (perceptual span, role of different types of radicals in character processing, etc.) are needed before any final conclusion can be drawn about the differences between SCRs and TCRs in the cognitive processing of Chinese characters.

References Abdelhadi, S., Ibrahim, R., & Eviatar, Z. (2011). Perceptual load in the reading of Arabic: Effects of orthographic visual complexity on detection. Writing Systems Research, 3, 117–127. https://doi.org/ 10.1093/wsr/wsr014 Cao, C. Y., & Shen, Y. (1963). Preliminary research about the tachistoscopic identification of Chinese characters of children [in Chinese]. Acta Psychologica Sinica, 7, 271–279. Chan, L., & Wang, L. (2003). Linguistic awareness in learning to read Chinese: A comparative study of Beijing and Hong Kong children. In C. McBride-Chang & H.-C. Chen (Eds.), Reading development in Chinese children (pp. 91–106). Westport, CT: Praeger. Chen, X., & Kao, H. S. (2002). Visualspatial properties and orthographic processing of Chinese characters. In H. S. R. Kao, C.-K. Leong, & D.-G. Gao (Eds.), Cognitive neuroscience studies of the Chinese language (pp. 175–194). Hong Kong: Hong Kong University Press.

Experimental Psychology (2020), 67(1), 31–39

M. Zhai et al., Stroke Encoding Processes

Chen, H.-C., Song, H., Lau, W. Y., Wong, K. F. E., & Tang, S. L. (2003). Developmental characteristics of eye movements in reading Chinese. In C. McBride-Chang & H.-C. Chen (Eds.), Reading development in Chinese children (pp. 159–169). Westport, CT: Praeger. Chen, M. J., & Yuen, J. C.-K. (1991). Effects of pinyin and script type on verbal processing: Comparisons of China, Taiwan, and Hong Kong experience. International Journal of Behavioral Development, 14, 429–448. https://doi.org/10.1177/016502549101400405 Cheng, C.-M. (1981). Perception of Chinese characters. Acta Psychologica Taiwanica, 23, 137–153. Cheng, C.-M., & Fu, G.-L. (1986). The recognition of Chinese characters and words under divided visual field presentation. Linguistics, Psychology, and the Chinese Language, 23–37. Dufau, S., Grainger, J., & Holcomb, P. (2008). An ERP investigation of location invariance in masked repetition priming. Cognitive, Affective, & Behavioral Neuroscience, 8, 222–228. https://doi. org/10.3758/cabn.8.2.222 Flores d’Arcais, G. B. (1994). Order of strokes writing as a cue for retrieval in reading Chinese characters. European Journal of Cognitive Psychology, 6, 337–355. https://doi.org/10.1080/ 09541449408406519 Frost, R., Katz, L., & Bentin, S. (1987). Strategies for visual word recognition and orthographical depth: A multilingual comparison. Journal of Experimental Psychology: Human Perception and Performance, 13, 104–115. https://doi.org/10.1037/0096-1523.13.1.104 Gao, D.-G., & Kao, H. S. R. (2002). Psycho-geometric analysis of commonly used Chinese characters. In C.-K. L. H. S. R. Kao, & D.-G. Gao (Ed.), Cognitive neuroscience studies of the Chinese language. (pp. 195–206). Hong Kong: Hong Kong University Press. Huang, K.-C., & Hsu, S.-H. (2005). Effects of numbers of strokes on Chinese character recognition during a normal reading condition. Perceptual and Motor Skills, 101, 845–852. https://doi.org/ 10.2466/pms.101.3.845-852 Ibrahim, R., Eviatar, Z., & Aharon-Peretz, J. (2007). Metalinguistic awareness and reading performance: A cross language comparison. Journal of Psycholinguistic Research, 36, 297–317. https://doi.org/10.1007/s10936-006-9046-3 Just, M. A., & Carpenter, P. A. (1987). Orthography: Its structure and effects on reading. In M. A. Just & P. A. Carpenter (Eds.), The psychology of reading and language processing (pp. 287–325). Boston, MA: Allyn & Bacon. Katz, L., & Frost, R. (1992). The reading process is different for different orthographies: The orthographic depth hypothesis. Advances in Psychology, 94, 67–84. https://doi.org/10.1016/ s0166-4115(08)62789-2 Leong, C. K. (1986). What does accessing a morphemic script tell us about reading and reading disorders in an alphabetic script? Annals of Dyslexia, 36, 82–102. https://doi.org/10.1007/ bf02648023 Leong, C. K., Cheng, P.-W., & Mulcahy, R. (1987). Automatic processing of morphemic orthography by mature readers. Language and Speech, 30, 181–196. https://doi.org/10.1177/ 002383098703000207 Magnuson, J. S. (2019). Fixations in the visual world paradigm: Where, when, why? Journal of Cultural Cognition, 3, 111139. https://doi.org/10.1007/s41809-019-00035-3 McBride-Chang, C., Chow, B. W., Zhong, Y., Burgess, S., & Hayward, W. G. (2005). Chinese character acquisition and visual skills in two Chinese scripts. Reading and Writing, 18, 99–128. https://doi. org/10.1007/s11145-004-7343-5 Marian, V., Blumenfeld, H. K., & Kaushanskaya, M. (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language and Hearing Research, 50, 940–967. https://doi.org/10.1044/1092-4388(2007/067)

© 2020 Hogrefe Publishing


M. Zhai et al., Stroke Encoding Processes

Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372422. https://doi.org/10.1037/0033-2909.124.3.372 Rayner, K. (2009). The 35th Sir Frederick Bartlett Lecture: Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psycholology, 62, 1457–1506. https://doi.org/10.1080/ 17470210902816461 Su, P. C. (2001). The 20th century’s study on contemporary Chinese characters contents [in Chinese]. Shanxi: Shuhai Press. Tan, L. H., & Peng, D. L. (1990). Yujing dui hanyu danzici tezheng fenxi de yingxiang [The effects of semantic context on the feature analyses of single Chinese characters]. Xin li xue dong tai [Journal of Psychology], 4, 5–10. Tsai, J.-L., Kliegl, R., & Yan, M. (2012). Parafoveal semantic information extraction in traditional Chinese reading. Acta Psychologica, 141, 17–23. https://doi.org/10.1016/j.actpsy.2012.06.004 Tsai, J.-L., & McConkie, G. W. (2003). Where do Chinese readers send their eyes. In J. Hyönä, R. Radach, & H. Deubel (Eds.), The mind’s eye: Cognitive and applied aspects of eye movement research (pp. 159–176). North Holland: Elsevier. Tseng, S. C., Chang, L., & Hsiang, W. C. C. (1965). An informational analysis of the Chinese language: I. The reconstruction of the removed strokes of the ideograms in printed sentence-texts. Acta Psychologica Sinica, 9, 281–290. Wang, W. S.-Y. (1973). The Chinese language. Scientific American, 228, 50–60. https://doi.org/10.1038/scientificamerican0273-50 Wang, W. S.-Y. (1981). Language structure and optimal orthography. In O. J. L. Tzeng & H. Spinger (Eds.), Perception of print: Reading research in experimental psychology (pp. 223–236). Hillsdale, NJ: Lawrence Erlbaum Associates. Wang, H.-C., Pomplun, M., Chen, M., Ko, H., & Rayner, K. (2010). Estimating the effect of word predictability on eye movements in Chinese reading using latent semantic analysis and transitional probability. The Quarterly Journal of Experimental Psychology, 63, 1374–1386. https://doi.org/10.1080/ 17470210903380814 Yan, G. L., Bai, X. J., Zang, C. L., Bian, Q., Cui, L., Qi, W., Rayner, K. & Liversedge, S. P. (2012). Using stroke removal to investigate Chinese character identification during reading: evidence from eye movements. Reading and Writing, 25, 951–979. https://doi. org/10.1007/s11145-011-9295-x Yang, H.-M., & McConkie, G. W. (1999). Reading Chinese: Some basic eye-movement characteristics. In J. Wang, H. C. Chen, R. Radach, & A. Inhoff (Eds.), Reading Chinese script: A cognitive analysis (pp. 207–222). Hove, UK: Psychology Press.

© 2020 Hogrefe Publishing

39

Yeh, J. S., & Liu, I. M. (1972). Factors affecting recognition thresholds of Chinese characters. Acta Psychologica Taiwanica, 14, 113–117. Yip, P. C. (2000). The Chinese lexicon: A comprehensive survey. Hove, UK: Psychology Press. Yu, B. L., & Cao, H. Q. (1992). The effect of the stroke-number disposition on Chinese character recognition. Psychological Science, 4, 5–10. Zhu, X. (1991). The effect of Chinese sentence context on word recognition. Acta Psychologica Sinica, 23, 145–152. History Received December 18, 2019 Revision received April 10, 2020 Accepted April 16, 2020 Published online June 9, 2020 Acknowledgments We thank the constructive comments from the anonymous reviewers. Conflict of Interest The authors declare that they have no conflict of interest. Publication Ethics Informed consent was obtained from all participants in the present study. Open Data Raw data of this study will be available in the Open Science Framework repository (https://osf.io/pd8jh/) or can be requested from the corresponding author. Funding This research was supported by a postgraduate fellowship from the Chinese University of Hong Kong to the first author, and it formed part of a master thesis submitted to the Chinese University of Hong Kong. This research was partially supported by funding from the Research Grants Council of the Hong Kong University Grants Committee (GRF No.: 845613) to the third author. Michael C. W. Yip Department of Psychology The Education University of Hong Kong 10 Lo Ping Road Tai Po New Territories Hong Kong SAR mcwyip@eduhk.hk

Experimental Psychology (2020), 67(1), 31–39


Short Research Article

Parsing for Position David J. Lobina1, José E. Garcı́a-Albea2, and Josep Demestre2 Department of Philosophy, University of Barcelona, Barcelona, Spain

1

Department of Psychology and CRAMC, Universitat Rovira i Virgili, Tarragona, Spain

2

Abstract. Monitoring tasks have long been employed in psycholinguistics, and the end-of-clause effect is possibly the better-known result of using this technique in the study of parsing. Recent results with the tone-monitoring task suggest that tone position modulates cognitive load, as reflected in reaction times (RTs): the earlier the tone appears in a sentence, the longer the RTs. In this study, we show that verb position is also an important factor. In particular, changing the time/location at which verb–noun(s) dependencies are computed during the processing of a sentence has a clear effect on cognitive load and, as a result, on the resources that can be devoted to monitoring and responding to a tone. This study is based on two pieces of evidence. We first report the acceptability ratings of six word orders in Spanish and then present monitoring data with three of these different word orders. Our results suggest that RTs tend to be longer if the verb is yet to be processed, pointing to the centrality of a sentence’s main verb in parsing in general. Keywords: parsing, monitoring tasks, verb position, word order change

Introduction Monitoring tasks have long been employed in psycholinguistics, and the end-of-clause effect is possibly the betterknown result of using this technique in the study of parsing. According to the now-classic studies of Abrams and Bever (1969) and Bever and Hurtig (1975), when participants monitor for a tone while listening to a sentence and are required to press a button as soon as they hear the tone, the end of a clause has a discernible effect on performance. This is evidenced in the results reported by these two studies: reaction times (RTs) to a tone were found to be longer at the end of the first clause of a biclausal sentence than in between clauses or at the beginning of the second clause. The end of a clause seems to exert a particular cognitive load on parsing, the primary task in these experiments, thereby impinging the response to the tones, the secondary task. This effect is argued to be the result of closing off the various syntactic phrases the parser opens up during the processing of a sentence – the result of a “wrap-up” operation – and has become a feature of a number of parsing theories, most notably in Frazier and Fodor’s (1978) sausage machine and the syntax-last model of Townsend and Bever (2001). Experimental Psychology (2020), 67(1), 40–47 https://doi.org/10.1027/1618-3169/a000477

In a recent study with the tone-monitoring technique, Lobina, Demestre, and Garcı́a-Albea (2018) point out that the data reported in Abrams and Bever (1969) and Bever and Hurtig (1975) may not be as robust as commonly thought, as some important factors were not considered at the time, and as a result, the experimental manipulation then used may have been confounded. Monitoring tasks, in general, exhibit a tendency of RTs to decrease across a sentence (Cutler & Norris, 1979), a factor that was not controlled for by Bever et al. In fact, the tone both Abrams and Bever (1969) and Bever and Hurtig (1975) placed at the end of clauses was typically the first tone in a series, and their data clearly show a decrease in RTs from the first to the last tone position. Lobina et al. (2018) ran a set of experiments to reevaluate some of these results and found that the decreasing tendency is very strong in the tone-monitoring paradigm and can be in conflict with structural factors such as the wrap-up effect, which is applicable at the end of clauses and sentences. In particular, this study reports an experiment (1) in which two types of simple, monoclausal Spanish sentences with three tone positions were used, as shown below in (1), where the | symbol indicates where the tone was placed and the numbers in parentheses are the RTs, in milliseconds, to each position. (1) a. El candidato del pa|rtido se pre|paró el pró|ximo discurso. “The party’s candidate prepared his next speech.” (257 ms; 222; 206) b. El candidato ha pre|parado un di|scurso sobre| la sanidad. “The candidate has prepared a speech about the health service.” (252 ms; 217; 205) © 2020 Hogrefe Publishing


D.J. Lobina et al., Parsing for Position

Lobina et al. (2018) report two main results: There was a general decrease in RTs for each sentence type, and there were no significant differences across sentence type, modulo a small effect regarding the appearance of the verb. Lobina et al. (2018) accounted for the data by appealing to the additive effects of two independent factors: one perceptual (a tone position effect indicating uncertainty; viz., when is the tone going to appear?) and the other psycholinguistic (the predictions of the parser or incrementality, indicating a linguistic kind of uncertainty; i.e., what linguistic material is it left to process?). In fact, Lobina et al. (2018) were able to disentangle the two factors by recording different event-related-potential components for each (2) – the N1 wave, suggestive of perceptual uncertainty and thus of a (tone) position effect, and the P3 wave, related to cognitive load in dual tasks, such as tone monitoring, and thus interpretable as reflecting incrementality – both of which correlated with RTs in the predicted direction. Our study focuses on the issue of verb position reported in Lobinaetal. (2018)asa small across-sentence-type effect and thus suggesting a possible structural effect. Under most linguistic theories, the verb is the central element of a sentence, and it is no less important in parsing theories. According to Gibson (1998), for instance, the storing and integration of dependencies between verbs and nouns (mostly, subjects and objects) constitute some of the parser’s most important operations, a perspective that was in fact at the heart of the hypotheses Lobina et al. (2018) drew in their study. We shall follow a similar strategy here, which shall be based on the expectation that processing a sentence’s main verb, along with computing the relevant verb–noun(s) dependencies, engages significant cognitive resources and may further interact with the position effect that has proved to be so significant in tone-monitoring tasks. In order to explore this issue, we manipulated the position of the verb by using different word orders. We used the Spanish language for this purpose, as it offers great flexibility in this respect (see Fernández Soriano (1993) for a discussion of some of the limits of free word order in Spanish, though). Along with the canonical subject–verb–object (SVO) order in Spanish, we employed a verb-initial as well as a verbfinal word order, and this allowed us to present the verb at three different times. We expected that each verb location would involve differing cognitive loads and this variability would presumably have some sort of effect on the parser’s operations, especially in terms of how and when verb–noun dependencies are computed, a phenomenon that ought to be detectable with the tone-monitoring paradigm. Thus, the general prediction of our study was that the greater the expectation that the verb comes next is, the faster the responses to tones should be – and faster still once the verb has been processed. © 2020 Hogrefe Publishing

41

The next section describes a study we conducted on the acceptability of six word orders in Spanish, which formed the basis for the materials we selected for the experiment reported in Experimental Data. The final section discusses the results of both the acceptability rating study and the experimental data and considers some of the potential work to be carried out in the future.

Word Order Acceptability Data For the purposes of this study, we modified the sentences in (1) by constructing complex subjects and objects for each, thus obtaining greater distance between verbs and nouns and a more intricate structure overall (by complex subjects and objects, we mean noun phrases modified by a prepositional phrase). The result can be observed in (2), where the specifications of the six possible word orders are shown within parentheses. (2) a. El candidato del partido ha preparado un discurso sobre la sanidad (SVO). “The party’s candidate has prepared a speech about the health service.” b. El candidato del partido un discurso sobre la sanidad ha preparado (SOV). c. Ha preparado el candidato del partido un discurso sobre la sanidad (VSO). d. Ha preparado un discurso sobre la sanidad el candidato del partido (VOS). e. Un discurso sobre la sanidad el candidato del partido ha preparado (OSV). f. Un discurso sobre la sanidad ha preparado el candidato del partido (OVS). The result is not only greater distance between the verb and nouns but also greater uncertainty as to whether all these sentences are, in fact, grammatical, as some of these sound somewhat awkward. In order to confirm whether all these sentences are acceptable, we ran a questionnaire in which we asked 72 native speakers of Spanish to judge the acceptability of 60 sentences similar to those in (2) on a scale from 1 (unacceptable) to 5 (perfectly acceptable). An analysis of the data indicates that native speakers prefer the canonical SVO order over all others (4.4 rating on average), with the rest following in this order: VSO (3.25), OVS (2.96), VOS (2.87), OSV (2.28), and SOV (2.02). Pairwise comparisons showed that rating differences between these word orders were all statistically significant (ps < 0.05), except in the following four pairs: VSO–OVS, VSO–VOS, OVS–VOS, and OSV–SOV. We selected the canonical SVO order as a baseline condition for our experiment and decided to employ two more Experimental Psychology (2020), 67(1), 40–47


42

word orders, which had to meet the following two conditions: (a) in one of these word orders, the verb would have to appear at the beginning of a sentence and in the other at the end, and (b) the verb should always appear adjacent to its object, as separating them would introduce an unwelcome variable – verbs and objects tend to go together, not least because of thematic restrictions, and we wanted to keep this constant. The VOS and SOV strings were the only word orders meeting these criteria and were thus selected as two additional conditions for the experiment. We should note a couple of things about the experimental conditions before moving on to a description of the actual experiment. There is, first of all, some distance between the ratings for SVO, the canonical order, and the other two orders, the VOS and the SOV. The verb-final SOV order, in fact, received the lowest acceptability rating of all orders, and it does sound slightly unnatural, though it did not prove to be totally unacceptable, considering that it received a rating over 2 on a 1–5 scale. The other verb-final order, OSV, sounds a bit better than SOV, though it is second last in the ratings, and moreover, there was no statistical difference between OSV and SOV. The OSV order, in addition, breaks the verb–object union, failing to meet condition (b). In any case, and as described below, we took the acceptability ratings of the experiment’s sentence types into consideration in the analyses.

Experimental Data (3) below shows the three experimental conditions, with the | symbol indicating the approximate placements of the tones (the second tone in VOS seems to appear earlier than the second tone in SVO and SOV, but this is an artifact of this particular example; length was controlled for in terms of the number of syllables, as explained below). (3) SVO: El candidato del partido ha pr|eparado un discurso sobr|e la sanidad. SOV: El candidato del partido un di|scurso sobre la sanidad ha pr|eparado. VOS: Ha preparado un discurso sobr|e la sanidad el c|andidato del partido.

D.J. Lobina et al., Parsing for Position

tone placed on the following object and another in the subject toward the end of the sentence. Thus, there are three types of sentences (SVO, SOV, and VOS) and two tone positions (1 and 2). Regarding our predictions, these are based on the assumption that the parser ought to be especially sensitive to the verb’s position in computing verb–noun dependencies, and thus, cognitive load will be influenced by the position of the verb – namely, whether the verb has already appeared and been processed. In particular, if the parser has processed a noun phrase, it will now expect a verb to appear next, as in SVO sentences. If the parser has instead processed a complex noun phrase (a subject) followed by another complex noun phrase (an object), as it would in SOV sentences, then the prediction that the verb comes next would probably be even stronger. But if the parser is inputted a verb to begin with, then the expectation would be that the verb’s arguments are to follow, as in VOS sentences. The actual predictions, then, are as follows. For the first tone position, in VOS sentences, the verb has already been processed, whereas in SVO sentences it is being processed and in SOV it is yet to appear. Thus, RTs to the first tone position should be faster in VOS sentences than in either SVO or SOV – or put otherwise, P1 SVO/SOV > P1 VOS, where P stands for (tone) position and > means “longer RTs than.” As for the second tone position, the verb has been processed already in both SVO and VOS sentences, whereas it is being processed in SOV sentences, and thus, RTs to this position in SOV sentences should be longer – that is, P2 SOV > P2 SVO/VOS.

Method Participants Sixty psychology students (13 male, 47 female) from the Rovira i Virgili University in Tarragona, Spain, participated in the experiment for course credit. The mean age was 22 years, and none of the participants had any known hearing impairments. All were native speakers of Spanish.

Materials Accordingly, in the SVO condition, the verb appears roughly in the middle of the sentence (as measured in number of syllables), with one tone placed on the verb and another inside the following complex object. In the SOV condition, the verb appears at the end, with one tone placed at the beginning of the preceding object and another on the verb toward the end of the sentence. In the VOS condition, the verb appears at the beginning, with one Experimental Psychology (2020), 67(1), 40–47

Three variants of monoclausal, active, declarative, Spanish sentences were constructed, totaling 60 experimental sentences. All sentences were unambiguous and composed of high frequency words according to the corpora in Almela, Cantos, Sánchez, Sarmiento, and Almela (2005) (which was cross-checked with both Sebastián-Gallés, Martı́, Carreiras, and Cuetos (2000) and Duchon, © 2020 Hogrefe Publishing


D.J. Lobina et al., Parsing for Position

Perea, Sebastián-Gallés, Martı́, and Carreiras (2013)). There were three sentence types (SVO, SOV, and VOS) and two tone positions (1 and 2). All sentences exhibited complex subject and object arguments, and the tones were placed at roughly the same point in each sentence, measured in terms of the number of syllables. The sentences were recorded in stereo with a normal but subdued intonation by a native, male speaker of the Spanish language using the Praat software on a Windows-operated computer. The software Cool Edit Pro (Version 2.0, Syntrillium Software Corporation, Phoenix, US) was employed to generate and superimpose tones with a frequency of 1,000 Hz, a duration of 25 ms, and a peak amplitude equal to that of the most intense sound of the materials (namely, 80 dBs). Every sentence had one tone only. A further 60 sentences acted as fillers, 24 of which did not carry a tone.

Procedure The design of the experiment was a 2 (tone position factor) by 3 (sentence type factor) within-participants, withinitems factorial, and therefore, 6 lists of stimuli were created, each list containing all experimental conditions (10 items per condition). Each list was arranged according to a Latin square (blocking) design so that the items were randomized within and between blocks. Participants were randomly assigned to each experimental list, 10 participants per list. The experiment was designed and run with the DMDX software (Forster & Forster, 2003) and administered in a sound-proof laboratory with low to normal illumination. The sentences were presented over the headphones binaurally, and participants were instructed to hold a keypad with their dominant hand in order to press a button as soon as they heard the tone. They were told to be as quick as possible, but to avoid guessing. The DMDX software was used to measure and record RTs. The experimental session lasted around 20 min.

Data Analysis Prior to carrying out the analyses, we excluded trials without a response, affecting 0.39% of the data. Following current guidelines on how to analyze skewed dependent variables such as RTs with linear mixed-effect models (Balota, Aschenbrenner, & Yap, 2013), RTs were inversenormalized ( 1,000/RT) to adjust for skewness. We analyzed an inverse transform of the RTs, as a graphical inspection indicated that this transformation approximated normality. Hence, inverse-transformed RTs served as the dependent measure for the analyses below. © 2020 Hogrefe Publishing

43

A number of linear mixed-effects models were created to analyze the data by using the statistical program R (v. 3.5.1; R Core Team, 2013), in particular the lmer function from the lme4 package (v. 1.1-21; Bates et al., 2015). Two fixed factors and their interaction were included in the main 3 × 2 analysis: sentence type (three withinparticipants/between-items levels: VOS, SVO, and SOV) and tone position (two within-participants/within-items levels: position 1 and position 2). This model also included acceptability ratings as a covariate. The acceptability ratings were included as a covariate since, as mentioned previously, the three types of sentences we employed differed significantly in acceptability. By including this covariate, we aimed to examine whether the pattern of results was due to the two experimental manipulations or instead to differences in terms of acceptability ratings to the three levels of the sentence type factor. Regarding the random structure of the model, we followed current guidelines in the psycholinguistics literature (Barr, Levy, Scheepers, & Tily, 2013). A maximal model with a fully specified random effects structure was initially built. This model included the two experimental factors, their interaction, and a covariate as fixed effects, as well as by-participants and by-items random intercepts and slopes for the experimental factors. This model, however, failed to converge and had to be gradually reduced to a model with by-participant and by-item random intercepts, but with no slopes. Hence, in the notation used in the lme4 package, the following model was the maximally converging one: RTs ∼ acceptability þ sentence type þ tone position þ sentence type : tone position þ ð1 j participantÞ þ ð1 j itemÞ Backward elimination model comparisons were conducted with a maximum likelihood test by using the analysis of variance (ANOVA) function in the R base package. This method compares a model with all fixed effects, interactions, and covariates to smaller models (one without the covariate, one with only one of the fixed effects, one with the interaction removed, etc.). For example, a model containing the main effects of sentence type and tone position and their interaction was compared to a model with only the main effects of sentence type and tone position, an analysis that would explore whether the twoway interaction significantly improves the fit of the model. The ANOVA function would then determine whether the simpler model was significantly worse at fitting/explaining the data than the more complex model, as indicated in the results of the chi-squared test and its associated p-value. A significant value for this test would suggest that the Experimental Psychology (2020), 67(1), 40–47


44

D.J. Lobina et al., Parsing for Position

interaction between sentence type and tone position would add significantly to the explanatory power (or fit) of such a model above and beyond the two main effects alone. A backward elimination model comparison was first conducted to explore the effect of acceptability ratings. Model comparisons were then conducted to explore the interaction between the sentence type and tone position factors. Model comparisons were also conducted to explore the main effects of the sentence type and tone position factors. The main effect of sentence type and the interaction between sentence type and tone position were explored with pairwise comparisons performed using the emmeans package in R (Lenth, 2019). Degrees of freedom were approximated by the Kenward–Roger method, and pairwise comparisons were interpreted using Bonferroni-corrected p-values.

Results Mean RTs and standard deviations are presented in Table 1, while Figure 1 shows a raincloud plot of the raw data as well as the median and interquartile range for RTs. The raincloud plot was produced using the method and code provided by Allen, Poggiali, Whitaker, Marshall, & Kievit et al. (2019). As can be observed in Table 1, at first sight the predictions seem to have been borne out. That is, RTs to the first position in VOS sentences were faster than RTs to the first position in either SVO or SOV sentences, while RTs to the second position in SOV sentences were shorter than RTs to the second position in either SVO or VOS sentences. The analyses show that even though sentences with lower acceptability ratings received longer RTs (the responses were slower), the main effect of sentence acceptability did not significantly improve the fit of the model (χ2 = 0.19, p =.89), and therefore, this factor was removed from the model. The main effect of tone position significantly improved the fit of the model (χ2 = 37.78, p < .0001), with shorter RTs to tones in the second position (342 ms) as compared to RTs to tones in the first position (353 ms). The main effect of sentence type also

Table 1. Mean RTs per tone position per sentence type (standard deviations in parentheses) Tone position Sentence type

1

2

356.05 (76)

336.68 (53)

SOV

356.78 (72)

347.60 (65)

VOS

347.50 (76)

337.38 (56)

SVO

Note. SVO = subject–verb–object; SOV = subject–object–verb; VOS = verb–object–subject.

Experimental Psychology (2020), 67(1), 40–47

significantly improved the fit of the model (χ2 = 23.78, p < .0001). Planned Bonferroni-corrected pairwise comparisons showed that RTs to tones in VOS sentences (343 ms) were significantly shorter than RTs to tones in SVO (347 ms, estimate = 0.038, SE = 0.015, t(3,472) = 2.48, p = .039) and SOV sentences (352 ms, estimate = 0.075, SE = 0.015, t(3,472) = 4.88, p < .0001). Moreover, RTs to tones in SVO sentences were shorter than RTs to tones in SOV sentences (estimate = 0.037, SE = 0.015, t(3,472) = 2.40, p = .049). The interaction between tone position and sentence type significantly improved the fit of the model too (χ2 = 7.14, p = .028). In order to explore the two-way interaction further, we analyzed RTs to tones in positions 1 and 2 separately. Bonferroni-corrected pairwise comparisons showed that in the first position RTs to tones in VOS sentences (347 ms) were significantly shorter than RTs to tones in both SVO (356 ms, estimate = 0.079, SE = 0.022, t(3,472) = 3.61, p = .0009) and SOV sentences (357 ms, estimate = 0.089, SE = 0.022, t(3,472) = 4.065, p = .0001). Moreover, RTs to tones in SVO sentences did not differ from RTs to tones in SOV sentences (estimate = 0.010, SE = 0.022, t(3,472) = 0.455, p = 1.000). In the second tone position, while RTs to tones in VOS sentences did not differ (339 ms) from RTs to tones in SVO sentences (339 ms, estimate = 0.002, SE = 0.022, t(3,472) = 0.09, p = 1.000), they were significantly shorter than RTs to tones in SOV sentences (348 ms, estimate = 0.062, SE = 0.022, t(3,472) = 2.85, p = .013). Moreover, RTs to tones in SVO sentences were significantly shorter than RTs to tones in SOV sentences (estimate = 0.064, SE = 0.022, t(3,472) = 2.94, p = .009). We also explored the tone position effect at each level of the sentence type factor. Bonferroni-corrected pairwise comparisons showed that the tone position effect was marginally significant in VOS sentences (estimate = 0.042, SE = 0.022, t(3,472) = 1.19, p = .055) and significant in both SVO (estimate = 0.123, SE = 0.022, t(3,472) = 5.62, p < .0001) and SOV sentences (estimate = 0.068, SE = 0.022, t = 3.141, p = .002).

Discussion As with previous monitoring studies, there was a perceptible decrease in RTs in each sentence type, in line with the data reported in Lobina et al. (2018). Thus, our results also suggest a position effect – a perceptual phenomenon – which here too had an impact on participants’ performance. Nevertheless, the decrease in RTs is more pronounced in the baseline condition (the SVO order) than in the other word orders, and this must have been due to the employment of noncanonical © 2020 Hogrefe Publishing


D.J. Lobina et al., Parsing for Position

45

Figure 1. Raincloud plot (with boxplots) for RTs on tone position 1 (left) and tone position 2 (right) for the sentence type factor (level 1: VOS, level 2: SVO, and level 3: SOV). The plot displays the raw data and the probability distribution. The boxplots display sample medians alongside interquartile ranges at 25, 50, and 75%. SVO = subject–verb–object; SOV = subject–object–verb; VOS = verb–object–subject.

sentences, which seems to have disrupted the position effect somewhat, an eventuality we were aiming for, as argued at the beginning of this paper. In this sense, we also obtained structural effects, highlighting the operations of the parser, the second major factor in determining performance in monitoring tasks. Indeed, the interplay between incrementality and perceptual uncertainty visà-vis the position of the verb had a clear effect on our results. By varying the position of the verb, we obtained the expected results with RTs going in the hypothesized directions. This was evident from different viewpoints. First, both the sentence type and the tone position factors significantly added to our model. This was expected as these two factors tend to be rather strong and moreover independent of each other in tone-monitoring tasks. In addition, the interaction between sentence type and tone position also significantly added to the fit of our statistical model, though marginally so, but this is also obviously relevant. Second, and as mentioned, the decrease in RTs is more pronounced in SVO sentences than © 2020 Hogrefe Publishing

in either verb-final (SOV) or verb-initial (VOS) sentences. This appears to be a genuine structural effect and cannot be attributed to the acceptability preferences of the different word orders, as the acceptability factor did not significantly add to our model and was in fact discarded. Furthermore, the preferred SVO order did not, as a matter of fact, receive the fastest RTs. Finally, and perhaps more transparently, our analyses show that SVO sentences behave like SOV sentences in the first tone position, where the verb is yet to appear in either condition, while in the second tone position SVO sentences now behave like VOS sentences, and in this position, the verb has been processed in both cases (further note that the highest RTs in the second tone position are those in the SOV condition, where the verb has not appeared yet). In fact, the decreasing tendency from the first to the second tone position in the VOS and SOV conditions is very similar (though RTs are longer in SOV, given that the verb has not been processed), whereas the decrease is significantly more marked in the SVO condition. Experimental Psychology (2020), 67(1), 40–47


46

According to the explanation put forward by Lobina et al. (2018), the steeper decrease in RTs in SVO sentences would be accounted for in terms of the additive but independent effects of perceptual and linguistic factors. In the first tone position, the verb is yet to appear in either SVO or SOV sentences while perceptual uncertainty is at its highest, and this would explain why RTs are similar in these two conditions. The situation is very different in the second position, especially in the SVO condition. In this case, the effect of both perceptual uncertainty and the parser’s cognitive load would have decreased by the second tone – the tone appears toward the end of the sentence and the verb has already been processed – and therefore, more resources can be employed to react to this tone than in SOV sentences, where linguistic uncertainty would still operate (an uncertainty as to when the verb is to be processed), thus explaining why RTs to the second tone are this time similar in SVO and VOS sentences. Put together, our results lend support to both the explanatory framework of Lobina et al. (2018) and the central role that processing a sentence’s main verb plays in an account of parsing. In particular, the combination of tone monitoring, the position effect, and the manipulation of verb position resulted in an experimental setting that allowed us to gauge cognitive load fairly accurately, and this should bode well for future employment of the tonemonitoring paradigm.

Final Remarks This study reported here was meant to complement the picture provided by Lobina et al. (2018) on the use of monitoring tasks in the study of parsing. To that end, we probed the effect the position of the verb has on the availability of resources to parse a sentence while monitoring a tone and found that this factor can highlight structural effects. In particular, we found that cognitive load and linguistic uncertainty are both greater prior to the processing of a sentence’s main verb. This indicates that changing the time/location at which verb–noun(s) dependencies are computed during processing a sentence has an effect on cognitive load and, as a result, on the resources that can be devoted in dual tasks to monitoring and responding to a tone. Put together, two interrelated issues are especially noteworthy. First, manipulating these factors appropriately can unearth structural effects with the tonemonitoring technique, vindicating the use of an experimental paradigm that is not currently widely used in the study of parsing (but perhaps it is making a comeback; Experimental Psychology (2020), 67(1), 40–47

D.J. Lobina et al., Parsing for Position

see Lobina et al (2018) for some comments). Naturally, the significance of the different factors we have discussed in this paper means that scholars will have to take them into consideration when designing a monitoring experiment, and this brings us to the second point of note. Specifically, there is an important methodological lesson to learn regarding the use of monitoring tasks: It is imperative scholars know which factors have the greater impact on participants’ performance in such tasks so that clearer results can be obtained with the paradigm. This point, of course, applies to many other experimental paradigms, but we hazard it is especially true of the tonemonitoring technique, given its history and how past results have been interpreted and implemented. The paradigm should not be avoided or discarded; it should instead be embraced once again, though extra care and attention need to be employed in designing this type of experiments. We submit that our results justify this optimism, but much work remains to be done, not least with more complex sentences. We hope to move on to this question soon.

References Abrams, K., & Bever, T. G. (1969). Syntactic structure modifies attention during speech perception and recognition. The Quarterly Journal of Experimental Psychology, 21(3), 280–290. https://doi.org/10.1080/14640746908400223 Allen, M., Poggiali, D., Whitaker, K., Marshall, T. R., & Kievit, R. A. (2019). Raincloud plots: A multi-platform tool for robust data visualization. Wellcome Open Research, 4(63), 1–40. https://doi. org/10.12688/wellcomeopenres.15191.1 Almela, R., Cantos, P., Sánchez, A., Sarmiento, R., & Almela, M. (2005). Frecuencias del Español. Diccionario y estudios léxicos y morfológicos. Madrid, Spain: Editorial Universitas. Balota, D. A., Aschenbrenner, A. J., & Yap, M. J. (2013). Additive effects of word frequency and stimulus quality: The influence of trial history and data transformations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39, 1563–1571. https://doi.org/10.1037/a0032186 Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278. https://doi.org/10.1016/j.jml.2012.11.001 Bates, D., Mӓchler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01 Bever, T. G., & Hurtig, R. R. (1975). Detection of a nonlinguistic stimulus is poorest at the end of a clause. Journal of Psycholinguistic Research, 4(1), 1–7. https://doi.org/10.1007/ bf01066985 Cutler, A., & Norris, D. (1979). Monitoring sentence comprehension. In W. E. Cooper & E. C. T. Walker (Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 113–134). Hillsdale, NJ: Lawrence Erlbaum. Duchon, A., Perea, M., Sebastián-Gallés, N., Martı́, A., & Carreiras, M. (2013). EsPal: One-stop shopping for Spanish word

© 2020 Hogrefe Publishing


D.J. Lobina et al., Parsing for Position

properties. Behavioral Research Methods, 45(4), 1246–1258. https://doi.org/10.3758/s13428-013-0326-1 Fernández Soriano, O. (1993). Sobre el orden de palabras en español. Dicenda, 11, 113–152. Forster, K. I., & Forster, J. C. (2003). DMDX: A Windows display program with millisecond accuracy. Behavior Research Methods, Instruments, & Computers, 35, 116–124. https://doi. org/10.3758/bf03195503 Frazier, L., & Fodor, J. D. (1978). The sausage machine: A new twostage parsing model. Cognition, 6, 291–325. https://doi.org/10. 1016/0010-0277(78)90002-1 Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68, 1–76. https://doi.org/10.1016/s00100277(98)00034-1 Lenth, R. (2019). emmeans: Estimated marginal means, aka leastsquares means. R package version 1.3.5. [Computer software manual]. Vienna, Austria: CRAN. Retrieved from https://cran.rproject.org/web/packages/emmeans/emmeans.pdf Lobina, D. J., Demestre, J., & Garcı́a-Albea, J. E. (2018). Disentangling perceptual and psycholinguistic factors in syntactic processing: Tone monitoring via ERPs. Behavior Research Methods, 50, 1125–1140. https://doi.org/10.3758/s13428-017-0932-4 R Core Team. (2013). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria: R Core Team. Retrieved from http://www.R-project.org/

© 2020 Hogrefe Publishing

47

Sebastián-Gallés, N., Martı́, M. A., Carreiras, M., & Cuetos, F. (2000). LEXESP. Léxico informatizado del español. Barcelona: Edicions Universitat de Barcelona. Townsend, D., & Bever, T. G. (2001). Sentence comprehension. Cambridge, MA: The MIT Press. History Received March 18, 2019 Revision received April 14, 2020 Accepted April 16, 2020 Published online June 9, 2020 Open Data The raw data as well as the scripts we used can be accessed at http://www.osf.io/w4mtv/. Funding This research was funded by two AGAUR research grants (2011-BPA-00127 and 2014-SGR-1444). David J. Lobina Department of Philosophy University of Barcelona 08001 Barcelona Spain dj.lobina@gmail.com

Experimental Psychology (2020), 67(1), 40–47


Short Research Article

Effects of Input Modality on Vocal Effector Prioritization in Manual–Vocal Dual Tasks Mareike A. Hoffmann1, Melanie Westermann1, Aleks Pieczykolan2, and Lynn Huestegge1 Institute of Psychology, University of Würzburg, Würzburg, Germany

1

Human Technology Center, RWTH Aachen University, Aachen, Germany

2

Abstract. Doing two things at once (vs. one in isolation) usually yields performance costs. Such decrements are often distributed asymmetrically between the two actions involved, reflecting different processing priorities. A previous study (Huestegge & Koch, 2013) demonstrated that the particular effector systems associated with the two actions can determine the pattern of processing priorities: Vocal responses were prioritized over manual responses, as indicated by smaller performance costs (associated with dual-action demands) for the former. However, this previous study only involved auditory stimulation (for both actions). Given that previous research on input–output modality compatibility in dual tasks suggested that pairing auditory input with vocal output represents a particularly advantageous mapping, the question arises whether the observed vocal-over-manual prioritization was merely a consequence of auditory stimulation. To resolve this issue, we conducted a manual–vocal dual task study using either only auditory or only visual stimuli for both responses. We observed vocal-over-manual prioritization in both stimulus modality conditions. This suggests that input–output modality mappings can (to some extent) attenuate, but not abolish/ reverse effector-based prioritization. Taken together, effector system pairings appear to have a more substantial impact on capacity allocation policies in dual-task control than input–output modality combinations. Keywords: cognitive control, dual-task performance, vocal prioritization, capacity allocation, effector systems

Typical experiments in multitasking research often focus on tasks involving a rather restricted range of effector systems (mostly manual key presses; see, e.g., Pashler, 1994). While such restrictions can be helpful to ensure a highly controlled experimental situation, everyday life often confronts us with challenges requiring the coordination of different effector systems simultaneously (cross-modal action; see Huestegge & Hazeltine, 2011; Huestegge, Pieczykolan, & Koch, 2014). However, the impact of (combinations of) effector systems on multipleaction (or dual-task) control has largely been disregarded in previous research and theories (e.g., Logan & Gordon, 2001; Meyer & Kieras, 1997; Navon & Miller, 2002; Tombu & Jolicœur, 2003). A study that explicitly focused on the impact of effector system combinations on multiple-action control was conducted by Huestegge and Koch (2013). They had participants respond to a single auditory stimulus (presented to the left/right ear) with either a single oculomotor, vocal Experimental Psychology (2020), 67(1), 48–55 https://doi.org/10.1027/1618-3169/a000479

or manual response, or with two of these responses simultaneously. An analysis of the pattern of performance costs (i.e., response time [RT] difference between singleand dual-response conditions for each effector system) revealed an asymmetrical distribution of these costs throughout all pairwise combinations of effector systems: Oculomotor responses were associated with smaller costs than vocal and manual responses, while vocal costs were only large when combined with oculomotor (but not with manual) responses. Finally, manual costs were substantial throughout. Interestingly, this pattern could not be explained in terms of the overall response time levels of the effector systems (e.g., vocal responses were slower than manual responses, nevertheless associated with smaller performance costs). Therefore, Huestegge and Koch (2013) interpreted these findings as evidence for a generic capacity allocation policy among responses based on an ordinal effector system hierarchy (see also Pieczykolan & Huestegge, 2014): Oculomotor responses are assumed to be prioritized over vocal and manual responses, while vocal responses are prioritized over manual responses. Probably, the specific effector systems are anticipated early during task processing, and a corresponding capacity allocation policy is implemented accordingly to eventually © 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)


M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

49

Table 1. Overview of possible stimuli and responses (left/right manual key press or vocal utterance “left”/“right”). Possible stimuli Visual stimulus modality

⅃ L Я R

Responses as a function of instruction (manipulated blockwise) Manual according to letter identity Vocal according to letter orientation Manual according to letter identity and vocal according to letter orientation Manual according to letter orientation Vocal according to letter identity Manual according to letter orientation and vocal according to letter identity

Auditory stimulus modality

Low pitch tone on left ear Low pitch tone on right ear High pitch tone on left ear High pitch tone on right ear

Manual according to frequency Vocal according to tone location Manual according to frequency and vocal according to tone location Manual according to tone location Vocal according to frequency Manual according to tone location and vocal according to frequency

select and execute these responses (i.e., an anticipatory mechanism similar to action effect anticipation as assumed by ideomotor theories; see, e.g., Badets, Koch, & Philipp, 2016; Pfister, 2019, for reviews). However, a potential alternative explanation of these findings comes from studies on input–output modality compatibility (IOMC) effects. IOMC effects refer to the influence ofthecombinationofsensorysystemsandeffectorsystemson dual-task performance. A dual-task setting involving a visual–manual task in combination with an auditory–vocal task (referred to as compatible mapping) has been reported to yield smaller dual-task costs than a dual-task setting with a reversed (incompatible) modality mapping (i.e., visual–vocal and auditory–manual; Göthe, Oberauer, & Kliegl, 2016; Halvorson, Ebner, & Hazeltine, 2013; Stelzel & Schubert, 2011; Stelzel, Schumacher, Schubert, & D’Esposito, 2006). Analogous findings have been observed in other multitasking paradigms such as psychological refractory period studies (Maquestiaux, Ruthruff, Defer, & Ibrahime, 2018) and taskswitching studies with respect to switch costs (Stephan & Koch, 2010, 2011; Stephan, Koch, Hendler, & Huestegge, 2013) and mixing costs (Hazeltine, Ruthruff, & Remington, 2006; Schacherer & Hazeltine, 2019). The advantage of compatible mappingshas usually been explained by referring to the similarity of stimuli to typical effects associated with certain actions: For example, vocal actions are typically followed by auditory effects (ideomotor account; see Greenwald, 1972, 2003; Stephan & Koch, 2010, 2011). Given that Huestegge and Koch (2013) only utilized auditory stimuli, it is possible that this setting created a particular IOMC-like advantage for vocal (vs. manual) action demands, which may have resulted in the prioritization of vocal-over-manual actions. This potential alternative explanation receives further credibility by other previous reports: Some studies, in fact, reported greater © 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)

dual-task costs for vocal than for manual responses (e.g., Fagot & Pashler, 1992; Holender, 1980; Schumacher et al., 2001), and these studies involved visual stimuli (or only one fixed assignment of stimulus-to-response modalities in the case of Schumacher et al., 2001). Therefore, a systematic examination of the role of stimulus modality on effector prioritization in vocal–manual dual tasks is pending. The present study was conducted to rule out that vocalover-manual prioritization reported previously (e.g., Huestegge & Koch, 2013) was simply due to the use of auditory stimuli (i.e., due to IOMC-like effects). Therefore, we conducted a study requiring manual and vocal responses, in which we explicitly manipulated the input modality by using only visual stimuli in one condition and only auditory stimuli in another condition. Principally, two study design options appeared feasible: First, it is possible to closely replicate the setup used by Huestegge and Koch (2013), in which one aspect of a single stimulus determined both actions in dual conditions (e.g., a left tone requires participants to always respond with both a left key press and uttering the word left). Second, it is possible to implement a more typical dual-task setup, in which two stimulus aspects independently determine the actions required in the two effector systems (e.g., a high frequency tone on the left ear requires pressing the right key but uttering left). As we recently demonstrated that the same effector-based hierarchy can be observed in both types of setups (Hoffmann, Pieczykolan, Koch, & Huestegge, 2019), we decided to follow the second approach here. Such a typical dual-task setup is probably more relatable for the majority of current dual-task researchers and theories because these theories usually assume two independent response selection processes. Across blocks of trials, participants either responded with single vocal, single manual, or with both responses to Experimental Psychology (2020), 67(1), 48–55


50

either a visual or an auditory stimulus. Single-task blocks involved responding to one dimension only of a stimulus, while dual-task blocks involved responding to two different dimensions of that same stimulus. In the auditory stimulus condition, a tone was presented, and tone pitch and location were each assigned to one of the two effector systems. In the visual stimulus condition, the letters “L” and “R” were presented either in correct or in mirrored orientation so that letter identity and orientation were distinct visual stimulus dimensions each assigned to one effector system (see Table 1 for an illustration of all possible stimuli in the visual or auditory domain and the possible instructed responses). If IOMC-like effects were the main reason behind vocal-over-manual prioritization, one would expect manual-over-vocal prioritization (indexed by relatively smaller dual-task costs) when visual stimuli trigger both responses. However, if effector system pairings are stronger determinants for capacity allocation than IOMC mappings, one would expect vocal-over-manual prioritization, irrespective of stimulus modality. Nevertheless, in the latter case, it is still possible that IOMClike effects attenuate the strength of vocal-over-manual prioritization.

Method Participants A power analysis (using η2p ¼ :38 and a betweenmeasurement correlation = .014 as observed by Huestegge & Koch, 2013, in the relevant vocal–manual combination group; α = 5%, 1 β = 95%) suggested a minimum sample size of 13 participants. Due to counterbalancing and because we were also interested in a potential interaction of effector system, task condition, and stimulus modality, 32 participants took part. Four participants were excluded because they produced too many (>33%) invalid trials (i.e., trials involving omission/ commission errors, outliers, or trials in which a saccade was executed prior to the required response, as such eye movements were shown to affect response latencies in other effector systems; see Huestegge, 2011; Huestegge & Adam, 2011). To ensure full counterbalancing, we recollected these data by testing four new participants. The final sample (26 females) had a mean age of 29.5 years (SD = 10.2). All had normal or corrected to normal vision and hearing, were right-handed, and naı̈ve regarding the purpose of the study. Participants gave informed consent and received a monetary reward or course credits for participation. Experimental Psychology (2020), 67(1), 48–55

M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

Apparatus and Stimuli Participants were seated 67 cm in front of a 210 cathode ray tube screen (temporal resolution: 100 Hz, spatial resolution: 1,024 × 768 pixels) with a standard German QWERTZ keyboard and in front of a Sennheiser e835-S microphone (Sennheiser electronic GmbH & Co KG, Wedemark, Germany). Participants wore supra-aural headphones (Sennheiser, PMX 95). Experiment Builder (version 2.1.140, SR Research Ltd., Ottawa, Ontario, Canada) was used to run the experiment and to log response events (left and right arrow key presses both operated by the participant’s right index finger and vocal RTs by utilizing the integrated voice key functionality). Content of the vocal response was recorded and registered online by the experimenter. An eye tracker (Eyelink 1000, SR Research) with a sampling rate of 1,000 Hz registered eye movements of the right eye in order to control for saccade occurrence (see the Participants section). During all blocks, a green fixation cross (approximate size = 0.4° of visual angle) on black background was present at the screen center. To the left and right, two green rectangular squares (also with a size of approximately 0.4°) at an eccentricity of 8.5° were displayed, but these were irrelevant in the context of the present study (they were included to be able to compare the present results with similar, other experiments from our lab involving instructed eye movements). The capital letters R and L served as visual stimuli (size: 0.6° displayed about 0.4° above the fixation cross). They were either mirrored (pointing to the left side: ⅃ and Я) or not (pointing to the right side: L and R). Auditory stimuli were easily distinguishable sinusoidal tones of either high (1,000 Hz) or low (400 Hz) frequency presented either on the left or right ear.

Procedure and Design At the beginning of each block, participants received both written and oral instructions, followed by a three-point horizontal calibration routine of the eye tracker. In each block, visual or auditory stimuli were presented, and participants were instructed to either respond vocally, manually, or both (each as fast and accurately as possible, but without any instructions regarding response order, grouping, or prioritization). The instructions included information about the stimulus modality and the assignment of dimension to effector system (i.e., which stimulus component was assigned to which effector, e.g., reacting manually to letter orientation and verbally to letter identity). In each trial, the stimulus was presented for 80 ms. All stimulus components (L vs. R, mirrored vs. not, high vs. low frequency, and presented to the left vs. right © 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)


M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

ear) occurred equally often in random order in each experimental condition. Participants responded by pressing the right or left arrow key, by uttering the word links or rechts (German for left/right), or both, depending on the current block. Responses should always be given in a spatially congruent manner to the respective stimulus component (e.g., left response to L, to a mirrored letter orientation, to a sound presented to the left ear, or to a low frequency, the latter analogous to pitch-location mappings on a piano keyboard). Half of the dual-task trials involved response–response compatibility in the sense that a left key press was combined with uttering “left”, while the other half was incompatible. Trials were separated by an interstimulus-interval of 3,000 ms. All participants experienced all 12 different block types twice: 3 (single manual, single vocal, dual task) × 2 (auditory, visual stimuli) × 2 (two possible assignments of stimulus component to effector system per stimulus modality). In total, each participant completed 24 blocks, each consisting of 32 trials. The sequence of conditions was counterbalanced across participants apart from three restrictions to reduce confusion for participants. All participants started with one of the single-task conditions followed by either the other single-task condition and next the dual-task condition or vice versa, involving the same stimulus modality and the same assignment of stimulus component to task. Then, these three conditions were repeated once. This was followed by six blocks in the respective other stimulus modality condition with the same sequence of task conditions (e.g., single manual – single vocal – dual task). The sequence of task conditions stayed constant within participants. Next, the stimulus modality switched again, but now the stimulus component to effector system assignment was reversed compared to the first six blocks. The same applied to the six final blocks regarding the second stimulus modality. The experimental 2 × 2 × 2 design involved the independent within-subject variables: effector system (manual vs. vocal), task condition (single vs. dual), and stimulus modality (auditory vs. visual). RTs and error rates served as dependent variables.

51

two SDs of the individual mean in each condition (5.3%). This resulted in 84.4% valid data. Finally, directional errors (4.6% of valid data) in manual and/or vocal responses (e.g., uttering right instead of left) were excluded from RT data analyses.

Response Times Absolute RT data are illustrated in Figure 1, while dual-task costs are depicted in Figure 2. RT data and error rates including dual-task costs are reported in Table 2. Data are publicly available under https://doi.org/10.5281/zenodo. 3756790. Results of 2 × 2 × 2 analyses of variance (ANOVAs) on RT data and error rates are depicted in Table 3. The analysis of RT data revealed a significant main effect of the effector system, indicating that manual responses (765 ms) were overall faster than vocal responses (994 ms). There was a significant main effect of task condition, suggesting that dual-task conditions yielded performance costs of 373 ms overall (dual-task RTs: 1,066 ms vs. single-task RTs: 693 ms). We also observed a significant main effect of stimulus modality, indicating overall lower RTs in response to a visual (850 ms) than to an auditory stimulus (909 ms).

Figure 1. Mean RTs as a function of effector system, task condition, and stimulus modality. Error bars represent mean standard errors. RT = response time.

Results Data Treatment Trials involving omission or commission errors (in manual or vocal responses, 2.1%) and all trials in which a saccade was registered prior to the execution of the required manual and/or vocal response (8.4%) were defined as invalid and discarded. The same applied to outliers that were defined as responses executed faster or slower than © 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)

Figure 2. Dual-task costs as a function of effector system and stimulus modality. Error bars represent mean standard errors. RT = response time.

Experimental Psychology (2020), 67(1), 48–55


52

M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

Table 2. Mean RTs, error rates, and dual-task costs (including SE) across effector systems, stimulus modalities, and task conditions. Task condition RTs (ms) Single Stimulus modality

Error rates (%) Dual-task costs

Dual

Single

Dual-task costs

Dual

Effector system

M

SE

M

SE

M

SE

M

SE

M

SE

M

SE

Manual

532

18

1,036

42

505

39

2.2

0.6

7.7

1.4

5.5

1.3

Vocal

870

27

1,196

36

326

24

1.5

0.5

7.2

1.5

5.6

1.4

Manual

558

15

933

32

375

28

1.8

0.4

7.8

1.4

6.0

1.1

811

19

1,098

31

287

24

1.4

0.3

5.0

1.0

3.6

0.9

Auditory Visual

Vocal Note. RT = response time.

Table 3. Overview of statistical test results (three-way ANOVAs) regarding RTs and error rates. RT data Source of variation

F(1, 31)

p

Error rates η2p

F(1, 31)

p

η2p

Effector system

107.45

<.001

.78

4.84

.035

.14

Task condition

415.54

<.001

.93

26.30

<.001

.46

Stimulus modality

10.65

.003

.26

1.37

.251

.04

Effector system × task condition

12.55

.001

.29

2.52

.123

.08

Effector system × stimulus modality

22.48

<.001

.42

2.59

.118

.08

Task condition × stimulus modality

11.30

.002

.27

.65

.427

.02

Effector system × task condition × stimulus modality

21.11

<.001

.41

6.30

.017

.17

Note. RT = response time.

Crucially, the interaction of effector system and task condition was significant, indicating a difference in dualtask costs between effector systems. Specifically, we observed smaller dual-task costs for vocal (306 ms, singletask RTs: 841 ms vs. dual-task RTs: 1,147 ms) than for manual responses (440 ms, single-task RTs: 545 ms vs. dual-task RTs: 985 ms). This was further qualified by a significant three-way interaction: Dual-task cost differences between effector systems varied between stimulus conditions. Note that this pattern persisted when excluding trials with an inter-response-interval below 100 ms, ensuring that the results were not driven by trials in which response grouping may have occurred. Post-hoc paired sample t-test comparisons revealed significantly smaller vocal than manual dual-task costs in both stimulus conditions. In the auditory condition, we observed a difference of 178 ms, t(31) = 4.45, p < .001, d = 0.95, while in the visual condition, the difference in dualtask costs between effector systems amounted to 88 ms, t(31) = 2.35, p = .026, d = 0.60. Moreover, post-hoc paired sample t-test comparisons revealed that manual dual-task costs were significantly greater in the auditory than in the visual stimulus condition, t(31) = 4.58, p < .001, d = 0.63, while vocal dual-task costs did not significantly differ as a function of stimulus modality, t(31) = 1.54, p = .134, d = 0.29. Experimental Psychology (2020), 67(1), 48–55

Additionally, the interaction of effector system and stimulus modality and the interaction of task condition and stimulus modality were significant. Post-hoc contrasts showed faster vocal RTs to visual stimuli (954 ms) than to auditory stimuli (1,033 ms), F(1, 31) = 16.76, p < .001, η2p ¼ :35, which was also observed for manual RTs, F(1, 31) = 4.71, p = .038, η2p ¼ :13 (784 ms for auditory, 746 ms for visual stimuli). The interaction of task condition and stimulus modality revealed overall smaller dual-task costs in the visual stimulus condition (331 ms, single-task RTs: 685 ms vs. dual-task RTs: 1,016 ms) than in the auditory stimulus condition (415 ms, single-task RTs: 701 ms vs. dual-task RTs: 1,116 ms).

Error Rates Errors occurred relatively rarely (4.6% in total). Nevertheless, there was a significant main effect of effector system, indicating more errors for manual (4.9%) than for vocal (3.8%) responses, and a main effect of task condition, indicating more errors in dual-task conditions (6.9%) than in single-task conditions (1.7%). The three-way interaction of effector system, task condition, and stimulus modality was significant, too. Post-hoc paired sample t-test comparisons revealed greater dual-task costs for manual © 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)


M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

responses than for vocal responses in the visual stimulus condition, t(31) = 3.21, p = .003, d = 0.41, while there was no significant difference in dual-task costs between the two effector systems in the auditory stimulus condition, t(31) = 0.14, p = .887, d = 0.02.

Discussion We compared dual-task costs associated with manual and vocal responses between conditions involving either only visual or only auditory stimuli. We used stimuli with two independent aspects (i.e., identity and orientation of letters in the visual domain, location and pitch in the auditory domain) in order to trigger the two responses independently. Therefore, we were able to examine and compare effector-based task prioritization effects for visual and auditory stimulation conditions. Generally, our findings revealed significant dual-task costs in both effector systems throughout all conditions. The observation of significant dual-task costs for vocal responses (which were executed second in 74% of trials) differs from Huestegge and Koch (2013), who did not observe any performance costs for vocal responses (most likely because in this previous study the vocal response was always of the same identity (e.g., left) as the manual response, and hence, there was no need to independently select the correct spatial code for the vocal response). Most importantly, the present results confirm the effector-based prioritization pattern reported in Huestegge and Koch (2013) across all conditions. Specifically, there was a significant prioritization of vocal-over-manual responses (as indexed by greater dual-task costs in RTs for the latter) in both stimulus modality conditions, and this observation was not compromised by any reversed pattern in the error data. Thus, we can rule out that vocal-overmanual prioritization could only be observed using auditory stimuli as a result from IOMC-like effects. Therefore, the present results also suggest that effector system pairings have a greater effect on dual-task capacity allocation policies than input–output modality mappings, as the latter only had a negligible effect on the performance cost pattern. We assume that this prioritization is rooted in an effector-based allocation of capacity to response selection processes. As the effector system associated with a response is essentially an execution-related characteristic, it appears likely that the effector system associated with a response is already anticipated at an early stage during task processing, prior to assigning capacity to the individual response selection processes (see also Hoffmann et al., 2019). This view is in line with other suggestions, implying that response-related features (e.g., proximal and © 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)

53

distal effects associated with responses as in ideomotor theories) are anticipated and thereby influence response selection (see, e.g., Badets et al. 2016; Pfister, 2019, for reviews). While our general observation of greater dual-task costs for manual (vs. vocal) responses is nicely in line with Huestegge and Koch (2013), there is still a discrepancy with respect to reports of other previous studies that observed a reversed manual versus vocal dual-task cost pattern using either visual or (simultaneous) visual and auditory stimuli (e.g., Fagot & Pashler, 1992; Holender, 1980; Schumacher et al., 2001; Stelzel et al., 2006). However, note that there are numerous methodological differences between our present setting and these earlier studies (as well as among them), and it would take a large set of experiments to pinpoint the crucial differences that may turn vocal-over-manual prioritization into manualover-vocal prioritization. Most importantly, however, our present results clearly demonstrated that effects of IOMC are not a central causal factor in this context. Our results support theoretical accounts that suggest capacity sharing or resource scheduling between tasks in dual-task control (Meyer & Kieras, 1997; Navon & Miller, 2002; Tombu & Jolicœur, 2003). Moreover, the results further specify such models in that they suggest that allocation of capacity is also determined by task characteristics such as the associated (anticipated) effector systems. Specifically, it appears conceivable to incorporate effectorbased attentional weighting parameters in computational theories of dual-task control such as executive control of the theory of visual attention (ECTVA; Logan & Gordon, 2001; see Hoffmann et al., 2019; Huestegge & Koch, 2013; Pieczykolan & Huestegge, 2019, for further discussion). It should be noted that vocal-over-manual prioritization in RTs was more pronounced in the auditory (vs. visual) stimulus condition. At first sight, this might indicate that IOMC-like effects (e.g., Hazeltine et al., 2006; Stelzel et al., 2006) at least modulate the extent of the effector-based prioritization effect. However, one issue speaks against such a clear conclusion here: Our analysis of the error rates pointed into the opposite direction. Specifically, while the difference in dual-task costs is more pronounced in auditory (vs. visual) stimulus conditions in the RT data, it is more pronounced in visual (vs. auditory) stimulus conditions in the error rates. Therefore, a speedaccuracy trade-off compromises any clear conclusion regarding the direction of a potential modulation of effector prioritization by IOMC-like phenomena. Additionally, it is important to keep in mind that our present study is different from typical IOMC studies in that we compared one dual-task condition involving only visual stimuli with another dual-task condition involving only auditory stimuli (intra-modal stimulation, while typical IOMC studies Experimental Psychology (2020), 67(1), 48–55


54

usually compare two different input–output modality mapping conditions using bimodal stimulation; see also Hoffmann et al., 2019). Thus, we conclude that specific input–output modality mappings can (to some extent) modulate dual-task performance (at least by affecting speed-accuracy policies), but not abolish or reverse effector-based prioritization (here: of vocal-over-manual responses) in dual-task control. Overall, effector system pairings appear to have a more substantial impact on capacity allocation policies in dual-task control than input–output modality mappings.

References Badets, A., Koch, I., & Philipp, A. M. (2016). A review of ideomotor approaches to perception, cognition, action, and language: Advancing a cultural recycling hypothesis. Psychological Research, 80, 1–15. https://doi.org/10.1007/s00426-0140643-8 Fagot, C., & Pashler, H. (1992). Making two responses to a single object: Implications for the central attentional bottleneck. Journal of Experimental Psychology: Human Perception and Performance, 18, 1058–1079. https://doi.org/10.1037/0096-1523. 18.4.1058 Göthe, K., Oberauer, K., & Kliegl, R. (2016). Eliminating dual-task costs by minimizing crosstalk between tasks: The role of modality and feature pairings. Cognition, 150, 92–108. https://doi. org/10.1016/j.cognition.2016.02.003 Greenwald, A. G. (1972). On doing two things at once: Time sharing as a function of ideomotor compatibility. Journal of Experimental Psychology, 94, 52–57. https://doi.org/10.1037/ h0032762 Greenwald, A. G. (2003). On doing two things at once: III. Confirmation of perfect timesharing when simultaneous tasks are ideomotor compatible. Journal of Experimental Psychology: Human Perception and Performance, 29, 859–868. https://doi. org/10.1037/0096-1523.29.5.859 Halvorson, K. M., Ebner, H., & Hazeltine, E. (2013). Investigating perfect timesharing: The relationship between IM-compatible tasks and dual-task performance. Journal of Experimental Psychology: Human Perception and Performance, 39, 413–432. https://doi.org/10.1037/a0029475 Hazeltine, E., Ruthruff, E., & Remington, R. (2006). The role of input and output modality pairings in dual-task performance: Evidence for content-dependent central interference. Cognitive Psychology, 52, 291–345. https://doi.org/10.1016/j.cogpsych. 2005.11.001 Hoffmann, M. A., Pieczykolan, A., Koch, I., & Huestegge, L. (2019). Motor sources of dual-task interference: Evidence for effectorbased prioritization in dual-task control. Journal of Experimental Psychology: Human Perception and Performance, 45, 1355–1374. https://doi.org/10.1037/xhp0000677 Holender, D. (1980). Interference between a vocal and a manual response to the same stimulus. In G. E. Stelmach, & J. Requin (Eds.), Tutorials in motor behavior (Vol. 1, pp. 421–431). Advances in psychology. Amsterdam, North-Holland: Elsevier. https://doi. org/10.1016/S0166-4115(08)61959-7 Huestegge, L. (2011). The role of saccades during multitasking: Towards an output-related view of eye movements.

Experimental Psychology (2020), 67(1), 48–55

M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

Psychological Research, 75, 452–465. https://doi.org/1007/ s00426-011-0352-5 Huestegge, L., & Adam, J. J. (2011). Oculomotor interference during manual response preparation: Evidence from the responsecueing paradigm. Attention, Perception, & Psychophysics, 73, 702–707. https://doi.org/10.3758/s13414-010-0051-0 Huestegge, L., & Hazeltine, E. (2011). Crossmodal action: Modality matters. Psychological Research, 75, 445–451. https://doi.org/ 10.1007/s00426-011-0373-0 Huestegge, L., & Koch, I. (2013). Constraints in task-set control: Modality dominance patterns among effector systems. Journal of Experimental Psychology: General, 142, 633–637. https://doi. org/10.1037/a0030156 Huestegge, L., Pieczykolan, A., & Koch, I. (2014). Talking while looking: On the encapsulation of output system representations. Cognitive Psychology, 73, 72–91. https://doi.org/10.1016/j. cogpsych.2014.06.001 Logan, G. D., & Gordon, R. D. (2001). Executive control of visual attention in dual-task situations. Psychological Review, 108, 393–434. https://doi.org/10.1037/0033-295x.108.2.393 Maquestiaux, F., Ruthruff, E., Defer, A., & Ibrahime, S. (2018). Dualtask automatization: The key role of sensorymotor modality compatibility. Attention, Perception & Psychophysics, 80, 752–772. https://doi.org/10.3758/s13414-017-1469-4 Meyer, D. E., & Kieras, D. E. (1997). A computational theory of executive cognitive processes and multiple-task performance: Part I. Basic mechanisms. Psychological Review, 104, 3–65. https://doi.org/10.1037/0033-295x.104.1.3 Navon, D., & Miller, J. (2002). Queuing or sharing? A critical evaluation of the single-bottleneck notion. Cognitive Psychology, 44, 193–251. https://doi.org/10.1006/cogp.2001.0767 Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory. Psychological Bulletin, 116, 220–244. https://doi.org/10. 1037/0033-2909.116.2.220 Pfister, R. (2019). Effect-based action control with body-related effects: Implications for empirical approaches to ideomotor action control. Psychological Review, 126, 153–161. https://doi. org/10.1037/rev0000140 Pieczykolan, A., & Huestegge, L. (2014). Oculomotor dominance in multitasking: Mechanisms of conflict resolution in cross-modal action. Journal of Vision, 14, 18. https://doi.org/10.1167/14.13.18 Pieczykolan, A., & Huestegge, L. (2019). Action scheduling in multitasking: A multi-phase framework of response-order control. Attention, Perception, & Psychophysics. 81, 14641487. https://doi.org/10.3758/s13414-018-01660-w Schacherer, J., & Hazeltine, E. (2019). How conceptual overlap and modality pairings affect task-switching and mixing costs. Psychological Research, 83, 1020–1032. https://doi.org/10.1007/ s00426-017-0932-0 Schumacher, E. H., Seymour, T. L., Glass, J. M., Fencsik, D. E., Lauber, E. J., Kieras, D. E., & Meyer, D. E. (2001). Virtually perfect time sharing in dual-task performance: Uncorking the central cognitive bottleneck. Psychological Science, 12, 101–108. https:// doi.org/10.1111/1467-9280.00318 Stelzel, C., & Schubert, T. (2011). Interference effects of stimulusresponse modality pairings in dual tasks and their robustness. Psychological Research, 75, 476–490. https://doi.org/10.1007/ s00426-011-0368-x Stelzel, C., Schumacher, E. H., Schubert, T., & D’Esposito, M. (2006). The neural effect of stimulus-response modality compatibility on dual-task performance: An fMRI study. Psychological Research, 70, 514–525. https://doi.org/10.1007/s00426005-0013-7 Stephan, D. N., & Koch, I. (2010). Central cross-talk in task switching: Evidence from manipulating inputoutput modality compatibility. Journal of Experimental Psychology: Learning,

© 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)


M.A. Hoffmann et al., Input Modality and Vocal Effector Prioritization

Memory, and Cognition, 36, 1075–1081. https://doi.org/10.1037/ a0019695 Stephan, D. N., & Koch, I. (2011). The role of inputoutput modality compatibility in task switching. Psychological Research, 75, 491–498. https://doi.org/10.1007/s00426-011-0353-4 Stephan, D. N., Koch, I., Hendler, J., & Huestegge, L. (2013). Task switching, modality compatibility, and the supra-modal function of eye movements. Experimental Psychology, 60, 90–99. https:// doi.org/10.1027/1618-3169/a000175 Tombu, M., & Jolicœur, P. (2003). A central capacity sharing model of dual-task performance. Journal of Experimental Psychology: Human Perception and Performance, 29, 3–18. https://doi.org/ 10.1037/0096-1523.29.1.3 History Received July 1, 2019 Revision received February 5, 2020 Accepted April 14, 2020 Published online June 9, 2020

© 2020 Hogrefe Publishing Distributed under the Hogrefe OpenMind License (https://doi.org/10.1027/a000001)

55

Open Data Data are publicly available under https://doi.org/10.5281/zenodo. 3756790. Funding This work was supported by Deutsche Forschungsgemeinschaft (German Research Foundation, HU 1847/4-1) to Lynn Huestegge. ORCID Mareike A. Hoffmann  https://orcid.org/0000-0003-1028-5049 Mareike A. Hoffmann Institute of Psychology University of Würzburg Röntgenring 11 97070 Würzburg Germany mareike.a.hoffmann@gmail.com

Experimental Psychology (2020), 67(1), 48–55


Registered Report

Does Object Size Matter With Regard to the Mental Simulation of Object Orientation? Sau-Chin Chen1, Bjorn B. de Koning2, and Rolf A. Zwaan2 Department of Human Development and Psychology, Tzu-Chi University, Taiwan

1

Department of Psychology, Education & Child Studies, Erasmus University Rotterdam, The Netherlands

2

Abstract. Language comprehenders have been arguing to mentally represent the implied orientation of objects. However, compared to the effects of shape, size, and color, the effect of orientation is rather small. We examined a potential explanation for the relatively low magnitude of the orientation effect: Object size moderates the orientation effect. Theoretical considerations led us to predict a smaller orientation effect for small objects than for large objects in a sentence–picture verification task. We furthermore investigated whether this pattern generalizes across languages (Chinese, Dutch, and English) and tasks (picture-naming task). The results of the verification task show an orientation effect overall, which is not moderated by object size (contrary to our hypothesis) and language (consistent with our hypothesis). Meanwhile, the preregistered picture–picture verification task showed the predicted interaction between object size and orientation effect. We conducted exploratory analyses to address additional questions. Keywords: sentence–picture verification task, match advantages, object orientation, object size, language aspects

Theories of mental simulation propose that language comprehension involves the reactivation of perceptual experiences (e.g., Barsalou, 1999). In the past few decades, these theories have acquired empirical evidence from a number of tasks, including the sentence–picture verification task. Studies using this task have found that reading a probe sentence that implies a particular perceptual feature facilitates verification responses for pictures representing the target objects with that particular feature (Connell, 2007; de Koning, Wassenburg, Bos, & van der Schoot, 2017a; Stanfield & Zwaan, 2001; Zwaan & Madden, 2005; Zwaan & Pecher, 2012; Zwaan, Stanfield, & Yaxley, 2002).

eagle was flying or perched, respectively. The shape can be inferred from the eagle’s location. Moreover, the task was to indicate if the depicted entity was mentioned in the sentence, whereby a sentence could be followed by a picture that matched the shape implied by the sentence (e.g., a picture of a flying eagle after the sky sentence) or by a picture that mismatched the shape implied by the sentence (e.g., a picture of a flying eagle after the nest sentence), which means that yes responses were required irrespective of the picture. Nevertheless, responses were faster to shape-matching pictures than to mismatching ones (Zwaan et al., 2002). This match advantage is consistent with the idea of mental simulation, arguing that the sentence could reactivate the particular perceptual experiences during language comprehension.

Comparisons of Match Advantages Consider Object Shape The probe sentence “He saw the eagle in the sky” implies the shape of a flying eagle, whereas the sentence “He saw the eagle in the nest” implies the shape of a perched eagle. Importantly, these sentences do not explicitly state that the Experimental Psychology (2020), 67(1), 56–72 https://doi.org/10.1027/1618-3169/a000468

Besides shape, three other perceptual features (orientation, color, and size) have been investigated in the literature. Orientation has shown the weakest effects compared to the other features. For object color, the initial findings suggested a mismatch advantage (Connell, 2005, 2007). However, using the same materials, Zwaan and © 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

Pecher (2012) obtained a match advantage of object color in an experiment with a larger number of participants. Compared to Connell, who tested 40–60 participants, Zwaan and Pecher recruited 152 participants in their online study. Recently, the match advantage of color was confirmed in two separate laboratory studies using different sets of materials (de Koning et al., 2017a; Hoeben Mannaert, Dijkstra, & Zwaan, 2017). Two lines of studies have investigated mental simulation of object size. The first line of research used probe sentences that described the distance between the reference point and the object (Vukovic & Williams, 2015; Winter & Bergen, 2012). These probe sentences implied far distance and near distance, and the size of picture presentations was respectively small and large. Winter and Bergen’s probe sentences implied the absolute distance between the observer and the object (e.g., “... the milk bottle in the fridge” vs. “... the milk bottle on the end of the counter”) and found a match advantage, whereas Vukovic and Williams’ probe sentences implied the relative distance between the observer and the object (e.g., “In front of you, ...” vs. “In the distance, ...”) and found a mismatch advantage. The second line of research manipulated the physical appearance of the object (de Koning, Wassenburg, Bos, & van der Schoot, 2017b). The target pictures of one object had the same appearance but differed in size. The probe sentence implied a large object, such as “...the bone of a dinosaur,” or a small object such as in “...the bone of a rabbit.” Consistent with the study by Winter and Bergen, de Koning et al. found a match advantage of object size. Thus, the studies of object size show a robust match advantage for sentences that imply the absolute object size such as the physical appearance and the absolute distance. The match advantages for orientation were smaller than those for the other perceptual features, and the findings were less consistent across studies, although it is unclear how these differences have come about (de Koning et al., 2017a). Stanfield and Zwaan (2001) reported the first finding about the match advantage of object orientation. Their study used sentences that implied horizontal and vertical orientations of objects. For example, the sentence “Frank placed the iron onto the shelf, hoping he wouldn’t be late” implies a vertically oriented iron, whereas the sentence “Frank pressed the iron onto his pants, hoping he wouldn’t be late” implies a horizontal iron. In use of materials such as this, Stanfield and Zwaan found a 44-ms match advantage of orientation in their laboratory study involving 40 participants. Subsequent studies have investigated object orientation with slightly modified designs and obtained an inconsistent pattern of results. Using identical materials as Stanfield and Zwaan (2001) and a larger number of participants recruited © 2020 Hogrefe Publishing

57

on the internet (n = 336), Zwaan and Pecher (2012) obtained a roughly equal match advantage (35 ms). Using a memory task and Dutch materials, Pecher, van Dantzig, Zwaan, and Zeelenberg (2009) found match advantages for object shapes and orientations. However, other studies failed to obtain significant match advantages for object orientation. With the same Dutch materials as in Pecher et al. (2009), Rommers, Meyer, and Huettig (2013) found a nonsignificant 1-ms match advantage of object orientation in their study of sentence–picture verification task. Recently, de Koning et al. (2017a) also used Dutch sentences and also failed to find a significant orientation match advantage with a different set of materials: Participants were only 7 ms faster on matching items than on mismatching items. Studies involving primary school children (8–12 years old) also showed inconsistent findings between object shape and orientation. Similar to Rommers et al. (2013) and de Koning et al. (2017a), Engelen, Bouwmeester, de Bruin, and Zwaan (2011) investigated the match advantages of object orientation and shape by intermixing orientation trials and shape trials. Engelen et al. showed an averaged match advantage of 74 ms, but the intermixed-trial design made it difficult to infer whether the match advantage was due to the orientation trials or the shape trials. Two things set the above-mentioned studies (e.g., Engelen et al., 2011; de Koning et al., 2017a) apart from the original study and its replications. First, the latter did not employ a task that kept participants focused on the meaning of the sentences. Stanfield and Zwaan (2001) had participants recall the target sentence after a certain number of trials, while Zwaan and Pecher (2012) used a sentence comprehension test. It is thus possible that the small effect sizes in the other studies are due to the lack of a task focusing on sentence comprehension. This is important because the effect is assumed to occur as a result of sentence comprehension. If participants are not prompted to comprehend the sentences, a reduction of effect size is plausible (Zwaan, 2014). The second difference between these studies, on the one hand, and the studies by Stanfield and Zwaan (2001) and Zwaan and Pecher (2012), on the other hand, is that the former studies presented the sentences in Dutch, whereas the original study and its direct replications presented the stimuli in English. These differences among orientation studies notwithstanding, the question remains why orientation yields a smaller effect than shape, size, or color on the same task used. Consistent with the summary of Zwaan and Pecher (2012), Cohen’s d of orientation (0.13) is less than half of that of shape (0.31). de Koning et al. (2017a) also indicated that, using a direct comparison in a within-subjects Experimental Psychology (2020), 67(1), 56–72


58

design, the smallest effect sizes were related to object orientation (0.07) and size (0.07) compared to color (0.48) and shape (0.27). We hypothesize that the relatively small effect of orientation is due to the nature of the objects being used in the orientation experiments. All of these studies used the visual stimuli from the original study (Stanfield & Zwaan, 2001) or stimuli that were based on or Dutch translations from the original ones. A common feature of the objects described and depicted in the stimuli used in these studies is that they can be manipulated by a single hand. Most (but not all) of these stimuli represent easily manipulable objects such as hair brush and pencil. Thus, for these items, the critical visual feature (orientation) can be changed by a simple manipulation during real-world action. This stands in contrast with the objects used in the shape (e.g., perched vs. flying eagle), color (e.g., red vs. green stoplight), and size (e.g., dinosaur vs. rabbit bone) experiments, where such a featural change by manipulation is impossible. That the past orientation studies employed small objects are relevant for at least two reasons. First, in the real world, such small objects are seen as constantly changing orientation. Take for instance a pen, which is frequently seen lying flat on a desk, standing upright in a pen holder or being held by someone writing a note, or even being twiddled idly in someone’s hand. This means that we have a great deal of experience seeing small objects rapidly change from one orientation to the next and back. Nonmanipulable objects or those that can only be manipulated by two hands usually are not observed changing orientations in close temporal succession (e.g., a street lantern is often seen standing upright, unless it is being transported on a truck to a new location but does not rapidly change orientation). As a result, our visual experience with small manipulable objects is different from that with larger objects that are more difficult or impossible to manipulate. Second, for objects that are manipulable with one hand, it is relatively easy to obtain a different orientation by either physically or mentally transforming the object to another orientation. There is behavioral and neuroimaging evidence that visual mental rotation and manual rotation rely on overlapping neural substrates (e.g., Parsons et al., 1995; Wexler, Kosslyn, & Berthoz, 1998; Windischberger, Lamm, Bauer, & Moser, 2003; Wohlschläger & Wohlschläger, 1998). It might, therefore, be hypothesized that visual mental rotation is facilitated by motor mental rotation. As a result, participants should be able to quickly recover from seeing a mismatching picture. This would reduce the advantage of a matching picture. For example, a pen can easily be turned from a vertical to a horizontal position with the hand or one can imagine doing so. These lines of thinking lead us to predict a shorter simulation time for the orientation of small objects compared to large objects. Experimental Psychology (2020), 67(1), 56–72

S.-C. Chen et al., Simulation of Object Orientation Across Size

Exploration of a Mental Rotation Account A second aim of the current study is to explore to what extent mental rotation can account for not finding a match advantage of object orientation. de Koning et al. (2017a) suggested that mental rotation as an alternative process to mental simulation could quickly erase the mismatched orientation, replacing it with the orientation that matches the one described in the sentence (Cohen & Kubovy, 1993; Yaxley & Zwaan, 2007). However, so far, this suggestion is speculative at best given that de Koning et al. did not directly test this in their study and their findings barely supported their assumption because they had only the results in terms of the small and manipulable objects. To put the suggestion of de Koning et al. to the test, we draw upon mental rotation research. Among the various mental rotation paradigms (Cohen & Kubovy, 1993; Shepard & Metzler, 1971; Zwaan & Taylor, 2006), the reaction times to verify whether two figures presented in a variety of different orientations are the same or different have been typically used to measure mental rotation. In Study 2 reported in this paper, we used a version of this standard mental rotation paradigm (Shepard & Metzler, 1971) and employed the real-world objects used in the sentence–picture verification task as our stimuli. The task was thus a picture– picture verification task in which participants verified whether the two simultaneously side-by-side presented pictures were the same or different in a same orientation (horizontal–horizontal; vertical–vertical) or in a different orientation (horizontal–vertical; vertical–horizontal). If the account of mental rotation is plausible, we expect that the large and nonmanipulable objects would require more rotation time than the small and manipulable objects. Empirical support for this hypothesis would provide converging evidence for the hypothesis that mental simulation effects will be smaller for small items than for large items.

The Present Study The present study aims to extend prior research on the mental simulation of object orientation during language comprehension. In three experiments, we tested the preregistered predictions outlined above using the sentence– picture verification task and the picture–picture verification task. We tested these predictions in three languages: English, Dutch, and Chinese, whereas English (Stanfield & Zwaan, 2001; Zwaan & Pecher, 2012) and Dutch © 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

(de Koning et al., 2017a, 2017b; Rommers et al., 2013) have been studied before. However, while the shape effect has been replicated and extended in Chinese (Gao & Jiang, 2018), we know of no research on the orientation effect in Chinese. Thus, including Chinese allows us to further investigate the generalizability of the orientation effect across languages. In Study 1, we investigated whether verification times of pictures depicting horizontal and vertical objects in the sentence–picture verification task were shorter for large objects than for small objects and whether this was consistent across the three languages. In Study 2, we investigated the mental rotation account using a picture–picture verification task and tested whether verifying if two pictures match in orientation produce a larger match advantage for the large than for the small objects (again using speakers of the three languages). In response to the suggestion of an anonymous reviewer of the preregistration, in Study 3, we used a sentence–picture naming task, similar to Study 1, where participants had to make a vocal response. A picture-naming task was used as it provides a stronger test of the mental simulation hypotheses than the verification task in that it does not call for a comparison between the sentence and the picture.

General Method To test the plausibility of our hypothesis, we constructed a stimulus set including small manipulable objects (e.g., pen) as well as large nonmanipulable objects (e.g., a boat or a missile) and tested it in an initial study. In this initial study, we examined the match advantage for orientation across large and small objects and between three languages: English, Dutch, and Chinese (see the data and the summary in Electronic Supplementary Material, ESM 1 and 2). Effects for orientation were found in English but not in Dutch. As per our prediction, the results of a meta-analysis on this initial study showed a significant match advantage for large objects but no match advantage for small objects. In the present study, we built on this initial study regarding the design, materials, and experimental procedure.

Experimental Procedures Three groups of native speakers (English, Dutch, English) participated in the sentence–picture verification task (Study 1) and the picture–picture verification task (Study 2) in a single experimental session. The sentence–picture naming task (Study 3) took place in a separate experimental session and involved participants from two groups of native speakers (English and Chinese). We used the © 2020 Hogrefe Publishing

59

same design and materials that were employed in the sentence–picture verification task and the picture–picture verification task in our initial study, except that we revised the probe sentences to eliminate the orientation implications of Dutch verbs. Specifically, the initial study yielded two unforeseen issues. First, the Dutch participants surprisingly showed a marginal significant match advantage for small objects. Closer inspection of the items in it exhibited that some of the Dutch sentences contained verbs implying a particular object orientation. Orientational verbs such as liggen (to lie) and staan (to stand) provide an additional, and more explicit, clue about the object’s orientation than does the object’s location. Although the use of these verbs is natural in Dutch (in fact more natural than using the less specific is), their use in the present experiment undermines the original goal of letting the orientation of the target object be determined by its location (described in a prepositional phrase). It stands to reason that providing an additional orientation cue would increase the size of the match advantage. We addressed this issue in the present study by replacing the orientation verbs with orientation-neutral verbs such as is. In the sentence–picture verification task, each experimental session began after six practice trials. A trial started with a left-justified and vertically centered fixation point for 1,000 ms, which was immediately followed by the probe sentence, presented in the same location as the fixation point. Participants pressed the space bar when they had read the sentence. Immediately thereafter, a horizontally and vertically centered fixation point appeared for 500 ms, after which the target picture was presented. Participants pressed the j key if they thought the depicted object was mentioned in the preceding sentence or the f key if they thought the object was not mentioned in the sentence. They were instructed to verify the target picture as quickly and accurately as possible. The picture–picture verification task was based on the sentence–picture verification task but did not include the test sentences. There were two further exceptions. First, there was only one horizontally and vertically centered fixation point that appeared before the target pictures. Second, the two target pictures appeared next to the fixation point (on both sides) until a response was made or until 2 s had passed. The picture-naming task was identical to the sentence– picture verification task except for the mode of responding. Instead of pressing a key on the keyboard, the participant read the object name aloud as quickly as possible within the span of 3 s after the object picture had been presented. Upon completion of the recording, an evaluation screen appeared that presented the object picture and required participants to evaluate their response by using one of four options: right, wrong, no response, and recording Experimental Psychology (2020), 67(1), 56–72


60

S.-C. Chen et al., Simulation of Object Orientation Across Size

failed. We managed all the tasks in Gorilla.sc (AnwylIrvine, Massonnié, Flitton, Kirkham, & Evershed, 2019). All the materials are available in Gorilla open materials (see the public link in Appendix A).

of different objects were selected from the stimuli set with the same object size. A total of 256 trials were presented in randomized sequences for the participants.

Material and Design

Study 1: Sentence–Picture Verification

A total of 128 pairs of sentences and gray-colored pictures (scaled in 240 × 240 pixels) were used. Sixty-four pairs were the critical items, and the other 64 sentence–picture pairs were fillers. Sixteen pictures of large objects were obtained from various sources, and the pictures of 16 small objects and 64 fillers were obtained from the standardized stimuli pools (Bonin, Peereman, Malardier, Méot, & Chalard, 2003; Brodeur, Guérard, & Bouras, 2014) and from the internet. The fillers consisted of a sentence accompanied by an unrelated picture. The critical items were created by crossing three within-participant variables: size (large vs. small), orientation (horizontal vs. vertical), and match (matching vs. mismatching of probe sentence and target picture). Of the 64 critical items, 32 described large objects and the other 32 described small objects. Object size was determined according to whether the object can be manipulated by a single hand (small) or not and only able to by heavy machinery or force of nature (large). The object terms were the grammatical subject of every critical sentence. The critical sentences implied the object orientations in the form of a prepositional phrase at the end of the sentence. For example, “The pen is on the table” implies a horizontal orientation of the pen, and “The pen is in the container” implies a vertical orientation; “The missile was flying over the sea” implies a horizontal orientation, and “The missile was launched from the submarine” implies a vertical orientation. The sentences were written in Chinese, Dutch, and English, respectively. For each critical item, one target picture presenting the horizontal or vertical orientation matched one of the two critical sentences but mismatched the other sentence. For reasons of counterbalancing, we created two stimulus lists wherein each list contained two of the four sentence– picture combinations. Each participant was randomly exposed to only a single horizontal or a vertical picture of a specific target object. The picture–picture verification task included two within-participant variables: object size (large vs. small) and match (identical vs. different). All the target objects were presented in the default orientation and had two companions. One companion was the picture presented in the identical orientation, and the other was the picture presented in a different orientation. Each target object was presented twice with its companions in this task. In order to balance yes–no responses, 64 filler items involving pairs Experimental Psychology (2020), 67(1), 56–72

The first aim of this study was to test the hypothesis that the match advantage for orientation items is larger for the large objects than for the small objects. The second aim was to test whether there were similar results across the languages Chinese, Dutch, and English. Earlier studies had found effects for English but not for Dutch stimuli. Chinese had not been tested to our knowledge. We modified some items for our Chinese participants. Whisk was replaced by chopsticks. Although a whisk is a common kitchen utensil for English and Dutch participants, it is unfamiliar to Chinese participants. Four large objects, such as drawbridge, wine barrel, steel barrel, and bottle, were also deemed unfamiliar to Chinese participants. We revised the Chinese sentences describing these objects to make them comprehensible for Chinese participants. To foreshadow, with regard to the second hypothesis, the results of the Dutch participants deviated from those of the English and Chinese participants. We suspected that language-specific knowledge might have affected Dutch participants’ performance. Therefore, we collected additional data in a group of Dutch native speakers, but fluent in English, who were presented with the English version of the sentence–picture verification task. This additional data collection followed the preregistered data collection plan but was not preregistered in itself.

Methods Sampling Plan We used G*Power (Faul, Erdfelder, Buchner, & Lang, 2009) to estimate the expected effect size and sample size based on a one-tailed t-test in terms of 0.05 significance level and 80–90% power. According to the preregistered plan, we conducted a sequential analysis based on Bayes factor estimations along with the data collection (Morey, Rouder, Love, & Marwick, 2015; Schönbrodt, Wagenmakers, Zehetleitner, & Perugini, 2017). We decided to use BF10 = 10 as the criterion to stop data collection for two reasons: (1) the collective evidence (de Koning et al., 2017a; Zwaan & Pecher, 2012) moderately supports the mental simulation of object orientation, BF10 = 20.944; (2) a BF10 between 10 and 30 implies strong © 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

evidence (Jeffreys, 1961). For each participant group (i.e., language), we computed a Bayesian t-test on the data of the sentence–picture verification task after each set of 40 participants. If the resulting BF was smaller than 10, we continued data collection with the next 40 participants. Once one of the match advantages had BF10 beyond 10, data collection for that group ended. For each group, data collection continued until BF10 beyond 10 or when the predetermined maximum of 160 participants was reached. The permanent link for access to the preregistered plan is https://osf.io/4z9my/. Participants This study recruited native speakers of English, Chinese, and Dutch. English-speaking participants were recruited via Prolific Academic, Chinese participants from Bounty Workers, and Dutch participants from the Erasmus University Rotterdam psychology participant pool. The age range of English and Chinese participants was 18–40 years, and the age range of Dutch participants was 18–30 years. In January 2019, Prolific Academic had 136 eligible Dutch participants and 344 eligible Chinese participants. The Chinese participants were from various Asian countries where people usually intermix Traditional and Simple Chinese. Taiwaneses are used to read Traditional Chinese. Therefore, we decided to use Bounty Workers (https:// bountyworkers.net/) for recruiting participants as many registered participants are native Taiwanese speakers. Preanalysis Processing After data collection, we first removed the data from participants who did not complete the task or had no correct responses for at least one condition from the dataset. This resulted in removal of 22 (of 104) participants from the English sample, 24 (of 117) participants from the Chinese sample, 8 (of 121) from Dutch sample, and 22 (of 121) participants from the Dutch sample doing the English task.

Results Overall Accuracy of Verification and Comprehension Table 1 summarizes the average accuracy on the sentence–picture verification task and the intertrial comprehension task for each language group. A betweenparticipant analysis of variance shows that there was a significant difference among the groups in the verification task, F (3, 459) = 14.04, MSE = 148.36, p < .001, 2 b η ¼ 0:084, and in the comprehension task, F (3, 459) = 2 8.44, MSE = 185.90, p < .001, b ηp ¼ 0:052. Specifically, Dutch participants had a higher accuracy on the sentence–picture verification task than did the © 2020 Hogrefe Publishing

61

Table 1. Average accuracy (in %) on the sentence–picture verification task and comprehension task (standard errors in parentheses) Group

Verification

Comprehension

Chinese

83.05 (1.17)

70.35 (1.46)

Dutch–Dutch

93.06 (0.7)

77.71 (0.9)

Dutch–English

89.02 (1.12)

78.26 (1.14)

English

86.75 (1.5)

74.43 (1.55)

Chinese and English participants. Additionally, the Dutch participants’ comprehension responses were more accurate than those of English and Chinese participants. Based on this, we excluded participants in the English and Chinese groups who had accuracy scores <75% and 70%, respectively. For the Dutch group, participants whose accuracy was <80% were excluded. The remaining dataset contained data of 82 English participants, 93 Chinese participants, 113 Dutch participants, and 104 Dutch participants doing the English task. Sequential Analysis of Verification Response Times We conducted the preregistered sequential analyses on the response time data of sentence–picture verification by language groups. The results showed that the English and Chinese samples met the preregistered criterion of BF10 > 10 at half the preregistered maximum sample size. Figure 1 illustrates the sequential analysis of each group separately for large and small objects. Verification Responses by Conditions Table 2 summarizes the mean reaction times and accuracy percentages as a function of language group, objects size, and match. In addition to Dutch participants, English and Chinese participants showed match advantages on the mean reaction times. Chinese participants performed worse than the other participants. We conducted three sets of statistical analyses as per the preregistered plans. They were (1) a three-way mixed analysis of variance with language group as betweenparticipants factor and object size and match as withinsubject factors; (2) a meta-analysis of the match advantage across language groups and object size; and (3) linear mixed-effect models with participants and items as intercepts. Because we ultimately completed two English studies, there were four language groups in each analysis. Three-Way Mixed ANOVA The preregistered mixed ANOVA on the verification times showed that, at a significance level of <.05, there were main effects of language group, F (3, 383) = 12.14, MSE = 2 84, 138.96, p < .001, b ηp ¼ 0:087; object size F (1, 383) = 2 216.50, MSE = 6, 053.32, p < .001, b ηp ¼ 0:361; and match Experimental Psychology (2020), 67(1), 56–72


62

S.-C. Chen et al., Simulation of Object Orientation Across Size

Figure 1. Bayesian sequential analysis of sentence–picture verification response times. Chi = Chinese group; E = excluded low-accuracy data; Eng = English group; I = included low-accuracy data; L = large objects; NL-Dut = Dutch group, Dutch study; NL-Eng = Dutch group, English study; S = small objects.

advantage F (1, 383) = 13.40, MSE = 6, 166.75, p < .001, 2 b ηp ¼ 0:034. The main effects of object size and match advantage were consistent with our preregistered prediction. The main effect of language group was also beyond the significance level but requires further exploration in light of the interactions. The interaction between object size and language group shows that Chinese participants required more time to verify the large objects than did the other groups, F (3, 383) = 4.24, MSE = 6, 053.32, p = .006, 2 b ηp ¼ 0:032. There was also a significant interaction between language group and match, F (3, 383) = 2.90, MSE = 2 6, 166.75, p = .035, b ηp ¼ 0:022. The interaction between object size and match advantage was not significant, F < 1. Experimental Psychology (2020), 67(1), 56–72

Also, the three-way interaction of the language group with match and object size was not significant, F (3, 383) = 0.89, 2 MSE = 5, 892.41, p = .448, b ηp ¼ 0:007. A three-way mixed ANOVA on the accuracy scores showed significant main effects of the language group, F 2 (3, 383) = 20.76, MSE = 3, 621.59, p < .001, b ηp ¼ 0:140, and match F (1, 383) = 63.55, MSE = 951.25, p < .001, 2 b ηp ¼ 0:142. Unlike the response time data, the accuracy scores showed no effect of object size, F (1, 383) = 0.29, 2 MSE = 629.80, p = .592, b ηp ¼ 0:001. Consistent with the verification times, theinteractionofobject size and language group indicated that the Chinese participants made more errorsforthelargeobjectsthandidtheotherlanguagegroups, © 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

63

Table 2. Averaged reaction times and error percentages (in parentheses) of the sentence–picture verification task Groups Chinese (N = 93)

Dutch (N = 113)

English (N = 99)

Dutch (N = 82)

a

Object size

Matching

Mean Response Time

Accuracy percentage

Large

N

828.23 (24.80)

219.35 (7.57)

Small

Y

799.19 (19.14)

264.31 (4.14)

Large

N

765.30 (23.19)

239.31 (7.59)

Small

Y

728.21 (19.91)

266.94 (4.12)

Large

N

726.96 (13.69)

278.60 (2.28)

Small

Y

725.65 (15.21)

277.27 (2.22)

Large

N

651.81 (11.47)

278.43 (2.07)

Small

Y

649.53 (12.44)

276.60 (2.08)

Large

N

690.47 (14.02)

271.59 (2.41)

Small

Y

677.56 (13.68)

280.87 (2.17)

Large

N

646.17 (12.80)

267.42 (3.08)

Small

Y

641.75 (13.10)

274.05 (2.48)

Large

N

722.27 (18.49)

271.19 (3.82)

Small

Y

693.14 (15.09)

281.02 (2.45)

Large

N

656.77 (14.92)

270.73 (4.46)

Small

Y

655.26 (15.57)

276.22 (3.95)

Note. Standard errors of Response Time and accuracy are in the parentheses. aDutch participants who joined the English study.

2

F (3, 383) = 8.08, MSE = 629.80, p < .001, b ηp ¼ 0:060. The interaction of language group and match advantage also indicated that the Chinese participants made more errors for the mismatching object orientation, F (3, 383) = 27.63, MSE = 2 951.25, p < .001, b ηp ¼ 0:178. This analysis indicated no interaction of orientation and match and no three-way interaction, Fs < 1.

residuals. By the analysis without the heterogeneous residuals, the coefficient for the object size was estimated to be b = 8.09 (SE = 9.42) and was above the preregistered significance level, p = 0.39. The analysis with the heterogeneous residuals returned a similar result: The estimated coefficient was b = 8.01 (SE = 9.45) and was above the preregistered significance level, p = 0.40.

Meta-Analysis on the Match Advantage Following the preregistered analysis plan, we compared the match advantages across language groups. The metaanalysis showed that the English group showed the largest effect size for large objects, M = 29.13, 95% CI [10.48, 47.78], while the Chinese group had the largest effect size for small objects, M = 37.09, 95% CI [11.18, 62.99]. Additionally, as planned, we tested if the effect size of large objects is larger than that of small objects. The meta-analysis showed a match advantage for large objects, M = 16.54, 95% CI [3.1, 29.97], but a null effect for small objects, M = 8.53, 95% CI [ 4.2, 21.27]. Figure 2 shows the results of this meta-analysis in a forest plot. The preregistered hypothesis for this study predicted that object size should moderate the orientation match advantage. To test this hypothesis, we conducted a moderator analysis on the effect size of match advantage as the dependent measurement, language groups as the independent variable, and object size as the moderator. We ran this analysis without and with the heterogeneous

Linear Mixed-Effect Model on the Verification Times The preregistered analysis plan aimed to explore the interactions of the three fixed effects (language group, object size, and matching of sentence and object). We decided to use the model including the trial sequence and the correlation of trial sequence and items based on the lowest Akaike information criterion (Akaike, 1974; Burnham & Anderson, 2010). Appendix B summarizes the selection of models. Table 3 summarizes the fixed effects in the linear mixedeffect model. The mixed-effect model shows the main effects of matching and language group below the preregistered significance level of p < .05, but the main effect of object size and all the interactions were significant. The mixed-effect model did not show an interaction of match advantage and object size or language, as did ANOVA and meta-analysis. Table 4 summarizes the coefficients of random effects in the linear model based on the suggestion of Barr, Levy, Scheepers, and Tily (2013). The variances of random effects were critical for controlling Type 1 error and the statistical power.

© 2020 Hogrefe Publishing

Experimental Psychology (2020), 67(1), 56–72


64

S.-C. Chen et al., Simulation of Object Orientation Across Size

Figure 2. Meta-analysis on the mean differences of match advantages.

Study 2: Picture–Picture Verification

objects. Participants made faster and more accurate responses for object pairs presented in the same orientation, especially for large objects.

The aim of Study 2 was to examine the mental rotation hypothesis. We assumed that large, nonmanipulable objects require longer rotation times than small, manipulable objects. Therefore, we predicted that participants would take more time to verify larger objects that are presented in different orientations.

Three-Way ANOVA Consistent with the preregistered plan, we conducted a mixed ANOVA on the correct reaction times as a function of language group, object size, and match. This analysis showed significant main effects of language group, F 2 (3, 457) = 7.32, MSE = 28, 383.87, p < .001, b ηp ¼ 0:046; object size, F (1, 457) = 411.90, MSE = 870.22, p < .001, 2 b ηp ¼ 0:474; and match, F (1, 457) = 1, 208.09, MSE = 2 854.53, p < .001, b ηp ¼ 0:726. Of more importance to our hypotheses, the interaction between object size and match was significant, F (1, 457) = 46.16, MSE = 673.50, p < .001, 2 b ηp ¼ 0:092. This interaction indicated that the verification time was longer for large objects (55 ms) than for small objects (39 ms). Additionally, the interaction between language group and object size was not significant, F (3, 457) 2 = 1.37, MSE = 870.22, p = .252, b ηp ¼ 0:009, nor was there a significant interaction between language group and match, 2 F (3, 457) = 5.77, MSE = 854.53, p = .001, b ηp ¼ 0:036. The nonsignificant interactions suggest that the human perception of object size and orientation generalizes across the three languages. The mixed ANOVA on the accuracy scores shows significant main effects of object size, F (1, 457) = 729.85,

Method Participants Same as in Study 1.

Results The analysis includes data from participants who had high accuracy (English: >75%, Chinese: >70%, Dutch: >80%) in the sentence–picture verification task. Table 5 summarizes the descriptive statistics as a function of language group, object size, and match. Each language group showed a longer mean reaction time and a lower accuracy for large Experimental Psychology (2020), 67(1), 56–72

© 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

65

Table 3. Fixed effects of critical variables: sentence–picture verification task and picture–picture verification task Sentence–picture verification

Picture–picture verification

Estimate (SE)

t Value

Estimate (SE)

Intercept

972.03 (69.93)

13.9***

Trial

4.49 (0.17)

26.07*** 4.18***

Sentence–picture naming

t Value

Estimate (SE)

831.49 (20.84)

39.91***

968.77 (30.64)

0.17 (0.04)

4.2***

0.71 (0.36)

1.96

64.29 (21.86)

2.94**

4.24 (36.07)

0.12

Group (English)

85.3 (20.40)

Group (Dutcha)

51.48 (19.77)

2.6**

Group (Dutchb)

64.92 (21.36)

3.04**

65.29 (95.13)

0.69

112.44 (22.55)

4.99***

133.56 (23.62)

2.98**

161.31 (18.03)

8.95***

10.49 (13.75)

3.37 (12.98)

0.26

93.14 (25.38)

15.53 (12.83)

1.21

44.44 (19.82)

2.24*

Object size Matchingc

27.89 (9.35)

Object size: matchingc Group (English): small size

44.7 (21.20) 92.57 (21.91)

4.22***

3.67***

16.88 (12.41)

1.36

11.36 (19.56)

0.58

Group (Dutchb): small size

4.91 (13.39)

0.37

39.84 (16.08)

2.48*

11.31 (12.56)

0.9

36.95 (22.87)

1.62

1.85

59.65 (22.51)

2.65**

67.51 (18.76)

3.6***

41.81 (33.2)

1.26

Group (English): matching

Group (Dutcha): matchingc

22.6 (12.19)

Group (Dutchb): matchingc

2.37 (13.13)

0.18

13.14 (17.63)

0.75

c

Group (Dutch ): small size: matching

2.42 (17.07)

0.14

34.28 (32.33)

1.06

Group (Dutchb): small size: matchingc

32.74 (18.42)

1.78

39.72 (26.31)

1.51

Group (English): Small size: matchingc a

31.61***

2.11*

Group (Dutcha): small size c

t Value

5.65*** 0.76

15.47 (19.29)

0.8

10.04 (19.99)

0.5

14.92(20.04)

0.74

18.95 (28.23)

0.67

Note. *p < .05, **p < .01, ***p < .001. aDutch participants who finished the Dutch study. bDutch participants who finished the English study. cMatching of sentence and picture (Studies 1 and 3) and target objects (Study 2).

Table 4. Random effects in sentence–picture verification task and picture–picture verification task Sentence–picture verification

Picture–picture verification

Sentence–picture naming

Term

Variance

Corr.

Variance

Corr.

Variance

Corr.

Participant (intercept)

34,931.03

NA

22,441.47

NA

62,029.52

NA

Participant: trial

8.16

0.55

NA

NA

Item (intercept)

73,845.79

0.76 NA

2,311.71

NA

5,945.33

NA

Residual

50,671.09

NA

26,228.04

NA

12,2131.56

NA

2

MSE = 26.26, p < .001, b ηp ¼ 0:615, and match, F (1, 457) = 2 646.19, MSE = 45.65, p < .001, b ηp ¼ 0:586. Unlike the analysis of response times, the main effect of language group was not significant, F (3, 457) = 1.26, MSE = 117.05, 2 p = .289, b ηp ¼ 0:008. Consistent with the reaction time analysis, the analysis of response accuracy showed a significant interaction of object size and match, F (1, 457) = 2 929.43, MSE = 22.55, p < .001, b ηp ¼ 0:670. This interaction indicated that participants made more errors for the large objects presented in different orientations. The other interactions were above the predefined significance level of .05. Meta-Analysis on the Verification Times Along with the preregistered hypothesis for the first study, this study investigated whether object size moderates the mental rotation of target objects. We conducted the © 2020 Hogrefe Publishing

0.22

moderator analysis on the effect size of match as the dependent measurement, language groups as the independent variable, and object size as the moderator. We ran this analysis without and with the heterogeneous residuals. In the analysis without the heterogeneous residuals, the coefficient for the object size was estimated to be b = 16.38 (SE = 4.89 and was below the preregistered significance level, z = 3.35, p = .001. The analysis with the heterogeneous residuals returned a similar result: The estimated coefficient was b = 16.42 (SE = 4.89 and was below the preregistered significance level, z = 3.36, p = .001. Mixed-Effect Model As in Study 1, we decided to use a model including the trial sequence as the intercept and the correlation of trial sequence and participants as the best fitting model. Table 3 summarizes the coefficients of the fixed effects. In Experimental Psychology (2020), 67(1), 56–72


66

S.-C. Chen et al., Simulation of Object Orientation Across Size

Table 5. Averaged reaction times and error percentages (in parentheses) of the picture–picture verification task Groups Chinese (N = 116)

Dutch (N = 121)

English (N = 120)

Dutch (N = 104)a

Object size

Matching

Mean RT

Accuracy percentage

Large

FALSE

674.96 (9.48)

83.22 (0.77)

Small

TRUE

614.90 (7.72)

96.90 (0.57)

Large

FALSE

633.33 (9.03)

95.69 (0.60)

Small

TRUE

596.54 (9.09)

96.55 (0.61)

Large

FALSE

650.00 (8.37)

82.23 (0.81)

Small

TRUE

603.66 (7.53)

96.26 (0.39)

Large

FALSE

619.18 (7.38)

95.27 (0.44)

Small

TRUE

586.37 (6.92)

96.07 (0.38)

Large

FALSE

638.01 (8.22)

80.60 (1.14)

Small

TRUE

584.15 (7.67)

96.25 (0.47)

Large

FALSE

603.57 (7.60)

94.09 (0.93)

Small

TRUE

565.45 (6.90)

96.20 (0.43)

Large

FALSE

695.01 (9.60)

81.61 (1.02)

Small

TRUE

632.74 (8.49)

97.33 (0.47)

Large

FALSE

657.27 (8.82)

95.46 (0.74)

Small

TRUE

608.26 (7.84)

96.72 (0.44)

Note. Standard errors of RT and accuracy are in the parentheses. aDutch participants who joined the English study.

addition to the main effects of match, object size, and languages, this mixed-effect model confirmed the larger effect of match for the large objects, β = 93.14, t = 3.67, p < .01. This model also indicated that there was an interaction between match and language, indicating a larger effect for Dutch participants in comparison with Chinese participants, β = 59.65, t = 2.65, p < .01.

Study 3: Naming Study Similar to Study 1, the aims of Study 3 were to test whether there was a larger match advantage for large than for small orientation item and whether this was similar across languages. Study 3 focused on the language groups that showed the orientation effects in Study 1: English and Chinese. If the match advantage obtained in Study 1 is also found in this study, this would provide corroborating evidence that mental simulations are performed when processing large objects.

Method Procedure The procedure was similar to that in Study 1 except that participants told the object name of the target picture aloud instead of pressing a response key. Before participants commenced with the experiment, they tested their Experimental Psychology (2020), 67(1), 56–72

audio recording function on a calibration page. In each practice and experimental trial, participants named the object aloud within 3 s. The Gorilla website, which was used for presenting the stimuli, stored each vocal response in mp3 format. Then, participants verified their voice response with four options: right, wrong, no response, and recording failed. After half of the filler trials, participants completed comprehension questions about the probe sentences, which were used to check whether participants were reading these sentences for meaning. As for the sentence–picture verification task, the participants filled in the postsurvey about this study at the end. Naming Latency Coding The sound files were archived in mp3 format in monochannel and 128 kb/s. We extracted the naming latency of each voice response in two phases. In the first phase, we used the Praat (Boersma, 2001) script to successfully determine the naming latency of sound files without noise. In the second phase, we checked the participants’ evaluation of their own recordings. Fifteen English participants indicated that more than 50% of their responses failed recording, but it appeared that these sound files contained accurate recordings of participants’ voice in the trials. Ten Chinese participants’ sound files included background noise. The noise caused the Praat script to calculate incorrect naming latencies. Prior to the data analysis, we reset the 15 English participants’ evaluations, and we manually coded the 10 Chinese participants’ naming latencies. © 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

67

Table 6. Averaged reaction times and accuracy percentages of the sentence–picture naming task Groups

Object size

Chinese (N = 87) Large Small English(N = 76) Large Small

Matching

Mean RT

Accuracy percentage

N

912.59 (23.36)

93.97 (0.81)

Y

907.11 (21.58)

94.25 (0.79)

N

800.38 (17.93)

97.70 (0.54)

Y

815.43 (20.43)

97.41 (0.65)

N

873.08 (37.85)

95.48 (0.95)

Y

877.94 (32.85)

96.13 (0.97)

N

771.28 (31.79)

96.38 (0.89)

Y

801.32 (27.47)

96.63 (0.90)

Note: Standard errors of RT and accuracy are in the parentheses.

Participants English and Chinese participants from the same participant pools as in Study 1 (English: ProA; Chinese: BW) participated. Thirty-two participants whose responses contained over 50%of unrecognizable sounds and who were not native speakers were removed from the dataset. The data analysis included the responses of 76 English participants and 87 Chinese participants.

Results Table 6 summarizes the mean reaction times and accuracy from the participants. Both English and Chinese participants showed a weak match disadvantage in the picturenaming task, and they required a longer time to read the large objects than the small objects. The preregistered plan addressed the exploratory goals for the analysis of picturenaming responses. There was a significant difference between large objects and small objects in a mixed ANOVA with language, object size, and match as factors. We compared the responses to large objects and small objects by using a mixed-effect model approach. Three-Way Mixed ANOVA on the Naming Latency The analysis on naming latencies yielded a significant main effect of object size, F (1, 138) = 166.84, MSE = 8, 399.33, 2 p < .001, b ηp ¼ 0:547. All other main effects and interactions 2 had p > .1 and b ηp ¼ 0:005. Linear Mixed-Effect Model The fixed effect of object size was the only coefficient beyond the preregistered significance level of p < .05. The confidence intervals of naming times indicated that large objects require more time for pronunciation, M = 973.91, 95% CI [904.45, 1,042.78], than small objects, M = 833.28, 95% CI [780.43, 886.14]. Tables 3 and 4 summarize the coefficients of the fixed effects and random effects, respectively. © 2020 Hogrefe Publishing

Discussion We set out to investigate to what extent inconsistent findings previously reported for the mental simulation of object orientation are related to the size of objects. In addition, we examined whether our findings would generalize across languages (Chinese, Dutch, and English). We conducted three preregistered studies in which we manipulated the size of objects to be either small or large. Participants made verification judgments (Studies 1 and 2) or named the depicted objects (Study 3). We hypothesized that a larger match advantage would be obtained for large objects than for small objects. We performed a confirmatory test on the verification task (based on earlier research using this task) and explored the effects in the picture-naming task. Additionally, we hypothesized that the match advantage should be similar across languages. Moreover, we tested a mental rotation account predicting that large and nonmanipulable objects should take longer to mentally rotate than small and manipulable objects, which would result in a smaller match advantage for small objects than for large ones. In the following discussion, we separate conclusions about our confirmatory analyses from exploratory analyses and comments.

Confirmatory Analyses Our first preregistered hypothesis was that the orientation effect should be larger for large than for small objects. Contrary to this hypothesis, the preregistered mixed ANOVA showed a null interaction of object size and object orientation in the sentence–picture verification task. The meta-analysis on the sentence–picture verification times (Study 1) showed that while there was a larger match advantage for large than for small objects, the difference between these meta-analytic effects were not significant (see Study 1 Result and Meta-analysis on the match advantage). Experimental Psychology (2020), 67(1), 56–72


68

The meta-analysis suggests that the pattern is more complex than we assumed. We will discuss this in the exploratory section of this discussion. Additionally, contrary to the sentence–picture verification experiment (Study 1), the picture–picture verification experiment (Study 2) did show the predicted interaction between object size and match. Thus, without a sentence context, pictures of large objects do yield a larger orientation effect than do pictures of small objects. To summarize, the evidence regarding our first preregistered hypotheses is mixed. Only the task that is the most purely visual (Study 2) shows the predicted interaction between size and orientation, but the lack of such an interaction in the language-based task (Study 1) suggests that object size cannot be used to explain the relatively small size of the orientation effect relative to other perceptual features such as color, actual object size (per se), or shape. Our second hypothesis is pertained to the generalizability of the orientation effect across languages, specifically in the picture–sentence verification task, as this is what was used in virtually all previous studies. Although the meta-analysis of Study 1 results shows a main effect of match, the pattern of orientation effects differed across languages in a rather complex manner. The interaction itself is not very strong (p = .035), and object size is not a moderator in the meta-analysis. As a supplement to our first hypothesis, we investigated in Study 2 whether differences could be found in picture verification times for large and small objects in an attempt to examine the role of mental rotation in the orientation effect. The mixed ANOVA showed longer verification times for larger objects as we predicted. Also, the metaanalysis indicated that object size moderated the verification times to target pictures. This suggests that mental rotation is a mental process that participants engage in during a picture–picture verification task.

Exploratory Analyses As shown in the meta-analysis of Study 1, there was an effect of orientation across languages and object size. This extends the original finding by Stanfield and Zwaan (2001). Nevertheless, the meta-analytic effect is small (12 ms) and considerably smaller than the original effect of 44 ms or the Zwaan and Pecher (2012) replication of 35 ms. Importantly, the present study is not a direct replication. First, different materials were used. Second, three language groups were included. However, even if one takes only the native English speakers, the effect (across large and small objects) is 15 ms. The match advantage for the Chinese participants is 35 ms. The Dutch participants Experimental Psychology (2020), 67(1), 56–72

S.-C. Chen et al., Simulation of Object Orientation Across Size

showed no orientation effect when the stimuli were presented in Dutch (2 ms across large and small objects). This is consistent with earlier findings of Rommers et al. (2013) and de Koning et al. (2017a), who also tested the orientation effect using Dutch stimuli. Combined with earlier findings, a pattern seems to be arising. In English, a small but reliable match advantage is found. In Dutch, however, no effect is found, while in Chinese a larger effect is found than in either English or Dutch (in the present study). As alluded to earlier, the lack of an effect in Dutch might be due to the nature of the stimuli. In Dutch, it is common, at least according to the second and third authors’ intuitions as native speakers of Dutch, to use a verb that describes the orientation of an object when describing the location of that object. Thus, Het boek staat op de plank (The book stands on the shelf) is more common than Het boek is op de plank (The book is on the shelf). It is possible that the lack of an orientation verb in our stimulus sentences has thrown our Dutch speakers off in that it violated their expectations. It is conceivable that Dutch speakers are used to using the orientation verb as the primary source of information on how to represent the orientation of an object rather than relying on the prepositional phrase to infer the object’s orientation from, as was necessitated by our stimuli. Perhaps the lack of an orientation effect in Dutch is due to the relative unusualness of our Dutch stimulus sentences. There are two aspects of our data that are consistent with this tentative explanation. First, as noted before, in our pilot study, we obtained an effect in a Dutch–Dutch sample for small objects. When we checked the stimuli, we noticed that there were many stimulus sentences with an orientation verb. These sentences were constructed by a Dutch research assistant, who had translated these sentences from English examples. This means that the research assistant automatically used orientation verbs when translating from sentences without an orientation verb. Apparently, the assistant thought this was the best translation rather than using a more literal one (i.e., using the orientation-neutral), which was deliberately used in the original study by Stanfield and Zwaan (2001) precisely because of this characteristic. Moreover, we obtained an orientation effect with these sentences containing orientation verbs. Thus, the lack of an orientation effect in Dutch sentences might be attributed to the absence of orientation verbs. The meta-analytical data provide a second hint that is consistent with this idea. When tested in English, the Dutch participants showed a pattern that is somewhere between English and Dutch data, at least with regard to the large objects. The Dutch–Dutch sample has an orientation effect of 1 ms, while the English sample has 29 ms. The Dutch–English sample has a difference of 13 ms. This is to © 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

ensure that this analysis is highly speculative (hence its presence in a section titled exploratory analyses), but it paves avenues for further research. The Chinese sample shows the largest effects of the three language groups and is, in fact, the only sample showing an effect for small objects. We are not sure why this is the case. One thing to note, however, is that, as we noted earlier, this sample shows by far the largest variability in response times. One plausible factor is that many Taiwanese participants were likely not as experienced as English and Dutch participants in online psychological experiments. Nevertheless, even though the Chinese data showed the largest variability in this study, our best-fitting mixed-effect model did not show a significant interaction between language group and match advantage. A further observation concerns the putative role of mental rotation in the various tasks. The meta-analysis of Study 1 suggests that object size did not moderate the match advantage of object orientation, although we found significant match advantages of large objects in English and a match advantage of small objects in Chinese. On the other hand, the meta-analysis of Study 2 indicates that object size indeed moderated the mental rotation speed of objects. These results demonstrate that the mental rotation hypothesis provides a better account for the perception task (i.e., picture–picture verification) than for the reading task (i.e., sentence–picture verification). The preregistered hypothesis was supported only by the results from the picture–picture verification task. The results suggest that mental rotation might play a role in the picture–picture verification task, but not in the sentence–picture verification task. Hence, the present study does not confirm the suggestion of de Koning et al. (2017a) that the failure to find an orientation effect in a sentence–picture verification task might be due to participant’s engagement in mental rotation processes where a mismatched orientation is quickly transformed into an orientation that matches the object as described in the sentence. According to our preregistered plan, the findings of the sentence–picture naming task (Study 3) provide clues to explore theoretical and methodological issues beyond the confirmatory analysis. Our picture-naming task showed a main effect of object size but null effect of object orientation. This result speaks to the theoretical distinction between extrinsic and intrinsic properties (Scorolli, 2014). Embodied cognition researchers have classified object orientation as an extrinsic property and object size as an intrinsic property (e.g., de Koning et al., 2017a). This classification depends on the assumption that simulating an intrinsic property requires only the visual system, whereas simulating extrinsic property requires the visual and motor systems. The finding that supports this © 2020 Hogrefe Publishing

69

distinction is that English readers, in feature-generation tasks, provided more intrinsic property than extrinsic property of target objects (McRae, Cree, Seidenberg, & McNorgan, 2005; Wu & Barsalou, 2009). One account to be examined is that the object name would sufficiently initiate the simulation of the extrinsic property. Simulating the intrinsic property, on the other hand, would require the understanding of the sentence context. It is premature to claim that our findings support the theoretical distinction between extrinsic and intrinsic properties, but our analysis provides clues for further research. One potential topic is to isolate the distinction between extrinsic and intrinsic properties. In the study which measured two kinds of object properties, an orthogonal manipulation of properties would be the appropriate method. Take object size and orientation as an example. Researchers could test the target picture rocket in four scenarios: large object and vertical “The rocket that will carry the satellite has been placed on the launching platform”; large object and horizontal “The rocket that is being transferred to the base will carry the satellite”; small object and vertical “In that diorama, a rocket has been placed on the launching platform”; small object and vertical “The rocket on that table will be placed in the diorama on the launching platform.” With a number of sentence–picture sets like in this example, researchers could examine if there is an interaction between size and orientation and evaluate to which extent the sentence context influenced the retrieval of object names.

Conclusions In this preregistered report, we tested two hypotheses about the orientation effect in the sentence–picture verification task. The first hypothesis was that the match advantage should be larger for large nonmanipulable objects than for small manipulable objects. The second hypothesis was that this effect should generalize across the languages we tested, Chinese, Dutch, and English. Neither hypothesis was supported by the data. Although the match effect was optically larger for large objects than for small objects, there was no significant interaction between object size and match. Therefore, we cannot conclude that object manipulability is a factor in the orientation effect and the fact that the orientation effect is typically smaller than the effects for shape, color, and size is due to manipulability. Contrary to our second hypothesis, the orientation effect did vary across languages. The effect was largest in Chinese, smaller in English, and absent in Dutch. Experimental Psychology (2020), 67(1), 56–72


70

In exploratory analyses, we attempted to explain all our findings and suggest a design to explore further theoretical questions. We tentatively explain the lack of an orientation effect in Dutch by noting that the experimental sentences are somewhat unusual in Dutch as compared to English. A future (preregistered) experiment could examine this idea further. These results suggest the orientation effect may not always be as straightforward as researchers think to generalize effects from one language to other languages and from one task to other tasks. We should note that there are several limitations to our approach. First, we explored the cognitive aspects of simulating object orientation and size in terms of specific tasks, namely, the sentence–picture verification task (our primary experiment), the picture–picture verification task, and the sentence–picture naming task. These tasks did not show similar patterns of results, suggesting that the results are, to some extent, task-specific. The sentence–picture verification task, our primary task, showed the most complex pattern of results, which we have tried to explain above. The sentence–picture naming task showed mainly null results. We are not sure whether this is because we collected the naming data online or simply because the picture-naming task may not be sensitive enough to detect an orientation effect, given that the orientation effect is small to begin with, and naming effects tend to be smaller than verification effects (Zwaan, 2014). The picture–picture verification task showed the clearest results. This is perhaps not surprising given that it is the only language-independent task and does not rely on inferential (or knowledge-activation) processes. Remarkably, this task showed that an interaction between object size and the match advantage might have to do with the size of the depicted objects but could also yield to a more mundane explanation, for example, the pictures depicting the larger objects had more visual features to be verified than the pictures depicting the small objects. The current study set out to investigate key issues that have been unexplored in embodied cognition research (see Ostarek & Huettig, 2019). Although not fully conclusive, our results point out to two aspects that could help advance research and theorizing regarding the mental simulation of object properties. First, we made an attempt to investigate whether and how variation in the characteristics of objects (i.e., size) within one object property (i.e., orientation) affects the mental simulation of objects. Second, we investigated whether the mental simulation of object orientation is language independent. Our findings suggest that differences exist in mental simulation across languages, which is important to take into account when predicting, comparing, or interpreting mental simulation findings from various languages. Together, the findings of this study provide a first step in furthering our Experimental Psychology (2020), 67(1), 56–72

S.-C. Chen et al., Simulation of Object Orientation Across Size

understanding of the mental simulation of object properties and hopefully inspire other researchers to contribute to further developments. Along this line, Chen et al. (2018) assessed the replicability of the match advantage of object orientation in more than 14 languages. These and other initiatives can build on our findings to explore novel questions and further refine theories of mental simulation in language comprehension.

Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at https://doi.org/10. 1027/1618-3169/a000468 ESM 1. Data and summary of the stimulus set ESM 2. Data and summary of the stimulus set

References Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19 (6), 716–723. https:// doi.org/10.1109/TAC.1974.1100705 Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2019). Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods, 52, 388407. https://doi.org/10.3758/s13428-019-01237-x Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278. https://doi.org/10.1016/j.jml.2012.11.001 Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577–660. https://doi.org/10.1017/ S0140525X99002149 Boersma, P., & Weenink, D. (2001). Praat, a system for doing phonetics by computer. Glot International, 5(9/10), 341–345. Bonin, P., Peereman, R., Malardier, N., Méot, A., & Chalard, M. (2003). A new set of 299 pictures for psycholinguistic studies: French norms for name agreement, image agreement, conceptual familiarity, visual complexity, image variability, age of acquisition, and naming latencies. Behavior Research Methods, Instruments, & Computers, 35(1), 158–167. https://doi.org/10. 3758/BF03195507 Brodeur, M. B., Guérard, K., & Bouras, M. (2014). Bank of standardized stimuli (BOSS) phase II: 930 new normative photos. PLoS One, 9(9), e106953. https://doi.org/10.1371/journal.pone. 0106953 Burnham, K. P., & Anderson, D. R. (2010). Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.). New York, NY: Springer. Chen, S.-C., Szabelska, A., Chartier, C. R., Kekecs, Z., Lynott, D., Bernabeu, P., …, Schmidt, K (2018). Investigating object orientation effects across 14 languages. https://doi.org/10.31234/osf.io/t2pjv Cohen, D., & Kubovy, M. (1993). Mental rotation, mental representation, and flat slopes. Cognitive Psychology, 25, 351–382. https://doi.org/10.1006/cogp.1993.1009

© 2020 Hogrefe Publishing


S.-C. Chen et al., Simulation of Object Orientation Across Size

Connell, L. (2005). Colour and stability in embodied representations. In B. Bara, L.W. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th annual conference of the cognitive science society (pp. 482–487). Mahwah, NJ: Lawrence Erlbaum. Connell, L. (2007). Representing object colour in language comprehension. Cognition, 102, 476–485. https://doi.org/10.1016/j. cognition.2006.02.009 De Koning, B. B., Wassenburg, S. I., Bos, L. T., & van der Schoot, M. (2017a). Mental simulation of four visual object properties: Similarities and differences as assessed by the sentencepicture verification task. Journal of Cognitive Psychology, 29(4), 420–432. https://doi.org/10.1080/20445911.2017.1281283 De Koning, B. B., Wassenburg, S. I., Bos, L. T., & van der Schoot, M. (2017b). Size does matter: Implied object size is mentally simulated during language comprehension. Discourse Processes, 54(7), 493–503. https://doi.org/10.1080/0163853X.2015. 1119604 Engelen, J. A. A., Bouwmeester, S., de Bruin, A. B. H., & Zwaan, R. A. (2011). Perceptual simulation in developing language comprehension. Journal of Experimental Child Psychology, 110(4), 659–675. https://doi.org/10.1016/j.jecp.2011.06.009 Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41, 1149–1160. https://doi.org/10.3758/BRM.41.4.1149 Gao, X., & Jiang, T. (2018). Sensory constraints on perceptual simulation during sentence reading. Journal of Experimental Psychology: Human Perception and Performance, 44(6), 848–855. https://doi.org/10.1037/xhp0000475 Hoeben Mannaert, L. N., Dijkstra, K., & Zwaan, R. A. (2017). Is color an integral part of a rich mental simulation? Memory & Cognition, 45(6), 974–982. https://doi.org/10.3758/s13421-017-0708-1 Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford, New York: Oxford University Press. McRae, K., Cree, G. S., Seidenberg, M., & McNorgan, C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4), 547–559. https://doi.org/10.3758/BF03192726 Morey, R. D., Rouder, J. N., Love, J., & Marwick, B. (2015). BayesFactor: 0.9.12-2 CRAN. Zenodo. https://doi.org/10.5281/zenodo. 31202 Ostarek, M., & Huettig, F. (2019). Six challenges for embodiment research. Current Directions in Psychological Science, 28(6), 593–599. https://doi.org/10.1177/0963721419866441 Parsons, L. M., Fox, P. T., Downs, J. H., Glass, T., Hirsch, T. B., Martin, C. C., …, Lancaster, J. L. (1995). Use of implicit motor imagery for visual shape discrimination as revealed by PET. Nature, 375(6526), 54–58. https://doi.org/10.1038/375054a0 Pecher, D., van Dantzig, S., Zwaan, R. A., & Zeelenberg, R. (2009). Language comprehenders retain implied shape and orientation of objects. Quarterly Journal of Experimental Psychology, 62(6), 1108–1114. https://doi.org/10.1080/17470210802633255 Rommers, J., Meyer, A. S., & Huettig, F. (2013). Object shape and orientation do not routinely influence performance during language processing. Psychological Science, 24(11), 2218–2225. https://doi.org/10.1177/0956797613490746 Schönbrodt, F. D., Wagenmakers, E.-J., Zehetleitner, M., & Perugini, M. (2017). Sequential hypothesis testing with Bayes factors: Efficiently testing mean differences. Psychological Methods, 22(2), 322–339. https://doi.org/10.1037/met0000061 Scorolli, C. (2014). Embodiment and language. In L. Shapiro (Ed.), The Routledge handbook of embodied cognition (pp. 127–138). New York, NY: Routledge. Shepard, R. N., & Metzler, J. (1971). Mental rotation of threedimensional objects. Science, 171(3972), 701–703. https://doi. org/10.1126/science.171.3972.701

© 2020 Hogrefe Publishing

71

Stanfield, R. A., & Zwaan, R. A. (2001). The effect of implied orientation derived from verbal context on picture recognition. Psychological Science, 12(2), 153–156. https://doi.org/10.1111/ 1467-9280.00326 Vukovic, N., & Williams, J. N. (2015). Individual differences in spatial cognition influence mental simulation of language. Cognition, 142, 110–122. https://doi.org/10.1016/j.cognition.2015.05.017 Wexler, M., Kosslyn, S. M., & Berthoz, A. (1998). Motor processes in mental rotation. Cognition, 68(1), 77–94. https://doi.org/10.1016/ S0010-0277(98)00032-8 Windischberger, C., Lamm, C., Bauer, H., & Moser, E. (2003). Human motor cortex activity during mental rotation. NeuroImage, 20(1), 225–232. https://doi.org/10.1016/s1053-8119(03)00235-0 Winter, B., & Bergen, B. (2012). Language comprehenders represent object distance both visually and auditorily. Language and Cognition, 4(1), 1–16. https://doi.org/10.1515/langcog-2012-0001 Wohlschläger, A., & Wohlschläger, A. (1998). Mental and manual rotation. Journal of Experimental Psychology: Human Perception and Performance, 24(2), 397–412. https://doi.org/W Wu, L.L., & Barsalou, L. W. (2009). Perceptual simulation in conceptual combination: Evidence from property generation. Acta Psychologica, 132(2), 173–189. https://doi.org/10.1016/j.actpsy. 2009.02.002 Yaxley, R. H., & Zwaan, R. A. (2007). Simulating visibility during language comprehension. Cognition, 105(1), 229–236. https:// doi.org/10.1016/j.cognition.2006.09.003 Zwaan, R. A. (2014). Replications should be performed with power and precision: A response to Rommers, Meyer, and Huettig (2013). Psychological Science, 25(1), 305–307. https://doi.org/10. 1177/0956797613509634 Zwaan, R. A., & Madden, C. J. (2005). Embodied sentence comprehension. In D Pecher & R A Zwaan (Eds.), Grounding cognition: The role of perception and action in memory, language, and thinking (pp. 224–245). Cambridge, UK: Cambridge University Press. Zwaan, R. A., & Pecher, D. (2012). Revisiting mental simulation in language comprehension: Six replication attempts. PLoS One, 7, e51382. https://doi.org/10.1371/journal.pone.0051382 Zwaan, R. A., Stanfield, R. A., & Yaxley, R. H. (2002). Language comprehenders mentally represent the shapes of objects. Psychological Science, 13, 168–171. https://doi.org/10.1111/14679280.00430 Zwaan, R. A., & Taylor, L. J. (2006). Seeing, acting, understanding: Motor resonance in language comprehension. Journal of Experimental Psychology: General, 135(1), 1–11. https://doi.org/10. 1037/0096-3445.135.1.1 History Received June 16, 2017 Revision received December 18, 2019 Accepted December 20, 2019 Published online May 11, 2020 Acknowledgments We thank David Feinberg of McMaster University for providing assistance and suggestions for extracting the naming latencies (see the contact of Feinberg’s laboratory at https://osf.io/jc2hk/). Authorship Sau-Chin Chen prepared the materials and scripts and collected and analyzed the data. Bjorn de Koning and Rolf Zwaan translated and revised the materials and managed the Dutch participant pool. All authors contributed to the writing of the proposal and final report. We thank the comments of Falk Huettig and the other anonymous reviewer on our Stage 1 proposal. David Feinberg and Jianan Li provided assistance on naming data processing.

Experimental Psychology (2020), 67(1), 56–72


72

S.-C. Chen et al., Simulation of Object Orientation Across Size

Open Data See information in Appendix A.

Appendix B

Funding This research is supported by the grants from the Ministry of Science and Technology, Taiwan, R.O.C. (MOST 105-2918-I-320002; MOST 105-2410-H-320-001) and Tzu-Chi University (TCMRC-P104002; TCMRC-P-107013).

Selections of Mixed-Effect Models

ORCID Sau-Chin Chen  http://orcid.org/0000-0001-6092-6049 Sau-Chin Chen Department of Human Development and Psychology Tzu-Chi University No. 67, Jie-Ren St. Hualien 97004 Taiwan csc2009@mail.tcu.edu.tw

Appendix A Guideline for Replication and Reproduction Researchers who are planning to replicate the studies can download the experimental scripts, sentence sheets, and picture files from the open materials repository (permanent link: https://gorilla.sc/openmaterials/39336). The data files and analytical scripts are accessible in this project repository (https://osf.io/auzjk/). Readers who want to reproduce the data analysis can at first download the files in this repository. In the guideline file datafile_guide.md (packaged in includes.zip), we summarized the content of files to reproduce the data analysis of this project. Considered the massive numbers and volume of voice files, we packaged the recorded voice of English and Chinese participants in Study3_voice.7z with the Praat script. This file is available in this project repository (direct access link: https://osf.io/sc59e/).

Experimental Psychology (2020), 67(1), 56–72

The mixed-effect models in each study considered inclusion of trail sequence and the constituents of random effects. There were four types of models, and each included the three fixed effects (language groups, object size, and matching setting) and the two random effects (participants, target objects). In each study, we evaluated the fitness of four types of models: (a) the model does not include the trial sequence as the intercept, (b) the model includes the trial sequence as the intercept, (c) the model includes the trial sequence as the intercept and the correlation of trial sequence and participants, and (d) the model includes the trial sequence as the intercept and the correlation of trial sequence and items. Table B1 summarizes the statistical information of each model. Model (c) had the smallest AIC in each study.

Table B1. Statistical information of mixed-effect models Study Sentence–picture verification

Picture–picture verification

Sentence–picture naming

Model

Degree of freedom

AIC

a

19

307,588.56

b

20

305,454.49

c

22

305,093.07

d

22

305,464.32

a

19

171,204.32

b

20

171,169.64

c

22

171,145.05

d

22

171,119.18

a

11

146,054.77

b

12

146,048.33

c

14

145,887.37

d

14

146,052.12

© 2020 Hogrefe Publishing


Instructions to Authors Experimental Psychology publishes innovative, original, high quality experimental research. The scope of the journal is defined by experimental methodology and thus papers based on experiments from all areas of psychology are welcome. To name just a few fields and domains of research, Experimental Psychology considers manuscripts reporting experimental work on learning, memory, perception, emotion, motivation, action, language, thinking, problem-solving, judgment and decision making, social cognition, and neuropsychological aspects of these topics. Apart from the use of experimental methodology, a primary criterion for publication is that research papers make a substantial contribution to theoretical research questions. For research papers that have a mainly applied focus, Experimental Psychology is not the appropriate outlet. Experimental Psychology publishes the following types of articles: Research Articles, Short Research Articles, Theoretical Articles, and Registered Reports. Replication studies should be submitted as a Registered Report. Manuscript Submission. All manuscripts should in the first instance be submitted electronically at http://www.editorial manager.com/exppsy. Detailed instructions to authors are provided at http://www.hogrefe.com/j/exppsy Copyright Agreement. By submitting an article, the author confirms and guarantees on behalf of him-/herself and any coauthors that he or she holds all copyright in and titles to the submitted contribution, including any figures, photographs, line drawings, plans, maps, sketches, tables, raw data, and other electronic supplementary material, and that the article and its contents does not infringe in any way on the rights of third parties. ESM and raw data files will be published online as received from the author(s) without any conversion, testing, or reformatting. They will not be checked for typographical errors or functionality. The author indemnifies and holds harmless the publisher from any third party claims. The author agrees, upon acceptance of the article for publication, to transfer to the publisher the exclusive right to reproduce and distribute the article and its contents, both

Experimental Psychology (2020), 67(1)

physically and in nonphysical, electronic, and other form, in the journal to which it has been submitted and in other independent publications, with no limits on the number of copies or on the form or the extent of the distribution. These rights are transferred for the duration of copyright as defined by international law. Furthermore, the author transfers to the publisher the following exclusive rights to the article and its contents: 1. The rights to produce advance copies, reprints, or offprints of the article, in full or in part, to undertake or allow translations into other languages, to distribute other forms or modified versions of the article, and to produce and distribute summaries or abstracts. 2. The rights to microfilm and microfiche editions or similar, to the use of the article and its contents in videotext, teletext, and similar systems, to recordings or reproduction using other media, digital or analogue, including electronic, magnetic, and optical media, and in multimedia form, as well as for public broadcasting in radio, television, or other forms of broadcast. 3. The rights to store the article and its content in machinereadable or electronic form on all media (such as computer disks, compact disks, magnetic tape), to store the article and its contents in online databases belonging to the publisher or third parties for viewing or downloading by third parties, and to present or reproduce the article or its contents on visual display screens, monitors, and similar devices, either directly or via data transmission. 4. The rights to reproduce and distribute the article and its contents by all other means, including photomechanical and similar processes (such as photocopying or facsimile), and as part of so-called document delivery services. 5. The right to transfer any or all rights mentioned in this agreement, as well as rights retained by the relevant copyright clearing centers, including royalty rights to third parties. Online Rights for Journal Articles

Guidelines on authors’ rights to archive electronic versions of their manuscripts online are given in the document “Guidelines on sharing and use of articles in Hogrefe journals” on the journal’s web page at www.hogrefe.com/j/exppsy. January 2020

© 2020 Hogrefe Publishing


A concise guide to the assessment and treatment of insomnia for busy professionals “This book provides an excellent concise introduction to insomnia and its treatment with cognitive behavioral therapy. A great addition to any therapist’s bookshelf!” Philip Gehrman, PhD, CBSM, Associate Professor, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA

William K. Wohlgemuth / Ana Imia Fins

Insomnia (Series: Advances in Psychotherapy – Evidence-Based Practice – Volume 42) 2019, viii + 94 pp. US $29.80 / € 24.95 ISBN 978-0-88937-415-7 Also available as eBook About 40% of the population experiences difficulty falling or staying asleep at some time in a given year, while 10% of people suffer chronic insomnia. This concise reference written by leading experts for busy clinicians provides practical and upto-date advice on current approaches to assessment, diagnosis, and treatment of insomnia. Professionals and students learn to correctly identify and diagnose insomnia and gain hands-on information on how to

www.hogrefe.com

carry out treatment with the best evidence base: cognitive behavioral therapy for insomnia (CBT-I). The American Academy of Sleep Medicine (AASM) and the American College of Physicians (ACP) both recognize CBT-I as the first-line treatment approach to insomnia. Appendices include useful resources for the assessment and treatment of insomnia, which readers can copy and use in their clinical practice.


Help medical and other health care students prepare for behavioral science examinations “Targeted specifically for medical students with the goal of helping them pass the behavioral science section of the USMLE, this book is extraordinarily useful for all those interested in the interaction of behavior and medicine.” Bradley R. Cutler, MD, Rush University Medical Center, in Doody’s Book Reviews, (reviewing of the 5th edition)

Danny Wedding, PhD / Margaret L. Stuber, MD (Editors)

Behavior and Medicine 6th edition 2020, xvi / 354 pp. US $69.00 / € 59.95 ISBN 978-0-88937-560-4 The latest edition of this popular textbook on the behavioral and social sciences in medicine has been fully revised and updated to meet the latest requirements on teaching recommended by the National Academy of Medicine (NAM). It is an invaluable resource for behavioral science foundation courses and exam preparation in the fields of medicine and health, including the USMLE Step 1. Its 23 chapters are divided into five core sections: mind– body interactions in health and disease, patient behavior, the physician’s role, physician–patient interactions, and social and cultural issues in health care. Under the careful guidance and editing of Danny Wedding, PhD, Distinguished Consulting Faculty Member, Saybrook University, Oakland, CA, and Margaret L. Stuber, MD, Professor of Psychiatry and Biobehavioral Sciences at UCLA, nearly 40 leading educators from major medical

www.hogrefe.com

Out h Marc 2020

faculties have contributed to produce this well-designed textbook. The following unique features of Behavior and Medicine make it one of the most popular textbooks for teaching behavioral sciences: • Based on the core topics recommended by the NAM • Numerous case examples, tables, charts, and boxes for quick access to information • Resources for students and instructors, including USMLE-style review Q & As • Specific “Tips for the Step” in each chapter guide learning • The use of works of art, poetry, and aphorisms “humanize” the material • Comprehensive, trustworthy, and up-to-date • Competitive price


Psychological Test Adaptation and Development Official Open Access Organ of the European Association of Psychological Assessment (EAPA)

“PTAD will be an important outlet for everyone interested in assessment!” Mathias Ziegler, Editor-in-Chief, Humboldt University Berlin

Psychological Test Adaptation and Development Editor-in-Chief Matthias Ziegler

Official Open Access Organ of the European Association of Psychological Assessment

OA New al Journ

About the journal PTAD is the first open access, peer-reviewed journal publishing papers which present the adaptation of tests to specific needs (e. g., cultural), test translations or the development of existing measures. Moreover, the focus is on the empirical testing of the psychometric quality of these measures. The journal provides a paper template, and registered reports are strongly encouraged. It is a unique outlet for research papers portraying adaptations (e. g., translations) and developments (e. g., state to trait) of individual tests – the backbone of assessment. The expert editor-in-chief is supported by a stellar cast of internationally renowned associate editors. A generous APC waiver program is available for eligible authors. Benefits for authors: • Clear guidance on the structure of papers helps you write good papers • Fast peer-review, aided by the clear structure of your paper • With the optional registered report format you can get expert advice from seasoned reviewers to help improve your research • Open access publication, with a choice of Creative Commons licenses • Widest possible dissemination of your paper – and thus of qualified information about your test and your research • Generous APC waiver program and discounts for members of selected associations The journal welcomes your submissions! All manuscripts should be submitted online via Editorial Manager, where full instructions to authors are also available: https://eu.hogrefe.com/j/ptad


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.