Volume 33 / Number 1 / 2017
Volume 33 / Number 1 / 2017
European Journal of
Psychological Assessment
European Journal of Psychological Assessment
Editor-in-Chief Samuel Greiff Associate Editors Nicolas Becker Gary N. Burns Laurence Claes Marjolein Fokkema David Gallardo-Pujol Dragos Iliescu Christoph Kemper Stefan Krumm Lena Lämmle Anastasiya Lipnevich René T. Proyer Ronny Scherer Matthias Ziegler
Official Organ of the European Association of Psychological Assessment
The multiaxial diagnostic system based on psychodynamic principles, now for children and adolescents OPD-CA-2 Task Force / Franz Resch / Georg Romer / Klaus Schmeck / Inge Seiffge-Krenke (Editors)
OPD-CA-2 Operationalized Psychodynamic Diagnosis in Childhood and Adolescence Theoretical Basis and User Manual
2017, xvi + 334 pp. US $69.00 / € 54.95 ISBN 978-0-88937-489-8 Following the success of the Operationalized Psychodynamic Diagnosis for Adults (OPD-2), this multiaxial diagnostic and classification system based on psychodynamic principles has now been adapted for children and adolescents by combining psychodynamic, developmental, and clinical psychiatric perspectives. The OPD-CA-2 is based on four axes that are aligned with the new dimensional approach in the DSM-5: I = interpersonal relations, II = conflict,
www.hogrefe.com
III = structure, and IV = prerequisites for treatment. After an initial interview, the clinician (or researcher) can evaluate the patient’s psychodynamics according to these axes to get a comprehensive psychodynamic view of the patient. Easy-to-use checklists and evaluation forms are provided. The set of tools and procedures the OPD-CA-2 manual provides have been widely used for assessing indications for therapy, treatment planning, and measuring change, as well as providing information for parental work.
European Journal of
Psychological Assessment Volume 33, No. 1, 2017 OfďŹ cial Organ of the European Association of Psychological Assessment
Editor-in-Chief
Samuel Greiff, Cognitive Science and Assessment, ECCS unit, 11, Porte des Sciences, 4366 Esch-sur-Alzette, Luxembourg (Tel. +352 46 6644-9245, E-mail samuel.greiff@uni.lu)
Editors-in-Chief (past)
Karl Schweizer, Germany (2009–2012), E-mail k.schweizer@psych.uni-frankfurt.de Matthias Ziegler, Germany (2013–2016), E-mail zieglema@hu-berlin.de
Editorial Assistant
Katharina Herborn, Cognitive Science and Assessment, ECCS unit, 11, Porte des Sciences, 4366 Esch-sur-Alzette, Luxembourg, (Tel. +352 46 6644-5578, E-mail katharina.herborn@uni.lu)
Associate Editors
Nicolas Becker, Germany; Gary N. Burns, USA; Laurence Claes, Belgium; Marjolein Fokkema, The Netherlands; David Gallardo-Pujol, Spain; Dragos Iliescu, Romania; Christoph Kemper, Luxembourg; Stefan Krumm, Germany; Lena La¨mmle, Germany; Anastasiya Lipnevich, USA; Rene´ Proyer, Germany; Ronny Scherer, Norway; Matthias Ziegler, Germany
Consulting Editors
Paul De Boeck, USA Christine DiStefano, USA Anastasia Efklides, Greece Rocı´o Ferna´ndez-Ballesteros, Spain Brian F. French, USA David Kaplan, USA Klaus Kubinger, Austria Kerry Lee, Singapore Helfried Moosbrugger, Germany
Founders
Rocı´o Ferna´ndez-Ballesteros and Fernando Silva
Supporting Organizations
The journal is the official organ of the European Association of Psychological Assessment (EAPA). The EAPA was founded to promote the practice and study of psychological assessment in Europe as well as to foster the exchange of information on this discipline around the world. Members of the EAPA receive the journal in the scope of their membership fees. Further, the Division for Psychological Assessment and Evaluation, Division 2, of the International Association of Applied Psychology (IAAP) is sponsoring the journal: Members of this association receive the journal at a special rate (see below).
Publisher
Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, Tel. +49 551 999-500, Fax +49 551 999-50111, E-mail publishing@hogrefe.com, Web http://www.hogrefe.com North America: Hogrefe Publishing, 7 Bulfinch Place, 2nd floor, Boston, MA 02114, USA, Tel. +1 866 823-4726, Fax +1 617 354-6875, E-mail customerservice@hogrefe-publishing.com, Web http://www.hogrefe.com
Production
Regina Pinks-Freybott, Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, Tel. +49 551 999-500, Fax +49 551 999-50111, E-mail production@hogrefe.com
Subscriptions
Hogrefe Publishing, Herbert-Quandt-Strasse 4, D-37081 Go¨ttingen, Germany, Tel. +49 551 50688-900, Fax +49 551 50688-998, E-mail zeitschriftenvertrieb@hogrefe.de
Advertising/Inserts
Melanie Beck, Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, Tel. +49 551 999-500, Fax +49 551 999-50111, E-mail marketing@hogrefe.com
ISSN
ISSN-L 1015-5759, ISSN-Print 1015-5759, ISSN-Online 2151-2426
Copyright Information
Ó 2017 Hogrefe Publishing. This journal as well as the individual contributions and illustrations contained within it are protected under international copyright law. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without prior written permission from the publisher. All rights, including translation rights, reserved.
Publication
Published in 6 issues per annual volume (new in 2017; 4 issues from 2004 to 2016)
Subscription Prices
Annual subscription, Institutions (2017): 398.00, US $508.00 Annual subscription, Individuals (2017): 199.00, US $254.00 Postage and handling: 12.00, US $16.00 Special rates: IAAP/Colegio Oficial de Psico´logos members: 129.00, US $164.00 (+ 12.00, US $16.00 postage and handling); EAPA members: Included in membership Single issues: 66.50, US $85.00 (+ postage and handling)
Payment
Payment may be made by check, international money order, or credit card, to Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, or, for North American customers, to Hogrefe Publishing, 7 Bulfinch Place, 2nd floor, Boston, MA 02114, USA.
Electronic Full Text
The full text of the European Journal of Psychological Assessment is available online at http://econtent.hogrefe.com and in PsycARTICLES.
Abstracting/Indexing Services
The journal is abstracted/indexed in Current Contents / Social & Behavioral Sciences (CC/S&BS), Social Sciences Citation Index (SSCI), Social SciSearch, PsycINFO, Psychological Abstracts, PSYNDEX, ERIH, and Scopus. Impact Factor (2015): 1.969.
European Journal of Psychological Assessment (2017), 33(1)
Janos Nagy, Hungary Tuulia Ortner, Austria Willibald Ruch, Germany Manfred Schmitt, Germany Ste´phane Vautier, France Fons J.R. van de Vijver, The Netherlands Alina von Davier, USA Cilia Witteman, The Netherlands
Ó 2017 Hogrefe Publishing
Contents Editorial
The Field of Psychological Assessment: Where it Stands and Where it’s Going – A Personal Analysis of Foci, Gaps, and Implications for EJPA Samuel Greiff
1
Original Articles
Testing the Incremental Value of a Separate Measure for Secure Attachment Relative to a Measure for Attachment Anxiety and Avoidance: A Study in Middle Childhood and Early Adolescence Katrijn Brenning, Bart Soenens, and Caroline Braet
5
Mindful Attention Awareness in Spanish Palliative Care Professionals: Psychometric Study With IRT and CFA Models Laura Galiana, Amparo Oliver, Noemı´ Sanso´, M. Dolores Sancerni, and Jose´ M. Toma´s
14
Impact of Differential-Item-Functioning on the Personal Epistemological Beliefs for Senior High School Students Jia-Jia Syu and Liang-Cheng Zhang
22
Anxiety Sensitivity or Interoceptive Sensitivity: An Analysis of Feared Bodily Sensations Peter J. Norton and Katharine Sears Edwards
30
Inconsistency Index for the Zuckerman-Kuhlman-Aluja Personality Questionnaire (ZKA-PQ) Anton Aluja, Angel Blanch, Maite Martı´-Guiu, and Eduardo Blanco
38
Exploring Occasion Specificity in the Assessment of DSM-5 Maladaptive Personality Traits: A Latent State-Trait Analysis Johannes Zimmermann, Axel Mayer, Daniel Leising, Tobias Krieger, Martin grosse Holtforth, and Johanna Pretsch
47
Assessing Perceived Ability to Cope With Trauma: A Multigroup Validity Study of a 7-Item Coping Self-Efficacy Scale Mark W.G. Bosmans, Ivan H. Komproe, Nancy E. van Loey, Leontien M. van der Knaap, Charles C. Benight, and Peter G. van der Velden
55
Incremental Validity of the Trait Emotional Intelligence Questionnaire-Adolescent Short Form (TEIQue-ASF) Alex B. Siegling, Ashley K. Vesely, Donald H. Saklofske, Norah Frederickson, and K. V. Petrides
65
Multistudy Reports
Ó 2017 Hogrefe Publishing
European Journal of Psychological Assessment (2017), 33(1)
Editorial The Field of Psychological Assessment: Where it Stands and Where it’s Going – A Personal Analysis of Foci, Gaps, and Implications for EJPA Samuel Greiff Cognitive Science and Assessment, University of Luxembourg, Esch, Luxembourg
An Initial Overview About 5 years ago, I submitted for the very first time a paper that I had coauthored with a colleague to EJPA. I scanned the types of articles that EJPA usually published at that time, and I honestly wasn’t sure whether our article’s content area and assessment focus would fit the journal’s scope. Now, as the new editor of EJPA, I am faced with the very same question on a somewhat different level: Which articles will be interesting to the community and to the journal, and which will not? And even further, where do potential gaps in the field of assessment exist, and how can these gaps be filled? To obtain a broad overview, I ran a crude and, thus, admittedly, not very comprehensive search in the Web of Science by combining the search terms psychological and assessment with various other search terms (acknowledging that different data bases might lead to different search results; Bakkalbasi, Bauer, Glover, & Wang, 2006). To allow for a comparison of time-related developments, I ran the search first for the years 2014–2016 and then for the years 2008–2010. The results of this search are displayed in Figure 1. In looking at the numbers, there were two main messages I took from them. First, the field of psychological assessment is on the rise. The number of hits from 2014 to 2016 (1,411 hits) had increased by almost 50% compared with 2008 to 2010 (976 hits) as can be seen by comparing Ó 2017 Hogrefe Publishing
the left and right sides of Figure 1. Second, the focus on assessment in the fields of clinical, cognitive, and educational psychology is strong (with a slight increase in the number of hits for the last), followed by articles on methodological topics in assessment and personality assessment (but note that in the search, none of these categories were mutually exclusive). There were surprisingly few hits when I conducted a search on industrial and organizational assessment. This is most likely a problem with the particular search terms I used and not so much with the in reality small number of publications in this area, but it might also indicate that this area could be more strongly represented in psychological assessment journals. In addition to this issue, the relative frequencies for the different fields remained virtually unchanged between the first and second time periods as can be seen by comparing the pattern of the unshaded bars on the left for 2014–2016 with the ones on the right for 2008–2010. After running this search and taking a look at what has been published in EJPA in the past, I felt that, overall, the journal comprehensively reflects the diversity of the field. That is, the core mission of the journal is to advance psychological assessment across content disciplines, and this is what it will continue to strive for in the coming years. Stated differently, submissions from all psychological content areas are very welcome as long as their focus is on advancements in the field of assessment. Had I known this when submitting my first paper 5 years ago, I would have European Journal of Psychological Assessment 2017; Vol. 33(1):1–4 DOI: 10.1027/1015-5759/a000412
2
Editorial
Figure 1. Number of publications in the field of psychological assessment and its subdisciplines. Shaded bars represent absolute frequencies (left y-axis), and unshaded bars represent relative frequencies (right y-axis). The search was conducted on January 8, 2017 in the Web of Science. It combined the search terms ‘‘psychological’’ and ‘‘assessment’’ with ‘‘clinic*’’ for clinical assessment, ‘‘education*’’ for educational assessment, ‘‘industrial organizational’’ for I/O, ‘‘personality’’ for personality, ‘‘methodolog*’’ for methodology, and ‘‘cogniti*’’ for cognitive. The relative frequencies do not sum to 100% because not all of the overall hits were matched with one of the subcategories, and the subcategories were not mutually exclusive.
been a bit more confident about sending it to EJPA, but in my letter to the editor, I might also have stressed more explicitly how the article provided a good match with the journal’s focus on assessment.
Tradition and Innovation We live in a world of constant change, and technology has significantly altered the way our society functions. The field of psychological assessment is not immune to these influences. Innovative assessment instruments that employ computer-based simulations, that use behavioral and process-related data to improve the assessment process (cf. stealth assessment; Shute & Ventura, 2013), or that widen the extent to which existing instruments are able to capture constructs are only a few examples of what this area might contribute to the field. For instance, in the OECD’s educational large-scale assessment program known as the Program for International Student Assessment (PISA), computer-administered science tasks (including simulations and interactive tasks) are employed to allow for a more realistic and diverse set of science literacy tasks (e.g., OECD, 2016). Paper-pencil-based instruments such as classical tests of intelligence and personality questionnaires have been, European Journal of Psychological Assessment 2017; Vol. 33(1):1–4
currently are, and will remain the backbone of psychological assessment and the science surrounding it. In fact, almost all submissions to EJPA fall into this category (for two recent examples, see Schult, Fischer, & Hell, 2016; Smits, Timmerman, Barelds, & Meijer, 2015). In addition to these kinds of papers, the journal explicitly invites submissions that target innovative assessment approaches, whether they involve technological devices such as computers and tablets or some other sort of innovation. Of course, it is my personal choice to explicitly mention innovative assessment and computer-based testing as areas of major development that are welcomed by the journal. However, this is not meant as a shift in focus but rather as an extension of the existing focus. In fact, I was surprised to see the small extent to which innovative assessment methods and instruments were represented when I conducted my literature search. As an add-on analysis, I combined the two major search terms (psychological and assessment) with one of the following terms: 21st century skills, stealth, computer based, computer assisted, and tablet. The overall hit rate for all of them was less than 1% (and this included even my own first submission to EJPA). With this low level of representation in the back of our minds, it is my vision for the field that we will experience some advancements in the coming years that Ó 2017 Hogrefe Publishing
Editorial
will utilize new technologies in such a way that they will complement assessment theory and practice and that this will, in turn, lead to more valid assessment procedures in research and in applied settings.
Implications for Authors and for the Publication Process Irrespective of content, the mission of EJPA (and, arguably, the field of psychological assessment as an entity) is built on three distinct cornerstones: a firm connection to the field of psychological science, a focus on assessment and the substantial advancement of knowledge in the field, and a commitment to the highest levels of quality and transparency in the empirical aspects of the contributions. Strong ties to psychological science: Saying that any assessment instrument needs to be grounded in psychological theory might be viewed by some people as a statement of the obvious, but it is surprising how often this simple and yet fundamental prerequisite is not fulfilled. Too often, an assessment instrument’s name or label is mistaken for its actual content, while it remains unclear how the items are actually mapped onto the underlying theoretical definition. There will be many cases in which not all of the relevant aspects of a theory can be adequately represented by a set of tasks (and the more complex the target construct is, the more prevalent this issue will likely be). In these cases, a clear distinction between the theoretical framework and the assessment framework might help to clarify where the assessment instrument contains blind spots – even if they are deliberate – with regard to construct coverage (cf. Michie, Johnston, Francis, Hardeman, & Eccles, 2008, for an interesting analysis of how to connect theory with intervention). Without giving precise meaning to an assessment instrument through psychological theory, developing and validating such an instrument remains essentially an empty exercise. Focus on assessment and advancement of knowledge: The number of submissions to EJPA have consistently been on the rise, mirroring the increase in the number of articles published in the field of psychological assessment (Figure 1). This increase in submissions, in turn, means that selection criteria need to be applied; and thus, the amount of new knowledge generated by an article from an assessment perspective is an important one. That is, submissions that mainly target substantive research questions with only a secondary focus on assessment will have a better fit with content-focused journals, of which there are many excellent ones out there. Obviously, an evaluation of the amount of new knowledge created by an article is to some extent a subjective one, but authors need only to peruse the journal if they wish to find a plethora of good examples of articles that generate considerable amounts of new knowledge. However, articles that target specific research questions might be relevant to EJPA as well. Brief reports were introduced a couple of years ago as a consequence of the need for a dedicated format for specific research questions. In this, EJPA offers a variety of distinct formats Ó 2017 Hogrefe Publishing
3
(i.e., original articles, brief reports, and multistudy reports). As communicated in previous editorials, there are some topics that are usually of little interest to EJPA such as papers that are primarily methodological in nature or papers that offer only translations of existing instruments (Ziegler & Bensch, 2013). Quality and transparency: EJPA is dedicated to upholding scientific standards of the highest possible quality and to adhering to the process of rigorous peer review. We ensure these aspects by attracting highly committed associate editors, an experienced board of consulting editors, and hand-picked external reviewers. At the same time, we acknowledge that transparency is becoming increasingly important – maybe in particular in the field of psychological science – and that sometimes the final manuscript as the ‘‘end product’’ is not sufficient for communicating a complete understanding of what was done. In psychological assessment, it is probably the actual empirical analyses and how they were implemented that are key to the findings. With this in mind, authors will now have the option to submit their codes and results (i.e., the inputs and outputs from statistical software packages such as Mplus, R, SAS, SPSS, and so forth) along with their manuscripts. This information will then be passed on to the external reviewers for further inspection. As an additional measure, when an article is accepted for publication in EJPA, authors will henceforth be required to submit both their inputs and outputs along with a brief description as Electronic Supplementary Material (ESM), and this information will be published along with the final article. With this first editorial, it was my goal to cover some general developments in the field of psychological assessment and aspects that are fundamental for publishing in EJPA. There are probably few surprises with regard to what is deemed important, but sometimes it can be helpful to explicate the implicit. In fact, thinking back to my first submission, the article underwent several rounds of revisions before it was finally accepted. Had I been aware of all the points mentioned in this editorial, it might have spared me (and the reviewers, for that matter) at least one round of revision. It is with this hope and spirit that this editorial was written: To provide some general information that is broad and yet relevant and that might help to guide authors’ decisions about whether EJPA is a good outlet for their research, or stated differently, to help them determine how they can maximize their chance of success in EJPA. As in the past, future editorials will continue to address diverse topics such as assessment-relevant methodological aspects, journal-related policy information, or thoughts and opinions with respect to the field of psychological assessment.
References Bakkalbasi, N., Bauer, K., Glover, J., & Wang, L. (2006). Three options for citation tracking. Google Scholar, Scopus, and Web of Science. Biomedical Digital Libraries, 3, 7. doi: 10.1186/1742-5581-3-7 European Journal of Psychological Assessment 2017; Vol. 33(1):1–4
4
Editorial
Michie, S., Johnston, M., Francis, J., Hardeman, W., & Eccles, M. (2008). From theory to intervention: Mapping theoretically derived behavioural determinants to behaviour change techniques. Applied Psychology, 57, 660–680. doi: 10.1111/j.1464-0597.2008.00341.x OECD. (2016). PISA 2015 – Assessment and analytical framework. Science, reading, mathematic and financial literacy. Paris, France: OECD Publishing. doi: 10.1787/9789264255425-en Schult, J., Fischer, F. T., & Hell, B. (2016). Tests of scholastic aptitude cover reasoning facets sufficiently. European Journal of Psychological Assessment, 32, 215–219. doi: 10.1027/ 1015-5759/a000247 Shute, V. J., & Ventura, M. (2013). Measuring and supporting learning in games. Stealth assessment. Cambridge, MA: The MIT Press. Smits, I. A. M., Timmerman, M. E., Barelds, D. P. H., & Meijer, R. R. (2015). The Dutch Symptom Checklist-90-Revised. Is the use of the subscales justified? European Journal of
European Journal of Psychological Assessment 2017; Vol. 33(1):1–4
Psychological Assessment, 31, 263–271. doi: 10.1027/10155759/a000233 Ziegler, M., & Bensch, D. (2013). Lost in translation. Thoughts regarding the translation of existing psychological measures into other languages. European Journal of Psychological Assessment, 29, 81–83. doi: 10.1027/1015-5759/a000167 Samuel Greiff Cognitive Science & Assessment University of Luxembourg 11, Porte des Sciences 4366 Esch-sur-Alzette Luxembourg E-mail samuel.greiff@uni.lu
Ó 2017 Hogrefe Publishing
Original Article
Testing the Incremental Value of a Separate Measure for Secure Attachment Relative to a Measure for Attachment Anxiety and Avoidance A Study in Middle Childhood and Early Adolescence Katrijn Brenning, Bart Soenens, and Caroline Braet Ghent University, Belgium Abstract. Research on attachment in middle childhood and early adolescence has typically relied on either unidimensional measures of attachment security (vs. insecurity) or on differentiated measures of attachment anxiety and avoidance. This study addressed the question whether there is a need to add an explicit measure of security when operationalizing parent-child attachment in terms of anxiety and avoidance. Both dimensional (i.e., regression analyses) and person-centered analyses (i.e., cluster analysis) are used in this study (N = 276, 53% boys, mean age = 10.66) to examine the incremental value of a scale for attachment security (in this study, the Security Scale) in addition to a scale for attachment anxiety and avoidance (in this study, the Experiences in Close Relationships Scale-Revised – Child version; ECR-RC). The present results suggest that an assessment of anxious and avoidant attachment (using the ECR-RC) may suffice to capture the quality of parent-child attachment in middle childhood and early adolescence. Keywords: ECR-RC, attachment security, cluster analysis
For quite a long time, middle childhood and early adolescence have been relatively neglected developmental periods in attachment research (Dwyer, 2005; Kerns, Tomich, Aspelmeier, & Contreras, 2000). Moreover, attachment research in this age period typically relied on broad assessments of attachment security versus insecurity using instruments such as the Security Scale (SS; Kerns, Klepac, & Cole, 1996) and the Inventory of Parent and Peer Attachment (IPPA; Armsden & Greenberg, 1987). This reliance on unidimensional measures of attachment security is in contrast with research on attachment in early childhood and adulthood, two life periods in which it is more common to distinguish between two fundamental and qualitatively different dimensions of attachment, that is, attachment anxiety and avoidance. A number of efforts have been made to introduce the distinction between attachment anxiety and avoidance into research on middle childhood and early adolescence (Dwyer, 2005; Yunger, Corby, & Perry, 2005), thereby using for instance the Experiences in Close Relationships Scale-Revised Child version (ECR-RC; Brenning, Soenens, Braet, & Bosmans, 2011a). The present research Ó 2015 Hogrefe Publishing
aimed to investigate whether this two-dimensional approach can largely replace the unidimensional secure-insecure approach to the assessment of attachment, or whether we need to combine both approaches. More specifically, the present research aimed to investigate whether the dimensions of anxious and avoidant attachment, as assessed with the ECR-RC, suffice to capture the quality of parent-child attachment in middle childhood and early adolescence or whether an explicit measure of secure attachment needs to be added for the purpose of incremental validity.
The Anxiety-Avoidance Model of Attachment Particularly in research on adult attachment, there is increasing consensus that, in order to model quality of attachment, it is necessary to assess both anxiety and avoidance (see Mikulincer & Shaver, 2007; Mikulincer & Shaver, 2012). Attachment anxiety involves European Journal of Psychological Assessment 2017; Vol. 33(1):5–13 DOI: 10.1027/1015-5759/a000264
6
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
a preoccupation with social support, jealousy, fear, and vigilance concerning abandonment and rejection. Attachment avoidance involves avoidance of intimacy, discomfort with closeness, and self-reliance. Attachment anxiety and attachment avoidance are considered the underlying dimensions of earlier categorical approaches to adult attachment (Bartholomew & Horowitz, 1991; Hazan & Shaver, 1987). Specifically, by crossing these two dimensions, four categorical attachment orientations can be distinguished: secure attachment (low on both dimensions), anxious or preoccupied attachment (high on anxiety and low on avoidance), dismissive-avoidant attachment (low on anxiety and high on avoidance), and fearful-avoidant attachment (high on both dimensions). The usefulness of the anxiety-avoidance distinction has received abundant support in research on adult attachment (for a review, see Mikulincer & Shaver, 2012). Much of this research has relied on the ECR-R (Fraley, Waller, & Brennan, 2000), which is one of the best validated self-report measures for attachment anxiety and avoidance (Sibley, Fischer, & Liu, 2005; Tsagarakis, Kafetsios, & Stalikas, 2007). Both anxious and avoidant attachment were found to predict unique variance in well-being and psychopathology and were found to be associated with these outcomes via different mechanisms of emotion regulation (Shaver & Mikulincer, 2002). In particular, anxious attachment has been found to be primarily linked to hyperactivation of emotions, whereas avoidant attachment has been found to be primarily linked to emotional deactivation (Mikulincer & Shaver, 2012).
Applying the Anxiety-Avoidance Model to Attachment in Middle Childhood and Early Adolescence In line with research on adults, the distinction between anxious and avoidant attachment has also been introduced in middle childhood and early adolescence (e.g., Brenning et al., 2011a; Finnegan, Hodges, & Perry, 1996). Although relatively less common than in research with adults, recent research with middle childhood children and early adolescents showed that both attachment anxiety and avoidance are uniquely related to well-being and psychopathology (e.g., Brenning, Soenens, Braet, & Bal, 2012). Further, in line with the model by Shaver and Mikulincer (2002), findings showed that children and adolescents scoring higher on attachment anxiety showed more dysregulation of negative emotions (i.e., an emotion regulation strategy conceptually similar to hyperactivation), whereas youngsters scoring higher on attachment avoidance used relatively more suppressive strategies to deal with their emotions (i.e., an emotion regulation strategy conceptually similar to deactivation) (Brenning, Soenens, Braet, & Bosmans, 2012). To tap into children’s and adolescents’ anxious and avoidant attachment representations a number of attachment instruments were developed (for an overview, see Dwyer, 2005), including the Preoccupied and Avoidant Coping Questionnaire (PACQ; Finnegan et al., 1996), and European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
more recently the ECR-RC (Brenning et al., 2011a). The PACQ has continuous scales for children’s preoccupied attachment (which is similar to anxious attachment) and for avoidant attachment. Interestingly, the PACQ also contains a third, separate, and explicit measure of secure attachment. The ECR-RC is a child version of the ECR-R that consists of two continuous scales to measure children’s and adolescents’ anxious and avoidant attachment. Remarkably, in contrast to the PACQ, the ECR-RC contains no separate scale to measure secure attachment. It is assumed that secure attachment is represented by low scores on both attachment anxiety and avoidance. An important question, which will be addressed in the current study, is whether this assumption is valid or whether, in contrast, an explicit measure of secure attachment needs to be added for a complete assessment of the quality of attachment in this age period. Notably, the question whether the distinction between anxiety and avoidance sufficiently captures the domain of attachment is not unique for middle childhood and early adolescence. For instance, in discussing limitations of the ECR-R for adults, Fraley et al. (2000) raised concerns that the ECR-R may not sufficiently capture the secure region of the two-dimensional space (Fraley et al., 2000, p. 364). The question whether a separate measure of secure attachment is needed is all the more important in middle childhood and early adolescence because, as discussed before, research on the differentiation between attachment anxiety and avoidance is less well established in this developmental period.
The Present Study The current study aimed to investigate whether the addition of a scale for secure attachment (i.e., the SS) would have added value when measuring attachment anxiety and avoidance (as assessed with the ECR-RC). We chose the ECR-RC as this is a recently developed but very promising self-report questionnaire, analogous to the ‘‘golden standard’’ that taps into anxious and avoidant attachment representations in adulthood. Both ECR-RC attachment scales were administered with regard to the mother-child relationship because this relationship still represents a key resource for psychosocial adjustment during the life period of middle childhood and early adolescence (Allen, 2008). First, using regression analyses we aimed to investigate whether attachment security would explain additional variance in conceptually important and frequently studied outcomes of attachment. In examining this question we focused not only on maladaptive outcomes (i.e., depressive symptoms, emotional dysregulation, and inhibition). It was deemed important to also include more adaptive outcomes because one might argue that measures of anxiety and avoidance merely capture the dark side of human functioning and fail to capture the bright side. On the basis of this argument one might predict that a measure of attachment security might be of particular added value when studying adaptive developmental outcomes. Based on previous research (e.g., Mikulincer & Shaver, 2012; Sroufe, Egeland, Carlson, & Collins, Ó 2015 Hogrefe Publishing
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
2005), we included self-esteem and adaptive emotion regulation as two additional outcomes in this study. In a second step, cluster analysis was used to identify attachment profiles on the basis of the ECR-RC. We expected that this cluster analysis would yield each of the four attachment types proposed by Bartholomew and Horowitz (1991). Next, we examined whether the addition of a measure of attachment security to the cluster analysis would result in a different cluster solution or even in an extended cluster solution (e.g., one in which an additional profile of attachment emerged). If the ECR-RC sufficiently captures the quality of parent-child attachment in middle childhood and early adolescence, the addition of a measure of secure attachment should not give rise to additional clusters. Also, between-cluster differences in psychosocial outcomes (i.e., depressive symptoms, self-esteem, and emotion regulation strategies) should be virtually identical irrespective of whether security of attachment is added as a clustering variable.
Materials and Method Participants and Procedure The data for this study were used and reported in Brenning et al. (2011a). As such, the present research involves a reanalysis of existing data. The sample consisted of 276 participants (52.9% boys) with a mean age of 10.65 years (SD = 0.92; range = 8–13 years) from nine elementary schools. All participants voluntarily and anonymously completed a battery of questionnaires during class periods and in the presence of a research assistant. All families had a middle-class background. In terms of family structure, 80.3% of the participants came from intact families whereas the remaining participants were from divorced families (19.3%) or families where the father was deceased (0.4%).
Statistical Analyses Our main goal was to examine the incremental value of the SS in addition to the ECR-RC using both dimensional (i.e., regression analysis) and person-centered analyses (i.e., cluster analysis). With regard to the dimensional analysis, two separate series of hierarchical linear regression analyses (using enter method) were used. Specifically, we examined whether the attachment security score (block 2) would contribute to the prediction of several outcome variables beyond the variance explained by the scores for attachment anxiety and avoidance (block 1). Vice versa, we also examined whether the scores for attachment anxiety and avoidance (block 2) would contribute to the variance explained in several outcome variables beyond the attachment security score (block 1). When interpreting the results of these regression analyses, statistical significance of the standardized regression coefficients and effect sizes was inspected. According to Cohen (1988), 0.2 is a small effect size, 0.5 is a medium or moderate effect, and 0.8 is a large effect. Ó 2015 Hogrefe Publishing
7
In a next step, cluster analysis was used to identify attachment profiles on the basis of the dimensions of anxious and avoidant attachment (as assessed with the ECRRC), and in combination with an explicit measure of secure attachment (as assessed with the SS). Based on the recommendations by Gore (2000), cluster analysis was performed in two steps. In a first step, the standardized scale scores for attachment were entered in a hierarchical cluster analysis. Second, nonhierarchical k-means cluster analysis was performed to optimize the hierarchical solution, using the initial seed points of the best cluster solution as derived from the hierarchical cluster solution in Step 1. Two separate cluster analyses were conducted, that is, (a) one where only the ECR-RC scores for anxiety and avoidance were used as clustering variables and (b) one in which the SS-security score was added as a clustering variable. In a final analysis, we examined between-cluster differences in the outcome variables using both solutions. A series of ANOVAs were conducted with cluster membership as the independent variable and the outcomes (i.e., self-esteem, depressive symptoms, and emotion regulation strategies) as dependent variables. For all analyses in this study, SPSS 22 was used.
Measures Experiences in Close Relationships Scale-Revised Child Version The ECR-RC (Brenning et al., 2011a) was used to assess attachment anxiety (e.g., ‘‘I worry about being abandoned’’) and attachment avoidance (e.g., ‘‘I prefer not to show how I feel deep down’’). Items are rated on a 7-point scale ranging from ‘‘not at all’’ to ‘‘very much.’’ The reliability and validity of the ECR-RC has been evidenced in several studies. In terms of reliability, the ECR-RC showed high levels of internal consistency (e.g., Brenning & Braet, 2013; Brenning, Soenens, Braet, & Bosmans, 2011b). In terms of construct validity, research by Brenning et al. (2011a) showed expected associations between the ECR-RC and other attachment measures (e.g., the Relationship Questionnaire). Further, the predictive validity of the ECR-RC was supported by theoretically plausible associations with assessments of both depressive symptoms (Brenning, Soenens, Braet, & Bal, 2012) and strategies of emotion regulation (Brenning, Soenens, Braet, & Bosmans, 2012). In the present study, Cronbach’s alphas of the ECR-RC were .83 and .85 for anxious and avoidant attachment, respectively. Security Scale The SS (Kerns et al., 1996) is a 15-item measure that taps into children’s felt security. More specifically, items assess the degree to which children believe an attachment figure is available and responsive, the child’s use of the attachment figure as a safe haven, and the child’s report of open communication with the attachment figure (Dwyer, 2005). For each item, participants are asked to choose between European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
8
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
one of two response options (e.g., ‘‘Some kids are not sure if they can trust their mom BUT other kids find it easy to trust their mom’’) and, next, to indicate their level of agreement with that option (i.e., ‘‘sort of true for me’’ or ‘‘really true for me’’). As such, each item can be scored from 1 to 4, with a higher score indicating greater perceptions of security. The SS has good psychometric properties, including high internal consistency, test-retest reliability, and good construct validity (Kerns et al., 1996; Van Ryzin & Leve, 2012). In this study, Cronbach’s alpha for the SS was .76. Child Depression Inventory The CDI (Kovacs, 1985) is an adaptation of the Beck Depression Inventory for use with children 7–17 years of age. The scale has 27 items dealing with sadness, selfblame, loss of appetite, insomnia, interpersonal relationships, and school adjustment. For each item, respondents choose one of three responses that best describes them (e.g., ‘‘I am sad sometimes’’; ‘‘I am often sad’’ or ‘‘I am always sad’’). Acceptable levels of internal consistency, test-retest reliability, and validity have been established (e.g., Saylor, Finch, Spirito, & Bennett, 1984). Cronbach’s alpha in the current study was .88. Self-Perception Profile for Children Participants completed the Global Self-Worth subscale of the Self-Perception Profile for Children (Harter, 1985). This subscale consists of five items (e.g., ‘‘I am pretty happy with myself’’), which can be scored on a scale ranging from 1 (= not at all) to 5 (= very much). The questionnaire has good internal consistency, convergent validity, and factorial validity (see Van den Bergh & Marcoen, 1999). In the current sample, Cronbach’s alpha was .81.
Children’s Sadness Management Scale
of sadness expression (e.g., ‘‘I get sad inside but don’t show it’’), (b) dysregulated expression (4 items), defined as expressing sadness in nonconstructive, hyperactivating ways (e.g., ‘‘I whine/fuss about what’s making me sad’’), and (c) adaptive emotion regulation coping (5 items), which refers to intentional, proactive, and targeted strategies for coping with sadness (e.g., ‘‘I try to calmly deal with what is making me feel sad’’). Children responded to the items on a 3-point scale (1 = hardly ever, 2 = sometimes, 3 = often). Research has shown moderate internal consistency for the three subscales and construct validity has been established in relation to self- and other-report measures of sadness regulation and children’s psychological and social functioning (Zeman et al., 2001). Cronbach’s alphas in the present study were as follows: .72 for inhibition, .58 for dysregulation, and .61 for adaptive emotion regulation. These Cronbach’s alpha values are similar to the values obtained in previous research (Zeman et al., 2001).
Results Preliminary Analyses Preliminary analyses were conducted to examine differences in the study variables in terms of children’s age, gender, and family structure. A multivariate analysis of variance (MANOVA) was conducted with gender and family structure as fixed factors and age as a covariate for each of the study variables. No significant multivariate effects were obtained for gender, Wilk’s Lambda = .95; F(8, 205) = 1.32, p > .05, g2 = .05, family structure, Wilk’s Lambda = .94; F(16, 410) = 0.85, p > .05, g2 = .03, or age, Wilk’s Lambda = .96; F(8, 205) = 1.19, p > .05, g2 = .05. Accordingly, these variables were not considered further in the main analyses. Correlations between the study variables are shown in Table 1.
Regression Analyses
Based on the CSMS (Zeman, Shipman, & Penza-Clyve, 2001), 14 items were administered that tap into three dimensions of sadness management: (a) inhibition (5 items), which refers to the deactivation or suppression
A first set of regression analyses included the ECR-RC attachment dimensions in Block 1 and self-esteem, depressive symptoms, and strategies of emotion regulation as
Table 1. Means, standard deviations, and correlations among all study variables Variable 1. ECR-RC anx 2. ECR-RC avoid 3. SS-security 4. Self esteem 5. Depression 6. Dysregulation 7. Inhibition 8. Adaptive ER M SD
1
2
3
4
5
6
7
8
_ .53*** .65*** .37*** .49*** .24*** .21** .11 2.26 0.78
– .68*** .40*** .36*** .12 .34*** .22*** 2.59 0.89
– .37*** .44*** .12 .26*** .14* 26.35 6.29
– .57*** .32*** .23*** .11 3.90 0.80
– .47*** .30*** .15* 36.74 6.52
– .04 .10 1.80 0.47
– .03 2.04 0.50
– 2.22 0.44
Notes. *p < .05. **p < .01. ***p < .001. European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
Ó 2015 Hogrefe Publishing
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
dependent variables (see Table 2). The two attachment dimensions explained a significant portion of variance in the prediction of each of the outcomes (R2 ranges from .04 to .25). Specifically, the results showed that both anxious and avoidant attachment had independent negative associations with self-esteem and positive associations with depressive symptoms. As for the emotion regulation strategies, attachment anxiety was related uniquely to dysregulation, whereas attachment avoidance was related uniquely to inhibition. Note that attachment anxiety was significantly related to inhibition when using a correlational analysis (r = .21, p < .01), while this association was not significant when using a regression analysis with both attachment anxiety and attachment avoidance as independent variables. This result is in line with theory (Shaver & Mikulincer, 2002) and previous research findings (see e.g., Brenning & Braet, 2013; Brenning, Soenens, Braet, & Bosmans, 2012) suggesting that attachment anxiety does not have a unique association with emotional inhibition. With regard to the link with adaptive emotion regulation strategies, attachment avoidance showed a unique and negative association whereas the association with attachment anxiety was nonsignificant. When added in a second step, the score for secure attachment did not account for significant incremental variability in self-esteem or any of the emotion regulation variables (R2change = < .01). Secure attachment did explain a statistically significant but small (R2change = .02) amount of variance in depression scores above and beyond the variability explained by ECR-RC anxiety and avoidance. To further examine the relative contribution of scores for attachment anxiety/avoidance and security, we also performed a series of regression analyses in which the order of entry into the equations was reversed. As can be seen in Table 2, SS attachment security (block 1) was associated with each of the outcome variables (R2 ranges from .02 to .20), except for dysregulation (R2 = .01). When entered in a second block, the scores for anxious and avoidant attachment did account for significant incremental variability in the outcome variables (R2 ranges from .02 to .07). More specifically, attachment anxiety and avoidance both contributed independently to the prediction of self-esteem. For each of the other outcome variables it was either anxiety (for depression and dysregulation) or avoidance (for inhibition and adaptive emotion regulation) that contributed. As such, ECR-RC anxiety and avoidance account for additional variance once the ASS security scale has been allowed to account for as much variance as possible. Further (not shown in Table 2), the initially significant
1
9
associations between secure attachment and all of the outcome variables were turned to nonsignificance (except for the negative association with depressive symptoms, which remained significant). Overall, these results suggest that attachment anxiety and avoidance have the most robust and unique associations with the developmental outcomes.1
Cluster Analyses Cluster Analysis on the ECR-RC Scores Alone To determine the number of clusters that will be retained for the cluster analysis, we inspected the scree plot of cluster properties (Mooi & Sarstedt, 2011). In the current study, the scree plot showed a clear elbow after the first five factors. Based on this scree plot, we estimated a cluster solution with five clusters and inspected the percentage of variance explained. One commonly used criterion is that the cluster solution should explain at least 50% of the variance in each of the defining variables (Milligan & Cooper, 1985). This was indeed the case for our five-factor cluster solution (R2 = .85 for attachment anxiety and .67 for attachment avoidance). To optimize the five-factor cluster solution and to address its interpretability, in a second step, the five-cluster centers derived from Ward’s method were used as the initial cluster centers for a nonhierarchical k-means clustering procedure. We examined the interpretability of the cluster solution by inspecting the z-scores of the clustering variables within each of the clusters. These z-scores, which represent the distances between the cluster means and the total sample standardized mean, in standard deviation units, can be interpreted as effect sizes (0.2 is a small effect, 0.5 is a moderate effect, 0.8 is a large effect). As presented in Figure 1, the results show a distinction between profiles of attachment that could largely be understood using the model by Bartholomew and Horowitz (1991). Participants in a first ‘‘secure attachment’’ cluster scored low on both anxiety and avoidance (z = 0.77 and z = 1.14, respectively). Participants in a second ‘‘preoccupied attachment’’ cluster scored high on anxiety and average on avoidance (z = 1.25 and z = 0.02, respectively). Participants in a third ‘‘dismissive-avoidant attachment’’ cluster scored average on anxiety and high on avoidance (z = 0.03 and z = 0.98, respectively). Participants in a fourth ‘‘fearful-avoidant attachment’’ cluster scored high on both attachment anxiety and avoidance (z = 1.38 and z = 1.56, respectively).
As recommended by a reviewer, all abovementioned analyses were also performed using latent variables. The results of these analyses provided further evidence that secure attachment adds little to the measure of attachment anxiety and avoidance, as none of the effects of secure attachment to the outcome variables were significant after taking into account the effects of attachment anxiety and avoidance. However, the effects of attachment anxiety and avoidance to the outcome variables also became nonsignificant when using latent variables rather than manifest variables. Further inspection revealed that this was due to the very high correlations between the latent scores for attachment anxiety and avoidance on the one hand and the latent score for secure attachment on the other hand. These very high correlations caused problems of multicollinearity, resulting in unstable parameter estimates in the structural models. In our view, the very high correlations between secure attachment and the two attachment scores derived from the ECR-RC (anxiety and avoidance) further underscore the conclusion of this paper. That is, low scores on anxiety and avoidance are largely redundant with high scores on secure attachment. As such, a separate score for secure attachment does not add much to the scores for anxiety and avoidance.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
Y axis: Z-scores
.14* (g2 = .02) .04 (g2 = .00) .21* (g2 = .03) .02* .02* 4.88* 3.44* 2.69
2
.03 (g2 = .00) .22** (g2 = .04) .02 (g2 = .00) .04** .00 5.16** 3.44* 0.05
Adaptive emotion regulation
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
1.5 1 0.5
Anx Av
0
Preocc.
-0.5
Dismiss.
Fearful Undiff.
-1.5
Sec.
X axis: Clusters
Figure 1. z-scores for Anxious (Anx) and Avoidant attachment (Av). Five-cluster solution: Secure attachment cluster (Sec.); Preoccupied attachment cluster (Preocc.); Dismissive-avoidant attachment cluster (Dismiss.); Fearful-avoidant attachment cluster (Fearful); Undifferentiated attachment cluster (Undiff.). Note. Coefficients shown are standardized regression coefficients (effect sizes), *p < .05. **p < .01. ***p < .001.
.27*** (g2 = .07) .05 (g2 = .00) .23** (g2 = .04) .08*** .03* 18.96*** .. 9.21*** 4.12* .10 (g2 = .01) .28** (g2 = .04) .02 (g2 = .00) .01 .04** 2.37 4.28** 5.20** Block 1: SS-security Block 2: ECR-RC anxiety ECR-RC avoidance Block 1: R2 SS scale Block 2: R2change ECR-RC scales Block 1: F(1, 235) SS scale Block 2: F(3, 232) ECR-RC scales Fchange (2, 232)
.37*** (g2 = .14) .19* (g2 = .02) .21* (g2 = .04) .14*** .05** 36.95*** 18.00*** 7.40**
.44*** (g2 = .19) .34*** (g2 = .08) .04 (g2 = .00) .20*** .07*** 57.67*** 28.00*** 10.86***
Inhibition
.08 (g2 = < .01) .27*** (g2 = .08) .08 (g2 = .00) .10*** .00 13.46*** ´9.21*** 0.72
Dysregulation Depression Self-esteem
.23** (g2 = .04) .25*** (g2 = .07) .10 (g2 = .01) .19*** .00 26.37*** 18.00*** 1.21
.42*** (g2 = .14) .13* (g2 = .02) .19* (g2 = .02) .25*** .02* 39.02*** 28.00*** 4.71*
.25** (g2 = .04) .05 (g2 = .00) .07 (g2 = .00) .05** .00 6.19** 4.28** 0.48
-1
Block 1: ECR-RC anxiety ECR-RC avoidance Block 2: SS-security Block 1: R2 ECR-RC scales Block 2: R2change SS scale Block 1: F(2, 233) ECR-RC scales Block 2: F(3, 232) SS scale Fchange (1, 232)
Table 2. Results of regression analyses: The contribution of attachment security beyond attachment anxiety and avoidance to the prediction of the outcomes
10
European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
Y axis: Z-scores 2 1.5 1 0.5 0 Dismiss.
-0.5 -1 -1.5 -2
Sec.
Undiff.
Anx Av Sec
Preocc. Fearful
X axis: Clusters
Figure 2. z-scores for Anxious (Anx), Avoidant attachment (Av), and Secure attachment. Five-cluster solution: Secure attachment cluster (Sec.); Preoccupied attachment cluster (Preocc.); Dismissive-avoidant attachment cluster (Dismiss.); Fearful avoidant attachment cluster (Fearful); Undifferentiated attachment cluster (Undiff.). Finally, an undifferentiated cluster emerged with scores that were average on both dimensions (z = 0.42 and z = 0.16, respectively). Cluster Analysis on the ECR-RC and SS Analogous to the ECR-RC cluster analyses, five clusters could be retained after inspecting the scree plot of cluster properties. Again, this five-cluster solution explained at least 50% of the variance in each of the defining variables (R2 = .73 for attachment anxiety, .58 for attachment avoidance, and .78 for attachment security). Ó 2015 Hogrefe Publishing
Ó 2015 Hogrefe Publishing
Inhibition
Dysregulation
Adaptive emotion regulation
Depression
Notes. 1 = ECR-RC Clusters; 2 = ECR-RC/SS clusters. Means not sharing subscripts differ significantly, as indicated by post hoc contrasts (Tukey, p < .05). ***p < .001.
.15 .15 .21 .20 .07 .07 .07 .07 .07 .07 (4, 254) 11.16*** (4, 252) 10.66*** (4, 253) 17.19*** (4, 250) 15.59*** (12, 649) 4.69*** (12, 619) 4.13*** (12, 649) 4.69*** (12, 619) 4.13*** (12, 649) 4.69*** (12, 619) 4.13*** (0.75)bc (0.69)bc (5.04)ab (5.84)ab (0.44)b (0.46)b (0.45)a (0.46)ab (0.52)ab (0.48)ab 4.05 4.04 35.27 35.96 2.29 2.30 1.71 1.76 2.07 2.07 (0.72)a (0.76)a (8.89)c (8.17)c (0.49)a (0.47)a (0.53)ab (0.54)a (0.37)b (0.40)b 3.43 3.30 41.27 40.76 2.00 2.01 1.84 1.73 2.28 2.30 (0.89)a (0.86)ab (5.94)bc (6.11)bc (0.41)ab (0.41)ab (0.50)ab (0.48)ab (0.50)b (0.51)ab 3.56 3.60 38.41 38.44 2.07 2.05 1.93 1.97 2.17 2.17 (0.64)ab (0.66)a (6.95)c (6.13)c (0.35)ab (0.42)ab (0.37)b (0.41)b (0.45)b (0.43)ab 3.63 3.54 41.49 42.15 2.20 2.15 2.00 2.04 2.13 2.15 (0.65)c (0.74)c (4.84)a (5.21)a (0.44)b (0.41)ab (0.48)ab (0.46)a (0.47)a (0.50)a 4.28 4.16 33.43 34.02 2.28 2.27 1.73 1.73 1.79 1.89 1 2 1 2 1 2 1 2 1 2 Self-esteem
Preoccupied Secure Measure Variables
Clusters
Table 3. Univariate ANOVAs and Post hoc cluster comparison
Dismissive-avoidant
Fearful-avoidant
Undifferentiated
F-value
R2
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
11
In a second step, this five-cluster solution was subjected to the k-means clustering procedure. This procedure (see Figure 2) resulted in a very similar pattern of results as cluster analysis relying only on the ECR-RC scores (see Figure 1). Participants in a first ‘‘secure attachment’’ cluster scored low on both attachment anxiety and avoidance and high on security (z = 0.71, z = 0.83, and z = 0.83, respectively). Participants in a second ‘‘preoccupied attachment’’ cluster scored high on anxiety but average on avoidance and low on security (z = 1.29, z = 0.39, and z = 1.13, respectively). Participants in a third ‘‘dismissive-avoidant attachment’’ cluster scored high on avoidance but average on anxiety and security (z = 0.97, z = 0.04, and z = 0.06, respectively). Participants in a fourth ‘‘fearfulavoidant attachment’’ cluster scored high on both attachment anxiety and avoidance but low on security (z = 1.06, z = 1.68, and z = 1.69, respectively). Finally, an undifferentiated cluster again emerged with scores that were average on all dimensions (z = 0.04, z = 0.15, and z = 0.06, for anxiety, avoidance, and security, respectively). Between-Cluster Differences in Outcome Variables When examining between-cluster differences in the outcome variables using the ECR-RC solution (see Table 3), the secure attachment cluster showed a more adaptive psychological profile for self-esteem, depression, and inhibition than the insecure clusters. For dysregulation and adaptive emotion regulation, secure did not consistently do better than insecure. Importantly, these results of between-cluster differences were practically identical when cluster analysis on the ECR-RC and SS was used. Moreover, the percentages of explained variance were essentially identical (see Table 3).
Discussion The results of this study suggest that the addition of a separate score for secure attachment (i.e., SS) contributes little to the assessment of mother-child attachment in addition to scores for attachment anxiety and avoidance obtained using the ECR-RC. In terms of assessment, the present results provide support that the ECR-RC contains enough items that tap into the secure region of the anxiety/avoidance spectra. Three different results support this conclusion: (a) secure attachment generally did not add to the prediction of relevant outcomes beyond the variance explained by anxiety and avoidance, (b) the same attachment profiles were identified using cluster analysis on the ECR-RC scores alone or cluster analysis on both the ECR-RC and SS scores, and (c) similar between-cluster differences were found in the outcome variables when relying on both cluster solutions. In spite of the clear-cut results obtained in the present study, we want to highlight a number of important cautionary notes. First, it is important to bear in mind that the European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
12
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
present results were obtained with the specific self-report measures used in this study. As such, until other studies using a wider variety of measures (e.g., IPPA) replicate the current findings, the conclusions drawn from this study apply only to the measures used in this study. Second, as the internal consistency of the CSMS Dysregulation scale was only borderline acceptable, the possibility exists that the relationships obtained with this scale were less reliable. This may explain why the Tukey contrasts for dysregulation did not indicate significant differences between the ECRRC based secure cluster and any of the three insecure clusters. Third, in order to make more general conclusions, future research including a larger number of participants and using more sophisticated latent statistical methods (e.g., Factor Mixture Modeling) should try to replicate the findings. Future research may also extend our findings to other developmental periods (e.g., early childhood and adulthood), to other attachment figures (e.g., fathers, peers, and teachers), and studying a broader range of adjustment measures (e.g., children’s social adjustment).
References Allen, J. P. (2008). The attachment system in adolescence. In J. Cassidy & P. R. Shaver (Eds.), Handbook of attachment: Theory, research, and clinical applications (pp. 419–435). New York, NY: The Guilford Press. Armsden, G. C., & Greenberg, M. T. (1987). The inventory of parent and peer attachment: Individual differences and their relationship to psychological well-being in adolescence. Journal of Youth and Adolescence, 16, 427–454. doi: 10.1007/bf02202939 Bartholomew, K., & Horowitz, L. M. (1991). Attachment styles among young-adults: A test of a 4-category model. Journal of Personality and Social Psychology, 61, 226–244. doi: 10.1037/0022-3514.61.2.226 Brenning, K., & Braet, C. (2013). The emotion regulation model of attachment: An emotion-specific approach. Personal Relationships, 20, 107–123. doi: 10.1111/j.1475-6811. 2012.01399.x Brenning, K., Soenens, B., Braet, C., & Bal, S. (2012). The role of parenting and mother-adolescent attachment in the intergenerational similarity of internalizing symptoms. Journal of Youth and Adolescence, 41, 802–816. doi: 10.1007/s10964-011-9740-9 Brenning, K., Soenens, B., Braet, C., & Bosmans, G. (2011a). An adaptation of the Experiences in Close Relationships Scale-Revised for use with children and adolescents. Journal of Social and Personal Relationships, 28, 1048–1072. doi: 10.1177/0265407511402418 Brenning, K., Soenens, B., Braet, C., & Bosmans, G. (2011b). The role of depressogenic personality and attachment in the intergenerational similarity of depressive symptoms: A study with early adolescents and their mothers. Personality and Social Psychology Bulletin, 37, 284–297. doi: 10.1177/ 0146167210393533 Brenning, K., Soenens, B., Braet, C., & Bosmans, G. (2012). Attachment and depressive symptoms in middle childhood and early adolescence: Testing the validity of the emotion regulation model of attachment. Personal Relationships, 19, 445–464. doi: 10.1111/j.1475-6811.2011.01372.x Cohen, J. (1988). Statistical power analyses for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
Dwyer, K. M. (2005). The meaning and measurement of attachment in middle and late childhood. Human Development, 48, 155–182. doi: 10.1159/000085519 Finnegan, R. A., Hodges, E. V. E., & Perry, D. G. (1996). Preoccupied and avoidant coping during middle childhood. Child Development, 67, 1318–1328. doi: 10.1111/j.14678624.1996.tb01798.x Fraley, R. C., Waller, N. G., & Brennan, K. A. (2000). An item response theory analysis of self-report measures of adult attachment. Journal of Personality and Social Psychology, 78, 350–365. doi: 10.1037/0022-3514.78.2.350 Gore, P. A. (2000). Cluster analysis. In H. E. A. Tinsley & S. D. Brown (Eds.), Handbook of applied multivariate statistics and mathematical modeling (pp. 297–321). San Diego, CA: Academic Press. Harter, S. (1985). Manual for the self-perception profile for children. Denver, CO: University of Denver. Hazan, C., & Shaver, P. (1987). Romantic love conceptualized as an attachment process. Journal of Personality and Social Psychology, 52, 511–524. doi: 10.1037/0022-3514.52.3.511 Kerns, K. A., Klepac, L., & Cole, A. K. (1996). Peer relationships and preadolescents’ perceptions of security in the child-mother relationship. Developmental Psychology, 32, 457–466. doi: 10.1037/0012-1649.32.3.457 Kerns, K. A., Tomich, P. L., Aspelmeier, J. E., & Contreras, J. M. (2000). Attachment-based assessments of parent-child relationships in middle childhood. Developmental Psychology, 36, 614–626. doi: 10.1037/0012-1649.36.5.614 Kovacs, M. (1985). The Children’s Depression Inventory (CDI). Psychopharmacology Bulletin, 21, 995–998. Mikulincer, M., & Shaver, P. R. (2007). Attachment in adulthood: Structure, dynamics, and change. New York, NY: Guilford. Mikulincer, M., & Shaver, P. R. (2012). An attachment perspective on psychopathology. World Psychiatry, 11, 11–15. Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179. doi: 10.1007/bf02294245 Mooi, E., & Sarstedt, M. (2011). Chapter 9: Cluster analysis. In E. Mooi & M. Sarstedt (Eds.), A concise guide to market research: the process, data, and methods using IBM SPSS statistics. Berlin, Germany: Springer. Saylor, C. F., Finch, A. J., Spirito, A., & Bennett, B. (1984). The Children’s Depression Inventory: A systematic evaluation of psychometric properties. Journal of Consulting and Clinical Psychology, 52, 955–967. doi: 10.1037/0022-006x.52.6.955 Shaver, P. R., & Mikulincer, M. (2002). Attachment-related psychodynamics. Attachment and Human Development, 4, 133–161. Sibley, C. G., Fischer, R., & Liu, J. H. (2005). Reliability and validity of the revised experiences in close relationships (ECR-R) self-report measure of adult romantic attachment. Personality and Social Psychology Bulletin, 31, 1524–1536. doi: 10.1177/0146167205276865 Sroufe, L. A., Egeland, B., Carlson, E. A., & Collins, W. A. (2005). The development of the person: The Minnesota study of risk and adaptation from birth to adulthood. New York, NY: Guilford Press. Tsagarakis, M., Kafetsios, K., & Stalikas, A. (2007). Reliability and validity of the Greek version of the revised experiences in close relationships measure of adult attachment. European Journal of Psychological Assessment, 23, 47–55. doi: 10.1027/1015-5759.23.1.47 Van den Bergh, B. R. H., & Marcoen, A. (1999). Harter’s selfperception profile for children: Factor structure, reliability, and convergent validity in a Dutch-speaking Belgian sample of fourth, fifth and sixth graders. Psychologica Belgica, 39, 29–47.
Ó 2015 Hogrefe Publishing
K. Brenning et al.: Measuring Attachment Anxiety and Avoidance
Van Ryzin, M. J., & Leve, L. D. (2012). Validity evidence for the security scale as a measure of perceived attachment security in adolescence. Journal of Adolescence, 35, 425–431. doi: 10.1016/j.adolescence.2011.07.014 Yunger, J. L., Corby, B. C., & Perry, D. G. (2005). Dimensions of attachment in middle childhood. In K. A. Kerns & R. A. Richardson (Eds.), Attachment in Middle Childhood (pp. 46–70). New York, NY: The Guilford Press. Zeman, J., Shipman, K., & Penza-Clyve, S. (2001). Development and initial validation of the children’s sadness management scale. Journal of Nonverbal Behavior, 25, 187–205. doi: 10.1023/a:1010623226626
13
Katrijn Brenning Department of Developmental, Personality and Social Psychology Ghent University H. Dunantlaan 2 9000 Ghent Belgium E-mail Katrijn.Brenning@Ugent.be
Date of acceptance: January 10, 2015 Published online: June 26, 2015
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):5–13
Original Article
Mindful Attention Awareness in Spanish Palliative Care Professionals Psychometric Study With IRT and CFA Models Laura Galiana,1 Amparo Oliver,1 Noemí Sansó,2 M. Dolores Sancerni,1 and José M. Tomás1 1
Department of Methodology for the Behavioral Sciences, University of Valencia, Spain, 2Ibsalut, Palliative Care Program of the Balearic Islands, Spain, and University of the Balearic Islands, Spain Abstract. Mindfulness is conceived as a state in which the individual pays full attention to everything that is happening around him or her. The Mindful Attention Awareness Scale (MAAS) is the most popular instrument for assessing mindfulness. Studies on its structure have shown some conflicting results. This study aims to offer new evidence on the dimensionality and reliability of the MAAS, handling both SEM and IRT procedures, in palliative professionals. The sample was composed of 385 professionals from a national online survey. First, two Confirmatory Factor Analyses (CFAs) were specified, estimated, and tested, with one- and two-factor structures, respectively. Second, the Graded Response Model (GRM) was used and accuracy of the MAAS using information functions was estimated. Results showed appropriate fit for the two CFA models. As the correlation between the two factors in the two-factor model was extremely high and the original authors posited a one-factor solution, this structure was retained for parsimony. The GRM also supported this structure, but found that the scale offered more information on professionals with lower levels of mindfulness, pointing at items 1, 2, 6, and 15 as the less discriminative, in line with the CFA lower factor loadings for these very same items. Keywords: Mindful Attention Awareness Scale (MAAS), structural equation models, Graded Response Models, dimensionality, information function
Mindfulness is conceived as a state in which the individual pays full and minute attention to everything that is happening around him or her (León, Fernández, Grijalvo, & Núñez, 2013). Traditionally, mindfulness has referred to the awareness that emerges from paying attention to purpose and to present moment, and from the focus on the display of individual’s immediate experience (Kabat-Zinn, 1982, 1990), that is to say, it involves sustained attention, but not an assessment of what is processed (Grossman, Niemann, Schmidt, & Walach, 2004). Therefore, mindfulness has been identified as a skill that can be taught and thereby improve health in the context of stress reduction (Baer, 2006; Grossman et al., 2004; Veehof, Oskam, Schreurs, & Bohlmeijer, 2011; Zeidan, Gordon, Merchant, & Goolkasian, 2010). Mindfulness has also been considered an important key for the maintenance and improvement of well-being (Grossman et al., 2004). In the specific area of palliative care professionals, well-being is of great importance: These professionals face extreme psychological challenges and experience emotional distress on regular basis, experiences that may lead to burn-out (Cole & Carlin, 2009; Meier, Back, & Morrison, 2001; Pereira, Fonseca, & Carvalho, 2011; Peters et al., 2012). A closer look at the protective factors promoting palliative care professionals’ well-being and work satisfaction has revealed the importance of the clinicians’ mindful aware-
ness development (Cole, 1997; Dobkin, 2011; Epstein, 1999; Novack, Epstein, & Paulsen, 1999; Novack et al., 1997). Mindful awareness is, then, a construct to be considered when studying palliative care professionals’ wellbeing, burn-out and job performance. Several measurement instruments have been developed to assess mindfulness. The debate continues with regard to the measurement and conceptualization of mindfulness (Sauer, Ziegler, Danay, Ives, & Kohls, 2013) and evidence on the validity of existing measures is needed. Some examples are the Freiburg Mindfulness Inventory (Buchheld, Grossman & Wallach, 2001), the Kentucky Inventory of Mindfulness Skills (Baer, Smith, & Allen, 2004), the Five Facet Mindfulness Questionnaire (Baer, Smith, Hopkins, Krietemeyer, & Toney, 2006), the Cognitive and Affective Mindfulness Scale–Revised (Feldman, Hayes, Kumar, Greeson, & Laurenceau, 2007), the Southampton Mindfulness Questionnaire (Chadwick et al., 2008), or the Toronto Mindfulness Scale (Davis, Lau, & Cairns, 2009). Among them, the Mindful Attention Awareness Scale (MAAS; Brown & Ryan, 2003) is the most popular one, with already more than 350 citations in the Web of Science (Grossman, 2011). The MAAS is a 15-item self-report instrument assessing frequency of mindful states in daily life (Brow & Ryan, 2003). Its first validation in a sample of undergraduate
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21 DOI: 10.1027/1015-5759/a000265
Ó 2015 Hogrefe Publishing
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
students offered evidence of a one-factor latent structure (Brow & Ryan, 2003). Although the scale has been subject to several criticisms (Grossman, 2008, 2011), its prevalence among scientific literature is still continuing, with over 1,100 citations in the Web of Science (three times more than the numbers offered by Grossman in his criticisms in 2011). Since its first validation, the MAAS has been translated and validated into many languages, including Spanish (Cebolla, Luciano, Piva DeMarzo, Navarro-Gil, & Garcia Campayo, 2013; Inchausti, Prieto, & Delgado, 2014; León et al., 2013; Soler et al., 2012). However, the studies on the Spanish version of the MAAS have shown inconsistent results as regards its factor structure. Soler et al. (2012) and León et al. (2013), using confirmatory factor analysis, found support for the one-factor structure of the scale, for both clinical and nonclinical samples. Meanwhile, Inchausti et al. (2014), applying Item Response Theory Models, found evidence for a one-factor structure, but only after removing three items. Cebolla et al. (2013), in turn, recently demanded a more comprehensive study of the scale, as their own research offered inconsistent evidence that supported a two-factor structure, but very close in fit to the one-factor solution. Scales in which psychometric results are controversial or lack consensus, as this one, may benefit from the addition of evidence of validity focused on individual items (Elosúa, 2003) to the usual structural equation modeling (SEM) framework. Item Response Theory (IRT) models, which are more interested on items’ psychometric behavior rather than on the scale psychometric behavior as a whole, are appropriate for the assessment of construct validity. In this context, this study aims to offer new evidence on the dimensionality and reliability of the MAAS, handling both SEM and IRT procedures, and validating it in an understudied population in this research arena, palliative care professionals.
15
the remaining 0.8% had more than one of these professions. They completed a survey of about 30 min long, which included demographic, personal, and professional data, as well as several scales related to professionals’ well-being. The surveys were completed using a web platform.
Instruments Among the several instruments used in this research, the Mindful Attention Awareness Scale (MAAS; Cebolla et al., 2013) is the one under scrutiny. This is a 15-item instrument measuring the general tendency to be aware and conscious of one’s own experiences of daily life. Sample items are: ‘‘I could experience an emotion and not be conscious of it until later’’ or ‘‘I find it difficult to stay focused on what is happening in the present.’’ It has a 6-point Likert type scale format, ranging from 1 (= almost always) to 6 (= almost never). Alpha for the scale in this sample was .90.
Data Analyses
A cross-sectional survey of Spanish palliative care professionals was conducted using the online platform Survey Monkey. This online platform assures all confidentiality and security standards. It does not save respondents’ computer ID and avoids users entering a second time into the platform when the survey has been answered once. All medical doctors, nurses, nursing assistants, psychologists, and social workers, as associate members of the Spanish Society of Palliative Care (SECPAL) (population = 1,309), were invited to answer the survey that included, among others, the MAAS instrument. The response rate was 33.07% (433 respondents), 385 of whom met the inclusion criteria (to be palliative care professionals). Sample (77.5%) were women, 40.3% were doctors, 33.1% nurses, 14.2% psychologists, 4.8% nursing assistants, 4.0% social workers, and
For validation of the scale a dual strategy for statistical analyses was employed: structural equations combined with item response theory models. This strategy has been used several times (see, e.g., Reise, Widaman, & Pugh, 1993; Van Dam, Earleywine, & Borders, 2010; Wang & Russell, 2005). Two structural equation models, specifically two confirmatory factor analyses (CFA), were specified, estimated, and tested. A first one, with an a priori one-dimensional structure, was based on the original structure proposed by the authors of the scale (Brow & Ryan, 2003), and a second one, with a two-factor structure following the results obtained by Cebolla et al. (2013). Given the 6-point Likert type scale, the estimation method was maximum likelihood with robust corrections of standard errors and polychoric correlations, as recommended for ordinal and non-normal data (Finney & DiStefano, 2006). The goodness-of-fit was assessed using multiple criteria (Hu & Bentler, 1999), Tanaka, 1993): (a) the robust or scaled chi-square statistic, with significant test statistic casting doubt on the model specification; (b) the Comparative Fit Index (CFI; Bentler, 1990) of more than .90 (and, ideally, greater than .95; Hu & Bentler, 1999) indicating good fit; (c) the Root Mean Squared Error of Approximation (RMSEA; Steiger & Lind, 1980), with values of .05 or less as indicative of adequate fit; and (d) Standardized Root Mean Square Residual (SRMR), with values of less than .08 indicative of good fit (Kline, 2011). A model with a CFI of at least .95 and a RMSEA less than .08 for small samples (N < 250) or < .06 for large samples together would indicate a good fit between the hypothesized model and the data (Hu & Bentler, 1999). The aforementioned authors also proposed a two-index strategy in which both a CFI > .95 and a SRMR < .08 reveal adequate fit.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
Method Design, Procedure, and Sample
16
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
Differences between the CFIs for both models were calculated in order to compare the models. Additionally, the Akaike Information Criteria (AIC) and Consistent Akaike Information Criterion (CAIC) were also employed for model comparison. The CAIC incorporates a penalty for complex models. Whereas some authors consider negligible a CFI difference of .05 or less (Little, 1997), others suggest that this difference should not exceed .01 (Cheung & Rensvold, 2002). The models with good fit and the lowest AIC and CAIC are preferred. Regarding IRT approach, the Graded Response Model (GRM) was used. This is an extension of the two-Parameter Logistic Model (2PLM), and in this case stochastic local independence assumption is tested using the residual correlation matrix. GRM was used to assess metric quality of response categories, as the estimations of the items discrimination ranged from .25 to .73. Therefore, the equal discrimination assumption of the extension of Rasch Model for polytomous response was not tenable. GRM is, as Hambleton, van der Linden, and Wells (2010) stated, ‘‘a simple yet elegant extension of the 2PLM, and one of the most popular IRT models to address polytomous data’’ (p. 27). An additional information that supports the election of a two-parameter model, instead of the one-parameter one, came from the previous CFA models that noticed clear differences among the factor loadings, and since factor loadings in CFA are analogous to item discrimination parameters in IRT (Chan, 2000; Ferrando, 1996; Widaman & Reise, 1997), an assumption of equal discrimination did not seem plausible. Although a CFA was previously used to assess the scale dimensionality, this model is not appropriate to fully evaluate IRT model requirements (Smith, 2002). In order to test unidimensionality and local independence, and following Abad, Olea, Ponsoda, and García (2011), Exploratory Factor Analysis (EFA) and residual correlation analyses were used. Unidimensionality assumption was examined via parallel analysis (Horn, 1965; O’Connor, 2000). Local independence was shown via residual correlation analyses. Both assumptions were supported and the graded response model (Samejima, 1970, 1997) was used. This model is a generalization of the 2PLM to ordered polytomous items. It describes the behavior of each item using a parameters for discrimination and b parameters indicating the position of each of the curves for the probability of choosing one of the response categories or superior to its operative characteristics. Thus, the value of each of these parameters reflects the score required to have .50 probability of choosing the alternative or alternatives above it. Additionally, the model provides goodness-of-fit measures for items (chi-square test) and for the full scale (chi-square test and alpha coefficient). Accuracy of measurements using information functions was also calculated. Finally, a sex-linked Differential Item Functioning (DIF) analysis to determine the effect on the scale of potential latent factors outside the construct was made. Mantel-Haenszel procedure for polytomous items was used. Structural equation models were estimated in EQS 6.1 (Bentler, 1999) and XCalibre 4.1.8 (Guyer & Thompson, 2012) was used for IRT-based estimation. Parallel analysis
was undertaken via O’Connor’s procedure in SPSS 19 (O’Connor, 2000).
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
Ó 2015 Hogrefe Publishing
Results Confirmatory Factor Analyses Two confirmatory factor analyses shown in Figure 1 were specified, estimated, and tested. Table 1 shows the fit indexes for both models. Overall fit indexes showed appropriate and almost identical fit for the two models. A chi-square difference test was calculated and results slightly favored the two-factor model (Dv2 = 3.84, Ddf = 1, p = .049). However, the correlation between the two factors in the two-factor model was extremely high (r = .964; 95% confidence interval .912-1), showing a lack of discriminant validity that casts doubt on the existence of two different latent variables. CFIs were excellent, the RMSEAs were close to the strictest cut-off, and the SRMR were also adequate. Differences between the two CFIs were negligible (DCFI = .001) and the lowest CAIC was that of the one-factor solution.
Graded Response Model Two IRT models were considered: 1PLM and 2PLM. In order to test which model better fits the data (taking parsimony into account), chi-square differences were calculated and the significant test showed that the 2PLM model was needed: Dv2 = 237.185, Ddf = 15, p < .001. Moreover, the AIC, BIC, and CAIC were calculated to compare the models, the 1PLM criteria (AIC = 8,960, BIC = 8,962, CAIC = 8,965.92) were larger than those for the 2PLM (AIC = 8,546, BIC = 8,553.84, CAIC = 8,555.84), thus reinforcing the idea that the 2PLM model was adequate. Accordingly, the 2PLM (Graded Response Model) was used. As the Graded Response Model requires participants’ answers in every point of the scoring scale, and some were missing for items 1, 2, 5, and 6, the scale format was reduced from six to four alternatives, by collapsing alternatives one and two, and alternatives five and six. This procedure was only carried out for GRM estimation purposes. Assumptions for local independence and unidimensionality were verified by principal component analysis and supported by the confirmatory factor analysis results presented in previous section. According to principal component analysis, the correlation matrix was appropriate given the high KMO value (.91) and the dominant factor explained 44.78% of the variance (eigenvalue = 6.72). The second factor had an eigenvalue of 1.16 explaining 7.78% of variance. The parallel analysis based on 1,000 replications found that the minimum value for this second eigenvalue to represent a substantive factor should have been 1.32, and therefore the scale may be considered unidimensional, according to this procedure. Additional to all this evidence, the decision to maintain a one-factor solution
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
17
Figure 1. Standardized factor loadings for the MAAS confirmatory factor analysis. All factor loadings were statistically significant (p < .01). For the sake of clarity, errors are not shown.
Table 1. Fit indexes for the two confirmatory models of MAAS One-factor Two-factor
v2
df
p
AIC
CAIC
CFI
SRMR
RMSEA
RMSEA 90%CI
253.327 249.103
90 89
< .01 < .01
73.32 71.10
365.163 362.515
.980 .981
.052 .053
.072 .071
.061–.082 .061–.082
Notes. df = degrees of freedom; CI = confidence interval; AIC = Akaike Information Criterion; CAIC = Comparative Akaike Information Criterion; CFI = Comparative Fit Index; SRMR = Standardized Root Mean Square Residual; RMSEA = Root Mean Square Error of Approximation.
was corroborated by Cattell’s criteria. With respect to local independence, this assumption is tenable if residual correlations are close to zero (and not superior to .10). Only some residuals were greater than .10, for items 1, 2, and 15, those with the lowest loadings in the EFA. The other items had residual correlations lower than .10, and therefore local independence was tenable. MAAS with the 15 items showed an adequate fit (v2(615) = 579.89; p = .842) and alpha had a very good value: .886. Item functioning is shown in Table 2. As shown in Table 2, the discrimination parameter values (a) ranged from 2.47 to 0.91, with items 1, 2, 6, and 15
being least discriminative. Misfit was found in items 6 and 11. Table 3 shows trait position parameters (b). As seen in Table 3 b parameters for every item were not close to each other, supporting the suitability of the alternatives. Estimation errors were slightly higher in the lowest response option due to the negative asymmetry of the distribution, but always less than 0.30 and, therefore, adequate. The test information function (Figure 2) had its maximum value at 1.5, indicating that the instrument was especially useful for people with a low positioning on the trait or, in other words, with a relatively low level of mindfulness. The accuracy decreased at higher levels of the trait, as it
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
18
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
Table 2. Item parameters for all calibrated items of MAAS Item
Mean
R
a
a SE
v2
p
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
3.243 3.639 3.559 2.992 3.123 2.772 3.448 3.426 3.485 3.432 3.198 3.573 3.432 3.493 3.672
0.349 0.381 0.578 0.668 0.598 0.443 0.740 0.788 0.676 0.795 0.597 0.582 0.658 0.739 0.340
0.757 0.972 1.553 1.549 1.356 0.910 2.170 2.473 1.908 2.591 1.366 1.557 1.726 2.233 0.987
0.048 0.068 0.107 0.097 0.086 0.057 0.147 0.170 0.130 0.179 0.086 0.108 0.117 0.154 0.069
52.178 46.525 31.043 36.162 36.433 64.389 20.037 24.910 44.784 31.125 61.897 26.371 32.459 34.036 37.541
0.113 0.255 0.870 0.685 0.674 0.011 0.998 0.978 0.316 0.868 0.019 0.963 0.827 0.771 0.625
is shown in information function displayed in Figure 2. Thus, the results indicate that the MAAS does not discriminate well at high levels. Finally, Mantel-Haenszel’s (MH) procedure for polytomous items showed absence of DIF in all items except the 6th. Delta MH shows DIF functioning if values are higher than 1.5. Item 6 had a Delta MH = 1.52, while all the other values ranged from 0.43 to 1.43. This item has shown its inadequacy in the different analyses performed.
Discussion
Figure 2. Test Information Function. Reproduced with permission from Assessment Systems Corporation, 2012.
Mindfulness has been considered a protective factor of burn-out, as it can help professionals to maintain and improve their well-being. Therefore, its study is of great interest in a context of particular stress like palliative care. In order to assess mindful attention, several instruments have been developed, being the most used the Mindful Attention Awareness Scale (MAAS). This measure has been used to predict several psychological outcomes, and ‘‘research is accumulating to show that the scale predicts overt behavior related to attention and behavior regulation’’ (Brown, Ryan, Loverich, Biegel, & West, 2011, p. 1045). The aim of this study was to clarify its dimensionality, in a sample of palliative care professionals, using its Spanish version, combining two techniques: structural equation and item response theory models. Regarding the confirmatory factor analyses, two a priori models were tested, one with a one-factor structure, as proposed by the original authors of the scale (Brown & Ryan, 2003) and with previous evidence in Spanish scientific literature (León et al., 2013; Soler et al., 2012), and a second one with a two-factor structure, following Cebolla et al.’s (2013) research. Results showed a good fit for both CFA models. Although from a statistical perspective, chi-square differences supported the two-factor model (p = .049), it is well recognized that chi-square is extremely sensitive and therefore a practical fit criterion has been defended (Little, 1997; Cheung & Rensvold, 2002). Given that there were negligible differences in CFIs, a lowest CAIC value for the one-dimension solution, a lower SRMR, an extremely high correlation (nonsignificantly different to one), and theoretical considerations (Brown & Ryan, 2003), made us take the one-factor model as a more appropriate representation of the data due to its parsimony. As regards information offered by the Graded Response Model, and although the fit of the model was appropriate, some items of the scale were not especially discriminant, as it was the case for items 1, 2, 6, and 15. Items with highest discrimination parameters were items 7, 8, 9, 10, and 14, the same as the ones found by Van Dam et al. (2010). Although the original authors of the scale have argued that ‘‘states reflecting less mindlessness are likely more accessible to most individuals’’ (Brown & Ryan, 2003, p. 826), the items with higher discrimination power were the more general ones, and related to automatic pilot, which lead us to believe, following Van Dam and colleagues’ argument, that
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
Ó 2015 Hogrefe Publishing
Notes. R = correlation; a = discrimination parameter; SE = standard error.
Table 3. Category statistics of MAAS Item
b1
b1 SE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
4.650 4.865 3.120 1.833 2.206 2.143 2.746 2.656 2.928 2.459 2.535 3.108 2.379 2.627 4.618
0.334 0.422 0.215 0.109 0.129 0.146 0.183 0.173 0.201 0.149 0.154 0.206 0.136 0.166 0.372
b2 2.252 2.906 1.832 0.705 1.072 0.501 1.483 1.450 1.755 1.412 1.224 2.064 1.425 1.559 2.920
b2 SE
b3
b3 SE
0.160 0.172 0.101 0.070 0.082 0.096 0.083 0.078 0.097 0.075 0.088 0.112 0.079 0.084 0.166
0.268 1.225 0.784 0.237 0.086 0.803 0.325 0.262 0.406 0.302 0.046 0.704 0.507 0.423 1.425
0.134 0.117 0.078 0.071 0.080 0.107 0.063 0.058 0.069 0.057 0.081 0.080 0.069 0.062 0.118
Note. b = difficulty parameter; SE = standard error.
TIF 15.0 13.5 12.0 10.5 9.0 7.5 6.0 4.5 3.0 1.5 -4
-3
-2
-1
0
θ
1
2
3
4 © 2012 ASC
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
19
‘‘a reversal of total scale score seems an unlikely means of measuring the opposite (mindfulness) of what the items represent’’ (Van Dam et al., 2010, p. 808). Thus, an overall examination of the scale pointed out that the MAAS was especially indicated for detecting lower levels of awareness or mindlessness. Even though the scale has been criticized for using only negatively worded items, which may lead to measure mindlessness instead of mindfulness (Van Dam et al., 2010), recent research (Höfling, Moosbrugger, Schermelleh-Engel, & Heidenreich, 2011) has formulated a new version of the MAAS with both positively and negatively worded items. Although some response bias effects were found in this version, a single trait mindfulness factor was found to strongly affect all items, either positive or negative. It is worth noting that items with lower discrimination power were also the ones with lowest factor loadings (although adequate) in the structural analysis, and vice versa, items with higher discrimination parameters were those with higher factor loadings. This is not surprising since several authors have noted that the factor loadings in CFA are analogous to item discrimination parameters in IRT (Chan, 2000; Ferrando, 1996; Widaman & Reise, 1997). Therefore, both statistical models reinforce each other’s conclusions. Current results provide some evidence in disentangling the question emerged on the structure of the Spanish version of the MAAS (Brown & Ryan, 2003). As follows from the hitherto reported, the scale, generally considered to measure mindful attention awareness, when used in its Spanish version emerges as a unidimensional scale. These outcomes are in line with the original presentation of the scale and with Spanish evidence gathered on the studies of Soler et al. (2012) and León et al. (2013). Although this is somehow in contrast with the structure proposed by Cebolla et al. (2013), it should be borne in mind that this study was carried out on patients with fibromyalgia, a different population, and that their two-factor model did not substantially differ from the unidimensional solution. Thus, particularities in the MAAS structure could be due to the context of its use. New validations of the scale in different populations would be welcome. This approach to validity is in line with one proposed by Kane (2013), in which the author proposed the validation of interpretations and uses of the scores, offering special guidelines for concrete populations. A further conclusion is drawn from the study of the scale information function. The use of the MAAS is especially appropriate in those populations with lower levels of awareness, as the instrument is not discriminant enough for higher scores. This result is according to one of Grossman’s concern. This author has recently pointed to the inappropriateness of the negative formulation of the questions in the MAAS (Grossman, 2011). As Reise and Waller (2009) explained: ‘‘variation at the low end of the scale is less informative in both substantive as well as a psychometric sense’’ (p. 31). Consequences of these results are interesting for the applied context, as they directly point the need to potentiate items with greater discriminative power in the highest scores of awareness. In other words, there is a need for
reviewing the scale validity. As this latest conclusion is derived from the use of Graded Response Models, one of the most used and structured models for ordinal items’ behavior (Ponsoda, Revuelta, & Abad, 2006), it seems adequate to divert future research studying dimensionality and psychometric properties of measurement instruments towards the use of not only Structural Equation Modeling techniques, but also models in the framework of Item Response Theory. This paper has the aforementioned strengths but also some drawbacks. One of its limitations is inherent to the use of the MAAS. As Grossman (2008, 2011) has deeply explained, some specific concerns regarding this instrument are its content validity, the absence of a consensus on the mindfulness definition, the problems of measuring selfperceptions, or the selection of inadequate populations (i.e., students). It has to be borne in mind, however, that some of these drawbacks are common to most of mindfulness instruments, as Grossman (2008) himself acknowledges, and to many psychological constructs. Another limitation is that the results are restricted to the Spanish adaptation of the MAAS in a specific sample of palliative care professionals and generalization of results to the original version may only be speculative.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
References Abad, F. J., Olea, J., Ponsoda, V., & García, C. (2011). Medición en ciencias sociales y de la salud [Measurement in social sciences and health]. Madrid, Spain: Síntesis. Baer, R. A. (2006). Mindfulness-based treatment approaches: Clinician’s guide to evidence base and applications. London, UK: Academic Press. Baer, R. A., Smith, G. T., & Allen, K. B. (2004). Assessment of mindfulness by self-report: The Kentucky inventory of mindfulness skills. Assessment, 11, 191–206. Baer, R. A., Smith, G., Hopkins, T., Krietemeyer, J., & Toney, L. (2006). Using self-report assessment methods to explore facets of mindfulness. Assessment, 13, 27–45. Bentler, P. M. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107, 238–246. Bentler, P. M. (1999). EQS 6 Structural Equations Program Manual. Encino, CA: Multivariate Software. Brown, K. W., & Ryan, R. M. (2003). The benefits of being present: mindfulness and its role in psychological wellbeing. Journal of Personality and Social Psychology, 84, 822–848. Brown, K. W., Ryan, R. M., Loverich, T. M., Biegel, T. M., & West, A. M. (2011). Out of the armchair and into the streets: Measuring mindfulness advances knowledge and improves interventions: Reply to Grossman (2011). Psychological Assessment, 23, 1041–1046. Buchheld, N., Grossman, P., & Walach, H. (2001). Measuring mindfulness in insight meditation (vipassana) and meditationbased psychotherapy: the development of the Freiburg Mindfulness Inventory (FMI). Journal for Meditation and Meditation Research, 1, 11–34. Cebolla, A., Luciano, J. V., Piva DeMarzo, M., Navarro-Gil, M., & Garcia Campayo, J. (2013). Psychometric properties of the Spanish version of the Mindful Attention Awareness Scale (MAAS) in patients with fibromyalgia. Health and Quality of Life Outcomes, 11, 6. Chadwick, P., Hember, M., Symes, J., Peters, E., Kuipers, E., & Dagnan, D. (2008). Responding mindfully to unpleasant thoughts and images: Reliability and validity of the
20
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
Southampton Mindfulness Inventory Questionnaire. The British Journal of Clinical Psychology, 47, 451–455. Chan, D. (2000). Detection of differential item functioning on the Kirton Adaptation-Innovation Inventory using multiplegroup mean and covariance structure analysis. Multivariate Behavioral Research, 35, 169–199. Cheung, D. G., & Rensvold, R. B. (2002). Evaluating goodnessof-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255. Cole, R. (1997). Meditation in palliative care – a practical tool for self-management. Palliative Medicine, 11, 411–413. Cole, T. R., & Carlin, N. (2009). The suffering of physicians. The Lancet, 374, 1414–1415. Davis, K. M., Lau, M. A., & Cairns, D. R. (2009). Development and preliminary validation of a trait version of the Toronto Mindfulness Scale. Journal of Cognitive Psychology, 23, 185–197. Dobkin, P. L. (2011). Mindfulness and whole person care. In T. A. Hutchinson (Ed.), Whole person care: A new paradigm for the 21st century (pp. 69–82). New York, NY: Springer Science. Elosúa, P. (2003). Sobre la validez de los tests [On tests’ validity]. Psicothema, 15, 315–321. Epstein, R. M. (1999). Mindful practice. JAMA, 282, 833–839. Feldman, G., Hayes, A., Kumar, S., Greeson, J., & Laurenceau, J. P. (2007). Mindfulness and emotional regulation: The development and initial validation of the Cognitive and Affective Mindfulness Scale-Revised (CAMS-R). Journal of Psychopathology and Behavioral Assessment, 29, 177–190. Ferrando, P. J. (1996). Calibration of invariant item parameter in a continuous item response model using the extended LISREL measurement submodel. Multivariate Behavioral Research, 31, 419–439. Finney, S. J., & DiStefano, C. (2006). Non-normal and categorical data in SEM. In G. R. Hancock & R. O. Mueller (Eds.), Structural Equation Modeling: A second course (pp. 269– 314). Greenwich, CT: Information Age Publishing. Grossman, P. (2008). On measuring mindfulness in psychosomatic and psychological research. Journal of Psychosomatic Research, 64, 405–408. Grossman, P. (2011). Defining mindfulness by how poorly I think I pay attention during everyday awareness and other intractable problems for psychology’s (Re)invention of mindfulness: Comment on Brow et al. (2011). Psychological Assessment, 23, 1034–1040. Grossman, P., Niemann, L., Schmidt, S., & Walach, H. (2004). Mindfulness-based stress reduction and health benefits: A meta-analysis. Journal of Psychosomatic Research, 57, 35–43. Guyer, R., & Thompson, N. A. (2012). User’s Manual for Xcalibre item response theory calibration software, version 4.1.8. St. Paul, MN: Assessment System Corporation. Hambleton, R. K., van der Linden, W. J., & Wells, C. S. (2010). IRT models for the analysis of polytomously scored data: Brief and selected history of model building advances. In M. L. Nering & R. Ostini (Eds.), Handbook of polytomous item response models (pp. 21–42). New York, NY: Routledge. Höfling, V., Moosbrugger, H., Schermelleh-Engel, K., & Heidenreich, T. (2011). Mindfulness or mindlessness? A modified version of the Mindful Attention and Awareness Scale (MAAS). European Journal of Psychological Assessment, 27, 59–64. Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179–185. Hu, L., & Bentler, P. M. (1999). Cut-off criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
Inchausti, F., Prieto, G., & Delgado, A. R. (2014). Análisis de la versión Española de la escala Mindful Attention Awareness Scale en una muestra clínica [Rasch Analysis of the Spanish version of the Mindful Attention Awareness Scale (MAAS) in a clinical sample]. Revista de Psiquiatría y Salud Mental, 7, 32–41. Kabat-Zinn, J. (1982). An outpatient program in behavioral medicine for chronic patients based on the practice of mindfulness meditation: Theoretical considerations and preliminary results. General Hospital Psychiatry, 4, 33–47. Kabat-Zinn, J. (1990). Full catastrophe living: Using the wisdom of your body and mind to face stress, pain, and illness. New York, NY: Delta. Kane, M. T. (2013). Validating the interpretations and uses of tests scores. Journal of Educational Measurement, 50, 1–73. Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: The Guilford Press. León, J., Fernández, C., Grijalvo, F., & Núñez, J. L. (2013). Assessing mindfulness: The Spanish version of the Mindfulness Attention Awareness Scale. Estudios de Psicología, 34, 175–184. Little, T. D. (1997). Mean and covariance structures (MACS) analyses of cross-cultural data: Practical and theoretical issues. Multivariate Behavioural Research, 32, 53–76. Meier, D. E., Back, A. L., & Morrison, R. S. (2001). The inner life of physicians and care of the seriously ill. JAMA, 286, 3007–3014. Novack, D. H., Epstein, R. M., & Paulsen, R. H. (1999). Toward creating physician-healers: Fostering medical student’s selfawareness, personal growth, and well-being. Academic Medicine, 74, 516–520. Novack, D. H., Suchman, A. L., Clark, W., Epstein, R. M., Najberg, E., & Kaplan, C. (1997). Calibrating the physician personal awareness and effective patient care. JAMA, 278, 502–509. O’Connor, B. P. (2000). SPSS and SAS programs for determining the number of components using parallel analysis and Velicer’s MAP test. Behavior Research Methods, Instruments, & Computers, 32, 396–402. Pereira, S. M., Fonseca, A. M., & Carvalho, A. S. (2011). Burnout in palliative care: A systematic review. Nursing Ethics, 18, 317–326. Peters, L., Cant, R., Sellick, K., O’Connor, M., Lee, S., Burney, S., & Karimi, L. (2012). Is work stress in palliative care nurses a cause for concern? A literature review. International Journal of Palliative Nursing, 18, 561–567. Ponsoda, V., Revuelta, J., & Abad, F. J. (2006). Modelos politómicos de respuesta al ítem [Polytomous models for item response]. Spain: La Muralla. Reise, S. P., & Waller, N. G. (2009). Item response theory and clinical measurement. Annual Review of Clinical Psychology, 5, 27–48. Reise, S. P., Widaman, K. F., & Pugh, R. H. (1993). Confirmatory factor analysis and Item Response Theory: Two approaches for exploring measurement invariance. Psychological Bulletin, 114, 552–566. Samejima, F. (1970). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 35, 139. Samejima, F. (1997). Graded response model. In W. J. Van der Linden & R. K. Hambleton (Eds.), Handbook of modern Item Response Theory (pp. 85–100). New York, NY: Springer. Sauer, S., Ziegler, M., Danay, E., Ives, J., & Kohls, N. (2013). Specific objectivity of mindfulness: A Rasch analysis of the Freiburg Mindfulness Inventory. Mindfulness, 4, 45–54. Smith, E. V. (2002). Understanding Rasch measurement: Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3, 205–231.
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
Ó 2015 Hogrefe Publishing
L. Galiana et al.: Mindful Attention Awareness in Spanish Palliative Care Professionals
21
Soler, J., Tejedor, R., Feliu-Soler, A., Pascual, J. C., Cebolla, A., Soriano, J., . . . Perez, V. (2012). Psychometric properties of Spanish version of Mindful Attention Awareness Scale (MAAS). Actas Españolas de Psiquiatría, 40, 19–26. Steiger, J. H., & Lind, C. (1980). Statistically based tests for the number of common factors. Paper presented at the annual meeting of the Psychometric Society, Iowa City, IA. Tanaka, J. S. (1993). Multifaceted conceptions of fit in structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 10–39). Newbury Park, CA: Sage. Van Dam, N. T., Earleywine, M., & Borders, A. (2010). Measuring mindfulness? An Item Response Theory analysis of the Mindful Attention Awareness Scale. Personality and Individual Differences, 49, 805–810. Veehof, M. M., Oskam, M. J., Schreurs, K. M., & Bohlmeijer, E. T. (2011). Acceptance-based interventions for the treatment of chronic pain: A systematic review and metaanalysis. Pain, 152, 533–542. Wang, M., & Russell, S. S. (2005). Measurement equivalence of the job descriptive index across Chinese and American workers: Results from confirmatory factor analysis and item response theory. Educational and Psychological Measurement, 65, 709–732.
Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention (pp. 281–324). Washington, DC: American Psychological Association. Zeidan, F., Gordon, N. S., Merchant, J., & Goolkasian, P. (2010). The effects of brief mindfulness meditation training on experimentally induced pain. Journal of Pain, 11, 199–209.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):14–21
Date of acceptance: January 13, 2015 Published online: June 26, 2015 Amparo Oliver Department of Methodology for the Behavioral Sciences University of Valencia Blasco Ibanez, 21 46010 Valencia Spain Tel. 34 96 386-4468 E-mail oliver@uv.es
Original Article
Impact of Differential-ItemFunctioning on the Personal Epistemological Beliefs for Senior High School Students Jia-Jia Syu1 and Liang-Cheng Zhang2 1
School of Public Health, University of Queensland, Herston, QLD, Australia, 2Department of Accounting, Finance and Economics, Griffith Business School, Griffith University, Australia
Abstract. The purpose of this study is to investigate whether the expression of items used to examine personal epistemological beliefs could affect the probabilities of response for compared groups and research outcomes. Differential-item-functioning (DIF) analysis on school types and location of school according to the Rasch model was performed in this study. Nonacademically inclined school students (n = 212) and academically inclined school students (n = 197) were selected to complete the questionnaire. The questionnaire consisted of three dimensions of beliefs about knowing and two dimensions of beliefs about learning. The results of the DIF analysis suggested that the items in the dimensions about knowing favor academically inclined students and students from schools located in the northern areas of Taiwan. The items in the learning dimensions favor nonacademically inclined students and students from schools located in the southern areas of Taiwan. The group comparisons were different between the scale that included the DIF items and the scale that excluded the DIF items. The discussion addresses the value of the precise detections for inappropriate items and the effect of the academic achievement on completing the survey. Keywords: differential-item-functioning, personal epistemology, scale development
Personal epistemology denotes the thinking about knowledge and learning (Schommer, 1990). Some researchers (Cano, 2005; Chen & Pajares, 2010; Schommer-Aikins, Duell, & Hutter, 2005) suggest that students with strong personal epistemological beliefs perform better in learning and have higher academic achievement. However, these positive relationships could be obtained by misusing personal epistemological beliefs related questionnaires. That is, the results may be confounded by students’ academic performance due to the cognitive difficulty of some of the item wordings. Therefore, the aim of this study is to propose a differential-item-functioning (DIF) analysis of the Personal Epistemological Beliefs Questionnaire for senior high school students (SH Personal Epistemological Beliefs Questionnaire) (Syu & Chan, 2010) to detect the effect of school types and locations on senior high school students in Taiwan. In addition, the effects of DIF are reported by including and excluding DIF items in mean comparisons. The development of the concept of personal epistemology can be divided into two main approaches (DeBacker, Crowson, Beesley, Thoma, & Hestevold, 2008). One of the approaches suggests understanding the individual’s personal epistemology from the developmental view, which focuses on the essence of knowledge and knowing. Perry
(1968) proposes four stages: dualism, multiplicity, relativism, commitment within relativism, and nine positions. This approach considers personal epistemology in a holistic sense. It recognizes that people are at different stages in their personal epistemological beliefs. However, this approach provides only a limited contribution to understanding the content of personal epistemology since it only focused on the stages of development. Schommer (1990) suggests a multidimensional perspective to understanding personal epistemology. To conform to the condition of education, she adds the dimension of learning to the theory of personal epistemology. Through this approach, the concept of personal epistemology comprises not only the belief about knowledge, but also the belief about learning (Hofer & Pintrich, 1997). There are five factors included in personal epistemology (the stability of knowledge, the structure of knowledge, the source of knowledge, the ability to learn, and the speed of learning). These factors are independent to some degree. This means that a change in an individual’s personal epistemological beliefs in one factor does not necessarily lead to a simultaneous change of their beliefs in another factor (Duell & Schommer-Aikins, 2001). This relationship among all the factors implies that personal epistemology does not develop in stages, but can be seen as a continuum status in each
European Journal of Psychological Assessment 2017; Vol. 33(1):22–29 DOI: 10.1027/1015-5759/a000268
Ó 2015 Hogrefe Publishing
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
23
dimension. The present study adopts Schommer’s (1990) theory of personal epistemological beliefs. The investigation of personal epistemological beliefs has been valued since Schommer’s theory was proposed in 1990 (Schommer, 1990). Such investigation has had a positive influence on learning-related abilities, such as self-regulated learning, motivation for learning, meta-cognition, and academic achievement (Bråten & Strømsø, 2005; Chen & Pajares, 2010). Studies have indicated that individuals with sophisticated personal epistemological beliefs are more likely to perform better in learning and academic achievement. Moreover, it has been found that there is a significant relationship between achievement and personal epistemological beliefs (Trautwein & Ludtke, 2007). However, this association should be considered carefully. In most survey studies, several questionnaires are developed to measure the constructs of interest and ability. However, participants’ responses may be affected by factors that are not addressed by the studies. Rijkeboer, van den Bergh, and van den Bout (2012) note that educational attainment might cause DIF. The equality of responses on self-report questionnaires may be affected by the fact that participants have different educational backgrounds. Similarly, given the abstract features of personal epistemology, in the investigation of personal epistemology, some items of a questionnaire are rather sophisticated and make cognitive demands on participants. Academically inclined students might be favored by the item descriptions, further affecting the research results. (In this study, the word ‘‘favor’’ is used to describe the situation where students in certain type of school could benefit from the irrelevant factor in terms of addressed trait by the study and can make them answer the item easily. In other words, these students have a higher probability of scoring higher on DIF items). Several studies have examined the relationship between achievement and personal epistemological beliefs (Bråten & Strømsø, 2005; Topcu & Yilmaz-Tuzun, 2009); however, no research has explored the possibility that the items in the questionnaire of personal epistemological beliefs function differently on academic performance. This study aims to examine DIF on SH Personal Epistemological Beliefs Questionnaire (Syu & Chan, 2010). This questionnaire measures five factors: (1) the stability of knowledge (the belief that knowledge is certain or tentative); (2) the structure of knowledge (the belief that knowledge is isolated or connected to other knowledge); (3) the source of knowledge (the belief that knowledge is transmitted by authority or by reasoning); (4) the ability to learn (the belief that the ability to learn is fixed or could be improved); and (5) the speed of learning (the belief that the speed of learning is quick or progressive). All the dimensions are somewhat independent and each can be explained independently. The manner in which the level of participants’ academic achievements is defined should also be considered. Studies that focus on academic achievement tend to group students by their performance within a class or school. However, when apparent differences in performance exist among compared schools, this classification becomes problematic.
The true level of students’ ability in the same classified group, such as a high-achiever group, might not be distributed equally among schools. In Taiwan, similar to most Eastern cultures, academic achievement is encouraged, and some schools are even characterized by the outstanding academic performance of their students. It is highly possible that students in high-achieving schools perform better in academic learning than in other schools. That is, a discrepancy in academic achievement can be found between different types of schools. Thus, using the types of schools as a classified variable to investigate the influence of academic achievement is an appropriate manner in which to define academic achievement. The location of schools is another factor of concern. The northern and southern areas of Taiwan have significantly different levels of development (Pai, Huang, & Chang, 2012), which may cause differences in culture and may have an effect on education. This potential influence has been found in studies that included the variable of location of school to balance the possible effect (Lou & Lin, 2004). Numerous studies have suggested the strong relationship between academic achievement and personal epistemological beliefs (e.g., Cano, 2005; Chen & Pajares, 2010; Trautwein & Ludtke, 2007). However, a procedure to assess construct validity and reliability does not imply item invariance across groups. The lack of invariance might lead to an erroneous conclusion for further mean comparisons (Rijkeboer et al., 2012). Before the results of further comparisons can be reported correctly, it is important to consider the issue of measurement bias first.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):22–29
Research Question This study proposes several questions to guide the research: Does DIF exist on SH personal epistemology beliefs questionnaire for different school types and locations in Taiwan? If so, to what extent does the DIF item affect the conclusion? In this study, DIF analysis was adopted to detect the DIF items of SH personal epistemology beliefs questionnaire. This study then conducted mean comparisons on the school types and locations based on the DIF items that were included and excluded to assess the effect of the DIF items. The results of mean comparison may provide further evidence for understanding the effect of DIF items.
Method Participants The data (N = 409) was collected from senior high schools in the northern areas (n = 138, 33.7%), middle areas (n = 121, 29.6%), and southern areas (n = 150, 36.7%) of Taiwan in 2007. One academically focused school and one nonacademically focused school were selected from each area. Schools with an academic focus included the
24
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
Table 1. Description and example item for each questionnaire factor Factors (the numbers of items)
Description and example item
The stability of knowledge (8 items)
Description Example itema
The structure of knowledge (7 items)
Description Example item Description Example item
The source of knowledge (5 items)
The ability to learn (7 items) The speed of learning (8 items)
Description Example item Description Example item
Knowledge is certain or tentative The truth that people believe today would possibly change tomorrow. Knowledge is isolated or connected to other knowledge Most words have only single meaning. Knowledge is transmitted by authority or by reasoning I believe that only experts can discover new phenomenon and build a new theory. The ability to learn is fixed or could be improved. People are unable to improve the learning ability. The speed of learning is quick or progressive Successful people learn quickly.
Note. aItem descriptions are tentatively translated from Chinese for exemplifying content.
top three schools in the respective areas. The nonacademically focused schools were selected from the remaining schools in the respective areas. There are 212 (51.8%) students from nonacademically focused schools and 197 (48.2%) students from academically focused schools participated in this study. The dataset was comprised of 197 (48.2%) male students and 209 (51.1%) female students, of which 146 (35.7%) were in first-grade high school (aged between 15 and 16 years) and 263 (64.3%) were in second-grade high school (aged between 16 and 17 years).
Questionnaire The Personal Epistemological Beliefs Questionnaire for senior high school students (SH Personal Epistemological Beliefs Questionnaire) (Syu & Chan, 2010) is based on Schommer’s (1990) theory of personal epistemology. It contains five factors with a total of 35 items. Each subscale has five to eight items. Table 1 lists a description and example item for each factor. The questionnaire is rated using a 4-point Likert scoring format. Answers range from 1 (= strongly agree) to 4 (= strongly disagree). The meaning of the score is interpreted negatively. Higher scores represent a tendency to hold naïve personal epistemological beliefs, while lower scores represent a tendency to hold sophisticated personal epistemological beliefs. When people hold naïve epistemological beliefs, they tend to believe knowledge is certain, isolated from other knowledge, and transmitted by an authority. In addition, these people are inclined to believe that the ability of learning can be improved and the speed of learning is progressive. In the reliability and validity analysis, the internal consistent Cronbach’s a coefficient for subscales was between .667 and .784, and the total scale was .905. The second-stage confirmation factor analysis was employed for building validation. The model was assessed by popular indices, and the satisfactory model fit was confirmed – v2 = 877.75 (p < .01), the root mean square error of approximation (RMSEA) = 0.051, the comparative fit index (CFI) = 0.97, and the root mean square residual European Journal of Psychological Assessment 2017; Vol. 33(1):22–29
(RMR) = 0.035. For more detailed information on these analyses, please refer to Syu and Chan (2010).
Differential-Item-Functioning (DIF) In contrast to the effect of tests, DIF implies the differences in the way a test item functions across demographic groups. A measure without DIF was formulated as the function 1, which signifies the functional relationship between the possibilities of the answer being correct and the ability for the two groups to be identical. Suppose that Y denotes the response to a particular test item, G represents the grouping variable, and the test’s latent construct is denoted by h, the probability distribution of Y on h can be expressed as f ðY jh; G ¼ RÞ ¼ f ðY jh; G ¼ FÞ;
ð1Þ
where R corresponds to the reference group and F corresponds to the focal group. DIF may be defined as examinees having the same trait or ability levels, while belonging to different groups and having dissimilar probability distributions on Y (Osterlind & Everson, 2009). There are two types of DIFs: uniform and nonuniform. Uniform DIF is the simplest form. It is identified when the probabilities of success on the flagged test item for one group (i.e., the reference group) are consistently higher than the probabilities of success for another group’s (i.e., the focal group) overall trait levels (Mellenbergh, 1982). However, when the conditional dependence changes in degree or direction at different points on the h continuum, then the nonuniform DIF occurs (Osterlind & Everson, 2009). The nonuniform DIF is related to the discrimination parameter. Several methods have been developed for the measurement of DIF. Generally, they can be classified as non-item response theory (non-IRT)-based methods and IRT-based methods (Yu, 2009; Osterlind & Everson, 2009). Some non-IRT-based methods are based on statistics, such as logistic regression (Swaminathan & Rogers, 1990) and the Mantel-Haenszel method (Holland & Thayer, 1988). Ó 2015 Hogrefe Publishing
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
Others are under the nonparametric scope and do not rely on population for detection, such as the simultaneousitem-bias test (Shealy & Stout, 1993). The IRT-based methods are under the scope of IRT but vary depending on the adopted model. There are two main frameworks provided to interpret DIF: (1) the differences in an item’s parameters and (2) the differences in the item characteristic curves (ICCs). The IRT-based methods allow for a more extensive investigation, both theoretically and procedurally, than the methods under classical test theory (Osterlind & Everson, 2009). Within the scope of the IRT-based methods, two approaches are used to detect DIF. The method of area measures to detect DIF can be quantified by calculating the discrepant areas between the ICCs for the two groups (Raju, 1988). One of the limitations of this approach is that the distribution of examinees across the theta scale often fails to be considered, resulting in misleading interpretations (Osterlind & Everson, 2009). Another approach is to test the differences on an item’s parameters between groups. Lord (1980) proposes the following approach. Test the null hypothesis in which the b parameters are equal for the reference and focal groups, which is given by ^ bF bR ^ ; ð2Þ d¼ SE ^ bR ^ bF
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi h i2 h i2 ^ ^ SE bR bF ¼ þ SE ^ bR ; SE ^ bR
25
the DIF analysis for school types and locations. The Statistical Package for the Social Sciences (SPSS) 18 was used to compare mean differences among the groups.
Results DIF Analysis on Different School Types Table 2 summarizes the results of DIF on different school types. Overall, the DIF items in the dimensions about the beliefs on knowing (including certain knowledge and simple knowledge) favored academically inclined school students (i.e., students from academically focused schools had a greater probability of gaining higher scores). The DIF items in the dimension about the beliefs on learning favored nonacademically inclined school students. The uniform DIF could be found in each subscale. Among the five factors, there were nine items identified as having DIF items. The subscale of fixed ability had the highest percentage of DIF items (43%) and favored nonacademically inclined school students. The second highest percentage of DIF items (38%) was the subscale of quick learning, which also favored nonacademically inclined school students. The subscales of certain knowledge and simple knowledge favored academically inclined school students, and the percentage of DIF items was 13% and 29%, respectively. The subscale of authority had no DIF items.
ð3Þ
where R is the reference group and F represents the focal group. The number of d is distributed approximately as standard normal and provides a test of the null hypothesis H0 : bR = bF. This method has been adopted by this study.
Software The Rasch model method of estimating the differences of an item’s parameters was adopted for the DIF analysis. Accordingly, Winsteps 3.57 (2005) was used to perform
DIF Analysis on School Locations The results demonstrated that DIF items existed in each subscale (Table 3). Within the subscales of certain knowledge and simple knowledge, two of eight items favored students from schools located in the northern areas of Taiwan. However, the DIF items in the subscales: authority, fixed ability, and quick learning did not favor students from schools located in the northern areas of Taiwan. The subscale of quick learning consistently favored students from schools located in the southern areas of Taiwan. Among the five dimensions, the subscales of certain knowledge
Table 2. DIF analysis on Academically inclined and Nonacademically inclined schools Dimension Certain knowledge Simple knowledge Fixed ability
Quick learning
Item no.
Contrast
30 2 35 9 24 32 10 15 29
.54 .62 .73 .53 .45 .60 .46 .65 .42
Group-favored Academic Academic Academic Nonacademic Nonacademic Nonacademic Nonacademic Nonacademic Nonacademic
Percentage of DIF items (%) 13 29 43
38
Note. Reference group: Academically inclined, focal group: nonacademically inclined. Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):22–29
26
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
Table 3. DIF results by locations of school Dimension Certain knowledge
Simple knowledge
Omniscient Authority Fixed ability Quick learning
Item no.
Contrast
Results
Percentage of DIF items (%)
1 6 11 34 2 7 27 35 3 23 9 32 15 33
0.53 0.50 0.46 0.53/0.62a 0.98/0.93 0.47 0.70/0.46 0.54 0.49 0.48 0.67 0.66/ 0.96 0.51 0.44/ 0.60
N>S N>S M>S N > M/N > S N > M/N > S M>S N > S/M > S N>S M>S S>M S>N M > N/S > N S>N S > N/S > M
50
57
25 29 25
Notes. N: the Northern area; M: the Middle area; S: the Southern area. athe two results were due to the pairwise comparison in the analysis.
and simple knowledge had the highest percentage of DIF items; the percentage of DIF items was at least 50%. The percentage of DIF items for the other three subscales was approximately 25%.
Effect of the DIF items To investigate the effect of the DIF items, analysis of t-test and analysis of variance (ANOVA) were conducted on the mean comparison among the school types and locations. The tests were conducted for each factor and the SH Personal Epistemological Beliefs Questionnaire. There were four types of scores: scores that included the DIF items, scores that excluded school-type DIF items, scores that excluded the location of DIF items, and scores that excluded both the school and location of the DIF items. As presented in Table 4, students in two types of schools exhibited significant differences in each subset and on the total scale. Academically inclined school students tended to hold more sophisticated beliefs than nonacademically inclined school students. However, the results of the t-test changed among the four types of scores when different DIF items were included. When alpha was taken as 0.05 in the conditions of the DIF items included and the DIF items excluded, the results of the t-test changed from nonsignificant to significant in the subsets of the fixed ability (t = 0.977 for the DIF items included; t = 2.162 for the DIF items excluded) and the subsets of quick learning (t = 1.944 for the DIF items included; t = 2.866 for DIF items excluded). For the other three subsets and the whole scale, the mean differences narrowed gradually in the comparisons of conditions of DIF items included with the conditions of the DIF items excluded. Figure 1 presents the reduction in distance among the six lines. The discrepancies among the six lines that excluded the local DIF items (N.L. DIF) were closer to those that excluded all the DIF items (EXCL. DIF) than to the condition of excluding the school-type DIF items European Journal of Psychological Assessment 2017; Vol. 33(1):22â&#x20AC;&#x201C;29
(N.S. DIF); the latter showed a greater effect on location than school type in the mean comparison analysis. ANOVA revealed that school students from different areas exhibited significant difference only on two subsets. The DIF items consistently favored students from schools located in the northern areas of Taiwan over those located in the southern areas of Taiwan (see Table 5). Overall, for the effect of the DIF items (see Figure 2), the lines inclined downwards from the left side to the right side. The DIF items changed the results of the group comparison. The condition of excluding the DIF items reduced the mean difference among the three groups. In the subset of certain knowledge, the comparison results changed from a significant difference to no significant difference. Additionally, when the mean differences of the location of the DIF items were considered, results that were similar to excluding both the DIF items were generated. The considerable effect of the school location was thus established.
Discussion This study aimed to investigate the effect of academic achievement on group comparison of personal epistemological beliefs. The DIF technique was employed for different school types and locations in Taiwan. The results suggest that DIF were detected on both the school types and locations, and that different levels of the DIF items included in the score will influence further interpretation of the results. Additionally, DIF functions inconsistently among the five dimensions. The DIF items in the subscales of certain knowledge and simple knowledge favor academically inclined school students, and disadvantage students from schools located in the southern areas of Taiwan. That is, among all of the students with the same level of personal epistemological beliefs, the students from schools located in the southern areas of Taiwan have a higher probability of obtaining lower scores in these two subscales. The DIF Ă&#x201C; 2015 Hogrefe Publishing
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
27
Table 4. Mean comparison between two types of school students Academic (n = 197) Dimension Certain knowledge
Simple knowledge
Omniscient authority
Fixed ability
Quick learning
Whole scale
DIF N.S. DIF N.L. DIF EXCL. DIF DIF N.S. DIF N.L. DIF EXCL. DIF DIF N.S. DIF N.L. DIF EXCL. DIF DIF N.S. DIF N.L. DIF EXCL. DIF DIF N.S. DIF N.L. DIF EXCL. DIF DIF N.S. DIF N.L. DIF EXCL. DIF
Nonacademic (n = 212)
M
SD
M
SD
t
1.76 1.79 1.77 1.83 1.59 1.61 1.63 1.63 1.73 1.73 1.74 1.74 1.89 1.73 1.76 1.73 1.87 1.72 1.84 1.69 1.77 1.72 1.76 1.72
.369 .376 .395 .419 .335 .363 .416 .416 .367 .367 .429 .429 .452 .415 .435 .415 .369 .370 .392 .379 .294 .293 .316 .306
1.97 1.99 1.96 2.01 1.82 1.81 1.79 1.79 1.90 1.90 1.90 1.90 1.94 1.83 1.84 1.83 1.95 1.85 1.92 1.81 1.91 1.88 1.89 1.86
.355 .366 .430 .455 .374 .382 .447 .447 .417 .417 .493 .493 .502 .516 .516 .516 .433 .448 .462 .450 .323 .312 .364 .353
5.943*** 5.629*** 4.826*** 3.994*** 6.543*** 5.297*** 3.618*** 3.618*** 4.282*** 4.282*** 3.397*** 3.397*** 0.977 2.162* 1.732 2.162* 1.944 3.246** 1.825 2.866** 4.730*** 5.351*** 3.763*** 4.196***
Notes. DIF: include all items; N.S. DIF: exclude school-type DIF items; N.L. DIF: exclude location DIF items; EXCL. DIF: exclude both school-type and location DIF items. *p < .05, **p < .01, ***p < .001.
items in the subscale of learning, including fixed ability and quick learning, tend to favor nonacademically inclined school students, and disadvantage students from schools located in the northern area of Taiwan.
Revising item expression is one of the methods used to reduce DIF effects (Rupp, 2013). Participants may have interpreted some of the DIF items in this study differently because of the manner in which they were worded, particularly considering some terms were somewhat unclear. For example, in the subscale of certain knowledge, the item translated as ‘‘I believe the knowledge in every discipline will not change over time,’’ used ambiguous wording such as ‘‘every discipline.’’ Although the word ‘‘every’’ was used here, not every student would compare the same disciplines when answering the scale. Similarly, in the dimension of quick learning, instead of using specific terms such as ‘‘first time,’’ unclear terms such as ‘‘too much time’’ and ‘‘very soon’’ were used in the description of DIF items. In addition, it was found that learning experiences might cause DIF for different types of schools. Students from the academically inclined schools were more likely to endorse the DIF item ‘‘I hope the teacher can provide an overview of the entire concept to help my learning’’ than students from nonacademically inclined schools. Terms such as ‘‘entire concept’’ relate to their learning experience. If the participants had not experienced this before, it would be difficult for them to understand and answer the question appropriately. Conversely, the results of DIF indicate that the teaching content or pedagogy needs to be modified (Huang & Li, 1999). During this study, it became apparent
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):22–29
0 -1
C S A F Q EBS
t value
-2 -3 -4 -5 -6 -7
DIF
N.S.DIF
N.L.DIF EXCL.DIF
Figure 1. t value in four conditions of DIF items included. DIF: include all items; N.S. DIF: exclude school type DIF items; N.L. DIF: exclude location DIF items; EXCL. DIF: exclude both school type and location DIF items; C: the dimension of Certain Knowledge; S: the dimension of Simple Knowledge. C: Certain knowledge; S: Simple knowledge; A: Omniscient authority; F: Fixed ability; Q: Quick learning; EBS: whole scale.
28
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
Table 5. Mean comparison among three locations of school students North (n = 138) Dimension
Middle (n = 121)
South (n = 150)
M
SD
M
SD
M
SD
Certain knowledge
DIF N.S. DIF N.L. DIF EXCL. DIF
1.78 1.80 1.82 1.87
.375 .382 .431 .469
1.86 1.89 1.87 1.94
.316 .312 .392 .388
1.95 1.98 1.91 1.95
.407 .421 .441 .464
Simple knowledge
DIF N.S. DIF N.L. DIF EXCL. DIF
1.61 1.64 1.67 1.67
.334 .356 .417 .417
1.68 1.66 1.64 1.64
.314 .336 .405 .405
1.82 1.82 1.81 1.81
.423 .426 .470 .470
F value (post-comparison) 7.137*** (S > N) 7.452*** (S > N) 1.773 1.431 12.080*** (S > N; S > M) 9.210*** (S > N; S > M) 5.904** (S > N; S > M) 5.904*** (S > N; S > M)
Note. DIF: include all items; N.S. DIF: exclude school-type DIF items; N.L. DIF: exclude location DIF items; EXCL. DIF: exclude both school-type and location DIF items. **p < .01. ***p < .001.
F value
15 10
C S
5 0
DIF
N.S.DIF
N.L.DIF EXCL.DIF
Figure 2. F value in four conditions of different DIF items included. DIF: include all items; N.S. DIF: exclude school-type DIF items; N.L. DIF: exclude location DIF items; EXCL. DIF: exclude both school-type and location DIF items; C: the dimension of Certain Knowledge; S: the dimension of Simple Knowledge. that offering a more comprehensive and holistic approach to teaching could assist disadvantaged groups. The causes of DIF are not limited to the manner in which the DIF items are worded. Multidimensionality is also one of the causes of DIF (de Ayala, 2009). When scores reflect not only the construct but also the variables unrelated to study (e.g., format, content, and gender), the conclusion of the study will be misleading (Osterlind & Everson, 2009). In this study, there is a tendency for the DIF items in the subscales of knowing to favor academically inclined school students, and the subscales about learning to favor nonacademically inclined school students. This tendency reveals that more than one construct is required to answer those items, and it also confirms our hypothesis regarding the potential impact of an individual’s academic achievement on their responses in the SH Personal Epistemological Beliefs Questionnaire. Differences on the effect of the DIF items were found when including and excluding the DIF items. In the subscales of fixed ability and quick learning, the consequence of mean comparisons between school types changed from insignificant to significant. Similarly, in the subscale of certain knowledge, the results turned to no mean difference among the locations. These distinct changes in the results highlight the importance of performing a DIF analysis before conducting further analyses to avoid distortion of European Journal of Psychological Assessment 2017; Vol. 33(1):22–29
the results. Although there are true mean differences among the groups, the reduced significance of the DIF items included versus the items excluded indicates that part of the differences is due to DIF. These results remind researchers to consider whether there are any factors that may affect the results. For example, studies on education are often interested in academic achievement, and the results of this study suggest that the expression of items may provoke DIF because of students’ academic achievement.
Conclusion This study uses DIF analysis to reexamine the validity of the SH Personal Epistemological Beliefs Questionnaire, which has been verified through general validation in previous studies. The results of this study provide meaningful information to modify and develop this scale in the future. Moreover, they highlight the necessity of conducting DIF analysis. Additionally, it is recommended that DIF analysis be applied to other variables that may affect a respondent’s answer, rather than being limited to conventional variables such as gender.
Limitations and Future Research This study has a number of limitations. First, this study uses only one index to examine DIF. While several statistics have been developed to detect DIF, no single method has been proven superior to another. As such, the use of multiple methods is recommended (Hambleton, 2006). Some studies argue that more than one index is required to obtain a rigorous outcome. However, given the difficulty of tackling the inconsistent outcomes of the different DIF indices, this study only included the most popular index as an indicator. The possible causes of DIF deserve further investigation. According to the results of this study, it is doubtful that the learning experience and the manner in which the DIF Ó 2015 Hogrefe Publishing
J.-J. Syu & L.-C. Zhang: Impact of DIF on the Personal Epistemological Beliefs
items are conveyed are the main causes of DIF. Future studies could use qualitative methods to understand the exact problem and provide a suitable DIF detecting method for this scale.
Acknowledgments The authors would like to thank Dr. Remo Ostini for his helpful suggestions on the revision of this manuscript. We thank the editor and two anonymous reviewers for their valuable comments, which helped us to improve the manuscript. This research was supported in part by a grant from the Ministry of Science and Technology of Taiwan (NSC103-2917-I-564-045).
References Bråten, I., & Strømsø, H. I. (2005). The relationship between epistemological beliefs, implicit theories of intelligence, and self-regulated learning among Norwegian post-secondary students. The British Journal of Educational Psychology, 75, 539–565. Cano, F. (2005). Epistemological beliefs and approaches to learning: Their change through secondary school and their influence on academic performance. The British Journal of Educational Psychology, 75, 203–221. Chen, J. A., & Pajares, F. (2010). Implicit theories of ability of Grade 6 science students: Relation to epistemological beliefs and academic motivation and achievement in science. Contemporary Educational Psychology, 35, 75–87. de Ayala, R. J. (2009). The theory and practice of item response theory. New York, NY: Guilford Publications. DeBacker, T. K., Crowson, H. M., Beesley, A. D., Thoma, S. J., & Hestevold, N. L. (2008). The challenge of measuring epistemic beliefs: An analysis of three self-report instruments. The Journal of Experimental Education, 76, 281–312. Duell, O. K., & Schommer-Aikins, M. (2001). Measures of people’s beliefs about knowledge and learning. Educational Psychology Review, 13, 419–449. Hambleton, R. (2006). Good practices for identifying differential item functioning. Medical Care, 44, 182–188. Hofer, B. K., & Pintrich, P. R. (1997). The development of epistemological theories: Beliefs about knowledge and knowledge and their relation to learning. Review of Educational Research, 67, 88–140. Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum. Huang, T. W., & Li, H. H. (1999). Gender DIF/DBF Analysis: Application of poly-SIBTEST. Psychological Testing, 46, 45–60. Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum. Lou, S. J., & Lin, Y. Y. (2004). Research of the hairdressing industry’s demands for the vocational high school and junior college graduate students’ professional competencies. Kaohsiung Normal University Journal, 17, 115–137. Mellenbergh, G. J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105–118.
Ó 2015 Hogrefe Publishing
29
Osterlind, S. T., & Everson, H. T. (2009). Differential item functioning (2nd ed.). CA: Sage. Pai, T. I., Huang, T. C., & Chang, C. Y. (2012). The relationships among job satisfaction, professional commitment and organizational commitment of college students enroll in intern practice course. Chung Yuan Physical Educational Journal, 1, 95–104. Perry, W. G. (1968). Patterns of development in thought and values of students in a liberal arts college: A validation of a scheme (ERIC Document Reproduction Service No. ED 024315). Cambridge, MA: Bureau of Study Counsel, Harvard University. Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53, 495–502. Rijkeboer, M. M., van den Bergh, H., & van den Bout, J. (2012). Item bias analysis of the Young Schema-Questionnaire for psychopathology, gender, and educational level. European Journal of Psychological Assessment, 27, 65–70. Rupp, A. A. (2013). A systematic review of methodology for person fit research in Item Response Theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55, 3–38. Schommer, M. (1990). Effects of beliefs about the nature of knowledge on comprehension. Journal of Educational Psychology, 82, 498–504. Schommer-Aikins, M., Duell, O. K., & Hutter, R. (2005). Epistemological beliefs, Mathematical problem-solving beliefs, and academic performance of middle school students. The Elementary School Journal, 105, 289–304. Shealy, R., & Stout, W. A. (1993). Model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DIF as well as item bias/ DIF. Psychometrika, 58, 159–194. Syu, J. J., & Chan, J. C. (2010). Developing and validating the personal Epistemological Beliefs Scale for senior high school students in Taiwan. Psychological Testing, 57, 433–458. Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370. Topcu, M. S., & Yilmaz-Tuzun, O. (2009). Elementary students’ metacognition and epistemological beliefs considering science achievement, gender and socioeconomic status. Elementary Education Online, 8, 676–693. Trautwein, U., & Ludtke, O. (2007). Epistemological beliefs, school achievement, and college major: A large-scale longitudinal study on impact of certain beliefs. Contemporary Educational Psychology, 32, 348–366. Yu, M. N. (2009). Item response theory. Taiwan, Taipei: Psychological Publishing.
Date of acceptance: January 26, 2015 Published online: June 26, 2015 Jia-Jia Syu School of Public Health University of Queensland Level 2, Public Health Bldg. Herston Road Herston, QLD 4006 Australia Tel. +61 42 662-6541 E-mail debbysyu@gmail.com
European Journal of Psychological Assessment 2017; Vol. 33(1):22–29
Original Article
Anxiety Sensitivity or Interoceptive Sensitivity An Analysis of Feared Bodily Sensations Peter J. Norton1 and Katharine Sears Edwards2 1
Monash University, Clayton, VIC, Australia, 2Stanford University School of Medicine, Stanford, CA, USA
Abstract. The construct of anxiety sensitivity (AS) – the fear of anxiety-related symptoms – has been highly influential in current conceptualizations of anxiety disorders in general, and panic disorder specifically. However, given documented associations between AS and both non-anxiety psychological disorders as well as medical/health conditions, the extent to which measures of AS are assessing a specific fear or anxiety symptoms versus a broader fear of interoceptive or bodily sensations is unclear. Confirmatory factor analysis of data from 373 participants failed to suggest whether fears of anxiety-related symptoms were factorially distinct from fears of non-anxiety-related bodily sensations, although analyses indicated that while fears of anxiety-related symptoms were more closely associated with panic disorder severity than were fears of non-anxiety-related symptoms, both were similarly and strongly associated with hypochondriacal fears. Implications for the construct of AS, and the broader construct of somatic fears, are discussed. Keywords: anxiety sensitivity, interoceptive fears, hypochondriasis, panic disorder
The construct of anxiety sensitivity (AS) was originally proposed by Reiss and McNally (1985) to explain data pertaining to the treatment of anxiety disorders (Reiss, 1999). It was presented as part of an expectancy theory of anxiety to explain the psychological factors involved in human motivation to avoid a feared stimulus. AS refers to a fear of experiencing anxiety and is considered to be one of the ‘‘fundamental fears,’’ or main intrinsic motivators that guide avoidance behavior. AS has been defined as the fear of anxiety-related sensations due to the belief that anxiety and its sensations may have harmful personal consequences (Reiss, 1987, 1991; Reiss & McNally, 1985; Reiss, Peterson, & Gursky, 1988; Reiss, Peterson, Gursky, & McNally, 1986). AS has been most closely implicated with the development and maintenance of panic disorder. Early clinical findings suggested a strong association between AS and the occurrence of panic attacks and panic disorder (e.g., Cox, Borger, & Enns, 1999; Cox, Fuentes, Borger, & Taylor, 2001; Taylor, Koch, & McNally, 1992). And now there is evidence that AS prospectively measured could be a risk factor for developing panic disorder; Maller and Reiss (1992) found that AS strongly predicted the frequency and intensity of later panic attacks, and Schmidt, Lerew, and Jackson (1997, 1999) demonstrated a link between scores on a measure of AS and subsequent spontaneous panic attacks in a nonclinical sample, after controlling for panic history and trait anxiety. Schmidt, Zvolensky, and Maner (2006) replicated these findings with a nonclinical
community sample composed of participants who were designated ‘‘at risk’’ based on a simple screening instrument. They found that AS predicted incidence of spontaneous panic attacks during 2-year follow-up. In addition to panic disorder, AS has been closely linked to other anxiety disorders, as well as other disorders of negative affect such as major depression. A study by Taylor et al. (1992) compared AS levels in six anxiety disorders. All disorders, with the exception of specific phobia, showed higher levels of AS compared to normal controls. Panic disorder patients had marginally higher AS scores than those with posttraumatic stress disorder (PTSD), and significantly greater than those with generalized anxiety disorder, obsessive-compulsive disorder, social phobia, and specific phobia (Taylor et al., 1992). A more recent meta-analysis has confirmed this stronger association between AS and panic, and AS and PTSD, compared with AS and the other anxiety disorders (Olatunji & Wolitzky-Taylor, 2009). AS has been found to be elevated in depressed patients and appears to decrease with pharmacological treatment of depression (Otto, Pollack, Fava, Uccello, & Rosenbaum, 1995). Otto and his colleagues consistently found AS elevations among depressed patients, with scores similar to those of individuals who had non-panic anxiety disorders (Otto, Demopulos, McLean, Pollack, & Fava, 1998; Otto et al., 1995). Taylor (1996) later claimed that the overlap between AS and depression is strongest for items on a measure of AS (Anxiety Sensitivity Index [ASI]; Reiss et al., 1986) that represent a fear of cognitive dyscontrol, as opposed to ASI
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37 DOI: 10.1027/1015-5759/a000269
Ó 2015 Hogrefe Publishing
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
31
items that represent a fear of somatic sensations or publicly observable symptoms. Further research on AS and depression has confirmed a significant relationship between ASI scores and major depressive symptomology (Zinbarg, Brown, Barlow, & Rapee, 2001). AS has also been found to be a predictor of hypochondriacal concerns (Otto, Pollack, Sachs, & Rosenbaum, 1992), even when the possible confounding of panic symptoms has been removed (Otto et al., 1998). DSM-IV-TR hypochondriasis is defined as a preoccupation with fears of having, or the idea that one has, a serious disease based on the person’s misinterpretation of bodily symptoms despite appropriate medical evaluation and reassurance (American Psychiatric Association [APA], 2000). Earlier versions of DSM included similar definitions, all focused on unrealistic or exaggerated interpretations of one’s own bodily sensations, as do DSM-5 somatic symptom disorder and illness anxiety disorder (APA, 2013).Given that individuals with hypochondriacal concerns tend to interpret many physical sensations as negative and potentially harmful, it is no surprise that they may also endorse items on the ASI (e.g., item 11: ‘‘When my stomach is upset, I worry that I may be seriously ill’’). Otto et al. (1992) found significant correlations between the ASI and several subscales of a measure of hypochondriasis, the Illness Attitudes Scale (IAS; Kellner, Abbott, Winslow, & Pathak, 1987; Kellner, Wiggins, & Pathak, 1986), in a panic disorder sample. Later, they reported that AS was also the highest predictor of hypochondriacal concerns in a sample of depressed patients with no history of panic disorder (Otto et al., 1998). More recently, Cox and colleagues found significant correlations between the ASI and IAS subscales among undergraduates, and large correlation between the ASI and an additional 8-item measure of hypochondriacal fears (Cox et al., 2001). Beyond the apparent role that AS plays in panic disorder, other anxiety disorders, and related disorders of negative affect, several research teams have found evidence that AS may be implicated in several nonpsychiatric medical conditions (see Asmundson, Wright, & Hadjistavropoulos, 2000). In particular, AS has been implicated in acute (Keogh & Mansoor, 2001) and chronic pain (Asmundson & Taylor, 1996), asthma (Carr, Lehrer, & Hochron, 1995; Carr, Lehrer, Rausch, & Hochron, 1994), gastrointestinal disorders (Norton, Norton, Asmundson, Thompson, & Larsen, 1999), chronic headache (Asmundson, Norton, & Veloso, 1999), vestibular dysfunction and tinnitus (Andersson & Vretblad, 2000), and several other medical or health conditions. Given the overlap between AS and a variety of emotional problems, more recent studies have examined at the latent structure of AS and how it may relate more broadly to psychological distress. Lewis et al. (2010) took a structural equation modeling approach to investigate how AS relates to other vulnerability factors for psychological disorders. They found that the facet of AS related to cognitive dyscontrol seemed to relate to a general distress factor that underlies symptoms of most mood and anxiety disorders, and the facet of AS related to physical worries was more specifically associated with a mid-level ‘‘fears’’ factor
thought to underlie social phobia, panic, and agoraphobia. Another study using a combination of factor-analytic and taxometric approaches suggested that AS may, in fact, be taxonic rather than dimensional; AS may occur in adaptive (low-risk, normal) and nonadaptive (high-risk, oversensitive) forms and may take on a different structure in each form, impacting how AS relates to other distress constructs depending on whether it is low or high (Bernstein et al., 2007).
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
Potential Problems With the AS Construct Given the nature of fears characterizing each of the abovementioned constructs and that these fears may be tapped by measures of AS, there has been substantial debate over what the ASI is measuring. Otto et al. (1992) explained the high degree of overlap between ASI and IAS scores among panic patients by suggesting that the two measures assessed ‘‘different aspects of a more general tendency to become fearful and aroused in response to somatic sensations’’ (p. 100). They stated that the IAS likely measured concerns about disease while the ASI measured concerns associated with anxiety-related sensations. Taylor (1994) countered that ASI and IAS are unlikely to assess distinct fears, because panickers could be responding to IAS items in terms of their anxiety-related fears without necessarily having a broader tendency to fear bodily symptoms. Related to this, Watt, Stewart, and Cox (1998) found that university students who scored high on the ASI reported learning histories that included messages about the danger of bodily symptoms in general, as opposed to messages about the danger of anxiety-related symptoms in particular. Perhaps this is because the ASI taps into a broader set of beliefs about the harmfulness of all interoceptive sensations, rather than specifically anxiety sensations. Since most ASI item stems do not contain any mention of the reason behind the respondent’s fears, individuals could endorse the same symptom fear for a different reason. Thus, students in Watt et al.’s study could have shown high AS scores due to more general hypochondriacal fears. Indeed, various researchers have found strong associations between ASI scores and the syndrome of hypochondriasis (e.g., Otto et al., 1992) and recognized the need to control for somatization in measuring AS (e.g., Hiller, Leibbrand, Rief, & Fichter, 2005). Although AS has helped yield new models of the relation between emotional sensitivities and both psychiatric and physical health conditions, it is unclear the extent to which the AS construct being measured is reflective of a fear of anxiety-related sensations specifically, or a more general fear of interoceptive sensations (from which anxiety-related sensations are a subset). The vast majority of the research examining AS has utilized one of three measurement instruments: The ASI, Anxiety Sensitivity Index-Revised (ASI-R; Taylor & Cox, 1998), and ASI-3 (Taylor et al., 2007). Each was theoretically constructed and contains a range of statements asserting a negative consequence of experiencing various symptoms or sensations. Respondents endorse how strongly each statement applies
32
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
to them (e.g., ‘‘It scares me when I feel ‘shaky’,’’ ‘‘When I notice that my heart is beating rapidly, I worry I might have a heart attack,’’ or ‘‘Unusual body sensations scare me’’). Each of the items on the ASI, ASI-R, and ASI-3 concerns symptoms commonly experienced during states of fear or anxiety. The purpose of the present study is, therefore, to (1) evaluate the extent to which items on a measure of AS are factorially distinct or common with items on a measure of non-anxiety-symptom sensitivity, and (2) examine differential associations between measures of AS and non-AS with measures of panic disorder and hypochondriacal fears. Based on the assumption that the ASI may tap into broader beliefs about the harmfulness of all interoceptive sensations, it was hypothesized that both AS and non-AS feared symptoms would load onto a single factor. It was further hypothesized that both AS-related and non-AS-related feared sensations would be similarly associated with an index of hypochondriacal fears. It was expected that AS-related feared sensations, as a result of their arousalreactive nature, would be more strongly associated with an index of panic disorder severity than would nonAS-related feared sensations.
Method Participants Undergraduate students from introductory psychology courses (n = 373, 67.8% female) consented to participate in a questionnaire study. No specific inclusion or exclusion criteria were employed, other than the participants’ willingness to complete a series of online questionnaires within a reasonable amount of time to receive extra course credit. Participants ranged in age from 18 to 48 years, with 93.5% falling within the 18- to 21-year-old range. The sample was broadly diverse with 26.1% Hispanic, 23.2% African American, 22.9% Asian, 19.4% Caucasian, 2.4% Middle Eastern, and 0.5% Native American/American Indian students, as well as 5.4% who reported their race/ ethnicity as ‘‘other’’ or ‘‘mixed.’’ All participants were volunteers, provided informed consent, and received partial academic credit.
Measures All participants were asked to complete a battery of questionnaires assessing AS and fears of non-anxiety-related symptoms, as well as measures of panic disorder severity and hypochondriacal fears. Two investigational measures, the Anxiety Sensitivity Index-Revised-Modified and the
1
Somatic Symptoms List, were developed specifically for the purposes of the current study. This is the complete list of measures that participants were asked to complete for the study. Anxiety Sensitivity Index-Revised-Modified (ASI-R-M; Unpublished) The ASI-R-M is a 25-item measure, based on the ASI-R,1 which was designed to remove any specific reasons why individuals might fear anxiety-related symptoms. Some ASI-R-M items are identical to ASI-R items, but those with forced interpretations (e.g., ‘‘When my throat feels tight, I worry that I could choke to death’’) were revised to present only the relevant symptom (e.g., ‘‘I worry when my throat feels tight’’). This was done to ensure we assessed fear of the particular symptom (e.g., ‘‘I worry when my head is pounding’’) rather than fear of a particular symptom because of a specific reason (e.g., ‘‘When my head is pounding, I worry I could have a stroke’’), as respondents may fear particular symptoms for other reasons (e.g., ‘‘a tumor’’). Similarly, some items were consolidated into a single statement about the relevant symptom (e.g., ‘‘When my chest feels tight, I get scared that I won’t be able to breathe properly,’’ and ‘‘When I feel a pain in my chest, I worry that I’m having a heart attack’’ were consolidated into, ‘‘I worry when I feel pain or tightness in my chest’’) and three new items were added to broaden the coverage of arousal-reactive symptoms (hot flushes, paresthesia, blurry or distorted vision) based on DSM-IV symptoms of panic disorder. Finally, seven strictly social items were removed (e.g., ‘‘It is important to me not to appear nervous’’) based on results of previous ASI-R research (see Taylor, 1996). Each item is rated on the same 5-point Likert scale used in the original ASI and ASI-R. The ASI-R-M demonstrated good reliability in the current sample (a = .91). Somatic Symptoms List (SSL; Unpublished) The SSL is a 30-item measure of fear of non-anxietyrelated sensations that was constructed for the purposes of this study. Symptoms were specifically chosen to represent bodily sensations not associated with anxiety or physiological arousal. Items were drawn from several somatic measures, including the Hopkins Symptoms Checklist (Derogatis, Lipman, Rickels, Chlenhuth, & Covi, 1974), the Somatic Symptoms Inventory (Abdel-Khalek, 2003), and DSM-IV-TR diagnostic criteria for somatization disorder (APA, 2000). All items are rated on the same 5-point Likert scale as with the ASI-R-M for rating fear of the specified symptom. The SSL items showed good internal consistency in the current sample (a = .91).
At the time of initial data collection, the Anxiety Sensitivity Index-3 (Taylor et al., 2007) – a revised version of the ASI and ASI-R designed to overcome – was not published. As such, we opted to retain the ASI-R-M in its original format to maintain consistency in our assessment protocol throughout the data collection process.
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
Ó 2015 Hogrefe Publishing
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
33
Panic Disorder Severity Scale
Results
The PDSS (Houck, Spiegel, Shear, & Rucci, 2002; Shear et al., 1997) is a 7-item measure designed as a brief screening measure for panic disorder. The self-report version of the PDSS (Houck et al., 2002) was used in the current study, which has shown comparable reliability, validity, and clinical sensitivity as the original clinician-rated PDSS (Houck et al., 2002). Respondents rated the frequency of panic attacks, distress during panic attacks, agoraphobic fear and avoidance, body sensation fear and avoidance, anticipatory anxiety, and impairment in work and social functioning on a 5-point ordinal scale yielding a maximum total score of 28. It has also shown acceptable agreement with a corresponding interview instrument (weighted Kappas = .51 and .71), and demonstrated good internal consistency, consecutive day test-retest reliability, and validity (Houck et al., 2002; Shear et al., 2001). The PDSS items showed good internal consistency in the current sample (a = .87).
After screening for data errors and outliers, ASI-R-M and SSL items were combined to create a 55-item scale and two confirmatory factor analyses of this combined scale were run to determine whether endorsements of anxietyrelated and non-anxiety-related symptoms fell out into separate factors. Finally, ASI-R-M and SSL total scores were correlated with measures of health anxiety and hypochondriasis, and the magnitudes of these correlations were compared.
Illness Attitudes Scale The IAS (Kellner et al., 1986, 1987) is a 29-item instrument developed to assess the beliefs, fears, and attitudes associated with hypochondriasis. The scale yields a total score that may be used as a general measure of hypochondriasis. The instrument has demonstrated good test-retest reliability during periods ranging from periods of 1 week to 6 months and internal consistency among medical outpatients and good internal consistency in the general population (see Sirri, Grandi, & Fava, 2008); it has also shown excellent convergent validity with other measures of hypochondriasis (Speckens, Spinhoven, Sloekers, Bolk, & van Hemert, 1996). The IAS demonstrated acceptable reliability in the current sample (a = .89).
Data Screening and Outlier Analysis All data were screened and any cases with more than 50% missing data (n = 18) were deleted (Human Subjects Ethics at our institution require the option to choose not to respond to any item). Only these 18 cases were excluded from analysis. Logistic regression analyses were performed to determine whether missing data were systematic for remaining cases, but no interpretable patterns were found. Missing data (1.3%) were then replaced via linear trend imputation, and continuous values were rounded up or down to retain the ordinal structure of data for variables. Total scores for all measures were entered into multiple regression equations to compute Mahalanobis distance. A chi-square cutoff of p < .001 was used to identify multivariate outliers (Tabachnick & Fidell, 2001) and these (4.2%) were omitted from further analyses. Univariate outliers were defined as any data points exceeding a distance of 1.5 times the inter-quartile range above the upper quartile or below the lower; univariate outliers were identified on most scales and Winsorized by replacing the outlying data with nonoutlying values while retaining the sequential order among the outliers quartile (Hoaglin, Mosteller, & Tukey, 1983). All variables showed univariate normal distributions with Skewness < ± 0.8.
Procedures
Confirmatory Factor Analysis
Power analysis preceded data collection and data was analyzed only after all data collection was completed. When presented to subjects, items on the SSL were interspersed with items on the ASI-R-M on a single questionnaire, in order to limit a priori groupings and order effects, yielding a 55-item total scale. This scale was subjected to confirmatory factor analysis to determine whether items representing fear of anxiety-related sensations (ASI-R-M) and items representing fear of non-anxiety-related sensations (SSL) loaded onto separate factors. Second, to examine whether symptom fear ratings such as those found on the ASIR-M and SSL may principally reflect interoceptive anxiety, ASI-R-M and SSL scores were correlated with measures of panic disorder and hypochondriasis to explore the extent to which they are differentially associated with panic disorder severity and hypochondriacal fears.
The data were analyzed using two confirmatory factor analyses (CFAs). We sought to evaluate the fit of our data to a unifactorial model (Figure 1a) and a two-factor model (Figure 1b) wherein ASI-R-M and SSL items loaded onto distinct but correlate factors. No additional model specifications (e.g., correlated error terms) were applied. Model fit was examined using the Root Mean Square Error of Approximation (RMSEA; ideally .02–.07; Browne & Cudeck, 1993), and the Comparative Fit and Tucker-Lewis Indices (CFI and TLI; ideally > .90; Marsh & Hau, 1996). All analyses were conducted using Mplus (version 7.1; Muthén & Muthén, 2013), and item-level data were treated as ordinal variables and analyzed using a weighted least squares (WLSMV) estimator. Given that the two models were not nested, and that Information Criterion estimates are not computable based on WLSMV estimators, the
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
34
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
Figure 1. Unifactorial (A) and bifactoral (B) models tested using WLSMV CFA. AS = Anxiety Sensitivity, SSL = Somatic Symptoms, IS = Interoceptive Sensitivity.
models were compared using the relative strength of the three fit indices. The two-factor model showed good fit to the data, RMSEA = 0.44, CFI = .91, v2(1,429) = 2,253.94, TLI = .91. ASI-R-M items showed significant standardized item-factor loadings to the AS factor (0.373–0.720), and SSL items showed significant standardized item-factor loadings to the AS factor (0.406–0.741). The AS and SSL factors, however, were extremely highly correlated,
r = .92. The unifactorial model also showed good fit to the data, v2(1,430) = 2,280.96, RMSEA = 0.44, CFI = .91, TLI = .90. Both ASI-R-M items (0.351–0.702) and SSL items (0.398–0.730) all showed significant standardized item-factor loadings to the single Interoceptive Sensitivity factor. Comparison of fit indices between the one- and two-factor models failed to suggest the superiority of either factor model, likely due to the proximity of the highly intercorrelated two-factor model and the one-factor model.
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
Ó 2015 Hogrefe Publishing
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
Table 1. Correlations among measures of anxiety sensitivity, somatic sensitivity, panic disorder severity, and hypochondriacal fears
35
The goal of the current study was to evaluate the specificity of measures of AS in assessing fears of anxiety-related symptoms versus fears of non-anxiety-related sensations. Although AS is associated with panic disorder and other anxiety disorders, an abundance of research has implicated AS in emotional reactions based on general health concerns. While some models of health condition-related anxiety (e.g., Asmundson, Norton, & Norton, 1999) have suggested that the specific fear of anxiety-related symptoms may be integral to the maintenance of the health anxiety, others (e.g., Taylor, 1994) have submitted that elevations on measures of AS may be capturing a broader tendency to fear somatic symptoms more generally, rather than anxiety symptoms specifically. Therefore, we sought to evaluate the extent to which items on a measure of AS were
factorially distinct or common with items on a measure of non-anxiety-symptom sensitivity, and examine the potential differential associations between measures of AS and non-AS-related fears and measures of panic disorder and hypochondriacal fears. Overall, the results of the confirmatory factor analyses were equivocal. Both the two-factor and one-factor models showed good and highly similar fit to the data, although the extremely high intercorrelation between the factors in the two-factor model suggests a great deal of commonality. However, when examining the relationships of AS feared symptoms and non-AS feared symptoms with a measure of hypochondriacal fears, no differential associations were observed. This may suggest that both AS and nonAS-related fears may be capturing aspects of a single broader ‘‘health-related somatic fears’’ construct rather than distinct phenomena. Conversely, when examining associations with a measure of panic disorder severity, a much stronger association was observed with the measure of fears of AS-related symptoms than with the measure of fears of non-AS-related symptoms. This suggests that while AS and non-AS somatic fears may both capture a broader fear of unusual bodily sensations, fears of arousal-reactive symptoms may be a defined subset of somatic fears. One implication of these findings – one which has been raised previously (see Taylor & Fedoroff, 1999) – is that AS may not be a fundamental unreducible fear. According to Reiss’ expectancy theory, fundamental fears are those that are distinct from, and unreducible to, other basic fears of inherently aversive stimuli. If this broader tendency is what ASI-R items actually measure, then its overlap with measures of hypochondriasis and health conditions would be expected to be high. Supporting this strength of association, other studies have found stronger relationships between ASI and measures of health anxiety than panic (see Fergus, 2014; Norton, Sexton, Walker, & Norton, 2005; Sexton, Norton, Walker, & Norton, 2003). Still, the lack of superior fit of the single-factor model over the two (AS and non-AS) factor model may imply a more complex relationship. Some limitations of the present study should be considered. First, the use of a nonclinical undergraduate sample means that these findings cannot automatically generalize to clinical populations, or even to the general population. The nature of beliefs behind AS fears may be quite different for individuals with clinical levels of panic, social phobia, and/or other disorders. Similarly, the nature of beliefs driving non-anxiety-related symptom fears might be different in clinical populations and show a distribution that is different from those driving anxiety-related fears. The use of an undergraduate student population is not limiting, however, when it comes to the detection of other basic fears that may be driving AS endorsements; Reiss (1991) described fundamental fears as aversive for most people, indicating that research aimed at identifying these fears may be conducted with samples that reflect the general population. Further research using clinical and population-based samples is clearly warranted. Finally, the most recent tool for assessing AS, the ASI-3 (Taylor et al., 2007) was not utilized as the project was initially conceptualized and initiated prior to the widespread adoption of the ASI-3.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
ASI-R-M SSL PDSS IAS
ASI-R-M
SSL
PDSS
IAS
– .79 (.95) .33 (.41) .58 (.71)
– .23 (.29) .54 (.67)
– .33 (.42)
–
Notes. All correlations significant at p < .001. ASI-R-M = Anxiety Sensitivity – Experimental, SSL = Somatic Sensations List, PDSS = Panic Disorder Severity Scale, IAS = Illness Attitudes Scale. Values in parentheses represent correlation coefficients corrected for scale reliabilities.
As such, Hypothesis 1 that both AS and non-AS feared symptoms would load onto a single factor was not clearly supported or rejected. Therefore, we next sought to explore whether each scale would differentially relate to related constructs of interest, specifically panic disorder severity and hypochondriacal fears.
Strength of Relation With Other Constructs To explore whether AS feared symptoms (ASI-R-M) and non-AS feared symptoms (SSL) would be differentially related to panic disorder severity and hypochondriacal fears, a series of correlations were computed (Table 1) and compared using Steiger’s Z-test of correlated correlations. The ASI-R-M and SSL were both strongly and positively associated with total scores on the PDSS and IAS. Consistent with the second hypothesis, the correlation between ASI-R-M and PDSS was significantly higher than the correlation between SSL and PDSS, Z(302) = 2.90, d = .34, p = .004, but no difference in magnitude was observed between the IAS and either the ASI-R-M or SSL, Z(302) = 1.24, d = .14, p = .215.
Discussion
36
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
Although the ASI-3 was developed, in part, to stabilize the structure of the AS subfactors – the subfactors were not used in the current analyses – it is possible that the relationship between AS and non-AS feared symptoms might have been impacted by our use of the ASI-R. Overall, the results of this study provide preliminary evidence that although fears of anxiety-related symptoms may be more closely related to panic disorder than fear of non-anxiety-related bodily sensations, the construct of AS may be a specific subset of the broader fear of internal bodily sensations in general. As such, psychosocial treatments designed to specifically target AS-related fears (e.g., Watt & Stewart, 2008) may be amenable to modification to more broadly target interoceptive fears that are seen as underlying a range of clinical presentations such as the DSM-5 (APA, 2013) somatic symptom disorder and illness anxiety disorder.
Abdel-Khalek, A. M. (2003). The Somatic Symptoms Inventory (SSI): Development, parameters, and correlates. Current Psychiatry, 10, 114–129. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text revision). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC: Author. Andersson, G., & Vretblad, P. (2000). Tinnitus and anxiety sensitivity. Scandinavian Journal of Behaviour Therapy, 29, 57–64. Asmundson, G. J. G., Norton, P. J., & Norton, G. R. (1999). Beyond pain: The role of fear and avoidance in chronicity. Clinical Psychology Review, 19, 97–119. Asmundson, G. J. G., Norton, P. J., & Veloso, F. (1999). Anxiety sensitivity and fear of pain in patients with recurring headaches. Behaviour Research and Therapy, 37, 703–713. Asmundson, G. J. G., & Taylor, S. (1996). Role of anxiety sensitivity in pain-related fear and avoidance. Journal of Behavioral Medicine, 19, 573–582. Asmundson, G. J. G., Wright, K. D., & Hadjistavropoulos. (2000). Anxiety sensitivity and disabling chronic health conditions: State of the art and future directions. Scandinavian Journal of Behavioral Therapy, 29, 100–117. Bernstein, A. B., Zvolensky, M. J., Norton, P. J., Schmidt, N. B., Taylor, S., Forsyth, J. P., . . . Leen-Feldner, E. (2007). Taxometric and factor analytic models of anxiety sensitivity: Integrating approaches to latent structural research. Psychological Assessment, 19, 74–87. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing Structural Equation Models (pp. 136–162). Beverly Hills, CA: Sage. Carr, R. E., Lehrer, P. M., & Hochron, S. M. (1995). Predictors of panic-fear in asthma. Health Psychology, 14, 421–426. Carr, R. E., Lehrer, P. M., Rausch, L. L., & Hochron, S. M. (1994). Anxiety sensitivity and panic attacks in an asthmatic population. Behaviour Research and Therapy, 32, 411–418.
Cox, B. J., Borger, S. C., & Enns, M. W. (1999). Anxiety sensitivity and emotional disorders: Psychometric studies and their theoretical implications. In S. Taylor (Ed.), Anxiety sensitivity (pp. 115–148). Mahwah, NJ: Erlbaum. Cox, B. J., Fuentes, K., Borger, S. C., & Taylor, S. (2001). Psychopathological correlates of anxiety sensitivity: Evidence from clinical interviews and self-report measures. Anxiety Disorders, 15, 317–332. Derogatis, L. R., Lipman, R. S., Rickels, K., Chlenhuth, E. H., & Covi, L. (1974). The Hopkins symptom checklist (HSCL): A self-report inventory. Behavioral Science, 19, 1–15. Fergus, T. A. (2014). Health-related dysfunctional beliefs and health anxiety: Further evidence of cognitive specificity. Journal of Clinical Psychology, 70, 248–259. Hiller, W., Leibbrand, R., Rief, W., & Fichter, M. M. (2005). Differentiating hypochondriasis from panic disorder. Journal of Anxiety Disorders, 19, 29–49. Hoaglin, D. C., Mosteller, F., & Tukey, J. W. (1983). Understanding robust and exploratory data. New York, NY: Wiley. Houck, P. R., Spiegel, D. A., Shear, M. K., & Rucci, P. (2002). Reliability of the self-report version of the Panic Disorder Severity Scale. Depression and Anxiety, 15, 183–185. Kellner, R., Abbott, P., Winslow, W. W., & Pathak, D. (1987). Fears, beliefs, and attitudes in DSM-III hypochondriasis. The Journal of Nervous and Mental Disease, 176, 20–25. Kellner, R., Wiggins, R. G., & Pathak, D. (1986). Hypochondriacal fears and beliefs in medical and law students. Archives of General Psychiatry, 43, 487–489. Keogh, E., & Mansoor, L. (2001). Investigating the effects of anxiety sensitivity and coping strategy on the perception of cold pressor pain in healthy women. European Journal of Pain, 5, 11–25. Lewis, A. R., Zinbarg, R. E., Mineka, S., Craske, M. G., Epstein, A., & Griffith, J. W. (2010). The relationship between anxiety sensitivity and latent symptoms of emotional problems: A structural equation modeling approach. Behaviour Research and Therapy, 48, 761–769. Maller, R. G., & Reiss, S. (1992). Anxiety sensitivity in 1984 and panic attacks in 1987. Journal of Anxiety Disorders, 6, 241–247. Marsh, H. W., & Lau, K.-T. (1996). Assessing goodness of fit: Is parsimony always desirable?. The Journal of Experimental Education, 64, 364–390. Muthén, L. K., & Muthén, B. (2013). Mplus user’s guide (Version 7.1). Los Angeles, CA: Muthén & Muthén. Norton, G. R., Norton, P. J., Asmundson, G. J. G., Thompson, L. A., & Larsen, D. K. (1999). Neurotic butterflies in my stomach: The role of anxiety, anxiety sensitivity, and depression in functional gastrointestinal disorders. Journal of Psychosomatic Research, 47, 233–240. Norton, P. J., Sexton, K. A., Walker, J. R., & Norton, G. R. (2005). Hierarchical model of vulnerabilities for anxiety: Replication and extension with a clinical sample. Cognitive Behaviour Therapy, 34, 49–63. Olatunji, B. O., & Wolitzky-Taylor, K. B. (2009). Anxiety sensitivity and the anxiety disorders: A meta-analytic review and synthesis. Psychological Bulletin, 135, 974–999. Otto, M. W., Demopulos, C. M., McLean, N. E., Pollack, M. H., & Fava, M. (1998). Additional findings on the association between anxiety sensitivity and hypochondriacal concerns: Examination of patients with major depression. Journal of Anxiety Disorders, 12, 225–232. Otto, M. W., Pollack, M. H., Fava, M., Uccello, R., & Rosenbaum, J. F. (1995). Elevated Anxiety Sensitivity Index scores in patients with major depression: Correlates and changes with antidepressant treatment. Journal of Anxiety Disorders, 9, 117–123.
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
Ó 2015 Hogrefe Publishing
Acknowledgment Portions of this study were completed as part of Katharine C. Sears’ doctoral dissertation.
References
P. J. Norton & K. S. Edwards: Understanding Anxiety Sensitivity
37
Otto, M. W., Pollack, M. H., Sachs, G. S., & Rosenbaum, J. F. (1992). Hypochondriacal concerns, anxiety sensitivity, and panic disorder. Journal of Anxiety Disorders, 6, 93–104. Reiss, S. (1987). Theoretical perspectives on the fear of anxiety. Clinical Psychology Review, 7, 585–596. Reiss, S. (1991). The expectancy model of fear, anxiety and panic. Clinical Psychology Review, 11, 141–153. Reiss, S. (1999). The sensitivity theory of aberrant motivation. In S. Taylor (Ed.), Anxiety Sensitivity (pp. 35–58). Mahwah, NJ: Erlbaum. Reiss, S., & McNally, R. J. (1985). Expectancy model of fear. In S. Reiss & R. R. Bootzin (Eds.), Theoretical issues in behavior therapy (pp. 107–121). San Diego, CA: Academic Press. Reiss, S., Peterson, R. A., & Gursky, D. M. (1988). Anxiety sensitivity, injury sensitivity, and individual differences in fearfulness. Behaviour Research and Therapy, 26, 341–345. Reiss, S., Peterson, R. A., Gursky, D. M., & McNally, R. J. (1986). Anxiety sensitivity, anxiety frequency, and the prediction of fearfulness. Behaviour Research and Therapy, 24, 1–8. Schmidt, N. B., Lerew, D. R., & Jackson, R. J. (1997). The role of anxiety sensitivity in the pathogenesis of panic: Prospective evaluation of spontaneous panic attacks during acute stress. Journal of Abnormal Psychology, 106, 355–364. Schmidt, N. B., Lerew, D. R., & Jackson, R. J. (1999). Prospective evaluation of anxiety sensitivity in the pathogenesis of panic: Replication and extension. Journal of Abnormal Psychology, 108, 532–537. Schmidt, N. B., Zvolensky, M. J., & Maner, J. K. (2006). Anxiety sensitivity: Prospective prediction of panic attacks and Axis I pathology. Journal of Psychiatric Research, 40, 691–699. Sexton, K. A., Norton, P. J., Walker, J. R., & Norton, G. R. (2003). Hierarchical model of generalized and specific vulnerabilities in anxiety. Cognitive Behaviour Therapy, 32, 82–94. Shear, K. M., Brown, T. A., Barlow, D. H., Money, R., Sholomskas, D. E., Woods, S. W., . . . Papp, L. A. (1997). Multicenter collaborative panic disorder severity scale. American Journal of Psychiatry, 154, 1571–1575. Shear, K. M., Frank, E., Rucci, P., Williams, J., Grochocinski, V. J., Vander Bilt, J., . . . Wang, T. (2001). Reliability and validity of the panic disorder severity scale: Replication and extension. Journal of Psychiatric Research, 35, 293–296. Sirri, L., Grandi, S., & Fava, G. A. (2008). The Illness Attitude Scales. Psychotherapy and Psychosomatics, 77, 337–350. Speckens, A. E. M., Spinhoven, P., Sloekers, P. P. A., Bolk, J. H., & van Hemert, A. M. (1996). A validation study of the Whitely Index, the Illness Attitude Scales, and the Somatosensory Amplification Scale in general medical and general practice patients. Journal of Psychosomatic Research, 40, 95–104.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics. New York, NY: Harper Collins. Taylor, S. (1994). Comment on Otto et al. (1992): Hypochondriacal concerns, anxiety sensitivity, and panic disorder. Journal of Anxiety Disorders, 8, 97–99. Taylor, S. (1996). Nature and measurement of anxiety sensitivity: Reply to Lilienfeld, Turner, and Jacob (1996). Journal of Anxiety Disorders, 10, 425–451. Taylor, S., & Cox, B. J. (1998). An expanded Anxiety Sensitivity Index: Evidence for a hierarchic structure in a clinical sample. Journal of Anxiety Disorders, 12, 463–483. Taylor, S., & Fedoroff, I. C. (1999). The expectancy theory of fear, anxiety, and panic: A conceptual and empirical analysis. In S. Taylor (Ed.), Anxiety sensitivity: Theory, research, and treatment of the fear of anxiety (pp. 17–33). Mahwah, NJ: Erlbaum. Taylor, S., Koch, W. J., & McNally, R. J. (1992). How does anxiety sensitivity vary across the anxiety disorders? Journal of Anxiety Disorders, 6, 249–259. Taylor, S., Zvolensky, M. J., Cox, B. J., Deacon, B., Heimberg, R. G., Ledley, D. R., & Jurado, S. (2007). Robust dimensions of anxiety sensitivity: Development and initial validation of the anxiety sensitivity index-3. Psychological Assessment, 19, 176–188. Watt, M. C., & Stewart, S. H. (2008). Overcoming the fear of fear: How to reduce anxiety sensitivity. Oakland, CA: New Harbinger. Watt, M. C., Stewart, S. H., & Cox, B. J. (1998). A retrospective study of the learning history origins of anxiety sensitivity. Behaviour Research and Therapy, 36, 505–525. Zinbarg, R. E., Brown, T. A., Barlow, D. H., & Rapee, R. M. (2001). Anxiety sensitivity, panic, and depressed mood: A reanalysis teasing apart the contributions of the two levels in the hierarchical structure of the anxiety sensitivity index. Journal of Abnormal Psychology, 3, 372–377.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):30–37
Date of acceptance: January 28, 2015 Published online: June 26, 2015
Peter J. Norton School of Psychological Sciences Monash University Clayton, Victoria, 3168 Australia Tel. +61 3 9905 1709 Fax +61 3 9905 3948 E-mail Peter.Norton@monash.edu
Original Article
Inconsistency Index for the Zuckerman-Kuhlman-Aluja Personality Questionnaire (ZKA-PQ) Anton Aluja,1,2 Angel Blanch,1,2 Maite Martí-Guiu,1,2 and Eduardo Blanco1,2 1
University of Lleida, Catalonia, Spain, 2Institute of Biomedical Research of Lleida, Catalonia, Spain Abstract. The purpose of this study is the development of an index to assess inconsistency in the answers of Zuckerman-Kuhlman-Aluja Personality Questionnaire (ZKA-PQ) in order to identify and discard inconsistent subjects in applied psychology, as clinical, forensics, or personnel selection. The procedure consists in the use of 10 pairs of highly correlated items in a wide sample of voluntary and anonymous subjects of both sexes (n = 5.644). We inserted random cases to the original data in order to obtain simulated scores of inconsistency and we established a cut-off criterion to discriminate between consistent and inconsistent subjects according to a 70 T Score. A score higher than 10 points discriminated the 3.7% of the subjects. Cronbach’s alpha average for facets was calculated by ZKA-PQ facets distributed in 8 (a: 0.79), 9–10 (a: 0.67), and above 10 points (a: .50) of the inconsistency index. The Feldt test indicates that alpha differences were significant. The inconsistency score did not affect the factorial structure of the ZKA-PQ. We discussed the utility of this index to identify inconsistent subjects with the ZKA-PQ, as, for instance, those with individual difficulties (a limited vocabulary, poor verbal comprehension, an idiosyncratic way of interpreting item meanings, carelessness, inattentiveness. . .). Keywords: personality test, inconsistency index, alpha reliability
The present study is designed to develop an inconsistency index to detect incoherent subjects that respond to Zuckerman-Kuhlman-Aluja Personality Questionnaire (ZKPQ; Aluja, Kuhlman, & Zuckerman, 2010). Researchers who have developed personality tests have designed validity scales and indices to control several answer distortions. However, some authors argue that these approaches have a limited success (Zickar & Drasgow, 1996). Uziel (2010) explores the main techniques that assess the utility of these scales and highlights their value as something inherent to personality traits but concludes that they are ineffective when used just as validity control tools, as most of them fail in the detection of biases in the questionnaires responses, being only suitable for a limited number of occasions. While most of validity scales of personality questionnaires are developed to obtain information about the intentional distortion of answers, the indices of inconsistency allow the detection of incoherent responses. These are due to random answers or individual difficulties such as a limited vocabulary, poor verbal comprehension, an idiosyncratic way of interpreting item meanings, carelessness, inattentiveness, blank answers, misreading items, answering in the wrong areas of the answer sheet, and/or using the same response category repeatedly without reading the item (Kurtz & Parrish, 2001). In order to detect unintended distortion or random responses, researchers have developed several inconsistency
indices. The MMPI-2 incorporates the True Response Inconsistency (TRIN), the Variable Response Inconsistency (VRIN), and the Fb – Back F scales (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989). The TRIN and VRIN were based on the Multidimensional Personality Questionnaire or MPQ (Tellegen, 1982). These scales were developed in order to detect patients with inconsistent responses. For instance, the Fb scale indicates that the respondent stops paying attention while providing random answers. Costa and McCrae (1992) did not create validity scales for the NEO Personality Inventory-Revised (NEO-PI-R). They argued that the use of distortion scales reduces the validity of personality measures (e.g., McCrae et al., 1989). Instead of using validity scales, Costa and McCrae (1992) suggested considering the number of missing responses and the tendency to acquiescence, nay-saying, and random responding. Schinka, Kinder, and Kremer (1997) developed a set of validity research scales for the NEO-PI-R: Positive Presentation Management (PPM), Negative Presentation Management (NPM), and Inconsistency (INC). PPM correlated strongly with E (0.42), and C (0.56), and NPM with E ( 0.48), A ( 0.42), and C ( 0.49) (Blanch, Aluja, Gallart, & Dolcet, 2009). The heterogeneous content of these scales affected the internal consistency in a negative manner (0.56 and 0.67) respectively. However, the INC was not related to personality scales.
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46 DOI: 10.1027/1015-5759/a000270
Ó 2015 Hogrefe Publishing
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
The validity scales are affected by personality (Joseph, Thomas, & Roopa, 2005). The faking good subjects in the 16 PF motivational distortion scale are shown as less anxious and more extraverted because they consider that the items related to anxiety and introversion are not socially desirable (Karson & O’Dell, 1976). Salgado, Remeseiro, and Iglesias (1996) found that Extraversion and Conscientiousness are related to motivation, as it is conceptualized in the five-factor model. Social desirability has also a higher association with Neuroticism, Contentiousness, and Agreeableness (Ones, Viswesvaran, & Reiss, 1996). Also, inconsistency is related positively to Neuroticism and negatively to Openness to experience (Johnson, 2005). Johnson (2005) estimated the relative incidence of invalidated protocols due to linguistic incompetence, carelessness, and inattentiveness in Web-based and paper-andpencil personality measures and found that there were no significant differences between the answer formats as both were affected. Despite this, the massive use of Internet to collect data with the ZKA-PQ suggested that it is necessary to take precautions in order to identify possible subjects that may respond randomly or with less motivation (Blanch, Aluja, & Gallart, 2013). Besides, an index of inconsistency can be useful also in the assessment of personality with the ZKA-PQ in several contexts (clinical, personnel selection. . .) as it has been demonstrated by the inconsistency scales VRIN and TRIN of the MMPI-2. The main aim of this study is the development of an index to detect inconsistent subjects with the ZuckermanKuhlman-Aluja Personality Questionnaire (ZKA-PQ; Aluja et al., 2010; Zuckerman, Kuhlman, Joireman, Teta, & Kraft, 1993). As inconsistency affects the reliability of the ZKAPQ scales, we compared the alpha reliability in different groups according to this inconsistency index. In addition, we were interested in knowing whether the factorial structure could be distorted by the inconsistency index. Moreover, we explored if there was a characteristic personality profile for the inconsistent subjects.
39
Table 1. INC means, standard deviations, and T Scores with random responses insertion INC Random insertion percentage (%) 0 10 20 30 40 50 60 70 80
M
SD
T score
4.86 6.22 7.21 7.98 8.59 9.04 9.45 9.81 10.10
2.71 4.28 4.90 5.20 5.34 5.33 5.43 5.41 5.40
50.01 55.18 58.71 61.52 63.72 65.45 67.01 68.27 69.35
Note. INC: Inconsistency scale.
a cut-off of r > 0.40 for the selected 10-pair items. Scores on the INC scales were calculated taking into account that higher scores reflected inconsistency in response to questions with similar content. The INC was the result of the addition of the absolute differences between the values of each pair of items. Theoretically, the index can vary from 0 to 30 because the score range goes from 1 to 4 (see Appendix). With the aim to discriminate between consistent and inconsistent subjects, we propose a criterion based on a cut-off that corresponds to a T score of 70, by means of combining real and simulated data (Handel, Ben-Poraht, Tellegen, & Archer, 2010) (Table 1).
Subjects
Method
There were 5.654 Spanish subjects (2.306 males and 3.338 females) from general population. The mean age was 39.83 (SD = 16.54) for men, and 36.32 (SD = 15.99) for women. The age range was from 17 to 88 years: 17–22 (24.4%), 23–35 (27.2%), 36–49 (22.5%), and 50–80 (25.9%). The subjects were recruited through university students who collaborated in the collection of data. Participation was anonymous and voluntary.
Development of the Inconsistency Index (INC)
Simulated Data
Data were collected through pencil and paper forms, between 2008 and 2011, in different studies with the ZKA-PQ and other tests in the Spanish general population (age range: 17–88 years). The 10 pairs of items to develop the INC were selected by choosing the item pairs with the highest correlation, belonging to the same facet and recoded in the same direction.1 This procedure was similar to that of Schinka et al. (1997), which determined
With the purpose to know the mean of inconsistence scores in the ZKA-PQ with random responses, we used a simulated database of 10,000 subjects. First, we generated 200 random variables with values ranging from 0 to 1, and second, we assigned 1, 2, 3, or 4 points according to four quartiles of these values. Each point had a 25% of the frequencies. The syntax for SPSS is shown in the Appendix. This database was used in order to do a simulation of the
1
Pairs of items and correlations: 1–61 (0.58), 4–24 (0.65), 49–19 (0.53), 21–121 (0.57), 30–50 (0.57), 80–180 (0.70), 121–141 (0.71) 14–94 (0.46), 120–160 (0.59), 21–161 (0.42) (n = 5.644).
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
40
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
Figure 1. Inconsistency index plot scores and frequency distribution of ZKA-PQ random sample.
INC mean, which will be obtained by means of the gradual integration of random data to the original sample.
Results Inconsistency Index (INC)
Instrument The ZKA-PQ (Aluja et al., 2010; Zuckerman & Aluja, 2014) has 200 items which measure five personality dimensions: Sensation Seeking (SS), Neuroticism (NE), Aggression (AG) Activity (AC), and Extraversion (EX). The factorial structure is based on 20 facets of 10 items each. Its answer format is a 4-point Likert scale. Each dimension has four facets: – Sensation Seeking (SS): SS1 (Thrill and Adventure Seeking), SS2 (Experience Seeking), SS3 (Disinhibition), and SS4 (Boredom Susceptibility/Impulsivity); – Neuroticism (NE): NE1 (Anxiety), NE2 (Depression), NE3 (Dependency), and NE4 (Low Self-Esteem); – Aggression (AG): AG1 (Physical Aggression), AG2 (Verbal Aggression), AG3 (Anger), and AG4 (Hostility); – Extraversion (EX): EX1 (Positive Emotions), EX2 (Social Warmth), EX3 (Exhibitionism), and EX4 (Sociability); – Activity (AC): AC1 (Work Compulsion), AC2 (General Activity), AC3 (Restlessness), and AC4 (Work Energy). In the original validation study, the factorial structure was based on 20 facets of 10 items each. Alphas for the factor scores were around .87 in two samples. The 10-item facets had an average alpha of .75 and .76 (Spanish and American samples). Only three facets in the Spanish sample (Boredom susceptibility/Impulsiveness, Hostility, and Restlessness) and two facets in the American sample (Boredom susceptibility/Impulsiveness and Restlessness) yielded alphas below .70 (Aluja et al., 2010). European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
The mean of INC random scores was 13.80 (SD = 3.37), with a range of 3–27 points. Figure 1 displays the frequency plot and the distribution of random scores. The inconsistency index mean for all the samples was 4.86 (SD = 2.71) and the range went from 0 to 22 points. By using a simulation procedure similar to that of Handel et al. (2010), we inserted a 10% of the randomly obtained data in the inconsistency index, adding a 10% more each time, until the addition of the 80% of the simulated data to the original one, which corresponded to 10 points in relation to a T score of 69.35. Notice that a T score above 70 corresponds to those subjects with two standard deviations from the mean. Therefore, we establish a cut-off of 11 or more to identify and discard inconsistent subjects. The graphical analysis of the INC frequency also displays a cut-off of 10 points in order to distinguish between the consistent and inconsistent subjects (see Figure 2). With regard to the consistent group ( 10 points), the distribution was normal (Skewness: 0.32; Kurtosis: 0.44; n = 5.444), and subjects with 11 or more points showed a deviated or marginal representation. These marginal and nonnormal scores (inconsistent group) belonged to the 3.7% of the total sample (Skewness: 1.49; Kurtosis: 2.53; n = 209). The mean of INC was 4.57 (SD = 2.27), range 1–10; for the consistent group and 12.79 (SD = 2.02), range of 11–22; for the inconsistent group.
Inconsistency Index and Alpha Reliability of ZKA-PQ Scales In order to study the relationship between the inconsistency index and the alpha reliability of the scales and facets of the Ó 2015 Hogrefe Publishing
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
41
Figure 2. Inconsistency index plot scores and frequency distribution of ZKA-PQ sample.
ZKA-PQ, the subjects were divided into three groups: (a) a group with one deviation below the mean score (< 9 points) n = 5,136 (M = 4.28; SD = 1.99), (b) another group for those subjects with scores between 9 and 10, n = 309 (M = 9.42; SD = 0.49), and (c) subjects with scores above 10 points, n = 209 (M = 12.79; SD = 2.02). An ANOVA mean analysis and a Scheffee comparison showed that means were statistically different between the three groups (intergroup) ( p < .001). In order to study the differences of the alphas between groups we used the ALPHATEST program (Feldt, 1969) that evaluates the equality of alpha coefficients in two independent samples (Merino & Lautenschlager, 2003). The alpha reliability of the 20 facets of the five dimensions of the ZKA-PQ was smaller in the highest INC group (Table 2).
Inconsistency Index and Personality Table 4 displays a mean comparison for the age and dimensions of the ZKA-PQ according to the three selected groups. As it can be observed, the age mean increases significantly as INC scores also increase (r: 0.11; p < .001). More inconsistent individuals scored higher in Aggressiveness (r: .23; p < .001), Neuroticism (r: 0.12; p < .001); and lower in Extraversion (r: 0.19; p < .001).
Discussion
We applied a Principal Axis Analysis with Varimax Rotation with the ZKA-PQ facets for the consistent and inconsistent groups (Table 3). The accounted variance of the consistent group was 68.26%, and it was 56.30% for the inconsistent group (Kaiser-Meyer-Olkin Measure of Sampling Adequacy was 0.84 and 0.76, and Bartlett’s Test of Sphericity: Approx. v2: 53,409,836 and 1,071, 281; df: 190; p < .001). The factorial congruence between consistent and inconsistent subjects’ matrices was good, with a global congruence of 0.95. Only the Restlessness facet obtained an unsatisfactory congruence. Nevertheless, the factorial loadings of the four facets of each factor were smaller. The average of the factorial loadings of the four facets for the five factors was 0.69 for the consistent group and 0.54 for the inconsistent group.
The self-report personality inventories provide an individualized personality profile based on the answers of the subjects. Thus, it is important that the subjects respond honestly and do not distort their answers deliberately. The validity or control indicators of personality questionnaires have been widely studied since the development of the MMPI (Hathaway & McKinley, 1943). Our study considers only the consistency of the answers as an indicator of global validity of the ZKA-PQ. The results indicate that high INC scores were found in 3.7% of the subjects that responded anonymously and voluntarily to the ZKA-PQ, by using a cut-off of 11 or more points. The INC means’ scores of the inconsistent subjects were similar to those obtained by means of a random sample. For this reason, in an anonymous and voluntary sample, it is possible that between the 3% and 4% of subjects should be discarded. These subjects can distort the data of the rest of the participants. The use of the ZKA-PQ in other samples such as recruitment processes or clinical individuals might yield higher inconsistency indices. In order to minimize inconsistency in comparative cross-cultural studies in different countries and/or
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
Factorial Differences
42
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
Table 2. Internal consistency alpha, Feldt differences and Personality G-1
G-2
G-3
G1-G2
G1-G3
G2-G3
INC < 8 (n = 5,136)
INC 9–10 (n = 309)
INC > 10 (n = 209)
a1
a2
a3
v2
p<
v2
p<
v2
p<
AG1 physical aggression AG2 verbal aggression AG3 anger AG4 hostility Aggressiveness
.82 .76 .85 .70 .92
.66 .64 .71 .58 .83
.51 .44 .50 .33 .69
57.89 22.00 62.62 14.85 97.85
.0001 .0001 .0001 .0001 .0001
111.35 76.06 171.09 67.49 263.57
.0001 .0001 .0001 .0001 .0001
6.90 10.10 15.39 11.30 21.79
.009 .002 .0001 .0008 .0001
AC1 work compulsion AC2 general activity AC3 restlessness AC4 work energy Activity
.77 .82 .64 .81 .88
.79 .75 .56 .76 .85
.57 .45 .32 .64 .73
0.96 14.12 5.08 6.95 7.35
.33 .0002 .02 .008 .007
38.76 9.75 40.18 40.61 80.17
.0001 .002 .0001 .0001 .0001
26.67 32.27 9.81 8.50 20.86
.0001 .0001 .002 .004 .0001
EX1 positive emotions EX2 social warmth EX3 exhibitionism EX4 sociability Extraversion
.80 .87 .82 .80 .92
.63 .79 .70 .69 .86
.54 .66 .40 .49 .77
53.84 31.45 36.01 25.95 50.99
.0001 .0001 .0001 .0001 .0001
73.18 101.39 171.09 95.38 146.47
.0001 .0001 .0001 .0001 .0001
2.43 12.03 24.95 12.84 14.86
.12 .0005 .0001 .0003 .0001
NE1 anxiety NE2 depression NE3 dependence NE4 low self-esteem Neuroticism
.77 .78 .74 .89 .93
.55 .58 .60 .71 .84
.37 .36 .41 .47 .77
65.13 60.03 25.02 147.96 120.18
.0001 .0001 .0001 .0001 .0001
112.95 129.18 70.54 325.57 193.49
.0001 .0001 .0001 .0001 .0001
5.84 9.18 7.81 18.87 7.91
.02 .002 .005 .0001 .005
SS1 thrill and adventure SS2 experience seeking SS3 disinhibition SS4 boredom susceptibility Sensation seeking
.82 .79 .80 .66 .91
.70 .66 .71 .60 .85
.57 .39 .53 .64 .76
36.01 31.77 18.30 3.29 41.89
.0001 .0001 .0001 .07 .0001
80.92 128.76 77.51 0.27 123.50
.0001 .0001 .0001 .60 .0001
6.69 17.73 12.08 0.56 13.31
.01 .0001 .0005 .45 .0003
Average facets only
.79
.67
.50
27.72
.0001
80.23
.0001
8.93
.003
languages, it is recommended to only consider those subjects with an inconsistency index below 11 points. As expected, the inconsistent subjects obtained lower alpha reliabilities in all the facets and dimensions. In fact, the alpha internal reliability was based on the average correlation among pairs of items creating the scale (Cronbach, 1951). Cronbach’s alpha can be biased because of inconsistent responses (Fong, Ho, & Lam, 2010). If subjects were inconsistent in the pairs of correlated items of the ZKAPQ, we can expect that they should be also inconsistent in the items of the facets. Therefore, the inconsistent responses affect the facet scores, which deteriorate the alpha reliability scores. An advantage of the INC is that we could obtain a score for each individual, while the Cronbach’s alpha could only be obtained for a group of subjects. Inconsistent subjects presented smaller factorial loadings in the factorial structure of the ZKA-PQ, but the global structure was not affected in a global way. Similar results were obtained by Johnson (2005). In order to unfold the personality pattern related to inconsistent answers, the mean scores of the ZKA-PQ facets and dimensions were compared between the different INC groups. Subjects with higher INC scores were older,
scored higher in Aggressiveness and Neuroticism, and scored lower in Activity and Extraversion. Notice that Neuroticism was also related positively to inconsistency in Johnson’s study (2005). Nevertheless, it would be advisable to replicate these findings with different samples. Notice that personality characteristics of the inconsistent subjects could be questionable. Dawson and Schuerger (2007) examined the relationship between the Response Inconsistency Scale (RINC) of the Adolescent Personality Questionnaire and the Variable Response Inconsistency Scale (VRIN) of the Minnesota Multiphasic Personality Inventory for Adolescents (MMPI-A). Scores on both inconsistency scales correlated negatively with Extraversion, Independence, and Self-control scores. Salgado et al. (1996) found that motivation correlated significantly with Extraversion and Conscientiousness. These findings are in line with our results. The main purpose of this paper was to provide a useful tool for the ZKA-PQ researchers in order to detect inconsistent subjects. In this way, the procedure is relatively simple, and the proposed statistics can be easily programmed (see Appendix). The INC may be useful to discard subjects with random responses, with low verbal comprehension, low
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
Ó 2015 Hogrefe Publishing
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
43
Table 3. Factorial structure matrices for consistent and inconsistent groups, congruency coefficients, and loadings differences by factor Sample 1 INC 10 (n = 5,435)
INC > 10 (n = 209)
I
II
II
IV
V
I
II
III
IV
V
CC
Diff.
AG1 AG2 AG3 AG4
physical aggression verbal aggression anger hostility
.58 .74 .82 .64
.02 .02 .05 .04
.16 .16 .05 .28
.04 .05 .32 .44
.24 .22 .02 .11
.42 .64 .50 .30
.13 .03 .02 .08
.16 .15 .26 .30
.00 .01 .22 .49
.43 .02 .13 .22
.92 .91 .97 .95
.16 .10 .32 .34
AC1 AC2 AC3 AC4
work compulsion general activity restlessness work energy
.05 .03 .25 .13
.59 .60 .56 .60
.01 .06 .15 .21
.06 .04 .11 .29
.07 .16 .18 .29
.03 .03 .13 .15
.53 .61 .47 .59
.08 .11 .10 .33
.07 .03 .08 .17
.17 .02 .08 .24
.91 .95 .87 .98
.06 .08 .09 .01
EX1 EX2 EX3 EX4
positive emotions social warmth exhibitionism sociability
.18 .15 .22 .05
.22 .05 .08 .08
.61 .75 .52 .76
.44 .13 .06 .14
.06 .09 .31 .24
.04 .08 .32 .01
.22 .05 .03 .12
.66 .65 .37 .58
.19 .19 .04 .16
.19 .12 .36 .17
.93 .96 .98 .97
.05 .10 .15 .18
NE1 NE2 NE3 NE4
anxiety depression dependence low self-esteem
.28 .16 .03 .01
.19 .03 .00 .11
.08 .21 .01 .26
.72 .82 .76 .83
.07 .01 .13 .04
.16 .02 .03 .11
.05 .01 .01 .03
.15 .27 .00 .48
.54 .49 .71 .51
.07 .04 .08 .03
.99 .98 .99 .94
.18 .33 .05 .32
SS1 SS2 SS3 SS4
thrill and adventure experience seeking disinhibition boredom susceptibility
.10 .02 .18 .27
.13 .01 .02 .09
.03 .18 .19 .02
.14 .00 .05 .08
.70 .73 .79 .55
.10 .15 .19 .19
.06 .01 .14 .25
.01 .06 .06 .27
.13 .09 .07 .05
.62 .58 .66 .44
.99 .95 .94 .90
.08 .15 .13 .11
.97
.96
.94
.96
.91
.95
.14
Congruency Coefficient Note. Factorial loadings .40 are in bold.
Table 4. Age and personality means comparison between INC groups and INC correlation with age and ZKA-PQ factors (1) INC < 8
Age Sensation seeking Aggressiveness Activity Neuroticism Extraversion
(2) INC 9–10
(3) INC > 10
M
SD
M
SD
M
SD
F
p<
37.22 89.26 87.01 110.55 92.04 118.70
15.45 17.10 17.03 14.45 18.44 16.02
41.12 88.67 94.53 109.14 94.99 112.51
18.04 5.42 14.23 15.19 14.56 14.94
46.24 92.65 95.95 104.44 96.81 108.34
18.92 13.71 11.90 12.94 12.29 13.51
38.12 4.43 54.56 18.83 10.37 62.51
.001 .012 .001 .001 .001 .001
level of motivation, or inattention in different contexts. Besides, this index could also be used in those research fields in which answers are obtained through the Internet and when the access to the questionnaires is not controlled or restricted. Likewise, the detection of subjects assessed as inconsistent with the ZKA-PQ in the clinical practice is advisable. Further cross-sectional and experimental studies are needed to unfold the relationship between inconsistency index and external variables, such as motivation or attention.
1 1 1 1 1 1
Scheffe
r
< 0.05
INC
p<
.11 .04 .23 .07 .12 .19
.001 .001 .001 .001 .001 .001
< < < > < >
2.3; 2 < 3 3; 2 < 3 2, 3 2; 2 > 3 2, 3 2.3; 2 > 3
Aluja, A., Kuhlman, M., & Zuckerman, M. (2010). Development of the Zuckerman-Kuhlman-Aluja personality questionnaire
(ZKA-PQ): A factor/facet version of the Zuckerman-Kuhlman Personality Questionnaire (ZKPQ). Journal of Personality Assessment, 92, 416–431. Blanch, A., Aluja, A., & Gallart, S. (2013). Personality assessment through internet: Factor analyses by age groups of the ZKA Personality Questionnaire. Psychologica Belgica, 53, 101–119. Blanch, A., Aluja, A., Gallart, S., & Dolcet, J. M. (2009). A review on the use of NEO-PI-R validity scales in normative, job selection, and clinical samples. European Journal of Psychiatry, 23, 121–129. Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (1989). Manual for the restandardized Minnesota Multiphasic Personality Inventory: MMPI-2. Minneapolis, MN: University of Minnesota Press. Costa, P. T., & McCrae, R. R. (1992). Revised NEO personality inventory (NEO PI-R tm) and NEO five-factor inventory
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
References
44
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
(NEO-FFI): professional manual. Odessa, FL: Psychological Assessment Resources. Cronbach, L. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. Dawson, K. A., & Schuerger, J. M. (2007). Adolescent personality and two measures of response inconsistency. Psychological Reports, 100, 113–114. Feldt, L. S. (1969). A test of the hypothesis that Cronbach’s alpha or Kuder-Richardson coefficient twenty is the same for two tests. Psychometrika, 34, 363–373. Fong, D. Y., Ho, S. Y., & Lam, T. H. (2010). Evaluation of internal reliability in the presence of inconsistent responses. [Research Support, Non-US Gov’t]. Health Qual Life Outcomes, 8, 27. Handel, R. W., Ben-Poraht, Y. S., Tellegen, A., & Archer, R. (2010). Psychometric functioning of the MMPI-2RF VRIN-r and TRIN-r scales with varying degrees of randomness, aquiescende, and counter-aquiescende. Psychological Assessment, 22, 87–95. Hathaway, S. R., & McKinley, J. C. (1943). Manual for the Minnesota Multiphasic Personality Inventory. New York, NY: Psychological Corporation. Johnson, J. A. (2005). Ascertaining the validity of individual protocols from Web-based personality inventories. Journal of Research in Personality, 39, 103–129. Joseph, C., Thomas, B., & Roopa, G. G. (2005). Test taking response styles and associated personality traits in aircrew during evaluation. Indian Journal of Aerospace Medicine, 49, 1–10. Karson, S., & O’Dell, J. W. (1976). A guide to the clinical use of the 16 PF. Campaign, IL: Institute of Personality and Ability Tests. Kurtz, J. E., & Parrish, C. L. (2001). Semantic response consistency and protocol validity in structured personality assessment: The case of the NEO-PI-R. Journal of Personality Assessment, 76, 315–332. McCrae, R. R., Costa, P. T., Dahlstrom, W. G., Barefoot, J. C., Siegler, I. C., & Williams, R. B. (1989). A caution on the use of the MMPI K-correction in research on psychosomatic-medicine. Psychosomatic Medicine, 51, 58–65. Merino, C., & Lautenschlager, G. J. (2003). Statistical comparison of the Cronbach’s alpha reliability, applications in the educational and psychological measurement. Revista de Psicología, 2, 127–136. Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in personality testing for personnel selection: The red herring. Journal of Applied Psychology, 81, 660–679.
Salgado, J. F., Remeseiro, C., & Iglesias, M. (1996). Personality and test taking motivation. Psicothema, 8, 553–562. Schinka, J. A., Kinder, B. N., & Kremer, T. (1997). Research validity scales for the NEO-PI-R: Development and initial validation. Journal of Personality Assessment, 68, 127–138. Tellegen, A. (1982). Brief manual of the multidimensional personality questionnaire, Unpublished manuscript. Uziel, L. (2010). Rethinking Social Desirability Scales: From impression management to interpersonally oriented selfcontrol. Perspectives on Psychological Science, 5, 243–262. Zickar, M. J., & Drasgow, F. (1996). Detecting faking on a personality instrument using appropriateness measurement. Applied Psychological Measurement, 20, 71–87. Zuckerman, M., & Aluja, A. (2014). Measures of sensation seeking. In G. J. Boyle, D. H. Saklofske, & G. Matthews (Eds.), Measures of personality and social psychological constructs (pp. 352–378). San Diego, CA: Academic Press. Zuckerman, M., Kuhlman, D. M., Joireman, J., Teta, P., & Kraft, M. (1993). A comparison of three structural models for personality: The Big Three, the Big Five, and the Alternative Five. Journal of Personality and Social Psychology, 65, 757–768.
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
Ó 2015 Hogrefe Publishing
Date of acceptance: January 29, 2015 Published online: June 30, 2015
Anton Aluja Department of Psychology University of Lleida Avda. Estudi General, 4 25001 Lleida Catalonia Spain Tel. +34 973 706-529 E-mail aluja@pip.udl.cat
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
45
Appendix ************************ Random generated variables for 10.000 cases (SPSS syntax) 1 2 3 4 point by item ************************ ************************ 1th part ************************ INPUT PROGRAM. LOOP I=1 TO 10000. SET SEED 42019831. COMPUTE x1= RV.UNIFORM(0,1). COMPUTE x2= RV.UNIFORM(0,1). . . .. COMPUTE x200= RV.UNIFORM(0,1). END CASE. END LOOP. END FILE. END INPUT PROGRAM. FREQUENCIES VARIABLES=x1 x2 x3 x4. ************************ 2ond part ************************ *Item 1 IF IF IF IF
(x1 <=0.25) R1=1. (x1 >0.25 and x1 <=0.50) R1=2. (x1 >0.50 and x1 <=0.75) R1=3. (x1>0.75) R1= 4.
*Item 2 IF (x2<=0.25) R2=1. IF (x2 >0.25 and x2 <=0.50) R2=2. IF (x2 >0.50 and x2 <=0.75) R2=3. IF (x2>0.75) R2=4. . . .. *Item 200 IF (x200<=0.25) R200=1. IF (x200 >0.25 and x4 <=0.50) R200=2. IF (x200 >0.50 and x4 <=0.75) R200=3. IF (x200>0.75) R200= 4. EXECUTE.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
46
A. Aluja et al.: Inconsistency Index for the ZKA-PQ
*********************** INCONSISTENCY INDEX FOR ZKA-PQ (SPSS syntax) *********************** COMPUTE PAIR_1 = ABS (Z1 - Z61). COMPUTE PAIR_2 = ABS (Z4 - Z24). COMPUTE PAIR_3 = ABS (Z49 - Z19). COMPUTE PAIR_4 = ABS (Z21 - Z121). COMPUTE PAIR_5 = ABS (Z30 - Z50). COMPUTE PAIR_6 = ABS (Z80 - Z180). COMPUTE PAIR_7 = ABS (Z121 - Z141). COMPUTE PAIR_8 = ABS (Z14 - Z94). COMPUTE PAIR_9 = ABS (Z120 - Z160). COMPUTE PAIR10 = ABS (Z21 - Z161). COMPUTE INC = PAIR_1+ PAIR_2+ PAIR_3+ PAIR_4+ PAIR_5+ PAIR_6+ PAIR_7+ PAIR_8+ PAIR_9+ PAIR10. EXECUTE.
European Journal of Psychological Assessment 2017; Vol. 33(1):38–46
Ó 2015 Hogrefe Publishing
Original Article
Exploring Occasion Specificity in the Assessment of DSM-5 Maladaptive Personality Traits A Latent State-Trait Analysis Johannes Zimmermann,1,* Axel Mayer,2,* Daniel Leising,3 Tobias Krieger,4 Martin grosse Holtforth,4 and Johanna Pretsch5 1
Department of Psychology, University of Kassel, Germany, 2Department of Data Analysis, Ghent University, Belgium, 3Department of Psychology, Technische Universität Dresden, Germany, 4 Department of Clinical Psychology and Psychotherapy, University of Bern, Switzerland, 5 Department of Psychology, University of Landau, Germany
Abstract. The alternative classification system for personality disorders in DSM-5 features a hierarchical model of maladaptive personality traits. This trait model comprises five broad trait domains and 25 specific trait facets that can be reliably assessed using the Personality Inventory for DSM-5 (PID-5). Although there is a steadily growing literature on the validity of the PID-5, issues of temporal stability and situational influences on test scores are currently unexplored. We addressed these issues using a sample of 611 research participants who completed the PID-5 three times, with time intervals of 2 months. Latent state-trait (LST) analyses for each of the 25 PID-5 trait facets showed that, on average, 79.5% of the variance was due to stable traits (i.e., consistency), and 7.7% of the variance was due to situational factors (i.e., occasion specificity). Our findings suggest that the PID-5 trait facets predominantly capture individual differences that are stable across time. Keywords: maladaptive personality traits, latent state-trait theory
The classification of personality disorders is currently shifting toward a dimensional trait model (Krueger & Markon, 2014; Skodol, 2012; Tyrer et al., 2011). A landmark step in this ongoing process was the development of a hierarchical model of maladaptive personality traits that is part of the alternative classification system for personality disorders in DSM-5 Section III (American Psychiatric Association, 2013). This model comprises five broad trait domains, namely Negative Affectivity, Detachment, Antagonism, Disinhibition, and Psychoticism. Each of these domains is further specified by several trait facets reflecting more narrowly-defined maladaptive dispositions. This hierarchical model was developed in conjunction with a self- and informant-report questionnaire for assessing the 25 trait facets: the Personality Inventory for DSM-5 (Krueger, Derringer, Markon, Watson, & Skodol, 2012). A growing body of research shows that the five-factor structure of the PID-5 appears to be relatively stable across samples, languages, and raters, and that the trait facets and domains are differentially related to a range of other relevant constructs (Krueger & Markon, 2014). One of the most important findings in this regard is that the PID-5 domains can be *
broadly conceived of as maladaptive variants of the FiveFactor Model (FFM) traits: Studies repeatedly found strong positive associations between Negative Affectivity and Neuroticism, and strong negative associations between Detachment and Extraversion, Antagonism and Agreeableness, and Disinhibition and Conscientiousness (whereas the association between Psychoticism and Openness was less clear, e.g., Watson, Stasik, Ro, & Clark, 2013; Zimmermann et al., 2014). A major issue in the assessment of personality traits is the influence of situational factors (e.g., current mood) on test scores (e.g., Clark, Vittengl, Kraft, & Jarrett, 2003; Naragon-Gainey, Gallagher, & Brown, 2013). This issue has not been systematically addressed so far with respect to the PID-5, and will constitute the focus of the present paper. Personality traits are assumed to be the most stable component of personality disorders as compared to more transient symptomatic expressions such as suicidal behavior (Morey & Hopwood, 2013). Thus, people’s manifest PID-5 test scores should ideally reflect their stable latent traits, and remain relatively unaffected by fluctuations of current mood or external circumstances. A recent
J. Zimmermann and Axel Mayer contributed equally to this manuscript and share first authorship.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54 DOI: 10.1027/1015-5759/a000271
48
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
meta-analysis reported that rank-order stability coefficients for general personality traits in adult samples ranged from .56 to .73, with older samples showing higher estimates (Ferguson, 2010). This suggests that ‘‘normal’’ individual differences in manifest test scores are relatively stable across measurement occasions. Similar estimates have been reported for the two-year stability of dimensional scores of personality disorders and maladaptive personality traits (Lenzenweger, 1999; Morey et al., 2007; Samuel et al., 2011). A conceptual and methodological framework that is ideally suited to separate consistent and occasion-specific variance in test scores is latent state-trait theory (LST theory; Steyer, Ferring, & Schmitt, 1992; Steyer, Geiser, & Fiege, 2012; Steyer, Schmitt, & Eid, 1999) and its revision (LST-R theory; Steyer, Mayer, Geiser, & Cole, 2015).1 In LST theory, it is argued that every manifest measure or test score can be decomposed into a latent state variable, also called the true score variable at a specific time in a specific situation, and measurement error. The latent state variable itself is a compound of trait components and situation-specific components, also called state residuals. An important goal in LST models is to separate these components from one another, and thereby determine the extent to which manifest measures actually reflect attributes of persons (i.e., trait components). The different components can only be separated by using longitudinal research designs, where multiple indicators of the same construct are measured at multiple occasions. The present study incorporates such a design. In LST theory, the variance of every manifest variable is additively decomposed into the variance of the latent trait variable, the variance of the latent state residual variable, and measurement error variance. The reliability coefficient Rel(Yit) is defined for the ith manifest variable at occasion t, Yit, and it represents the proportion of the latent state variance relative to the total variance Var(Yit). A high reliability coefficient indicates a low proportion of measurement error variance in the measured variable Yit. The consistency coefficient Con(Yit) is defined as the proportion of the latent trait variance relative to the total variance Var(Yit) of each manifest variable. The consistency coefficient thus quantifies how well the manifest variable Yit measures the latent trait variable. The occasion specificity coefficient Spe(Yit) is defined as the proportion of the state residual variance relative to the total variance Var(Yit), that is, it indicates which proportion of the variance of the measured variable Yit is due to the specific situation in which it is measured. Precise definitions of latent trait variables, latent state variables, and latent state residuals as well as definitions of reliability, consistency, and occasion specificity coefficients are given in Sidebar 3.1 in Steyer et al. (2015). LST analyses have been applied to a range of different measures assessing current affect and mood (e.g., Schmukle, Egloff, & Burns, 2002), or general personality traits (e.g., Deinzer et al., 1995; Schmukle & Egloff, 2005; Steyer, Schwenkmezger, & Auer, 1990). For a trait 1
measure, the trait-specific component (consistency) should be high and ideally as high as the reliability. By contrast, the occasion-specific component should approximate zero, which would indicate no occasion-specific effects on the measurement of the trait at all. In fact, LST analyses in the area of personality assessment revealed small, but yet partly significant occasion-specific effects on trait measurement. Deinzer et al. (1995) found that the proportion of occasion-specific variance ranged from 6% for extraversion to 16% for agreeableness. Schmukle and Egloff (2005) reported an occasion-specific variance of 2% for trait anxiety and of 9% for trait extraversion, and Steyer et al. (1990) found an occasion-specific variance for trait anxiety of 18%. An aggregation of these results would lead us to expect an occasion specificity of about 10% for trait measures. However, it should be kept in mind that in previous research the proportion of occasion-specific variance has varied somewhat across personality traits. For example, current affect might influence responses to anxiety items more than responses to extraversion items (Schmukle & Egloff, 2005). Moreover, the distress associated with suffering from a mental disorder is likely to influence the response to affect-laden items, a phenomenon that has been called mood-state distortion (Brown, 2007). This suggests that occasion specificity might be higher when assessing maladaptive personality traits in clinical samples with more transient mental health problems. The aim of the present study is to further advance the literature on the measurement of maladaptive personality traits by estimating the amount of variance in self-reported PID-5 trait facets that is due to the person (i.e., consistency) as opposed to the situation (i.e., occasion specificity). We present LST analyses of the 25 PID-5 trait facets that are based on a sample of 611 participants who completed the PID-5 three times, with time intervals of 2 months in between measurement occasions. Since the PID-5 is assumed to be a measure of relatively stable traits, we hypothesized that occasion specificity for the facets would be similar to the estimates that were obtained in previous studies of general personality traits (see above). Additionally, we expected some variation in occasion specificity across trait facets. We also conducted multiple-group LST analyses to explore whether occasion specificity differs across persons with and without current mental health problems.
Materials and Methods Procedure We invited students at several universities in Germany, Austria, and the German-speaking part of Switzerland to participate in a questionnaire study of personality and mental health (see Zimmermann et al., 2014). Students who were interested in participating wrote an e-mail to the study
Another model that allows for considering traits and states simultaneously is the trait-state-error model (Kenny & Zautra, 1995).
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
Ó 2015 Hogrefe Publishing
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
team, and subsequently received e-mails containing personalized links to an online platform. In addition, we posted a hyperlink to the study on the website of a popular psychology magazine in Germany. After obtaining informed consent, all participants were asked to complete a questionnaire battery including questions regarding demographic information (e.g., age, gender, education), the Patient Health Questionnaire (PHQ; Gräfe, Zipfel, Herzog, & Löwe, 2004), and the PID-5 (time 1). Participants were excluded if they reported being younger than 18 years old. Participants who did not complete the online survey within 7 days received weekly reminders. Completers received a second invitation 2 months later, again containing personalized links to the online platform where they were asked to complete the PID-5 (time 2). This procedure was repeated again another 2 months later (time 3). Participants who completed all three assessments automatically entered a lottery, in which they could win coupons from an online bookseller worth 50 euros. Participants studying psychology received course credit. The study protocol was approved by the Ethics Committee of the University of Kassel.
Measures Personality Inventory for DSM-5 (PID-5) The PID-5 is a 220-item questionnaire for assessing maladaptive personality traits according to the DSM-5 trait model (Krueger et al., 2012; German version: Zimmermann et al., 2014). Items are presented with a 4-point response format ranging from ‘‘very false or often false’’ (0) to ‘‘very true or often true’’ (3). For the majority of items higher values reflect higher levels of personality pathology, with only 16 items being reverse coded. The PID-5 comprises 25 trait facet scales each of which comprises between four and 14 items. For both the English and the German language versions, facet scales show acceptable to good internal consistencies (with median Cronbach’s alphas of around .85), as well as unique and high loadings on the five factors of Negative Affectivity, Detachment, Antagonism, Disinhibition, and Psychoticism (Krueger & Markon, 2014; Zimmermann et al., 2014).
49
eating disorder). The PHQ shows good sensitivity and specificity in identifying persons with interview-based DSM-IV diagnoses (Gräfe et al., 2004).
Sample The initial sample consisted of 618 participants who completed assessments at time 1. We excluded 7 participants with a substantial (> 10%) number of missing PID-5 item values. Thus, the following analyses are based on data from 611 participants. At subsequent assessments, we received full responses (i.e., responses with 10% missing PID-5 item values) from 506 (time 2) and 485 (time 3) participants. In the final sample of 611 participants, 511 (83.6%) were female, 86 (14.1%) were male, and 14 (2.3%) did not report their gender. Participants’ age ranged from 18 to 61 years and averaged at 25.7 years (SD = 7.9). As this sample was predominantly recruited at universities, 575 participants (94.1%) had one of the highest possible secondary school degrees within the academic systems of the German-speaking countries (‘‘Abitur’’ or ‘‘Matura’’), so the participants were very well educated, on average. Five hundred ninety-three participants (97.1%) also completed the PHQ at time 1. Of these, 276 participants (46.5%) were probably suffering from at least one mental health problem – a prevalence rate that is quite typical for German university samples when using the PHQ as a screening instrument (Bailer, Schwarz, Witthöft, Stübinger, & Rist, 2008). Specifically, 133 participants (22.4%) had a PHQ diagnosis of depression (major depressive disorder or other depressive disorder), 51 (8.6%) had a PHQ diagnosis of anxiety disorder (panic disorder and/or other anxiety disorder), 61 (10.3%) had a PHQ diagnosis of eating disorder (bulimia nervosa or binge eating disorder), 101 (17.0%) showed symptoms of alcohol abuse, and 102 (17.2%) had major somatic complaints.
Statistical Analyses
The PHQ is a widely used screening measure for mental disorders according to DSM-IV (Spitzer, Kroenke, & Williams, 1999; German version: Gräfe et al., 2004). Specifically, the PHQ screens for eight Axis I disorders, which are further specified as ‘‘threshold disorders’’ (including major depressive disorder, panic disorder, and bulimia nervosa) or ‘‘subthreshold disorders’’ (including other depressive disorder, other anxiety disorder, alcohol abuse or dependence, somatoform disorder, and binge
In a preparatory step, we construed two parallel ‘‘parcels’’ of items (Little, Rhemtulla, Gibson, & Schoemann, 2013) for each PID-5 trait facet, based on corrected item-whole correlations, and used these two parcels as indicators in the LST models. Specifically, we ranked items according to their corrected item-whole correlation in the long-format data set, and subsequently assigned items with the highest, the fourth-highest, the fifth-highest, the eighth-highest, etc., to the first parcel, and items with the second-highest, the third-highest, the sixth-highest, etc., to the second parcel (ABBAABBAA etc.). This way, we ensured that the average item-total correlations within the two parcels were approximately the same. In order to estimate variances of latent trait variables and latent state variables, we used a multistate-multitrait
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
Patient Health Questionnaire (PHQ)
50
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
model with indicator-specific traits (MSMT model; Eid & Hoffmann, 1998) for each of the 25 PID-5 trait facets. Note that, for each of the 25 trait facets, six manifest variables were available, that is, two item parcels i = 1, 2 at three occasions of measurement t = 1, 2, 3. The model equation for our MSMT model for any given facet is: 0 B B B B B B B B B B B @
Y 11 Y 21 Y 12 Y 22 Y 13 Y 23
1
0 C B C B C B C B C B C B C¼B C B C B C B C @ A
1 0 1 0 1 0
0
1
0
C B B 1C C ! B C B 0 C h1 B C B þ C B 1 C h2 B C B C B 0A @ 1
1
0 0
1
0 0
0
1 0
0
1 0
0
0 1
0
0 1
1
0
B C C0 1 B B C f1 B C CB C B CB f2 C þ B C@ A B B C B C f3 B C B A @
e11
1
C e21 C C C e12 C C C; e22 C C C e13 C A e23 ð1Þ
where hi denotes the latent trait variable for the ith item parcel, ft denotes the state residual for occasion t, and eit are measurement error variables. Figure 1 shows a path diagram for this model. Additionally, we assume that the variances of the latent trait variables are equal, Var(h1) = Var(h2), that the variances of the state residuals are equal, Var(f1) = Var(f2) = Var(f3), and that the variances of the measurement error variables are equal, Var(e11) = Var(e21) = . . . = Var(e23). Based on our MSMT model, we can identify variances of latent trait variables and latent state variables and compute reliability, consistency, and occasion specificity coefficients as follows: RelðY it Þ ¼
Varðhi Þ þ Varðft Þ VarðY it Þ
ð2Þ
Results and Discussion
Varðhi Þ ConðY it Þ ¼ VarðY it Þ
ð3Þ
Varðft Þ VarðY it Þ
ð4Þ
SpeðY it Þ ¼
Note that reliability, consistency, and occasion specificity coefficients are identical for all six manifest variables, because we assumed equal variances of measurement error variables, latent trait variables, and state residual variables. We applied the MSMT model to each of the 25 PID-5 trait facets separately. In addition, we estimated multiple-group MSMT models to compare reliability, consistency, and occasion specificity coefficients in the 276 participants with mental health problems and the 317 participants without mental health problems (according to the PHQ at time 1). The parameters of all models were estimated by full information maximum likelihood using the lavaan package (Rosseel, 2012) for R (R Core Team, 2013). 2
Figure 1. Path diagram for the multistate-multitrait model with two manifest indicators Yit (item parcels) at each of three occasions t of measurement, occasionspecific state residuals ft, indicator-specific trait variables hi, and measurement error variables eit. All loadings are fixed at one.
Table 1 shows reliability, consistency, and occasion specificity coefficients for each of the trait facets as well as model fit information based on the full sample. The model fit indices reported in Table 1 indicated an acceptable model fit for the MSMT model for most of the trait facets. The average reliability across the 25 trait facets was .872, which is well in line with prior findings regarding the internal consistency (i.e., Cronbach’s alpha) of PID-5 scales (Krueger et al., 2012; Krueger & Markon, 2014; Zimmermann et al., 2014). The average consistency was .795, and the average occasion specificity was .077, which is close to what was expected based on prior studies of general personality traits (Deinzer et al., 1995; Schmukle & Egloff, 2005).2 Multiple-group MSMT models suggested that results were very similar in participants with and without current mental health problems: In the former group, the average consistency was .795, and the average occasion specificity was .080; in the latter group, average consistency was .759, and the average occasion specificity was .087. Detailed results on the reliability, consistency, and
Some researchers prefer consistency and specificity coefficients using the latent state variance in the denominator instead of the total variance of Yit. Note that the ratio of trait variance and latent state variance can be computed based on the numbers in Table 1 by Con(Yit)/ Rel(Yit). Similarly, the ratio of situation-specific variance and latent state variance can be obtained by Spe(Yit)/Rel(Yit).
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
Ó 2015 Hogrefe Publishing
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
51
Table 1. Reliability, occasion specificity, consistency and model fit for each of the trait facets in the full sample Facet
Rel(Y)
Spe(Y)
Con(Y)
v2
CFI
TLI
RMSEA
Separation insecurity Anxiousness Emotional lability Submissiveness Perseveration Withdrawal Intimacy avoidance Anhedonia Restricted affectivity Depressivity Suspiciousness Manipulativeness Deceitfulness Grandiosity Callousness Attention seeking Hostility Impulsivity Irresponsibility Rigid perfectionism Distractibility Risk taking Unusual beliefs experiences Perceptual dysregulation Eccentricity
.88 .92 .89 .80 .84 .93 .90 .91 .86 .96 .81 .77 .86 .82 .84 .88 .87 .82 .78 .89 .91 .90 .86 .90 .95
.08 .09 .07 .09 .11 .05 .07 .09 .07 .08 .07 .05 .08 .07 .10 .08 .07 .08 .06 .07 .09 .06 .08 .07 .08
.80 .83 .82 .71 .73 .88 .83 .82 .79 .88 .74 .72 .78 .75 .74 .80 .80 .74 .72 .82 .82 .84 .79 .83 .87
81.57 88.11 91.72 60.10 68.80 164.53 116.12 53.71 107.53 260.18 140.55 77.23 67.01 40.98 220.44 61.20 80.52 55.51 57.85 56.36 79.67 69.60 149.11 174.70 120.52
.98 .98 .98 .98 .98 .97 .97 .99 .97 .95 .95 .97 .98 .99 .92 .99 .98 .99 .98 .99 .98 .99 .95 .95 .98
.99 .99 .98 .99 .99 .98 .98 .99 .98 .97 .97 .98 .99 .99 .94 .99 .99 .99 .99 .99 .99 .99 .97 .97 .99
.07 .07 .07 .06 .06 .11 .09 .05 .08 .14 .10 .07 .06 .04 .12 .06 .07 .05 .05 .05 .07 .06 .10 .11 .09
Domain NA NA NA NA NA DET DET DET NA/DET DET/NA DET/NA ANT ANT ANT ANT ANT NA/ANT DIS DIS DIS DIS DIS PSY PSY PSY
Note. N = 611. NA = Negative Affectivity; DET = Detachment; ANT = Antagonism; DIS = Disinhibition; PSY = Psychoticism; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; RMSEA = Root Mean Square Error of Approximation. A minus sign ( ) in the first column indicates that the facet is a negative marker of the domain. A slash (/) in the first column indicates that the facet is a marker of two domains. All models have 21 degrees of freedom and the p-values for the v2-tests of model fit were smaller than .01 for all models.
occasion specificity coefficients for specific trait facets in these two groups can be found in Table 2. Although the majority of trait facets showed an occasion specificity coefficient between .07 and .09, a few trait facets stood out of this general pattern (see Table 1). For example, the occasion specificity coefficient was smaller than .06 for the facets of ‘‘manipulativeness’’ and ‘‘withdrawal,’’ and as large as .11 for the facet of ‘‘perseveration.’’ The relatively small amount of occasion specificity for withdrawal is in line with previous findings suggesting that the measurement of extraversion (which is empirically the opposite pole of withdrawal; see, e.g., Zimmermann et al., 2014) is hardly influenced by current mood states (Schmukle & Egloff, 2005). In contrast, the relatively large amount of occasion specificity in perseveration might be due to a specific proneness of this trait facet to situational activation. For example, the costs of one’s own attempts at getting something exactly right (see PID-5 item 51) might be more easily experienced after working hard for some time than after a relaxing holiday. However, such post hoc explanations regarding the differential influence of current mood states or recent experiences remain speculative and should be treated with caution. In sum, the results of our study support the assumption that the PID-5 predominantly captures individual
differences that are stable across time, at least when considering shorter time periods (e.g., several months). This finding ties in with previous studies showing that the rank-order stability of dimensional scores of personality disorders and maladaptive personality traits across 2 years is high (Lenzenweger, 1999; Morey et al., 2007; Samuel et al., 2011). It is also in line with the reasoning of the DSM-5 Work Group on Personality and Personality Disorder that incorporating a dimensional trait model may help address the discrepancy between the definition of personality disorders as enduring patterns and the empirical reality of considerable instability for categorical personality disorder diagnoses (Skodol, 2012). Our study makes an important contribution to the knowledge base as it is the first to demonstrate high stability for the DSM-5 trait facets. This is important because one of the main arguments against adopting the alternative personality disorder model for the official classification (i.e., in DSM-5 Section II) is that little research exists with regard to the exact set of variables contained in the model (Skodol, Morey, Bender, & Oldham, 2013). Thus, showing that DSM-5 trait facet scores are hardly influenced by situational factors has the potential to actually influence the diagnostic nomenclature (i.e., with accumulating evidence, the alternative personality disorder model may migrate into Section II of a planned DSM-5.1).
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
52
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
Table 2. Reliability, occasion specificity, consistency, and model fit for each of the trait facets in participants with and without mental health problems Healthy group Domain NA NA NA NA NA DET DET DET NA/DET DET/NA DET/NA ANT ANT ANT ANT ANT NA/ANT DIS DIS DIS DIS DIS PSY PSY PSY
Distressed group
Model fit 2
Facet
Rel(Y)
Spe(Y)
Con(Y)
Rel(Y)
Spe(Y)
Con(Y)
v
CFI
TLI
RMSEA
Separation insecurity Anxiousness Emotional lability Submissiveness Perseveration Withdrawal Intimacy avoidance Anhedonia Restricted affectivity Depressivity Suspiciousness Manipulativeness Deceitfulness Grandiosity Callousness Attention seeking Hostility Impulsivity Irresponsibility Rigid perfectionism Distractibility Risk taking Unusual beliefs experiences Perceptual dysregulation Eccentricity
.86 .90 .87 .79 .81 .92 .88 .87 .87 .92 .75 .76 .84 .82 .81 .86 .84 .78 .71 .90 .88 .88 .84 .85 .94
.08 .10 .07 .08 .13 .07 .11 .11 .07 .12 .07 .05 .09 .06 .11 .09 .08 .10 .06 .08 .12 .06 .08 .10 .10
.78 .80 .80 .70 .69 .85 .77 .76 .80 .80 .68 .70 .76 .76 .70 .77 .76 .68 .65 .82 .76 .82 .76 .75 .84
.87 .92 .89 .81 .85 .94 .90 .92 .85 .96 .83 .78 .87 .83 .87 .90 .87 .84 .79 .88 .90 .89 .87 .90 .94
.09 .11 .09 .10 .13 .05 .05 .09 .07 .09 .08 .05 .09 .09 .10 .07 .07 .08 .07 .07 .09 .05 .08 .07 .08
.78 .81 .80 .71 .72 .89 .86 .82 .78 .87 .75 .73 .78 .74 .76 .83 .80 .76 .72 .81 .81 .84 .79 .83 .87
94.45 93.40 120.52 92.84 121.41 209.57 130.28 70.62 152.33 308.29 179.82 96.04 83.50 58.08 323.81 77.85 116.19 93.55 72.31 75.35 116.06 73.78 155.72 178.15 168.34
.98 .99 .97 .98 .97 .96 .97 .99 .96 .94 .94 .97 .98 .99 .88 .99 .97 .98 .98 .99 .98 .99 .96 .95 .97
0.99 0.99 0.98 0.98 0.98 0.97 0.98 0.99 0.97 0.96 0.96 0.98 0.99 1.00 0.91 0.99 0.98 0.98 0.99 0.99 0.98 0.99 0.97 0.97 0.98
.06 .06 .08 .06 .08 .12 .08 .05 .09 .15 .11 .07 .06 .04 .15 .05 .08 .06 .05 .05 .08 .05 .10 .10 .10
Note. N = 593 (317 in healthy group and 276 in distressed group). NA = Negative Affectivity; DET = Detachment; ANT = Antagonism; DIS = Disinhibition; PSY = Psychoticism; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; RMSEA = Root Mean Square Error of Approximation. Healthy versus distressed group membership was defined by the presence of at least one diagnosis based on the Patient Health Questionnaire at time 1. A minus sign ( ) in the first column indicates that the facet is a negative marker of the domain. A slash (/) in the first column indicates that the facet is a marker of two domains. All models have 42 degrees of freedom and the p-values for the v2-tests of model fit were smaller than .01 except for grandiosity ( p = .05).
This study has two main limitations: First, the sample consisted predominantly of self-selected, female psychology students. Thus, the generalizability of our findings is limited. However, our findings were virtually identical for participants with and without mental health problems, suggesting that they may generalize to clinical samples. In fact, the stability of DSM-5 trait facets seems to be rather independent of more acute mental health problems such as those that were formerly diagnosed on Axis I. Second, we only used a self-report form of the PID-5, so the results pertain mainly to the stability of people’s self-images regarding their own maladaptive personality traits. Given the considerable intervals between assessments that were used in the present study, it is unlikely that this stability was inflated by factors such as memory effects. However, the extent to which the results of the present study would generalize across different sources of information (e.g., informant reports) remains unclear for now and should be the subject of further research (Markon, Quilty, Bagby, & Krueger, 2013). Nevertheless, the findings of the present study are still highly relevant, given that most research in the domain
of personality pathology is indeed carried out using selfreport measures such as the PID-5 (Krueger & Markon, 2014; Widiger & Samuel, 2005). Despite these limitations, our study provides strong evidence for the temporal stability of DSM-5 trait facets. Note, however, that this does not imply that maladaptive personality traits cannot change in the long run, or that personality pathology is a life sentence. For example, there might be trait changes when considering clinical samples over longer periods of time (Morey & Hopwood, 2013), or when considering the effects of evidence-based treatments for personality disorders (Budge et al., 2013). In fact, clinical interventions for personality disorders typically aim at trait changes rather than at mere short-term state changes. This has been one of the key arguments that led to a revision of LST theory (Steyer et al., 2015) to more accurately reflect the intuitive idea that persons’ traits can change over time. Steyer et al. (2015) describe various models that allow for examining mean trait level changes, interindividual differences in trait changes, and long-term effects of interventions on latent traits in randomized experiments and
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
Ó 2015 Hogrefe Publishing
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
observational studies. Applying these models in future studies including a larger number of assessments over a longer period of time could provide a more detailed picture on stability and change in maladaptive personality traits.
53
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders: DSM-5 (5th ed.). Arlington, VA: American Psychiatric Association. Bailer, J., Schwarz, D., Witthöft, M., Stübinger, C., & Rist, F. (2008). Prävalenz psychischer Syndrome bei Studierenden einer deutschen Universität [Prevalence of mental disorders among college students at a German university]. Psychotherapie, Psychosomatik, medizinische Psychologie, 58, 423–429. doi: 10.1055/s-2007-986293 Brown, T. A. (2007). Temporal course and structural relationships among dimensions of temperament and DSM-IV anxiety and mood disorder constructs. Journal of Abnormal Psychology, 116, 313–328. doi: 10.1037/0021-843X.116. 2.313 Budge, S. L., Moore, J. T., Del Re, A., Baardseth, T. P., Wampold, B. E., & Nienhuis, J. B. (2013). The effectiveness of evidence-based treatments for personality disorders when comparing treatment-as-usual and bona fide treatments. Clinical Psychology Review, 33, 1057–1066. doi: 10.1016/ j.cpr.2013.08.003 Clark, L. A., Vittengl, J., Kraft, D., & Jarrett, R. B. (2003). Separate personality traits from states to predict depression. Journal of Personality Disorders, 17, 152–172. doi: 10.1521/ pedi.17.2.152.23990 Deinzer, R., Steyer, R., Eid, M., Notz, P., Schwenkmezger, P., Ostendorf, F., & Neubauer, A. (1995). Situational effects in trait assessment: The FPI, NEOFFI, and EPI questionnaires. European Journal of Personality, 9, 1–23. doi: 10.1002/ per.2410090102 Eid, M., & Hoffmann, L. (1998). Measuring variability and change with an item response model for polytomous variables. Journal of Educational and Behavioral Statistics, 23, 193–215. Ferguson, C. J. (2010). A meta-analysis of normal and disordered personality across the life span. Journal of Personality and Social Psychology, 98, 659–667. doi: 10.1037/a0018770 Gräfe, K., Zipfel, S., Herzog, W., & Löwe, B. (2004). Screening psychischer Störungen mit dem ‘‘Gesundheitsfragebogen für Patienten (PHQ-D)’’ [Screening for psychiatric disorders with the Patient Health Questionnaire (PHQ). Results from the German validation study]. Diagnostica, 50, 171–181. doi: 10.1026/0012-1924.50.4.171 Kenny, D. A., & Zautra, A. (1995). The trait-state-error model for multiwave data. Journal of Consulting and Clinical Psychology, 63, 52–59. doi: 10.1037/0022-006X.63.1.52 Krueger, R. F., Derringer, J., Markon, K. E., Watson, D., & Skodol, A. E. (2012). Initial construction of a maladaptive personality trait model and inventory for DSM-5. Psychological Medicine, 42, 1879–1890. doi: 10.1017/ S0033291711002674 Krueger, R. F., & Markon, K. E. (2014). The role of the DSM-5 personality trait model in moving toward a quantitative and empirically based approach to classifying personality and psychopathology. Annual Review of Clinical Psychology, 10, 477–501. doi: 10.1146/annurev-clinpsy-032813-153732 Lenzenweger, M. F. (1999). Stability and change in personality disorder features: The Longitudinal Study of Personality Disorders. Archives of General Psychiatry, 56, 1009–1015. doi: 10.1001/archpsyc.56.11.1009
Little, T. D., Rhemtulla, M., Gibson, K., & Schoemann, A. M. (2013). Why the items versus parcels controversy needn’t be one. Psychological Methods, 18, 285–300. doi: 10.1037/ a0033266 Markon, K. E., Quilty, L. C., Bagby, R. M., & Krueger, R. F. (2013). The development and psychometric properties of an informant-report form of the Personality Inventory for DSM5 (PID-5). Assessment, 20, 370–383. doi: 10.1177/ 1073191113486513 Morey, L. C., & Hopwood, C. J. (2013). Stability and change in personality disorders. Annual Review of Clinical Psychology, 9, 499–528. doi: 10.1146/annurev-clinpsy-050212-185637 Morey, L. C., Hopwood, C. J., Gunderson, J. G., Skodol, A. E., Shea, M. T., Yen, S., . . . McGlashan, T. H. (2007). Comparison of alternative models for personality disorders. Psychological Medicine, 37, 983–994. doi: 10.1017/ S0033291706009482 Naragon-Gainey, K., Gallagher, M. W., & Brown, T. A. (2013). Stable ‘‘trait’’ variance of temperament as a predictor of the temporal course of depression and social phobia. Journal of Abnormal Psychology, 122, 611–623. doi: 10.1037/ a0032997 R Core Team. (2013). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www. R-project.org Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. Samuel, D. B., Hopwood, C. J., Ansell, E. B., Morey, L. C., Sanislow, C. A., Markowitz, J. C., . . . Grilo, C. M. (2011). Comparing the temporal stability of self-report and interview assessed personality disorder. Journal of Abnormal Psychology, 120, 670–680. doi: 10.1037/a0022647 Schmukle, S. C., & Egloff, B. (2005). A latent state-trait analysis of implicit and explicit personality measures. European Journal of Psychological Assessment, 21, 100–107. doi: 10.1027/1015-5759.21.2.100 Schmukle, S. C., Egloff, B., & Burns, L. R. (2002). The relationship between positive and negative affect in the Positive and Negative Affect Schedule. Journal of Research in Personality, 36, 463–475. doi: 10.1016/S0092-6566(02) 00007-7 Skodol, A. E. (2012). Personality disorders in DSM-5. Annual Review of Clinical Psychology, 8, 317–344. doi: 10.1146/ annurev-clinpsy-032511-143131 Skodol, A. E., Morey, L. C., Bender, D. S., & Oldham, J. M. (2013). The ironic fate of the personality disorders in DSM-5. Personality Disorders: Theory, Research, and Treatment, 4, 342–349. doi: 10.1037/per0000029 Spitzer, R. L., Kroenke, K., & Williams, J. B. (1999). Validation and utility of a self-report version of PRIME-MD: The PHQ primary care study. JAMA: The Journal of the American Medical Association, 282, 1737–1744. Steyer, R., Ferring, D., & Schmitt, M. (1992). States and traits in psychological assessment. European Journal of Psychological Assessment, 8, 79–98. Steyer, R., Geiser, C., & Fiege, C. (2012). Latent state-trait models. In H. Cooper (Ed.), APA handbook of research methods in psychology (pp. 291–308). Washington, DC: American Psychological Association. Steyer, R., Mayer, A., Geiser, C., & Cole, D. A. (2015). A theory of states and traits: Revised. Annual Review of Clinical Psychology, 11, 71–98. doi: 10.1146/annurev-clinpsy032813-153719 Steyer, R., Schmitt, M., & Eid, M. (1999). Latent state-trait theory and research in personality and individual differences. European Journal of Personality, 13, 389–408. doi: 10.1002/ (SICI)1099-0984(199909/10)13:5<389::AID-PER361>3.0. CO;2-A
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
References
54
J. Zimmermann et al.: Exploring Occasion Specificity in the Assessment of DSM-5 Traits
Steyer, R., Schwenkmezger, P., & Auer, A. (1990). The emotional and cognitive components of trait anxiety: A latent state-trait model. Personality and Individual Differences, 11, 125–134. doi: 10.1016/0191-8869(90)90004-B Tyrer, P., Crawford, M., Mulder, R. T., Blashfield, R. K., Farnam, A., Fossati, A., . . . Reed, G. M. (2011). The rationale for the reclassification of personality disorder in the 11th revision of the International Classification of Diseases (ICD-11). Personality and Mental Health, 5, 246–259. doi: 10.1002/pmh.190 Watson, D., Stasik, S. M., Ro, E., & Clark, L. A. (2013). Integrating normal and pathological personality: Relating the DSM-5 trait-dimensional model to general traits of personality. Assessment, 20, 312–326. doi: 10.1177/ 1073191113485810 Widiger, T. A., & Samuel, D. B. (2005). Evidence-based assessment of personality disorders. Psychological Assessment, 17, 278–287. doi: 10.1037/1040-3590.17.3.278 Zimmermann, J., Altenstein, D., Krieger, T., Grosse Holtforth, M., Pretsch, J., Alexopoulos, J., . . . Leising, D. (2014).
European Journal of Psychological Assessment 2017; Vol. 33(1):47–54
The structure and correlates of self-reported DSM-5 maladaptive personality traits: Findings from two Germanspeaking samples. Journal of Personality Disorders, 28, 518–540. doi: 10.1521/pedi_2014_28_130
Date of acceptance: January 29, 2015 Published online: June 30, 2015 Johannes Zimmermann Department of Psychology University of Kassel 34127 Kassel Germany Tel. +49 561 804-3331 E-mail johannes.zimmermann@uni-kassel.de
Ó 2015 Hogrefe Publishing
Multistudy Report
Assessing Perceived Ability to Cope With Trauma A Multigroup Validity Study of a 7-Item Coping Self-Efficacy Scale Mark W.G. Bosmans,1 Ivan H. Komproe,2 Nancy E. van Loey,3,4 Leontien M. van der Knaap,1 Charles C. Benight,5 and Peter G. van der Velden1,6 1
INTERVICT, Tilburg University, The Netherlands, 2Department of Research and Development, HealthNet TPO, Amsterdam, The Netherlands, 3Department of Psychosocial and Behavioural Research, Association of Dutch Burns Centres, Beverwijk, The Netherlands, 4Department of Clinical and Health Psychology, Utrecht University, Utrecht, The Netherlands, 5Department of Psychology, University of Colorado, Colorado Springs, CO, USA, 6Institute for Psychotrauma, Diemen, The Netherlands Abstract. Aim of the present study was to examine the construct validity of the trauma-related coping self-efficacy (CSE) scale. While assessing the psychometric properties of this 20-item scale among four different samples (514 victims of disaster, 1325 bereaved individuals, 512 victims of acute critical incidents, 169 severe burn victims), we found no measurement equivalence across groups. A shortened version was composed using only those items that were applicable to all types of potentially traumatic events (PTEs). In contrast to the CSE-20, the CSE-7 has a robust factor structure; factor structure and factor loadings were similar across study samples, indicating that it measured the same construct across different PTEs. These results offer strong support for cross-event construct validity of the CSE-7. Associations of the CSE-7 with posttraumatic stress symptoms showed the same pattern as with the CSE-20, indicating that the reduction in items did not diminish the scales’ power to predict posttraumatic stress. Keywords: coping self-efficacy, posttraumatic stress, construct validity, psychometric properties, measurement equivalence
Coping self-efficacy (CSE) is the ‘‘perceived capability to manage one’s personal functioning and the myriad environmental demands of the aftermath occasioned by a traumatic event’’ (Benight & Bandura, 2004, p. 1130), a cognition strongly related to psychosocial functioning after exposure to potentially traumatic events (PTEs). CSE is a perception of being able to deal with all the consequences of the PTE and being able to resume one’s normal life. More specifically, this means trauma-related CSE is comprised of perceptions of being able to deal with reminders of the event, being able to deal with any negative emotions associated with the event, being able to employ active coping strategies, and being able to resume normal functioning (Benight & Bandura, 2004). Coping is to be interpreted in a broad sense, encompassing both behaviors and cognitions employed in an effort to effectively deal with the external (practical) and/or internal (emotional) demands of the event. The assessment of available coping options is not only dependent on which responses would be effective, but also on the individuals’ believed ability to employ
these responses successfully (Lazarus & Folkman, 1984). Stress is the result of a mismatch between demands posed by a personally threatening event or its consequences, and the coping options to deal with this event and its consequences (Bandura, 1997; Lazarus & Folkman, 1984). Furthermore, CSE may affect the long-term stressfulness of a PTE by affecting the motivation to employ and sustain effective coping efforts; low levels of CSE are associated with avoidant coping strategies (Benight, Ironson, et al., 1999). Finally, CSE may affect how existing (initial) stress reactions are perceived: The belief that one can relieve unpleasant emotional states whatever their source, makes them less aversive (Kent, 1987; Kent & Gibbons, 1987). To sum up, CSE reflects the perceived level of ability to effectively deal with the event and its consequences, and determines appraisal of both the event itself and its consequences. Numerous studies have shown that higher levels of trauma-related CSE perceptions are consistently associated with lower posttraumatic stress symptoms, both concurrently
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64 DOI: 10.1027/1015-5759/a000266
56
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
and over time, across a wide range of traumatic experiences, including natural disasters, terrorist attacks, war combat and motor vehicle accidents (Benight, Cieslak, Molton, & Johnson, 2008; Benight, Freyaldenhoven, Hughes, Ruiz, & Zoschke, 2000; Benight, Ironson, et al., 1999; Luszczynska, Benight, & Cieslak, 2009). Because of its important role in posttraumatic recovery, the existence of a valid and reliable instrument to measure CSE in a wide range of PTEs can be a very useful tool in screening victims exposed to PTEs for risk of long-term psychopathology. Benight and colleagues (Benight, Ironson, & Durham, 1999; Benight et al., 2000, 2008) developed several scales to measure CSE after specific PTEs, such as hurricane CSE (Benight, Ironson, & Durham, 1999) and domestic violence CSE (Benight, Harding-Taylor, Midboe, & Durham, 2004). For the Hurricane CSE measure, internal consistency was high (alphas in two hurricane samples were above .90). Exploratory factor analysis showed that the measure was composed of a single factor assessed across two separate samples of hurricane survivors. One sample was composed of survivors of hurricane Andrew (1992, N = 165) and the other of survivors of hurricane Opal (1995, N = 63). For the two samples the explained variance for the one-factor structure was 52 and 60%, respectively, with factor loadings of the separate items all higher than .60. Factor analysis in this study was limited to exploratory factor analyses (with principal component analysis). No confirmatory factor analysis, that is, testing the equivalence of the factor structure across different groups of victims, was performed to confirm the resemblance with the factor structure found in the original exploratory factor analysis. The Domestic Violence CSE measure provided similar results: a high internal consistency of .97 and a single factor found in an exploratory factor analysis (principal component analysis) explaining 56% of the variance. In this study a single sample of domestic violence victims (N = 283) was assessed. Again, no confirmatory factor analyses were performed. Whereas these previous measures were specific to one particular type of event, more recently a more general trauma-related CSE scale was created to measure CSE relevant for victims of various PTEs (Benight, unpublished raw data, November 28, 2012). The existence of a general trauma-related CSE measure allows (mental) health care workers and policymakers to quickly screen an exposed population. Because large-scale (and small-scale) catastrophic events generally happen without warning and can have a variety of different consequences that are difficult to predict, often there is no predesigned specific CSE measure that takes into account the variation in coping demands that is specific to each event. An additional benefit of a general trauma-related CSE measure is that its scores as well as its effect on posttrauma recovery can be compared across events. Because the impact and coping demands can vary widely across PTEs, a general trauma-related CSE measure needs to focus on a more essential, generic capacity of being able to deal with the demands of the aftermath of a PTE and of being able to resume daily life and normal functioning. Specific tasks like being able to clear debris or being able to overcome specific feelings like anger or
shame will not be relevant for all victims of PTEs (e.g., Amstadter & Vernon, 2008). In order to develop such a general measure, the items with the strongest correlations across a number of studies testing CSE and traumatic stress adaptation were identified by Benight and colleagues (cf. Benight & Bandura, 2004; Luszczynska et al., 2009). After this process, 20 items were included (CSE-20). Internal reliability estimates across four initial samples were all above .93. Exploratory factor analyses demonstrated a single factor solution within each sample, explaining a range of variance from 50% to 63% (Benight, unpublished raw data, November 28, 2012). To date, the validity of the CSE-20 scale across different types of PTEs has not been tested.
European Journal of Psychological Assessment 2017; Vol. 33(1):55â&#x20AC;&#x201C;64
Ă&#x201C; 2015 Hogrefe Publishing
The Present Study Original aim of the present study was to examine the psychometric properties and construct validity of the traumarelated CSE-20 among four samples of victims of PTEs. We assessed whether the concept of CSE has cross-event construct validity by examining if there was measurement equivalence across victims of different types of PTEs. This was tested among very different groups, with respect to the type of PTE experienced and the time that had passed since the event. The determination of measurement equivalence among these samples is a very strong argument for the wide applicability of the measure. Initial findings showed poor measurement equivalence of the CSE-20 across samples, indicating that the CSE-20 measures (slightly) different concepts after different PTEs (factor loadings across samples were not equal), and that the scale did not measure general trauma-related CSE. Therefore, we screened the existing measure for items that might not apply across all types of PTE and all victims, followed by a selection of items that are relevant across all events and victims while still capturing the full scope of trauma-related CSE. We performed the same analyses on an abbreviated version of the trauma-related CSE scale. Finally, we compared the convergent validity of the two scales by examining their associations with posttraumatic stress symptoms.
Materials and Methods Participants Respondents in the four samples in the present study were confronted with PTEs (see for details below). The PTEs were events where respondents, in accordance with the more or less shared criteria in the varying and changing criteria of PTSD in DSM III, III-R, IV, and V, were confronted with death, threatened death, or threatened serious injury (direct exposure or witnessing). However, we did not assess the presence of PTSD (as a diagnosis) by PTSD symptom severity.
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
57
Sample 1: Victims of a Fireworks Disaster
Measures
The first sample consisted of 514 adult residents affected by the Enschede fireworks disaster (response rate 54.8%). The disaster (May 13th, 2000) was caused by a massive explosion in a fireworks storage facility that destroyed a residential area. Survivors were assessed 2–3 weeks (T1), 18 months (T2), 4 years (T3), and 10 years (T4) postdisaster. CSE was assessed at T4 (2010). The study was approved by a medical ethical committee. All participants signed an informed consent form. Questionnaires were filled out on printed forms and online.
The 20-item trauma-related CSE Scale developed by Benight (Benight, Ironson, & Durham, 1999; Benight et al., 2000, 2008) was administered to assess CSE in all samples. For each item, respondents rated their perceived self-efficacy for dealing with different consequences of the PTE they experienced on a 7-point Likert scale (e.g., resuming normal life; dealing with frightening images or dreams about the event; being optimistic since the event) (see Table 1). The scores range from 1 (= I am completely incapable of ) to 7 (= I am perfectly capable of). The original English version of the CSE scale was translated using a professional translation agency (Overtaal Language Services, Division of Transperfect Global Headquarters, London). Translation was performed in multiple rounds of forward and backward translations, using different Dutch and English native speakers to ensure the meaning of the translated items was exactly the same as in the original version. The 15-item Impact of Event Scale (IES) (Horowitz, Wilner, & Alvarez, 1979) was used to assess event-related posttraumatic stress symptoms among all samples. The construct validity and reliability of the Dutch version was proven to be acceptable across different traumatic experiences (Van der Ploeg, Mooren, Kleber, van der Velden, & Brom, 2004). For samples 2, 3, and 4 in this study, seven additional items from the IES-R (Weiss & Marmar, 1997) were included, measuring the hyperarousal symptoms of the third symptom cluster of posttraumatic stress disorder (APA, 2000). However, the original scoring system of the IES was retained (cf. Pfefferbaum et al., 2001, 2003). We call this expanded version the IESplus. We did not administer the total IES-R, because the items as well as the scoring system of the subscales intrusions and avoidance were revised and changed, disabling comparisons with previous research on trauma with the IES.
Sample 2: Victims of Various Acute Potentially Traumatic Events in the Past 2 Years The second sample was drawn from the longitudinal LISS panel (Longitudinal Internet Studies for the Social sciences), a nationally representative sample of the Dutch population (for a full description of the panel and the procedure see Scherpenzeel, 2011). Informed consent was obtained during recruitment for the panel. The total sample of the current survey on trauma (April, 2012) comprised 5,879 respondents (out of 7,495 panel members who were approached, response = 78.4%). Panel members received and filled out the questionnaires online. In the case of more than one PTE, respondents were asked to focus on the most severe event. The research was approved by the MESS (Measurement and Experimentation in the Social Sciences) board of overseers. Respondents of the LISS panel who were confronted with acute PTEs in the past 2 years, were administered the CSE scale (N = 512). They were confronted with events such as intentional violence/threats, accidents, fires, and property crimes.
Sample 3: People Who Lost a Significant Other in the Past 2 Years
Analysis
The third sample was also derived from the trauma survey within the LISS panel. This sample is composed of people who lost a significant other in the past 2 years (N = 1,325). There is no overlap between these two samples: all samples are exclusive.
Analyses were conducted in four subsequent steps, in addition to data screening. Data screening and exploratory factor analyses were conducted using SPSS 19 (IBM, 2010). We used AMOS 19 (Arbuckle, 1997) for multi-sample confirmatory factor analyses (Jöreskog & Sörbom, 1993). Maximum likelihood estimation was used to estimate the models.
Sample 4: Burn Victims
Data Screening
The last sample consists of 169 burn victims (response = 69.4%), who were admitted to one of three burn centers in the Netherlands or one burn center in Belgium between April 2010 and September 2012. The study was approved by an ethical committee. Patients were invited to participate in the study by a local researcher. After providing written informed consent, patients received printed questionnaires 2–4 weeks after the event.
Within all samples the CSE-20 scale showed similar and very high internal consistency (Cronbach’s alphas in all samples were .97). The individual items of the CSE scale showed extreme negative skewness, violating the assumptions of normality. In order to deal with this violation of assumptions to conduct factor analyses, base-10 log transformations were applied on the CSE items. After this
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
58
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
Table 1. Items in the 20-item CSE scale Item no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Item wording Dealing with my emotions (anger, sadness, depression, fear) since the traumatic experience. Dealing with the impact that the traumatic experience has had on my life. Carrying on with my everyday life. Talking about the traumatic experience. Accepting what happened. Finding meaning in what happened. Controlling frightening thoughts about the traumatic experience. Keeping persistent frightening images of the event under control. Making sure I don’t have an emotional breakdown. Dealing with thoughts about my own vulnerability. Dealing with frightening images or dreams about the traumatic experience. Not blaming myself in any way about what happened. Being optimistic since the traumatic experience. Supporting other people since the traumatic experience. Keeping under control any thoughts that the traumatic experience will happen to me again. Resisting any thoughts that I can no longer cope. Seeking help from other people because of what happened. Not taking out my anger on other people. Being emotionally strong. Not criticizing myself because of how I’m dealing with this.
Note. Bold items are items retained in the CSE-7.
transformation, values of skewness (maximum 1.553) and kurtosis (maximum value 1.494) were well within the values of univariate nonnormality that cause problems in CFA (Curran, West, & Finch, 1996; Muthen & Kaplan, 1985). We used the transformed data in the following analyses.
Exploratory factor analysis (principal axis factoring) was conducted to explore the factor structure of the CSE scale, using sample 1 as reference sample. Sample 1 was chosen as reference sample because it offered a relatively homogeneous sample, as this sample was exposed to the same event, and measurement was in the same timeframe since the event. KMO and Bartlett’s test were examined to assess the suitability of the transformed data for factor analysis. Kaiser’s criterion (Kaiser, 1960) and visual inspection of the scree plot were used to examine the numbers of factors to retain for further analyses. Stevens’ (2002) guidelines for the strength of factor loadings with respect to sample size were followed, these are: for a sample size of 50, a loading of 0.722 can be considered significant, for a sample of 100 the factor loadings should be greater than 0.512, for 200, greater than 0.364, for 300 it should be greater than 0.298, and for 600 greater than 0.21. We tested the hypotheses about measurement equivalence across the four samples by comparing the absolute fit of the different factor models with sample 1 (disaster) as the reference sample. An invariant pattern of factor structure and factor loadings in different samples is an indicator of construct validity (Devins et al., 1988). In order to test an
increasingly strict level of measurement equivalence, the differences in absolute fit of consecutive hierarchical models were compared. When the difference in v2 values between two models is not significant, the hypothesis that the factor structure is similar is tenable (Jöreskog & Sörbom, 1993). In this study, our main interest was not in the absolute fit of the models, but in the fit of a model compared to other specified models. After all, we wanted to examine whether the instrument functions the same in diverse trauma exposed samples. However, in order to assure the final model fits the data well, and not just provide a model with bad fit that is marginally better than comparison models, we also investigated the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI), and the Root Mean Square Error of Approximation (RMSEA). Recommended cutoff values for the CFI and TLI are above .90 (Hu & Bentler, 1999) and an upper value of .10 for the 90% CI for the RMSEA (Browne & Cudeck, 1993). In order to establish that the 20 items load on a single factor in all samples (Model A), we performed exploratory factor analyses on all confirmatory samples. If a single factor structure was found, with all items’ loadings higher than Stevens’ (2002) minimum values, equality of Model A was assumed. The following hierarchical models were used in this study: (1) Model A, where the number of factors and the pattern of the factor loadings of the individual items with the factors were similar across samples; (2) Model B, with the additional constraint of equal factor loadings across samples added to Model A; (3) Model C, with the additional constraint of equal error variances added to Model B.
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
Ó 2015 Hogrefe Publishing
Step 1: Measurement Equivalence of the CSE-20
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
59
Table 2. Descriptives Sample 1 disaster (N = 514) Sample Age** % male** IES total** IES plus IES avoidance** IES intrusion** IES hyperarousal CSE-20** CSE-7
Sample 2 acute (N = 512)
Sample 3 loss (N = 1325)
Sample 4 burns (N = 169)
M
SD
M
SD
M
SD
M
SD
42.38 41.4 11.07 * 4.68 5.81 * 119.77 41.78
13.75
43.34 48 15.6 21.42 7.04 7.08 5.82 116 41.13
17.53
50.85 45.1 18.44 22.59 6.17 10.39 4.15 116.45 40.72
17.5
40.15 67.1 16.01 24.76 8.08 8.55 5.54 115.23 40.52
8.59
15.64 * 7.96 7.27 * 22.23 8.21
17.79 24.59 8.58 8.07 7.6 23.68 8.16
15.41 20.26 6.79 8.18 6.01 21.73 776
15.81 23.59 8.05 8.06 6.26 24.14 8.49
Notes. *Data not available. **Significant differences in mean scores (p < .05).
Step 2: Item Selection For New Scale Reliability analyses of the CSE-20 showed that removal of a number of items would not result in a significant decrease in the value of the reliability coefficient. No items were singled out as having a large individual impact on the Cronbach’s alpha of the scale, indicating some redundancy. Furthermore, among all samples all item-rest correlations were above .59. These results gave us no leads as to which items to remove. We found no literature on how to deal with this redundancy problem, other than highly data-driven selection techniques that capitalize on chance. Using a data-driven selection has the added risk of forming a very narrow scale that does not measure the full width of the concept under study (Ziegler, 2014). We found no wellproven analytic strategy to build a brief version of a questionnaire using a selected set of items of an original questionnaire. Therefore, we decided to reexamine the individual items in the scale to select a limited number of items that were applicable to victims of all kind of PTEs as much as possible. Content-driven selection of items was performed with the goal of retaining a scale measuring the broad spectrum of functioning among victims of PTEs (dealing with reminders of the event, emotions associated with the event, being able to employ active coping strategies and being able to resume normal functioning). This content-based approach is the most sound from a theoretical perspective, because of its focus on the content validity of the scale. Furthermore, this method is not different from the normal procedure followed by researchers developing a new scale (Aiken, 1996; Devellis, 1991). Selection of items to be retained in the short scale was based on the following three criteria: Minimizing item overlap, in case of overlap the most generically applicable item was selected; items had to be applicable for all victims of PTEs (applicable regardless of the type of PTE and of specific coping styles of the subjects); and the items had to cover the entire concept of trauma-related CSE. The selection process was done by MB and PV in two phases. In the first phase the selection was done separately. In the second phase selections were compared and brought Ó 2015 Hogrefe Publishing
together in one final selection. After selection of these items, we re-performed exploratory factor analysis on the reference sample (sample 1) in order to investigate whether the selected items still measure a single underlying construct.
Step 3: Measurement Equivalence of the 7-Item Scale We reassessed measurement equivalence of the 7-item scale across the four samples with sample 1 (disaster) as the reference sample. The same procedure was used as in step 1.
Step 4: Convergent Validity Finally, we examined and compared the Pearson correlations between the CSE-20 scale and the brief CSE-7 on the one hand, and the IES total and IESplus subscales on the other hand using the Fisher r-to-z transformation (Silver & Dunlap, 1987).
Results Demographic information of the four samples and scores on the IES and IESplus are shown in Table 2. ANOVAs showed that the mean age of participants and mean scores on the IESplus and its subscales of intrusion and avoidance were significantly different between samples. The proportion of men and women was also not equal between samples. Because age and gender were not related to CSE-20 or CSE-7 scores, we did not correct for these differences. Finally, scores on the CSE-20 were significantly dissimilar between samples, while scores on the CSE-7 were not. In order to establish robustness of the construct within our reference sample, we conducted additional EFAs and CFAs on randomly selected halves. Results showed full (scalar) measurement equivalence between the two halves of the reference sample, with both scales (results not shown). European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
60
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
Table 3. Tests of the equality of the factor structure of the 20-item CSE scale Model
v2 total
Df
Dv2
Sample 1 and Sample 2 (disaster and acute events) Model A 2,495.698 340 Model B 2,545.505 359 49.806 Model C 2,636.920 379 91.415 Sample 1 and Sample 3 (disaster and loss) Model A 4,464.465 340 Model B 4,504.809 359 40.344 Model C 4,600.167 379 95.358 Sample 1 and Sample 4 (disaster and burns) Model A 1,690.508 340 Model B 1,725.492 359 34.984 Model C 1,909.772 379 184.280
Step 1: Measurement Equivalence of the CSE-20 Both the KMO measure of sampling adequacy (0.96) and the Bartlett test, v2(190) = 9,203.264, p < .001, indicated that there were distinct clusters of variables, signifying the data should yield distinct and reliable factors. Internal consistency of the scale based on the total 20 items (sample 1, disaster) was very high: Cronbachs a = .97. Results from the exploratory factor analysis on the original CSE-20 scale among the reference sample (sample 1) indicated a single factor structure, accounting for 64.06% of total variance, with an eigenvalue of 13.16. All factor loadings were well above Stevens’ (2002) recommended values (above .64). Model A was similar for all samples: a single factor was found in all confirmatory samples, explaining 59.6% to 66.4% of the variance. For these samples, all factor loadings were also above recommended values. Results of the tests of equality of factor structure and absolute fit measures for the models are summarized in Table 3. In each case, the factor structure of 1 confirmatory sample was compared with that of the reference sample (sample 1, disaster). The absolute fit indexes for the unconstrained and constrained models indicated that the models did not fit the data very well; all p values of the v2 comparisons of the models were significant. For all models the values of the CFI and the TLI fell short of the values recommended by Hu and Bentler (1999). The values of the RMSEA were acceptable, with upper values of its 90% CI all below .10 (Browne & Cudeck, 1993). These results indicate that the factor structure was not robust across samples exposed to different types of PTEs, and that the constitution of the underlying concept of CSE as measured by the CSE-20 scale was not the same across samples.
Df
p
CFI
TLI
RMSEA (CI)
19 20
<.001 <.001
.893 .891 .888
.867 .873 .875
.079 (.076–.082) .077 (.074–.080) .076 (.074–.079)
19 20
.003 <.001
.879 .878 .876
.851 .858 .863
.081 (.079–.083) .079 (.077–.081) .078 (.076–.080)
19 20
.014 <.001
.890 .888 .875
.864 .869 .861
.076 (.073–.080) .075 (.071–.078) .077 (.074–.080)
selected the same items, no further selection process was needed.
Content Overlap The following two item clusters showed considerable overlap: items 1, 9, and 19; and items 7, 8, and 11. Of the cluster 1, 9, and 19, item 19 is the most general and neutral; item 1 specifies which emotions are indicated, and item 9 is a consequence which may only be relevant for those struggling to deal with intense emotions. Item 19 neither specifies which emotions are indicated nor which effect they might have. Therefore, item 19 was retained. Of the cluster 7, 8, and 11, item 11 is the most general and neutral. Items 7 and 8 specify being able to control frightening thoughts and images, while item 11 only specifies being able to deal with them. This item implies that even while one continues to involuntarily experience frightening images (such as intrusive posttraumatic stress symptoms), one is able to cope with them.
Non-Generic Items
Content-driven item selection resulted in a final selection of 7 items (see Table 1). Since both MB and PV independently
In the next step we excluded items 5, 6, 10, 12, 14, 15, 16, 18, and 20. Items 5 and 6 can be useful in helping someone come to terms with a PTE. However, these items are related to a specific form of coping: Meaning-focused coping (Park & Folkman, 1997). While potentially an important coping mechanism, not all victims of a PTE will (need to) use these emotion-focused coping strategies (Frazier & Schauben, 1994). This can be a necessary component of coping for some, but not everyone may need to do this in order to cope. Furthermore, these items are influenced by the time elapsed since the event. A victim of a PTE is more likely to have given meaning to an event or accepted its occurrence in the longer term. Therefore, these items may not be relevant for all victims. Item 10 may also not be relevant for all victims. A PTE does not necessarily entail threat or harm to the individual him- or herself. Witnessing
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
Ó 2015 Hogrefe Publishing
Step 2: Item Selection
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
61
Table 4. Tests of the equality of factor structures of the 7-item CSE scale Model
v2 total
Df
Dv2
Df
p
CFI
TLI
RMSEA (CI)
6 7
.180 <.001
.965 .964 .959
.929 .941 .944
.079 (.069–.090) .073 (.063–.082) .071 (.062–.079)
6 7
.101 <.001
.968 .967 .964
.936 .946 .951
.072 (.065–.080) .066 (.060–.073) .064 (.057–.070)
6 7
.190 <.001
.958 .957 .936
.916 .929 .913
.083 (.071–.096) .076 (.065–.088) .085 (.074–.095)
Sample 1 and Sample 2 (disaster and acute events) Model A 208.252 28 Model B 217.133 34 8.881 Model C 251.102 41 33.969 Sample 1 and Sample 3 (disaster and loss) Model A 297.615 28 Model B 308.236 34 10.620 Model C 344.782 41 36.546 Sample 1 and Sample 4 (disaster and burns) Model A 159.869 28 Model B 168.589 34 8.720 Model C 240.968 41 72.379
harm to an infant for instance can be very distressing. So can experiencing the sudden loss of a loved one. Neither event will necessarily result in the victims experiencing increased thoughts about one’s own vulnerability. Item 12 may also not be relevant for all. Self-blame is not always an issue in the aftermath of a traumatic event, even among those who suffer from PTSD symptoms (Cox, Resnick, & Kilpatrick, 2014). Item 16 also assumes that someone has specific thoughts: the thought that one cannot cope. Item 14 was not included because this is not a necessary component of an individual’s trauma-related CSE. While this externally oriented item might be an indicator of collective CSE, it is not one of individual CSE. Furthermore, an individual might have been incapable of offering support to others even before the event. Item 15 was not included because not all events are as likely to invoke the fear of recurrence. While a rape victim for instance might experience a high level of fear for the same thing happening again, this may be less relevant for someone who lost a spouse. Item 18 may only be relevant for those experiencing a high level of anger. While this emotion is certainly experienced by victims of PTEs, this is not always the case. And even for those who experience this emotion, externalizing by taking it out on others is not the only way of expressing it. Item 20, finally, does not give straightforward information on CSE: the item could be interpreted in several ways. On the one hand, people who do not cope well could score low on this item, because they are aware of the fact they are not handling the event well. On the other hand, someone who is objectively coping rather well, but who is very self-critical or perfectionist could still find fault in how he or she is handling the event. This item is therefore potentially influenced by other personal characteristics than CSE.
(item 17), and being able to resume normal functioning (items 2, 3, and 13). Internal consistency for this 7-item scale was high in all four samples, but not as high as for the CSE-20: (sample 1: a = .93; sample 2: a = .93; sample 3: a = .92; sample 4: a = .90).
Step 3: Measurement Equivalence of the 7-Item Scale (CSE-7) Exploratory factor analysis among the reference sample (sample 1, disaster) showed that the seven selected items load on a single factor, explaining 64.96% of its variance, with all item loadings above .71. Exploratory factor analyses among the confirmatory samples all found a single factor as well, with explained variances ranging from 57.79% to 65.93%. In all samples, factor loadings were above .59. This confirms Model A in all samples. In Table 4, results of the multi-sample confirmatory factor analyses are shown. Measures of overall model fit were good for all models. Results also show that for the confirmatory samples the factor structure was the same as in the exploratory sample 1 (disaster), with no significant differences in factor structure and factor loadings (Model B). The additional constraint of equal error variances (Model C) did result in significantly worse fitting models, indicating that while the factor structure is robust across samples, differences in the unexplained variances of the items exist.
Step 4: Convergent Validity
The items that we retained were items 2, 3, 4, 11, 13, 17, and 19. These items cover the entire concept of traumarelated CSE: dealing with reminders of the event (items 4 and 11), dealing with emotions associated with the event (item 19), being able to employ active coping strategies
The correlations between the original 20-item version and the 7-item version of the CSE scales with the total scales and subscales of the IESplus are shown in Table 5. Tests of equality of correlation coefficients showed that these correlations were not significantly different in strength for the original 20-item scale and the 7-item scale. This demonstrates that concurrent validity for the two different scales is similar; both being equally associated with levels of posttraumatic stress symptoms.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
Generic Items
62
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
Table 5. CSE scores and correlation with PSS Sample 1 disaster CSE 20-item** CSE 7-item** Sample 2 acute CSE 20-item** CSE 7-item** Sample 3 loss CSE 20-item** CSE 7-item** Sample 4 burns CSE 20-item** CSE 7-item**
M
SD
119.77 41.78
22.23 8.21
115.97 41.13
IES
IESplus
Intrusions
Avoidance
Arousal
.636 .657
* *
.615 .640
.593 .616
* *
23.67 8.16
.505 .472
.527 .494
.476 .436
.486 .459
.514 .484
116.34 40.72
21.71 7.76
.451 .458
.497 .502
.373 .378
.438 .449
.520 .517
115.23 40.52
24.14 8.49
.600 .614
.639 .659
.592 .608
.596 .610
.619 .650
Notes. *Data not available. **Means and standard deviations based on untransformed data.
Discussion In this study we examined the psychometric properties of two versions of a general trauma-related CSE scale using four samples exposed to varying types of PTEs. Results offer some support for both the 20-item CSE (CSE-20) scale and the 7-item scale (CSE-7). There were no great differences between these two versions in absolute model fit nor in the associations with posttraumatic stress symptoms. However, the CSE-7 has the advantage of an identical factor structure across very different samples. This conceptual equivalence is necessary for the instrument to measure general trauma-related CSE across different samples. Since the goal of the scale is to measure event-related CSE for all those exposed to PTEs, revision of the total scale by selecting the most essential and generic posttrauma items that apply to all victims while still measuring the full spectrum of post-event functioning, was theoretically sound. Which events qualify as traumatic events is subject to debate (cf. Rosen & Lilienfeld 2008). Different choices could have been made with regard to the type of event the traumarelated CSE scale is aimed at. In addition, because selection of items in the CSE-7 was based on their content rather than on statistical criteria, other researchers may have made a different selection. The samples were very different regarding both type of event experienced and time elapsed since the event. There were also differences regarding the average severity of PTSS symptoms. The fact that the revised 7-item CSE scale (CSE-7) did demonstrate measurement equivalence across samples indicated that the underlying concept of general posttrauma CSE is constituted in a similar manner in the different samples. In order to ensure measurement equivalence among all samples, CFAs of the CSE-7 were repeated using the other samples as reference sample, offering similar results (data not shown, but available on request). These results offer strong support for cross-event construct validity of the CSE-7 scale. Furthermore, the ultimate goal of the CSE scale is to be able to predict posttraumatic stress reactions and recovery of victims of PTEs. Convergent validity of the CSE-7
was equal to that of the 20-item version; the correlations between the CSE-7 scale and the CSE-20 scale with (subscales of) the IESplus did not differ significantly. This indicates that the 7-item version has lost none of its associative power with posttraumatic stress symptoms. A 7-item scale that is able to predict these stress reactions as reliably as a 20-item scale is preferable for this puts fewer burdens on the victims. The fact that the additional constraints of Model C (added constraint of equal error variances) led to significantly worse fitting models indicates that the factor structure is not 100% the same across samples, but is also affected by contextual influences. This signifies that comparisons of absolute or mean scores on the CSE-7 scale across different types of events should be interpreted with caution. Nevertheless, results demonstrated that the construct as measured by the CSE-7 is very similar for the very diverse types of PTEs experienced by the different samples in this study, demonstrating the validity of the underlying concept of general trauma-related CSE across victims of PTEs in our study, in contrast to the CSE-20. Some limitations and characteristics of the study need to be discussed. Because we only compared Dutch victims of PTEs, it is possible that, despite using forwards and backwards translations with English and Dutch native speakers, the meaning of the items in the scale might be different in the Dutch and the American setting. There might be some cultural differences involved. Therefore, future research is needed that will investigate the measurement equivalence of the scale across various similarly exposed samples in different countries. Examination of the correlations between the CSE scales and posttraumatic stress symptoms showed that whereas the 7-item measure performs just as well as the 20-item version, there are considerable differences between the samples. For the disaster and burn victim samples, the correlations with PSS were high, all above .59. Perhaps not surprisingly, the disaster group (for whom the event took place 10 years ago) had lower symptoms than the other samples. Yet even 10 years post-disaster, CSE remains strongly associated with symptom levels, corroborating the role of CSE even in the long term. For sample 2
European Journal of Psychological Assessment 2017; Vol. 33(1):55â&#x20AC;&#x201C;64
Ă&#x201C; 2015 Hogrefe Publishing
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
and 3 (acute events and loss), these associations were lower, although still considerable. It is beyond the scope of this study to speculate on the reasons behind this last finding. It is important to note that the main focus of the article was the measurement equivalence of the scale across different PTEs. We used the associations with PSS as a proxy indicator of convergent validity. Other specific post-event mental health problems were not measured in all samples, and therefore not available for comparison. Further research is warranted to examine and compare the associations with other mental health problems among other more or less homogeneous samples confronted with PTE, since the impact may differ between events. An important strength of this study is that we compared the factor structure of the trauma-related CSE scale among groups of victims exposed to different kinds of PTEs. Additionally, there was also considerable variation in the time that had expired since the event before filling out the questionnaire. For sample 1 (disaster) assessment was 10 years post-disaster, for samples 2 (acute) and 3 (loss) assessment was up to two years post-event, and for sample 4 (burn victims) assessment was between 2 and 4 weeks after the burn event. This diversity enabled us to determine the degree to which the measure is universally applicable to victims of PTEs, regardless of either the type of PTE or the amount of time elapsed since the event. Although results from this study support the notion that the CSE-7 scale measures CSE common to those exposed to a wide range of PTEs, further confirmation of these results is necessary. Further validation of the CSE-7 in different groups of victims and among victims of PTEs outside of the Netherlands would offer further support for the external validity of the measure. Additionally, the predictive influence of CSE perceptions – measured by this general trauma-related instrument – on psychological recovery from PTEs should be investigated. Finally, more research is needed in order to determine whether certain cutoff points on the scale will be able to flag those at risk of severe posttraumatic stress or other mental health problems such as depression, generalized anxiety, and substance use and abuse.
63
Aiken, L. R. (1996). Rating scales and checklists. Canada: Wiley. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: APA. doi: 10.1176/appi.books.9780890423349 Amstadter, A. B., & Vernon, L. L. (2008). Emotional reactions during and after trauma: A comparison of trauma types.
Journal of Aggression, Maltreatment & Trauma, 16, 391–408. doi: 10.1080/10926770801926492 Arbuckle, J. (1997). Amos users’ guide version 3.6. Chicago, IL: SPSS. Bandura, A. (1997). Self-efficacy: The exercise of control. New York, NY: Freeman. Benight, C. C., & Bandura, A. (2004). Social cognitive theory of posttraumatic recovery: The role of perceived self-efficacy. Behaviour Research and Therapy, 42, 1129–1148. doi: 10.1016/j.brat.2003.08.008 Benight, C. C., Cieslak, R., Molton, I. R., & Johnson, L. E. (2008). Self-evaluative appraisals of coping capability and posttraumatic distress following motor vehicle accidents. Journal of Consulting and Clinical Psychology, 76, 677–685. doi: 10.1037/0022-006x.76.4.677 Benight, C. C., Freyaldenhoven, R. W., Hughes, J., Ruiz, J. M., & Zoschke, T. A. (2000). Coping self-efficacy and psychological distress following the Oklahoma city bombing. Journal of Applied Social Psychology, 30, 1331–1344. doi: 10.1111/j.1559-1816.2000.tb02523.x Benight, C. C., Harding-Taylor, A. S., Midboe, A. M., & Durham, R. L. (2004). Development and psychometric validation of a domestic violence coping self-efficacy measure. Journal of Traumatic Stress, 17, 505–508. doi: 10.1007/s10960-004-5799-3 Benight, C. C., Ironson, G., & Durham, R. L. (1999). Psychometric properties of a hurricane coping self-efficacy measure. Journal of Traumatic Stress, 12, 379–386. doi: 10.1023/A:1024792913301 Benight, C. C., Ironson, G., Klebe, K., Carver, C. S., Wynings, C., Burnett, K., . . . Schneiderman, N. (1999). Conservation of resources and coping self-efficacy predicting distress following a natural disaster: A causal model analysis where the environment meets the mind. Anxiety Stress and Coping, 12, 107–126. doi: 10.1080/10615809908248325 Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing Structural Equation Models (pp. 136–162). Beverly Hills, CA: Sage. Cox, K. S., Resnick, H. S., & Kilpatrick, D. G. (2014). Prevalence and correlates of posttrauma distorted beliefs: Evaluating DSM-5 PTSD expanded cognitive symptoms in a national sample. Journal of Traumatic Stress, 27, 299–306. doi: 10.1002/jts.21925 Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1, 16–29. Devellis, R. F. (1991). Scale development: Theory and applications. Applied social research methods series (Vol. 26). Thousand Oaks, CA: Sage. Devins, G. M., Orme, C. M., Costello, C. G., Binik, Y. M., Frizzell, B., Stam, H. J., & Pullin, W. M. (1988). Measuring depressive symptoms in illness population: Psychometric properties of the Center for Epidemiological Studies Depression (CES-D) scale. Psychology and Health, 2, 139–156. doi: 10.1080/08870448808400349 Frazier, P. A., & Schauben, L. J. (1994). Causal attributions and recovery from rape and other stressful life events. Journal of Social and Clinical Psychology, 13, 1–14. Horowitz, M., Wilner, N., & Alvarez, W. (1979). Impact of Event Scale: a measure of subjective stress. Psychosomatic Medicine, 41, 209–218. Hu, L.-t., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. doi: 10.1080/10705519909540118 IBM. (2010). IBM SPSS statistics for Windows, version 19.0. Armonk, NY: IBM Corp.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
Acknowledgments We thank all respondents for their time and effort. This study is part of a large research project on coping selfefficacy, granted by the Victim Support Fund (Fonds Slachtofferhulp), The Netherlands (grant number 13.04.21).
References
64
M. W. G. Bosmans et al.: Assessing Perceived Ability to Cope With Trauma
Jöreskog, K. G., & Sörbom, D. (1993). Structural equation modeling with the SIMPLIS command language (Vol. 2, pp. 127–180). Boston, MA: Allyn & Bacon. Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141–151. doi: 10.1177/001316446002000116 Kent, G. (1987). Self-efficacious control over reported physiological, cognitive, and behavioural symptoms of dental anxiety. Behaviour, Research and Therapy, 25, 341–347. Kent, G., & Gibbons, R. (1987). Self-efficacy and the control of anxious cognitions. Journal of Behavior Therapy & Experimental Psychiatry, 18, 33–40. Lazarus, R. S., & Folkman, S. (1984). Stress, appraisal and coping. New York, NY: Springer. Luszczynska, A., Benight, C. C., & Cieslak, R. (2009). Selfefficacy and health-related outcomes of collective trauma a systematic review. European Psychologist, 14, 51–62. doi: 10.10.27/1016-9040.14.1.51 Muthen, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171–189. Park, C. L., & Folkman, S. (1997). Meaning in the context of stress and coping. Review of General Psychology, 1, 115–144. Pfefferbaum, B. S., Call, J. A., Lensgraf, S. J., Miller, P. D., Flynn, B. W., Doughty, D. E., . . . Dickson, W. L. (2001). Traumatic grief in a convenience sample of victims seeking support services after a terrorist incident. Annals of Clinical Psychiatry, 13, 19–24. doi: 10.1023/A:1009008614219 Pfefferbaum, B. S., Flynn, G. M., Kearns, B. W., Doughty, L. J., Gurwitch, D. E., Nixon, R. H., & Nawaz, S. J. (2003). Case finding and mental health services for children in the aftermath of the Oklahoma city bombing. The Journal of Behavioral Health Services & Research, 30, 215–227. doi: 10.1007/BF02289809 Rosen, G. M., & Lilienfeld, S. O. (2008). Posttraumatic stress disorder: An empirical evaluation of core assumptions. Clinical Psychology Review, 28, 837–868. doi: 10.1016/ j.cpr.2007.12.002
European Journal of Psychological Assessment 2017; Vol. 33(1):55–64
Scherpenzeel, A. (2011). Start of the LISS panel: sample and recruitment of a probability-based internet panel. Retrieved from http://www.lissdata.nl/assets/uploaded/Sample_and_ Recruitment.pdf Silver, N. C., & Dunlap, W. P. (1987). Averaging correlation coefficients: Should Fisher’s z transformation be used? Journal of Applied Psychology, 72, 146–148. doi: 10.1037/ 0021-9010.72.1.146 Stevens, J. E. (2002). Applied multivariate statistics for the social sciences (4th ed.). Mahwah, NJ: Erlbaum. Van der Ploeg, E., Mooren, T. T., Kleber, R. J., van der Velden, P. G., & Brom, D. (2004). Construct validation of the Dutch version of the impact of event scale. Psychological Assessment, 16, 16–26. doi: 10.1037/1040-3590.16.1.16200411653-003 Weiss, D. S., & Marmar, C. R. (1997). The Impact of Event Scale-Revised. In J. P. Wilson & T. M. Keane (Eds.), Assessing psychological trauma and PTSD (pp. 399–411). New York, NY: Guilford Press. Ziegler, M. (2014). Comments on item selection procedures. European Journal of Psychological Assessment, 30, 1–2. doi: 10.1027/1015-5759/a000196
Date of acceptance: January 13, 2015 Published online: June 26, 2015
Mark W.G. Bosmans INTERVICT Tilburg University Warandelaan 2 5000 LE Tilburg The Netherlands Tel. +31 13 4663503 E-mail m.w.g.bosmans@tilburguniversity.edu
Ó 2015 Hogrefe Publishing
Multistudy Report
Incremental Validity of the Trait Emotional Intelligence Questionnaire-Adolescent Short Form (TEIQue-ASF) Alex B. Siegling,1,2 Ashley K. Vesely,3 Donald H. Saklofske,3 Norah Frederickson,2 and K. V. Petrides1,2 1
London Psychometric Laboratory, University College London, UK, 2Research Department of Clinical, Educational, and Health Psychology, University College London, UK, 3Department of Psychology, University of Western Ontario, London, ON, Canada Abstract. This study examined the incremental validity of the adolescent short form of the Trait Emotional Questionnaire (TEIQue-ASF) in two European secondary-school samples. The TEIQue-ASF was administered as a predictor of socioemotional or academic achievement criteria, along with measures of coping strategies or cognitive ability, respectively. In Dutch high school students (N = 282), the TEIQue-ASF explained variance in all socioemotional criteria, controlling for coping strategies and demographics. In a sample of British preadolescents, the measure showed incremental contributions to academic achievement in the core areas (English, math, and science) of the English curriculum, controlling for cognitive ability subscales and gender (N = 357–491). Implications for the validity and applied utility of the TEIQue-ASF are discussed. Keywords: Trait Emotional Intelligence Questionnaire, short form, incremental validity, adolescents, trait emotional self-efficacy
Research interest in the field of emotional intelligence (EI) has exploded in recent years with scores of empirical studies and a growing number of meta-analyses on various topics (e.g., Joseph, Jin, Newman, & O’Boyle, 2014; Malouff, Schutte, & Thorsteinsson, 2014; Martins, Ramalho, & Morin, 2010; Perera & DiGiacomo, 2013). Trait emotional intelligence (trait EI) refers to a constellation of emotional self-perceptions located at the lower levels of personality hierarchies (Petrides, Pita, & Kokkinaki, 2007) and is assessed using typical-performance measures. The construct is distinct from ability EI, which seeks to integrate emotion-related abilities and should be assessed using maximum-performance measures (Petrides & Furnham, 2001). The weak associations between typical- and maximumperformance EI measures illustrate this distinction (e.g., Derksen, Kramer, & Katzko, 2002; Ferrando et al., 2010; Petrides, Frederickson, & Furnham, 2004; Warwick & Nettelbeck, 2004). Furthermore, trait EI provides an interpretive framework for the majority of EI measures, which assess typical performance, even though many of them were originally conceptualized as measuring emotion-related
abilities. The term ‘‘EI’’ has been retained, however, in order to relate the construct to the broader EI literature, from which it derives. Several trait EI measures have been developed (Siegling, Saklofske, & Petrides, 2014) and an impressive line of research has demonstrated their predictive and incremental validity. For example, a recent review of the literature found that the Trait Emotional Intelligence Questionnaire (Petrides, 2009) explained additional criterion variance over broad personality factors (i.e., Big Five or Giant Three) and other emotion-related constructs (e.g., alexithymia, social desirability, and exposure to stress) in 78% of the analyses (N > 100; Andrei, Siegling, Aloe, Baldaro, & Petrides, 2015). In contrast, relatively few trait EI measures have been developed specifically for children or adolescents. The Emotional Quotient Inventory: Youth Version (Bar-On & Parker, 2000) and the Trait Emotional Intelligence Questionnaire Adolescent and Child forms (Petrides, Sangareau, Furnham, & Frederickson, 2006) are the only two established measures, although another measure was developed recently (Billings, Downey, Lomas,
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74 DOI: 10.1027/1015-5759/a000267
66
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
Lloyd, & Stough, 2014). Generally, these measures have been subject to considerably less validation research than their respective adult counterparts. Following a brief review of published studies, the present paper further examines the incremental validity of the Trait Emotional Intelligence Questionnaire-Adolescent Short Form (TEIQue-ASF) over other relevant predictors of socioemotional and educational criteria.
To date, the adolescent form of the TEIQue has been subject to relatively little psychometric research compared to its adult counterpart. Nonetheless, since it consists of similar items to the adult version, rephrased into ageappropriate language, construct validity can, to some extent, be extrapolated from evidence gathered with the adult version. Notably, the adult TEIQue was found to converge strongly with two similar self-report measures (r = .73 and .77; Gardner & Qualter, 2010). Most studies involving the adolescent form have used the 30-item TEIQue-ASF, which has shown good internal reliability in adolescents (a = .83; Mikolajczak, Petrides, & Hurry, 2009) and preadolescents (a = .84; Petrides et al., 2006). The type of criteria a construct should explain are those that, in theory, are directly influenced by it. For example, proximate outcomes of trait EI are likely to have a pronounced emotional emphasis and revolve around how people manage everyday challenges or function in social situations (e.g., situational frustration, response to stress, or positive and negative affect). Moreover, the value of a construct (or a measure) is considerably enhanced if it can predict broader, long-lasting life outcomes and not just behaviors or mental states of a temporary, psychological nature. Examples of such outcomes are academic and career success, career selection, relationship and family stability, mental health, and even reproductive success. These broader outcomes are influenced, to various extents, by a multitude of psychological constructs, without necessarily being directly related to any of them. However, predictive and criterion validity are not sufficient for measures of a relatively new construct, such as trait EI. Beyond the ability to explain or predict variance, it is essential for a construct and its measures to explain unique or incremental criterion variance not accounted for by conceptually related and established constructs. Cognate constructs of trait EI include higher-order personality factors and other narrower trait-like factors akin to trait EI (Petrides, Pérez-González, & Furnham, 2007). A prime example of a narrower set of related constructs are coping strategies. In general, coping refers to how people respond to stressful or negative situations and has implications for a range of psychological outcomes, predominantly mental health (Endler & Parker, 1994; Greenaway et al., 2015). Coping strategies are trait-like attributes that partly overlap with trait EI, both conceptually and empirically. For example, trait EI correlates positively
with adaptive and negatively with maladaptive coping strategies (Mavroveli, Petrides, Rieffe, & Bakker, 2007). Evidence also exists that trait EI maximizes the beneficial effects of the former while minimizing the adverse effects of the latter (Davis & Humphrey, 2012a). Moreover, some have conceptualized coping strategies as proximate outcomes of trait EI and found to statistically mediate its effects on maladaptive behavior (Davis & Humphrey, 2012a; Mikolajczak et al., 2009). Regardless of whether trait EI is an antecedent or an overlapping construct situated at the same ontological level, it should demonstrate incremental validity over conceptually and empirically related constructs, such as coping strategies. There exist general consensus and good evidence that trait EI is at most weakly related to cognitive ability (Derksen et al., 2002; Saklofske, Austin, & Minski, 2003; Van der Zee, Thijs, & Schakel, 2002). In fact, some research suggests that cognitive ability and trait EI interact in predicting academic performance, with trait EI showing stronger effects for students at the lower end of cognitive ability (Petrides et al., 2004). Still, to be considered useful, trait EI should explain incremental variance in directly relevant, emotion-laden criteria above cognitive ability. At the same time, any incremental contributions to outcomes primarily linked to cognitive ability would speak to the validity and value of trait EI and its measures. An example of such a criterion is academic achievement, which is a relatively broad and important outcome. The relationship between trait EI and academic achievement has been discussed elsewhere (Ferrando et al., 2010; Petrides et al., 2004). Evidence for the incremental validity of the English TEIQue-ASF has been reported in three studies on British preadolescents and adolescents. In these samples, the measure accounted for variance in the following criteria: self-reported disruptive behavior and depression when controlling for demographics, the Big Five personality traits, and general cognitive ability (Davis & Humphrey, 2012b); four of five aspects of psychopathology after controlling for gender, an adult trait EI measure (Schutte et al.’s, 1998, Assessing Emotions Scale), and measures of emotional ability (emotion perception, emotion management, using emotions, and facial expression recognition; Williams, Daley, Burnside, & Hammond-Rowley, 2010); and four socioemotional variables (peer-rated social behavior and inclusion, and self-reported adjustment/ psychopathology) over the baseline levels of these criteria and general cognitive ability (Frederickson, Petrides, & Simmonds, 2012). Translations of the TEIQue-ASF were assessed for incremental validity in two studies of preadolescents. In these studies, TEIQue-ASF scores explained unique variance in somatic complaints, controlling for depression in a Dutch sample (Mavroveli et al., 2007), and in teacher-rated academic achievement, controlling for cognitive ability, personality, and self-concept in a Spanish sample (Ferrando et al., 2010). Overall, few studies have used the TEIQue-ASF to predict (a) socioemotional criteria, especially operationalized in ways other than self-report while
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
Ó 2015 Hogrefe Publishing
Criterion and Incremental Validity of the TEIQue-ASF
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
controlling for relevant predictors, and (b) objectively assessed performance criteria.
Present Study The present study further investigates the incremental validity of the TEIQue-ASF over and above competing constructs. First, it was examined whether the TEIQue-ASF accounts for unique variance in socioemotional criteria (depression, somatic complaints, and peer-rated social competence) when controlling for a broad set of trait-like predictors (i.e., seven coping strategies), by reanalyzing data presented by Mavroveli et al. (2007). Coping strategies have only been used as criteria of TEIQue-ASF scores. Operationalized as relatively stable traits, however, coping strategies qualify as a particularly relevant set of competing predictors beyond which the TEIQue-ASF should demonstrate incremental validity, given their theoretical and empirical relationships with trait EI, and implications for psychological outcomes. Thus, trait EI as well as a subset of coping strategies may be expected to explain variance in the three criteria investigated in this study. Second, using unpublished data on criteria assessed in Frederickson et al.’s (2012) sample, it was examined whether the TEIQue-ASF can explain unique variance in objective academic achievement criteria (end-of-year grade levels in three subjects) when controlling for cognitive ability. One advantage of using objective criteria in this study was the avoidance of the limitation of commonmethod variance. In both samples, demographic data (gender and either age or school grade) were also held constant. In Sample 2, the analyses were conducted separately for Grades 7 and 8, since the criterion variables (grade levels) were grade-dependent but same for all students, as described in the Measures Section.
Method Participants and Procedure Sample 1 consisted of preadolescents and adolescents (N = 282; 48.2% female), recruited from four Dutch state high schools. It had a mean age of 13.7 years (SD = 0.7, range = 12.0–15.7) and was described as ethnically and socially diverse (Mavroveli et al., 2007). Data from students with special needs, identified by their teachers, were excluded from the dataset by the researchers who conducted the original study. Since the exact same sample was used in the present study, there were no missing data. Measures were administered during class time. Sample 2 comprised British preadolescents (46.8% female, age range = 11–13 years) from four secondary schools situated in South East England. The students were in Grades 7 or 8 and predominantly from White English (78.2%) or other White Western European (10.99%) backgrounds. A total of 1,140 students participated in the original study, but the number of students in the analyses Ó 2015 Hogrefe Publishing
67
reported ranged from 476 to 491 for seventh graders and from 357 to 469 for eighth graders. By using pairwise deletion for dealing with missing data on some variables, the effective sample size varied from analysis to analysis. Further details about the two samples can be found in previous publications (Frederickson et al., 2012; Mavroveli et al., 2007).
Measures The TEIQue-ASF comprises 30 items, taken in pairs from each of the 15 facets of the full form. The items are responded to on a 7-point Likert scale (1 = completely disagree, 7 = completely agree). A Dutch translation (Mavroveli et al., 2007) was administered to Sample 1, whereas the original English form was administered to Sample 2. The internal consistency (McDonald’s omega) of the TEIQue-ASF scores was .85 in both samples.
Sample 1 Utrecht Coping List for Adolescents (Bijstra, Jackson, & Bosma, 1994) This measure consists of 47 items based on a 4-point Likert scale and assessing seven distinct coping strategies. The subscale names, numbers of items, and internal consistencies on this sample were as follows: confrontation (7 items, x = .82), palliative coping (8 items, x = .80), avoidant coping (8 items, x = .77), seeking social support (6 items, x = .88), depressive coping (7 items, x = .71), showing emotions (3 items, x = .78), and optimistic coping (5 items, x = .81). Three ‘‘spare items’’ are not used in any of the subscales. Children’s Depression Inventory (Kovacs, 1985; Timbremont & Braet, 2001) This Dutch scale consists of 28 items measuring cognitive and somatic symptoms of depression in children. Children answer the items on a 3-point Likert scale of increasing symptom severity. McDonald’s omega on this sample was .87. Somatic Complaints List (Rieffe, Meerum Terwogt, & Bosch, 2004) This is a 10-item Dutch measure of the pain frequency experienced by adolescents and children. Responses are indicated on a 3-point Likert scale. McDonald’s omega on this sample was .85. Guess Who Peer Assessment (Coie & Dodge, 1988; Parkhurst & Asher, 1992) Students were asked to identify classmates whose behavior reflects each of the following descriptors: cooperation, European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
Histograms indicated that all variables except for depression approximated a normal distribution. Table 1 shows the levels of skewness and kurtosis for each variable. These confirmed the non-normality of depression, but also indicated a small degree of skew and a more pronounced degree of kurtosis for social competence. Concerning depression, the positive skew is not surprising, because European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
5.01 0.60 0.17 0.49 12.00 2.75 0.12 0.18 6.28 1.78 0.56 0.06 11.78 2.69 0.49 0.24 13.67 3.67 0.32 0.07 16.41 3.58 0.17 0.11 20.13 3.70 0.12 0.16 16.14 3.48 0.41 0.21 0.52 0.50 0.07 2.01 13.69 0.70 0.18 0.44 2.30 6.52 1.13 4.63 1.62 0.33 0.61 0.27 1.25 0.20 2.39 14.55
– .20*** – .17** .20*** – .18** .03 .46*** – .05 .07 .14* .22*** – .07 .19** .01 .22*** .12* – .41*** .16** .18** .03 .53*** .002 – .23*** .02 .25*** .15** .06 .46*** .43*** – .03 .03 .07 .19** .08 .09 .02 .05 – .03 .22*** .12 .02 .10 .08 .03 .24*** .29*** – .18** .25*** .02 .04 .07 .12* .03 .31*** .17** .20*** – .14* .15* .18** .19** .05 .04 .08 .45*** .16** .15* .39***
7 6 5 4 3 2
Notes. N = 282. UCL-A = Utrecht Coping List for adolescents (Bijstra et al., 1994); TEIQue-ASF = Trait Emotional Intelligence Questionnaire-Adolescent Short Form (Petrides, 2009). *p < .05. **p < .01. ***p < .001.
Sample 1
M SD Skewness Kurtosis
Results
1
Eight levels covering the ages 5–14 years describe pupils’ progress at the end of the academic year, compared to their same-age peers across the country. Level 1 represents the progress of pupils at age five and Level 8 that of the most able pupils at age 14. Each level is divided into three sublevels: C (‘‘has started to work at the level’’), B (‘‘working well within the level’’), and A (‘‘has reached the top of the level and is working towards the next level’’). The levels in the core areas of the curriculum (English, math, and science) were used as criteria of academic achievement. For the analyses in this study, numerical point scores ranging from 1 (representing Level 1C) to 24 (representing Level 8A) were used.
Variable
National Curriculum Levels
Table 1. Sample 1: Descriptive statistics and intercorrelations between study variables
This test was administered to all participants at age 11 upon entering secondary education. Our focus was on the verbal, quantitative, and nonverbal subscales. The rationale for using subscales is that any one of them may not explain variance in a given criterion, thus weakening the composite’s overall explanatory power while inflating that of TEIQue-ASF scores. Only total scale scores of the CAT were available and, thus, internal consistency could not be calculated. However, CAT scores are highly reliable in national samples (Strand, 2004).
8
Cognitive Abilities Test (CAT; Lohman et al., 2001)
– .42*** .21*** .29*** .01 .23*** .15* .18** .22*** .45*** .20*** .11 .54***
9
Sample 2
Depression Somatic complaints Social competence Age Gender UCL-A confrontational UCL-A palliative UCL-A avoidant UCL-A social support UCL-A depressive UCL-A showing emotions UCL-A optimistic TEIQue-ASF
10
11
12
13
disruption, aggression, and leadership. Proportions of a student’s nominations by his or her classmates were computed for each description. An overall social-competence score was then calculated by subtracting the sum of pro-social nomination proportions (cooperation and leadership) from the sum of antisocial nomination proportions (aggression and disruption). Evidence for the descriptors’ criterion and discriminant validity with social preference and impact was presented in Coie, Dodge, and Coppotelli (1983). Results presented by Frederickson and Furnham (1998) support the temporal stability of the descriptors in 9- to 12-year-olds over a 5-week period.
–
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
68
Ó 2015 Hogrefe Publishing
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
69
Table 2. Sample 1: Hierarchical regression analyses predicting socioemotional criteria with demographics (Step 1), UCL-A coping strategies (Step 2), and the TEIQue-ASF (Step 3) Depression Step 1: Age and gender Step 2: UCL-A coping strategies Step 3: TEIQue-ASF Step 3 predictors Age Gender UCL-A confrontational UCL-A palliative UCL-A avoidant UCL-A social support UCL-A depressive UCL-A showing emotions UCL-A optimistic TEIQue-ASF
Somatic complaints
F(2, 279) = 13.23***, DR2 = .087***, R2Adj = .08 F(9, 272) = 16.40***, DR2 = .265***, R2Adj = .33 F(10, 271) = 19.24***, DR2 = .063***, R2Adj = .39 b .17*** .001** .02 .17** .04 .14** .23*** .08 .08 .33***
Tolerance
VIF
.87 .92 .65 .59 .79 .84 .74 .88 .56 .59
1.15 1.08 1.54 1.68 1.27 1.19 1.36 1.14 1.77 1.70
Social competence
F(2, 279) = 8.40**, DR2 = .057***, R2Adj = .05 F(9, 272) = 11.50***, DR2 = .219***, R2Adj = .25 F(10, 271) = 11.22***, DR2 = .017*, R2Adj = .27 b
Tolerance
VIF
.87 .92 .65 .59 .79 .84 .74 .88 .56 .59
1.15 1.08 1.54 1.68 1.27 1.19 1.36 1.14 1.77 1.70
.05 .16** .02 .09 .01 .03 .36*** .06 .07 .17*
F(2, 279) = 13.97***, DR2 = .091***, R2Adj = .08 F(9, 272) = 8.12***, DR2 = .121***, R2Adj = .19 F(10, 271) = 8.12***, DR2 = .019*, R2Adj = .20 b .12* .20*** .16* .15* .03 .09 .09 .24*** .20** .18*
Tolerance
VIF
.87 .92 .65 .59 .79 .84 .74 .88 .56 .59
1.15 1.08 1.54 1.68 1.27 1.19 1.36 1.14 1.77 1.70
Notes. N = 282. UCL-A = Utrecht Coping List for adolescents (Bijstra et al., 1994); TEIQue-ASF = Trait Emotional Intelligence Questionnaire-Adolescent Short Form (Petrides, 2009); VIF = Variance inflation factor. *p < .05. **p < .01. ***p < .001.
most children are presumably not depressed. Nonetheless, both skewness (0.71) and kurtosis ( 0.17) were within an acceptable range when examining normality without two extreme outliers (z > 3) on the depression scale. Likewise, there were two outliers in social comparison (z < 3), whose removal brought skewness and kurtosis down to reasonable levels ( 0.50 and 1.80, respectively). It was decided not to remove these cases from the analysis, given the fairly large sample size should compensate for any outlier effects. Correlations between the variables were generally weak or moderate, with a maximum value of .54 between the TEIQue-ASF and depression. Thus, the correlations indicated no issues with multicollinearity. Correlations between the TEIQue-ASF and the three criteria were all significant and in the expected direction. The coping strategies showed a mix of significant and nonsignificant associations with the criteria that were also in a logical direction. The TEIQue-ASF showed the expected pattern of positive associations with adaptive coping strategies and negative associations with maladaptive coping strategies. It was unrelated to palliative coping. Regression analysis summaries for Sample 1 are shown in Table 2. Demographics (age and gender) were entered at Step 1, followed by coping strategies at Step 2, and the total TEIQue-ASF score at Step 3. In the interest of space and given the study aims, only beta weights at Step 3 are displayed. Collinearity statistics shown in Table 2 further alleviate any concerns for multicollinearity. Variance inflation factors were all between 1 and 2 and tolerance values were all greater than .55. Thus, none of these values were within a critical range. The numbers of coping strategies showing a
significant beta weight were one for somatic complaints, three for depression, and four for social competence; criterion variance explained ranged from 12.1% (social competence) to 26.5% (depression) at Step 2. The TEIQue-ASF composite explained unique variance in all three socioemotional criteria in an expected direction. The additional criterion variance explained by the TEIQue-ASF ranged from 1.7% (somatic complaints) to 6.3% (depression).
Ă&#x201C; 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):65â&#x20AC;&#x201C;74
Sample 2 Histograms approximated a normal distribution and statistics of skewness and kurtosis were all within an acceptable range (see Table 3). Correlations were mostly weak-to-moderate (see Table 3). The maximum correlation between the TEIQue-ASF and CAT subscales was .20 (Grade 8) and, therefore, multicollinearity was of little concern. The TEIQue-ASF showed significant, albeit weak, associations with academic achievement criteria as well as with the CAT subscales. In contrast, all three CAT subscales were moderately correlated with the criteria in both samples. Correlations between the TEIQue-ASF and CAT subscales were weak but consistently significant. All of these associations were positive. Regression analysis summaries for seventh and eighth graders are shown in Tables 4 and 5, respectively. Once again, multicollinearity statistics gave no reason for concern. For the most part, incremental effects of the TEIQue-ASF were consistent across the two grades in
70
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
Table 3. Sample 2: Descriptive statistics and intercorrelations between study variables Variable
N
M
SD
Skewness Kurtosis
1. 2. 3. 4. 5. 6. 7. 8.
End-of-year English End-of-year math End-of-year science Gender CAT verbal CAT quantitative CAT nonverbal TEIQue-ASF
554 566 569 672 651 630 636 614
28.70 5.13 27.53 4.89 28.63 4.08 1.49 0.50 94.99 11.88 94.74 12.28 97.95 13.21 4.50 0.71
0.38 0.12 0.78 0.24 0.23 0.11 0.18 0.03
1. 2. 3. 4. 5. 6. 7. 8.
End-of-year English End-of-year math End-of-year science Gender CAT verbal CAT quantitative CAT nonverbal TEIQue-ASF
421 439 437 468 430 432 435 413
30.64 4.64 28.54 7.27 29.13 5.41 1.44 0.50 96.61 13.04 93.89 12.40 99.04 13.20 4.41 0.72
0.38 0.12 0.78 0.24 0.23 0.11 0.18 0.03
Grade 0.37 0.85 0.36 1.95 0.08 0.79 0.28 0.44 Grade 0.37 0.85 0.36 1.95 0.08 0.79 0.28 0.44
1
2
3
4
5
6
7
7 – .47*** .52*** .21*** .57*** .47*** .41*** .21*** 8 – .30*** .54*** .16*** .44*** .37*** .31*** .29***
– .45*** .06 .50*** .54*** .48*** .14**
– .12** .50*** .46*** .47*** .17***
– .09* .04 .01 .03
– .25*** .03 .29*** .31*** .30*** .14**
– .02 .46*** .36*** .40*** .21***
– .04 – .11* .65*** – .05 .63*** .62*** – .17*** .16** .20*** .12*
– .63*** – .60*** .70*** – .19*** .15*** .12**
Notes. CAT = Cognitive Abilities Test (Lohman et al., 2001); TEIQue-ASF = Trait Emotional Intelligence Questionnaire-Adolescent Short Form (Petrides, 2009). *p < .05. **p < .01. ***p < .001.
terms of significance (at p < .05). In Grade 7, TEIQue-ASF scores explained incremental variance over and above CAT subscales in end-of-year English and Science, but not in math, while in Grade 8 they explained incremental variance in all three subjects. The unique contribution of the TEIQue-ASF was 1.3% (English) and 0.6% (science) in Grade 7 and somewhat stronger in Grade 8 at 5.8% (English), 1.0% (math), and 2.2% (science). Of the three CAT subscales, only nonverbal ability had consistent betas across the two grades in terms of significance. As can be expected, nonverbal ability was significant in the regression analyses for math and science, but nonsignificant in the analysis for English. Verbal ability was a significant predictor of English and science in both grades and of math in Grade 7 only. Quantitative ability explained unique variance in English and math, but not in science, in Grade 7, and in none of the criteria in Grade 8. Variance explained by CAT subscales ranged from 28.8% (science) to 34.5% (English) in Grade 7, and from 11.5% (math) to 22.4% (science) in Grade 8.
This investigation focused on the incremental validity of the TEIQue-ASF. Specifically, it extended this important aspect of construct validity to (a) socioemotional criteria, controlling for a broad set of competing, trait-like attributes (i.e., coping strategies), and (b) objective achievement criteria (end-of-year grade levels), controlling for cognitive ability. Two samples were used, with the analyses concerning academic achievement (Sample 2) split by grade.
The results showed incremental contributions of the TEIQue-ASF to the variance of all three socioemotional criteria (depression, somatic complaints, and social competence) above and beyond coping strategies. Coping strategies have only been examined as criteria of the various TEIQue forms (Mavroveli et al., 2007). However, since they were operationalized as traits (i.e., based on items concerning respondents’ general behavior and not to any particular time period), the present study categorized them as concurrent predictors in order to examine the incremental validity of the TEIQue-ASF. Coping strategies represent typical responses to stressful life events (Greenaway et al., 2015) that are highly relevant during the adolescence, given the socioemotional and developmental challenges ones faces during this formative developmental stage. In that sense, coping strategies may provide a more developmentally appropriate and, perhaps, meaningful proxy conceptualization of personality than, for instance, the Five-Factor Model. The incremental contributions of the TEIQue-ASF to the variance in these socioemotional criteria are consistent with previous findings demonstrating the measure’s unique contributions to self-reported disruptive behavior and depression, controlling for demographics, the Big Five personality traits, and academic achievement (Davis & Humphrey, 2012b). They also build on Frederickson et al.’s (2012) findings of incremental predictive effects on peer-rated social behavior, inclusion, and self-reported psychopathology over the baseline levels of these criteria and general cognitive ability. The present results thus provide further evidence for the incremental validity of the TEIQue-ASF in predicting socioemotional criteria. Even though the effect sizes were not particularly large, these results also extend the measure’s relatively consistent
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
Ó 2015 Hogrefe Publishing
Discussion
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
71
Table 4. Sample 2: Hierarchical regression analyses predicting academic achievement criteria of seventh graders with gender (Step 1), CAT subscales (Step 2), and the TEIQue-ASF (Step 3) Step 1: gender Step 2: CAT subscales Step 3: TEIQue-ASF
End-of-year English
End-of-year maths
End-of-year science
F(1, 474) = 15.14***, DR2 = .031***, R2Adj = .03 F(4, 471) = 71.01***, DR2 = .345***, R2Adj = .37 F(5, 470) = 59.97***, DR2 = .013***, R2Adj = .38
F(1, 486) = .13, DR2 = .0003, R2Adj = .002 F(4, 483) = 59.97***, DR2 = .332***, R2Adj = .33 F(5, 482) = 48.19***, DR2 = .001, R2Adj = .33
F(1, 489) = 3.92*, DR2 = .008, R2Adj = .01 F(4, 486) = 51.13***, DR2 = .288***, R2Adj = .29 F(5, 485) = 41.98***, DR2 = .006**, R2Adj = .29
Step 3 predictors
b
Tolerance
VIF
b
Tolerance
VIF
b
Tolerance
VIF
Gender CAT verbal CAT quantitative CAT nonverbal TEIQue-ASF
.16*** .37*** .23*** .03 .12**
.97 .54 .41 .47 .96
1.03 1.86 2.45 2.15 1.05
.02 .21*** .31*** .13* .04
.97 .55 .42 .47 .96
1.03 1.83 2.38 2.12 1.05
.07 .27*** .12 .21*** .08*
.97 .54 .42 .47 .96
1.03 1.84 2.39 2.14 1.04
Notes. CAT = Cognitive Abilities Test (Lohman et al., 2001); TEIQue-ASF = Trait Emotional Intelligence Questionnaire-Adolescent Short Form (Petrides, 2009); VIF = Variance inflation factor. *p < .05. **p < .01. ***p < .001.
Table 5. Sample 2: Hierarchical regression analyses predicting academic achievement criteria of eighth graders with gender (Step 1), CAT subscales (Step 2), and the TEIQue-ASF (Step 3) Step 1: gender Step 2: CAT subscales Step 3: TEIQue-ASF
End-of-year English
End-of-year maths
End-of-year science
F(1, 355) = 7.20**, DR2 = .020**, R2Adj = .02 F(4, 352) = 25.95***, DR2 = .208***, R2Adj = .22 F(5, 351) = 28.11***, DR2 = .058***, R2Adj = .28
F(1, 367) = 1.11, DR2 = .003, R2Adj = .0003 F(4, 364) = 12.16***, DR2 = .115***, R2Adj = .12 F(5, 363) = 65***, DR2 = .010*, R2Adj = .13
F(1, 367) = .20, DR2 = .001, R2Adj = .002 F(4, 364) = 26.29***, DR2 = .224***, R2Adj = .22 F(5, 363) = 23.74***, DR2 = .022**, R2Adj = .24
Step 3 predictors
b
Tolerance
VIF
b
Tolerance
VIF
b
Tolerance
VIF
Gender CAT verbal CAT quantitative CAT nonverbal TEIQue-ASF
.21*** .33*** .10 .03 .25***
.96 .54 .52 .55 .94
1.04 1.85 1.91 1.81 1.06
.10* .09 .10 .19** .10*
.96 .52 .51 .53 .94
1.04 1.93 1.97 1.89 1.07
.04 .31*** .04 .16* .15**
.96 .52 .51 .53 .94
1.04 1.92 1.96 1.88 1.07
Notes. CAT = Cognitive Abilities Test (Lohman et al., 2001); TEIQue-ASF = Trait Emotional Intelligence Questionnaire-Adolescent Short Form (Petrides, 2009); VIF = Variance inflation factor. *p < .05. **p < .01. ***p < .001.
pattern of unique contributions to more objective criteria when controlling for a relevant and comprehensive set of trait-like attributes. Above and beyond cognitive ability, the TEIQue-ASF also explained unique variance in academic achievement, represented here by British students’ end-of-year grade levels in the core areas of the curriculum (English, science, and math). Only one of the six analyses conducted across the two grades (end-of-year math of seventh graders) did not reveal an incremental effect for the TEIQue-ASF. Although the effect sizes for the TEIQue-ASF were modest, these results build on previously observed unique contributions to teacher-rated academic performance after controlling for cognitive ability, personality, anxiety, and self-concept (Ferrando et al., 2010). In that study, the TEIQue-ASF emerged as the only significant predictor of academic achievement other than cognitive ability, despite
the additional predictors. The current results show that the measure also explains unique variance in objective achievement indices, relative to cognitive ability.
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
Implications Despite the small effect sizes for the TEIQue-ASF in this investigation, it is important to keep in mind that they were derived with the short form of the instrument, which is less powerful than the full form. Also, where academic achievement is concerned, a small effect size of trait EI can be expected, since trait EI is not theoretically the strongest predictor of achievement (Petrides et al., 2004). Other relatively broad criteria in which trait EI may play a stronger role include interpersonal outcomes (e.g., relationship stability and social loneliness) and intrapersonal outcomes
72
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
(e.g., mental disorders and substance dependence). From this point of view, the results are encouraging and speak to the value of the construct. Another reason why the results reported here may underrepresent the true effects of trait EI is emerging evidence indicating that the TEIQue does not represent trait EI optimally (Siegling, Petrides, & Martskvishvili, 2014), even though it has demonstrated superior construct validity relative to other trait EI measures (Freudenthaler, Neubauer, Gabler, Scherl, & Rindermann, 2008; Gardner & Qualter, 2010; Martins et al., 2010). Some of the 15 facets represented by the TEIQue items seem to be redundant and to compromise the validity of the total composite (Siegling, Petrides, et al., 2014). Redundant facets occupy no unique variance of the construct and, therefore, are unable to account for incremental variance in construct-relevant outcomes. On the contrary, the effects of uniquely predictive and non-predictive facets average out when combined into a composite; correlations of this composite with relevant criteria will consequently be lower than those of a composite comprised of predictive facets only (Siegling, Petrides, et al., 2014; Smith, Fischer, & Fister, 2003). Although more psychometric research is needed to confirm these initial results of facet (item) redundancy, stronger incremental effects can be expected with a refined version of the TEIQue. The results convincingly demonstrate that trait EI, and more specifically the TEIQue-ASF, can explain unique variance in construct- and developmentally relevant criteria in adolescents. In conjunction with previous findings, they support the application of trait EI measures in psychoeducational assessments and suggest that even short trait EI forms can have valuable utility in adolescent samples. From the perspective of prediction, the present demonstration of incremental validity is important because it is furnished by a short, convenient, and cost-effective measure. Short forms are often preferred where practical constraints in a research or applied context do not permit the use of the corresponding full forms. Given the enormous effort and resources that go into the prediction of academic performance at every level of education, the ability to improve prediction precision through straightforward means is highly desirable. From the perspective of explanation, our findings further highlight the importance of emotions in the educational process and the need to investigate in greater depth when and why emotion is associated with academic success (Valiente, Swanson, & Eisenberg, 2012).
was certainly not unrepresented among the Sample 1 predictors. Though it would have been ideal to include a trait measure in Sample 2, the analysis was restricted to preexisting data and, therefore, it must be tentatively assumed that the TEIQue-ASF has incremental validity vis-à-vis both personality and cognitive ability, as previous research suggests (Ferrando et al., 2010). Especially the size of the measure’s unique contributions to various criteria remains to be established, using objective achievement data of the kind analyzed in the present investigation.
References
Given the overlap of trait EI with personality, the fact that personality was not assessed and controlled for may be viewed as a limitation. As discussed, the comprehensive set of coping strategies used as control variables in Sample 1 is perhaps a more developmentally meaningful proxy for personality, which may not be fully crystallized until adulthood (e.g., Roberts, Caspi, & Moffitt, 2001). Personality
Andrei, F., Siegling, A. B., Aloe, A. M., Baldaro, B., & Petrides, K. V. (2015). The incremental validity of the Trait Emotional Intelligence Questionnaire (TEIQue): A systematic review and meta-analysis. Manuscript under review. Bar-On, R., & Parker, J. D. A. (2000). Bar-On Emotional Quotient Inventory: Youth version (BarOn EQ-I:YV) technical manual. Toronto, Canada: Multi-Health Systems. Bijstra, J. O., Jackson, S., & Bosma, H. A. (1994). De Utrechtse Coping Lijst voor adolescenten [The Utrecht Coping List for adolescents]. Kind En Adolescent, 15, 67–74. doi: 10.1007/ BF03060546 Billings, C. E. W., Downey, L. A., Lomas, J. E., Lloyd, J., & Stough, C. (2014). Emotional Intelligence and scholastic achievement in pre-adolescent children. Personality and Individual Differences, 65, 14–18. doi: 10.1016/j.paid.2014.01.017 Coie, J. D., & Dodge, K. A. (1988). Multiple sources of data on social behavior and social status in the school: A cross-age comparison. Child Development, 59, 815–829. doi: 10.2307/ 1130578 Coie, J. D., Dodge, K. A., & Coppotelli, H. (1983). ‘‘Dimensions and Types of Social Status: A Cross-Age Perspective’’: Correction. Developmental Psychology, 19, 224. doi: 10.1037/0012-1649.19.2.224 Davis, S. K., & Humphrey, N. (2012a). Emotional intelligence predicts adolescent mental health beyond personality and cognitive ability. Personality and Individual Differences, 52, 144–149. doi: 10.1016/j.paid.2011.09.016 Davis, S. K., & Humphrey, N. (2012b). The influence of emotional intelligence (EI) on coping and mental health in adolescence: Divergent roles for trait and ability EI. Journal of Adolescence, 35, 1369–1379. doi: 10.1016/ j.adolescence.2012.05.007 Derksen, J., Kramer, I., & Katzko, M. (2002). Does a self-report measure for emotional intelligence assess something different than general intelligence? Personality and Individual Differences, 32, 4–6. doi: 10.1016/S0191-8869(01)00004-6 Endler, N. S., & Parker, J. D. A. (1994). Assessment of multidimensional coping: Task, emotion, and avoidance strategies. Psychological Assessment, 6, 50–60. doi: 10.1037/ 1040-3590.6.1.50 Ferrando, M., Prieto, M. D., Almeida, L. S., Ferrandiz, C., Bermejo, R., Lopez-Pina, J. A., . . . Fernandez, M.-C. C. (2010). Trait emotional intelligence and academic performance: Controlling for the effects of IQ, personality, and self-concept. Journal of Psychoeducational Assessment, 29, 150–159. doi: 10.1177/0734282910374707 Frederickson, N. L., & Furnham, A. F. (1998). Sociometric classification methods in school peer groups: A comparative investigation. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 39, 921–933. doi: 10.1111/14697610.00392
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
Ó 2015 Hogrefe Publishing
Limitations and Future Directions
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
73
Frederickson, N., Petrides, K. V., & Simmonds, E. (2012). Trait emotional intelligence as a predictor of socioemotional outcomes in early adolescence. Personality and Individual Differences, 52, 323–328. doi: 10.1016/j.paid.2011.10.034 Freudenthaler, H. H., Neubauer, A. C., Gabler, P., Scherl, W. G., & Rindermann, H. (2008). Testing and validating the Trait Emotional Intelligence Questionnaire (TEIQue) in a German-speaking sample. Personality and Individual Differences, 45, 673–678. doi: 10.1016/j.paid.2008.07.014 Gardner, K. J., & Qualter, P. (2010). Concurrent and incremental validity of three trait emotional intelligence measures. Australian Journal of Psychology, 62, 5–13. doi: 10.1080/ 00049530903312857 Greenaway, K. H., Louis, W. R., Parker, S. L., Kalokerinos, E. K., Smith, J. R., & Terry, D. J. (2015). Measures of coping for psychological well-being. In G. J. Boyle, D. H. Saklofske, & G. Matthews (Eds.), Measures of personality and social psychological constructs (pp. 322–351). London, US: Academic Press. doi: 10.1016/B978-0-12-3869159.00012-7 Joseph, D., Jin, J., Newman, D., & O’Boyle, E. (2014). Why does self-reported emotional intelligence predict job performance? A meta-analytic investigation of mixed EI. Journal of Applied Psychology, 100, 298–342. Retrieved from http://psycnet.apa.org/psycinfo/2014-39897-001/ Kovacs, M. (1985). The Children’s Depression Inventory (CDI). Psychopharmacology Bulletin, 21, 995–998. Lohman, D. F., Thorndike, R. L., Hagen, E. P., Smith, P., Fernandes, C., & Strand, S. (2001). Cognitive Abilities Test (3rd ed.). Windsor, UK: NFER-Nelson. Malouff, J. M., Schutte, N. S., & Thorsteinsson, E. B. (2014). Trait emotional intelligence and romantic relationship satisfaction: A meta-analysis. The American Journal of Family Therapy, 42, 53–66. doi: 10.1080/01926187.2012.748549 Martins, A., Ramalho, N., & Morin, E. (2010). A comprehensive meta-analysis of the relationship between emotional intelligence and health. Personality and Individual Differences, 49, 554–564. doi: 10.1016/j.paid.2010.05.029 Mavroveli, S., Petrides, K. V., Rieffe, C., & Bakker, F. (2007). Trait emotional intelligence, psychological well-being and peer-rated social competence in adolescence. British Journal of Developmental Psychology, 25, 263–275. doi: 10.1348/ 026151006X118577 Mikolajczak, M., Petrides, K. V., & Hurry, J. (2009). Adolescents choosing self-harm as an emotion regulation strategy: The protective role of trait emotional intelligence. The British Journal of Clinical Psychology, 48, 181–193. doi: 10.1348/014466508X386027 Parkhurst, J. T., & Asher, S. R. (1992). Peer rejection in middle school: Subgroup differences in behavior, loneliness, and interpersonal concerns. Developmental Psychology, 28, 231–241. doi: 10.1037/0012-1649.28.2.231 Perera, H. N., & DiGiacomo, M. (2013). The relationship of trait emotional intelligence with academic performance: A metaanalytic review. Learning and Individual Differences, 28, 20–33. doi: 10.1016/j.lindif.2013.08.002 Petrides, K. V. (2009). Psychometric properties of the trait emotional intelligence questionnaire (TEIQue). In C. Stough, D. H. Saklofske, & J. D. A. Parker (Eds.), Assessing emotional intelligence: Theory, research, and applications (pp. 85–101). New York, NY: Springer Science + Business Media. doi: 10.1007/978-0-387-88370-0 Petrides, K. V., Frederickson, N., & Furnham, A. (2004). The role of trait emotional intelligence in academic performance and deviant behavior at school. Personality and Individual Differences, 36, 277–293. doi: 10.1016/S0191-8869(03) 00084-9 Petrides, K. V., & Furnham, A. (2001). Trait emotional intelligence: Psychometric investigation with reference to
established trait taxonomies. European Journal of Personality, 15, 425–448. doi: 10.1002/per.416 Petrides, K. V., Pérez-González, J. C., & Furnham, A. (2007). On the criterion and incremental validity of trait emotional intelligence. Cognition & Emotion, 21, 26–55. doi: 10.1080/ 02699930601038912 Petrides, K. V., Pita, R., & Kokkinaki, F. (2007). The location of trait emotional intelligence in personality factor space. British Journal of Psychology, 98, 273–289. doi: 10.1348/ 000712606X120618 Petrides, K. V., Sangareau, Y., Furnham, A., & Frederickson, N. (2006). Trait emotional intelligence and children’s peer relations at school. Social Development, 15, 537–547. doi: 10.1111/j.1467-9507.2006.00355.x Rieffe, C., Meerum Terwogt, M., & Bosch, J. (2004). Emotional awareness and somatic complaints in children. European Journal of Developmental Psychology, 1, 31–47. Roberts, B. W., Caspi, A., & Moffitt, T. E. (2001). The kids are alright: Growth and stability in personality development from adolescence to adulthood. Journal of Personality and Social Psychology, 81, 670–683. doi: 10.1037/0022-3514. 81.4.670 Saklofske, D. H., Austin, E. J., & Minski, P. S. (2003). Factor structure and validity of a trait emotional intelligence measure. Personality and Individual Differences, 34, 707–721. doi: 10.1016/S0191-8869(02)00056-9 Schutte, N. S., Malouff, J. M., Hall, L. E., Haggerty, D. J., Cooper, J. T., Golden, C. J., & Dornheim, L. (1998). Development and validation of a measure of emotional intelligence. Personality and Individual Differences, 25, 167–177. doi: 10.1016/S0191-8869(98)00001-4 Siegling, A. B., Petrides, K. V., & Martskvishvili, K. (2014). An examination of a new psychometric method for optimizing multi-faceted assessment instruments in the context of trait emotional intelligence. European Journal of Personality, 29, 42–54. doi: 10.1002/per.1976 Siegling, A. B., Saklofske, D. H., & Petrides, K. V. (2014). Measures of ability and trait emotional intelligence. In G. J. Boyle, D. H. Saklofske, & G. Matthews (Eds.), Measures of personality and social psychological constructs (1st ed., pp. 381–414). Oxford, UK: Academic Press. Smith, G. T., Fischer, S., & Fister, S. M. (2003). Incremental validity principles in test construction. Psychological Assessment, 15, 467–477. doi: 10.1037/1040-3590.15.4.467 Strand, S. (2004). Consistency in reasoning test scores over time. The British Journal of Educational Psychology, 74, 617–631. doi: 10.1348/0007099042376445 Timbremont, B., & Braet, C. (2001). Psychometrische evaluatie van de Nederlandstalige Children’s Depression Inventory [Psychometric assessment of the Dutch version of the Children’s Depression Inventory]. Gedragstherapie, 34, 229–242. Valiente, C., Swanson, J., & Eisenberg, N. (2012). Linking students’ emotions and academic achievement: When and why emotions matter. Child Development Perspectives, 6, 129–135. doi: 10.1111/j.1750-8606.2011.00192.x Van der Zee, K., Thijs, M., & Schakel, L. (2002). The relationship of emotional intelligence with academic intelligence and the Big Five. European Journal of Personality, 16, 103–125. doi: 10.1002/per.434 Warwick, J., & Nettelbeck, T. (2004). Emotional intelligence is. . .? Personality and Individual Differences, 37, 1091–1100. doi: 10.1016/j.paid.2003.12.003 Williams, C., Daley, D., Burnside, E., & Hammond-Rowley, S. (2010). Can trait Emotional Intelligence and objective measures of emotional ability predict psychopathology across the transition to secondary school? Personality and Individual Differences, 48, 161–165. doi: 10.1016/j.paid. 2009.09.014
Ó 2015 Hogrefe Publishing
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
74
A. B. Siegling et al.: Incremental Validity of the TEIQue-ASF
Date of acceptance: January 13, 2015 Published online: June 26, 2015
Alexander B. Siegling Research Department of Clinical, Educational, and Health Psychology University College London London WC1H 0AP UK Tel. +44 020 7538-0703 E-mail alexander.siegling.11@ucl.ac.uk
European Journal of Psychological Assessment 2017; Vol. 33(1):65–74
Ó 2015 Hogrefe Publishing
EAPA
APPLICATION FORM EAPA membership includes a free subscription to the European Journal of Psychological Assessment. To apply for membership in the EAPA, please fill out this application form and return it together with your curriculum vitae to: David Gallardo-Puj ol, PhD (EAPAH Secretary Gener al), Dept. of Clinical Psychology & Psychobiology, Campus , Mundet, Pg. de la VallH d H ebron, 171, 08035 B arcelona, Spain, E-mail david.gallardo@ub.edu.
Family name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Affiliation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . City
. . . . . . . . . . . . . . . .
Postcode . . . . . . . . . . . . . . . . . . . .
Country . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phone
. . . . . . . . . . . . . . .
Fax . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ANNUAL FEES ◆ EURO 75.00 (US $ 98.00) – Ordinary EAPA members ◆ EURO 50.00 (US $ 65.00) – PhD students ◆ EURO 10.00 (US $ 13.00) – Undergraduate student members
FORM OF PAYMENT ◆ Credit card VISA
Mastercard/Eurocard
IMPORTANT! 3-digit security code in signature field on reverse of card (VISA/Mastercard) or 4 digits on the front (AmEx)
American Express
Number Expiration date
/
CVV2/CVC2/CID#
Card holder’s name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signature . . . . . . . . . . . . . .
Date
. . . . . . . . . . . . . . . . . . . . .
◆ Cheque or postal order Send a cheque or postal order to the address given above Signature . . . . . . . . . . . . . .
Date
. . . . . . . . . . . . . . . . . . . . .
Instructions to Authors The main purpose of the European Journal of Psychological Assessment is to present important articles, which provide seminal information on both theoretical and applied developments in this field. Articles reporting the construction of new measures or an advancement of an existing measure are given priority. The journal is directed to practitioners as well as to academicians: The conviction of its editors is that the discipline of psychological assessment should, necessarily and firmly, be attached to the roots of psychological science, while going deeply into all the consequences of its applied, practice-oriented development. Psychological assessment is experiencing a period of renewal and expansion, attracting more and more attention from both academic and applied psychology, as well as from political, corporate, and social organizations. The EJPA provides a meeting point for this movement, contributing to the scientific development of psychological assessment and to communication between professionals and researchers in Europe and worldwide. European Journal of Psychological Assessment publishes the following types of articles: Original Articles, Brief Reports, and Multistudy Reports. Manuscript submission: All manuscripts should in the first instance be submitted electronically at http://www.editorialmanager.com/ejpa. Detailed instructions to authors are provided at http://www.hogrefe.com/j/ejpa Copyright Agreement: By submitting an article, the author confirms and guarantees on behalf of him-/herself and any coauthors that the manuscript has not been submitted or published elsewhere, and that he or she holds all copyright in and titles to the submitted contribution, including any figures, photographs, line drawings, plans, maps, sketches, tables, and electronic supplementary material, and that the article and its contents do not infringe in any way on the rights of third parties. ESM will be published online as received from the author(s) without any conversion, testing, or reformatting. They will not be checked for typographical errors or functionality. The author indemnifies and holds harmless the publisher from any third-party claims. The author agrees, upon acceptance of the article for publication, to transfer to the publisher the exclusive right to reproduce and distribute the article and its contents, both physically and in nonphysical, electronic, or other form, in the journal to which it has
been submitted and in other independent publications, with no limitations on the number of copies or on the form or the extent of distribution. These rights are transferred for the duration of copyright as defined by international law. Furthermore, the author transfers to the publisher the following exclusive rights to the article and its contents: 1. The rights to produce advance copies, reprints, or offprints of the article, in full or in part, to undertake or allow translations into other languages, to distribute other forms or modified versions of the article, and to produce and distribute summaries or abstracts. 2. The rights to microfilm and microfiche editions or similar, to the use of the article and its contents in videotext, teletext, and similar systems, to recordings or reproduction using other media, digital or analog, including electronic, magnetic, and optical media, and in multimedia form, as well as for public broadcasting in radio, television, or other forms of broadcast. 3. The rights to store the article and its content in machinereadable or electronic form on all media (such as computer disks, compact disks, magnetic tape), to store the article and its contents in online databases belonging to the publisher or third parties for viewing or downloading by third parties, and to present or reproduce the article or its contents on visual display screens, monitors, and similar devices, either directly or via data transmission. 4. The rights to reproduce and distribute the article and its contents by all other means, including photomechanical and similar processes (such as photocopying or facsimile), and as part of so-called document delivery services. 5. The right to transfer any or all rights mentioned in this agreement, as well as rights retained by the relevant copyright clearing centers, including royalty rights to third parties.
Online Rights for Journal Articles: If you wish to post the article to your personal or institutional website or to archive it in an institutional or disciplinary repository, please use either a pre-print or a post-print of your manuscript in accordance with the publication release for your article and the document ‘‘Guidelines on sharing and use of articles in Hogrefe journals’’ on the journal’s web page at www.hogrefe.com/j/ejpa.
November 2016
How to assess the social atmosphere in forensic hospitals and identify ways of improving it “All clinicians and researchers who want to help make forensic treatment environments safe and effective should buy this book.” Mary McMurran, PhD, Professor of Personality Disorder Research, Institute of Mental Health, University of Nottingham, UK
Norbert Schalast / Matthew Tonkin (Editors)
The Essen Climate Evaluation Schema – EssenCES A Manual and More
2016, x + 108 pp. US $49.00 / € 34.95 ISBN 978-0-88937-481-2 Also available as eBook The Essen Climate Evaluation Schema (EssenCES) described here is a short, well-validated questionnaire that measures three essential facets of an institution’s social atmosphere. An overview of the EssenCES is followed by detailed advice on how to administer
www.hogrefe.com
and score it and how to interpret findings, as well as reference norms from various countries and types of institutions. The EssenCES “manual and more” is thus a highly useful tool for researchers, clinicians, and service managers working in forensic settings.
Psychological Assessment Science and Practice Book series Edited with the support of the European Association of Psychological Assessment (EAPA) Editors Tuulia M. Ortner, PhD, Austria Itziar Alonso-Arbiol, PhD, Spain Anastasia Efklides, PhD, Greece Willibald Ruch, PhD, Switzerland Fons J.R. van de Vijver, PhD, The Netherlands
Volume 1, 2015, vi + 234 pages US $63.00 / € 44.95 ISBN 978-0-88937-437-9
www.hogrefe.com
New series Each volume in the series Psychological Assessment – Science and Practice presents the state-of-the-art of assessment in a particular domain of psychology, with regard to theory, research, and practical applications. Editors and contributors are leading authorities in their respective fields. Each volume discusses, in a reader-friendly manner, critical issues and developments in assessment, as well as well-known and novel assessment tools. The series is an ideal educational resource for researchers, teachers, and students of assessment, as well as practitioners.
Volume 2, 2016, vi + 346 pages US $69.00 / € 49.95 ISBN 978-0-88937-452-2
Volume 3, 2016, vi + 336 pages US $69.00 / € 49.95 ISBN 978-0-88937-449-2