258
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 11, NO. 2, APRIL-JUNE 2020
Computational Study of Primitive Emotional Contagion in Dyadic Interactions Clavel , and Mohamed Chetouani Giovanna Varni , Isabelle Hupont , Chloe Abstract—Interpersonal human-human interaction is a dynamical exchange and coordination of social signals, feelings and emotions usually performed through and across multiple modalities such as facial expressions, gestures, and language. Developing machines able to engage humans in rich and natural interpersonal interactions requires capturing such dynamics. This paper addresses primitive emotional contagion during dyadic interactions in which roles are prefixed. Primitive emotional contagion was defined as the tendency people have to automatically mimic and synchronize their multimodal behavior during interactions and, consequently, to emotionally converge. To capture emotional contagion, a cross-recurrence based methodology that explicitly integrates short and long-term temporal dynamics through the analysis of both facial expressions and sentiment was developed. This approach is employed to assess emotional contagion at unimodal, multimodal and cross-modal levels and is evaluated on the Solid SAL-SEMAINE corpus. Interestingly, the approach is able to show the importance of the adoption of cross-modal strategies for addressing emotional contagion. Index Terms—Primitive emotional contagion, facial expressions analysis, sentiment analysis, cross-recurrence quantification analysis
Ç 1
I
INTRODUCTION
the last decade, researchers in Human-Machine Interaction (HMI) worked to endow virtual agents with socioaffective skills by mainly focusing on face-to-face communication [1], [2]. These agents have been employed in a variety of applications such as intelligent tutoring systems [3], serious games [4], and health-care scenarios [5]. Interacting with them, users increase their willingness to disclose the feeling of rapport [6], or report a better quality of experience [7]. Current research on virtual agents has mainly focused on the study of single communication modalities such as gaze and smile [8], [9] and of the relation among all the intrapersonal modalities, for example language and gestures [10]. However, when people interact in dyads or groups, they consciously and unconsciously exchange feelings and emotions simultaneously through and across multiple modalities (see [11] for a survey on multimodal emotional perception). Therefore, models and technologies allowing virtual agents to engage humans in these sophisticated forms of interpersonal interaction are required [12], [13]. The scarce interpersonal dynamics models that can be found in the human-agent interaction literature still rely on the classical information-transmission metaphor of communication in which, turn by turn, user and agent produce (encode) and receive (decode) messages that travel across N
G. Varni, I. Hupont, and M. Chetouani are with the Institute for Intelligent Systems and Robotics, Sorbonne University, Paris 75005, France. E-mail: {varni, hupont, mohamed.chetouani}@isir.upmc.fr. C. Clavel is with the Institut Mines-T el ecom, T el ecom ParisTech, CNRS LTCI75014, Paris 75013, France. E-mail: chloe.clavel@telecom-paristech.fr.
Manuscript received 6 Mar. 2017; revised 2 Nov. 2017; accepted 21 Nov. 2017. Date of publication 28 Nov. 2017; date of current version 29 May 2020. (Corresponding author: Giovanna Varni.) Recommended for acceptance by A. A. Salah. Digital Object Identifier no. 10.1109/TAFFC.2017.2778154
channels between them [14]. These models therefore fail to fully emulate human communication dynamics and its cross-modal nature. During a real interaction, the partners, like in a dance, continually co-regulate their behaviors [15]. This communication does not necessarily imply the use of the same modalities from each partner but consists in an interpersonal cross-modal exchange, i.e., a dynamic interleaving of several modalities leading to the emergence of communicative behaviors such as engagement and synchrony. The dynamic interplay of emotions during interactions is referred in psychological literature as emotional contagion. Hartfield and colleagues [16] argue that emotional contagion is a multiply determined (i.e., with many possible causes), multilevel (i.e., occurring through different communication modalities) family of phenomena. It “can manifest as similar responses (e.g., as when smiles elicit smiles) or complementary responses (e.g., when the sight of a stroke aimed leads to a drawing back of the site of the blow)”. Particularly, Hartfield and colleagues introduce an automatic and largely unconscious contagion they call primitive emotional contagion [17] and define as “the tendency to automatically mimic and synchronize facial expressions, vocalizations, postures, and movements with those of another person’s and, consequently, to converge emotionally”. An important step toward the development of advanced models of interpersonal human-agent interaction is the measure of such emotional contagion specially taking into account crossmodality. This would enable to improve, even in long term interactions, the naturalness and believability of virtual agents avoiding for example the perception of uncanniness [18]. This paper applies a computational approach to investigate the dynamics of primitive emotional contagion1 in face1. Note that primitive emotional contagion will hereinafter be referred to as emotional contagion for the sake of simplicity.
1949-3045 ß 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See ht_tps://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Khwaja Fareed University of Eng & IT. Downloaded on July 06,2020 at 01:39:12 UTC from IEEE Xplore. Restrictions apply.