Wearable Audio Journal and Mobile Application to Capture Automatic Thoughts in Patients Undergoing Cognitive Behavioral Therapy Devyani Jain
Abstract
Post Graduate Design Student in
By replacing the hand-written ‘thought records’, used by Cognitive behavioral therapy (CBT) patients, with a wearable audio journal that works in tandem with a smartphone, we can help patients capture their automatic thoughts just by speaking their mind out. Voice data provide more clues about the precise state of mind of the patient and thus helps in better therapy.
Apparel Design and Merchandising, National Institute of Design, Gandhinagar, Gujarat, India Devyanijain.18@gmail.com Manikandan Hk Post Graduate Design Student in New Media Design, National Institute of Design, Gandhinagar, Gujarat, India
Author keywords
Wearable technology; Audio Journaling; Aids for Cognitive behavioral Therapy; Mindfulness; Technology facilitated self-awareness
manihk@gmail.com
ACM Classification Keywords H.5.5 Sound and Music Computing (J.5) Systems.
Introduction
Copyright is held by the author/owner(s).
Developed by Dr. Aaron T. Beck, [1] Cognitive Behavior Therapy (CBT) is designed to modify the individual's idiosyncratic, maladaptive ideation. One of most widely used techniques in CBT is the method of reflective journaling in the form of ‘thought record’[2]. Traditionally the thought record is a printed booklet carried by the patient and he is expected to fill it,
Prototype dimension
Figure 1. Dimensions of the Magnetic Leather Clip-on device compared with the dimensions of a PAN card, to show its relative size
whenever he/she experiences a negative thought or feeling. Later these thought records are jointly analyzed by the therapist and the patient to observe and understand what combination of situations, actions and thoughts have led to a particular emotion and feeling in the patient. By identifying this pattern, they can take measures to counter the negative feeling. This process of writing down one’s emotional states, right at the instance of an emotional shift, increases the selfawareness of the patient. In our short-term design project, we have designed a wearable audio recording device which can replace the printed form of thought records and can function as an audio journal to capture the patient’s verbal automatic thoughts (commonly referred as Self-talking). The system in tandem with a mobile application, presents meaningful visualization of recordings, helping both the therapist and the patient in effective therapy.
Emotional richness in Speech versus text Figure 2. The clip-on measures 7.8 inches when it is unfolded
Table 1 One of the most commonly used Thought Record tables, in which the patients fill in the first three columns at the time of event and then reflect on it later by filling the other 2 columns with the therapist.
Comparative studies of self-expression through speech versus through text [3 & 4] prove that speech and voice data are emotionally rich compared to textual medium. Complex expressions as mentioned in [4] are well expressed in the medium of speech rather than text. The signifiers used in the written language may not signify the signified in its fullest essence, particularly when human emotions don’t exist in isolation. (for example, We may feel helpless as well as sorrowful, at the same time). Written word lacks many of the emotional markers [3] that are otherwise found in spoken language, in the form of tonality, frequency and rhythm of speech. The above understanding that evolved from secondary research was confirmed through personal interviews with practicing Psychotherapists in Canada.
Wearability Based on Gemperle [4] large body of work on wearability and the functional constraint of positioning the device closer to one’s mouth, both functional and conceptual prototypes were developed. Considering the ergonomics of daylong wear and the ability to wear it on different kinds of clothing (without muffling the device), different modalities of clipping such as magnetic clip-on (where the device latches on the edge of your clothing – see Figure 1 & Figure 2) and metal clip-on (device is held on to the clothing by a clip) were developed. Choice of form and material was largely driven by the limitations posed by the electronics involved. The smallest prototype was based on the ISD1932 microcontroller [see Figure 4] that works on 3V battery and has 2minutes recording time and measures only
3.5X2.5cms. While the fully functional prototype, measures 4.5x4cms, made by hacking an off-the-shelf
transcribe the voice data into text but also retain the voice data for therapist’s analysis (c) to help the patient reflect on his thought through pattern finding (d) to act as a continuous interface between the patient and the therapist. The application uses Google’s Automatic Speech recognition (ASR) [6] plugin that is available for open-source community through Processing language to transcribe the recorded voice data into text.
Data Analysis The system looks for ‘keywords’ and presents the data based on the frequency of use, context of use and nature of use. Figure 6 shows the interface that helps in easy skimming of audio data and Figure 7, shows the keywords presented as clouds giving you a sense of the most frequent keywords.
Internal components Figure 3. Magnetic Leather Clip-on worn by a user on his neck
Figure 4. Dimensions of the ISD1932 microcontroller board, the smallest that we were able to use
electronic board that works on a 9V battery, and can record onto a SD memory card and can also connects to a smartphone through micro USB port [see Figure 5]. The prototypes of different sizes were made to demonstrate the possible miniaturization of the wearable.
Device Interaction The user can tap or just touch the device to start or stop recording. A LED light acts as the feedback device. The recordings stored in the device can then be transferred to the smartphone for analysis and reflection by connecting it through USB.
Figure 5. Custom made circuit for our functional prototypes
Smartphone application The smartphone application serves the following key functions (a) interface to store all voice data (b) to
Figure 6. Interface to navigate the audio data, with textual clues on the waveform
Figure 7. Transcribed data presented as Keyword Cloud for analysis
At this stage of the project, we have not integrated artificial intelligence into the system, but as of now, the smartphone application serves (1) as an interface for both storing the voice data which is accessible to both the therapist and the patient instantly (2) presents the patient and the therapist with certain key visualizations based on prominent patterns of keywords usage to help them make sense out of the large amounts of audio data. The smartphone app allows the patient to alert his therapist whenever he/she goes through a sudden change of emotional mind states. Also the therapist will have access to the patient’s daily recordings and hence therapy can be broken down to smaller consultation session that are more useful than a long bi-monthly session.
Conclusion This short-term design project demonstrates the possibilities of wearable devices in the domain of mental health. But By combining ubiquitous computing and quantified self, we are faced with massive amount s of personal data, which could greatly enhance personal healthcare. This project is still in progress and we are in the process of developing the complete ecosystem so as to perform actual user testing with real patients.
Acknowledgements Prof. Kate Hartman, Director of Social body Lab in OCAD University, Toronto was the primary guide and mentor for this project while Ms. Erin Lewis, Research Assistant in Social Body Lab was the secondary mentor and guide.
References [1] S.D. Hollon, A.T. Beck. Cognitive and cognitivebehavioral therapies A.E. Bergin, S.L. Garfield (Eds.), Handbook of psychotherapy and behavior change (4th ed.) (1994), pp. 428–466 [2] Andrew A. Sweet, Andrew L. Loizeaux, Behavioral and cognitive treatment methods: A critical comparative review, Journal of Behavior Therapy and Experimental Psychiatry, Volume 22, Issue 3, September 1991, Pages 159-185 [3] Ze-Jing Chuang and Chung-Hsien Wu, Multi-Modal Emotion Recognition from Speech and Text, Computational Linguistics and Chinese Language Processing, Vol. 9, No. 2 , August 2004, pp. 45-62
[4] Barbara L. Chalfonte, Robert S. Fish, and Robert E. Kraut. 1991. Expressive richness: a comparison of speech and text as media for revision. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '91), [5] Francine Gemperle, Chris Kasabach, John Stivoric, Malcolm Bauer, Richard Martin. Design for Wearability. [6] Hasim Sak, Françoise Beaufays, Kaisuke Nakajima, Cyril Allauzen. LANGUAGE MODEL VERBALIZATION FOR AUTOMATIC SPEECH RECOGNITION. Proc ICASSP, IEEE (2013)