Musical expectation evoking music through deception

Page 1

! Musical Expectation: ! Evoking Emotions Through Deception! ! ! ! ! ! ! ! ! ! ! ! ! ! By! Riley Murray Smith!

! ! ! ! ! ! ! ! ! ! !

Submitted as a senior thesis, 5/8/15!

! ! ! !

!


Abstract: Is there anything so mysterious as music? It has an undeniable ability to fill us with powerful emotions, despite being non-referential in nature. Where do these emotions come from? Since the middle of the 20th century, musicologists have been working with neuropsychologists to try and answer this important question. This paper will focus in depth on the phenomenon of expectation, and how violations of expectation in music can create moments of emotional affect in a listener. First, I will cover the biological and cultural components of expectation. Then, after a brief discussion on emotion in music, I will discuss and compare the leading theories in the field of expectation-based emotional response, looking in depth at the contributions of Leonard Meyer and David Huron. Lastly, using this body of research, I will analyze and propose strategies for composing music that will target and manipulate an audience’s expectations and emotions.!

! ! ! INTRODUCTION! !

! For many listeners and enthusiasts throughout the centuries, music has held a unique ability to conjure images in the mind. For these people, music can induce flavors, ideas, and textures of the extramusical world. This group is known as the Referentialists (Meyer, 1956). For much of music’s history, Referentialism has been the dominating doctrine. A separate branch of thinking, known as Absolutism, believes that music is non-referential and self-contained. Absolutists argue that what makes music special is its unique ability to incite emotional sensations through sound despite being a closed, mathematical-like system (Meyer, 1956). Though these two groups of ideologies have long been at odds with each other, it is not impossible, as Leonard Meyer points out, for them to coexist. In fact, in order to understand the complex and multifaceted ways music creates emotion, I believe it is essential to merge the findings of both groups into a larger framework. ! ! Those who oppose Referentialism are skeptical that the experiences formed by listening to music as a referential force are too subjective to approach in a scholarly way. While these experiences may not be universal, some are cultural. The society and culture we live in provides us with specific schemas, or learned categories, about music and its meaning. Much of what we learn to expect in general comes from our social surroundings. In this paper, I will show how these cultural expectations create emotional responses in listeners. One way to derive what a culture expects from their music is to study the probability within that culture’s musical genres using mathematical theory, which I will address in detail later on. ! ! Absolutists seek a more universal framework of music and emotion. In an effort to learn how sound produces an emotional response, they turn to biology. The way the human ear processes sound vibrations and relays them to our conscious awareness is critical to understanding our reaction to music. Memory plays an important role in this process, registering and storing new information, while simultaneously using developed schemas to predict the future (Snyder, 2000). Our expectations form based on the constant stream of information that


our memory processes, leading to experiences such as anticipation, surprise, fear, and reward (Huron, 2008).! ! It is these schemas from both biological and cultural learning that form the basis of what we expect to hear when we listen to music. It is important to note that these schemas are not actually separated in our minds. Information from biological derivatives influences cultural schemas and vice versa. For the purposes of organization and critical inspection they have been divided. First we will turn to the components that make up sound and the biological processes that bring them into our conscious awareness. As we will see, the act of hearing, the formation of memories, and the formation of expectations are all closely related. After that, we will take a closer look at culturally derived expectations. !

! ! BIOLOGICAL EXPECTATION! ! !

Before this examination of biological expectation, it will be helpful to briefly contextualize the fundamentals of what our musical expectations address. This will help us frame the focus of this paper going forward. When music is broken down to its basic foundation it is defined as sound over time. Therefore our expectations of music can easily be broken down into “what” to expect and “when” to expect. 
 ! The “what” of music is sound. In its most elementary form, the phenomenon of sound is sound waves occurring simultaneously. Every sound we hear consists of a fundamental frequency wave and a unique series of overtones; quieter waves that ring out at specific frequencies above the fundamental tone at the same time. These overtones resonate based on the physical properties of the sounding object. Every instrument and object produces a signature sonic character, or timbre, based on their shape, size, and material. Throughout our development as listeners, we create a vast library of schemas dedicated to timbre so that when we hear a sound, we can instantly identify its source. Our timbre related schemas are so powerful, research shows that listeners typically can identify the genre of a song they are exposed to after hearing 250 milliseconds of audio (Huron 2008). As a side, if you want to make a new genre of music, this research shows that it may be more effective to utilize new or unique combinations of timbres to base the genre on, rather than using a new song structure or rhythmic pattern. ! ! The fundamental wave of a sound produces the resonating frequency we commonly refer to as pitch. A typical human can hear pitches from 20Hz to 20,000Hz. In music, Since the advent of polyphonic music (multiple pitches sounding at once) and tonality (Western harmonic structure), we have developed complex hierarchical schemas for pitch and harmony which also account for what we expect to hear. Since tonality is a cultural schema, I will address it in the following section. Timbre, pitch, and harmony are the elements of sound about which we form expectations. ! ! Because music, like dance or film, is a time-based art form, we also create expectations of when a sound will likely happen. The element of sound responsible for our perception of


timing is amplitude or loudness. Every sound we hear has a unique amplitude shape, as well as a timbral signature. A wave’s amplitude shape is made up of attack, decay, sustain, and release components. These shapes help us identify the rhythmic content of a musical piece, which typically has a faster percussive attack compared to more lyrical melodic phrases. Most music is temporally organized by evenly spaced pulses that create regularity. We feel these pulses innately as we listen to music (Juslin et al., 2008). Since music tends to have steady rhythmic regularity, rhythm is an influential source of expectation. In Western music, rhythmic expectations are also organized into a hierarchal structure, centered around a conceptual “downbeat” (Snyder, 2001). The temporal expectations we form need not be limited to a periodic pulse. One example of a non-periodic rhythmic expectation is the sound of a bouncing ball (Huron, 2008). When we hear a ball bounce, we expect each successive bounce to speed up incrementally as the ball loses energy and has smaller bounces. Although non-periodic rhythm is rare in western music (especially before the 20th century) they are relatively common in music of other cultures such as Japan, Tibet, and West Africa (Huron, 2008). ! ! Each of these properties of music that I have discussed so far can be referred to as “primary parameters”(Snyder, 2000). This means that they are stored in memory through a system of absolute values that are culturally learned. These are essentially the schemas we have for sound. For instance, the acoustic phenomenon of pitch is stored as individual notes. In the Western musical system, notes are exactly one half step apart. The piano’s range spans from about 27Hz to 4100Hz, and that large range of frequencies is separated into 88 exactly tuned and distinct notes (Zuckerman, 2007). To listeners familiar with western music, pitches are stored as notes with absolute value that are identified using the letters A-G. This means that they can identify notes, the relation between notes (intervals), and patterns of notes (melody), across listening experiences and even between harmonic keys. These absolute values help listeners familiar with the cultural music language identify patterns and form exact expectations. Like pitch, harmony and rhythm are also primary parameters in Western music (Snyder, 2000). ! ! There are also secondary parameters in music, which cannot be easily recognized in terms of proportional relationships like primary parameters. Musical elements like loudness, tempo, and timbre are considered secondary parameters. Their changes are identified in simple relative terms of “more or less.” Throughout a song, it is difficult to give an exact value of how loud an instrument is playing from one moment to the next, however it is possible to tell when an instrument gets louder or softer. This is because in music, secondary parameters are not stored in memory as absolute values. This means the expectations we form about them are more general. That does not make secondary parameters less important, it just makes them less exact (Snyder, 2000). ! ! Our musical listening experience is filled with changes in these primary and secondary parameters. Throughout this paper, I will use the term “event” to signify a change in one or more of these parameters. We form our expectations on changes in these acoustical parameters as we semi-consciously predict what will come next (Snyder, 2000). Now that we have an understanding of these musical parameters, we will discuss the biological functions that helps us hear, process, and remember sound. This discussion will help us understand how biological


expectations are formed by shedding light on the way that we create schemas, or learned patterns and associations. ! ! It is here, in the formation of Biological Expectation, that the fields of musicology, neuroscience, and neuropsychology meet. In order to determine how music effects us emotionally, it is necessary to explore how we perceive and identify sound. Neuroscience focuses on the brain’s anatomy and the complex neuro-pathways of the auditory system. Once these sonic sensations are identified and categorized into parameters, neuropsychology will direct us to our consciousness and the way our minds both process and predict information. ! ! Physiological and psychological operations both play a role in expectation, an evolutionary trait developed as an extension of memory (Meyer, 1956), (Juslin & Sloboda, 2011), (Huron, 2008). From an evolutionary perspective, memory itself did not develop in order to store past events with chronological and accurate precision. Rather, memory is a modulatory structure with the explicit purpose of helping humans adapt, learn from past experiences, and plan ahead (Snyder, 2000). Memory is the basis for future thinking. To understand how our memory system takes in music and forms expectations, it is critical to first discuss anatomical process of listening. Let’s examine how sounds enter our brain. ! ! In his book, “This Is Your Brain On Music,” Levitin (2006) presents a fitting metaphor for the incredibly difficult task the brain has for hearing and making sense of the world around it. “Imagine you stretch a pillowcase tightly across the opening of a bucket, and different people throw ping pong balls at it from different distances. Each person can throw as many ping-pong balls as he likes, and as often as he likes. Your job is to figure out, just by looking at how the pillowcase moves up and down, how many people there are, who they are, and whether they are walking toward you, away from you, or standing still.” (Levitin, 2006)This is essentially what the brain does during the first stage of auditory processing. In order to keep track of the information coming in, it is crucial for us to retain the sounds we hear as they excite the hairs in our inner ears and activate special neuron pathways. To do this, our brains utilize a system of temporally based memory. This system consists of echoic memory, short-term memory, and long-term memory (Snyder, 2000). ! ! The following table is a useful tool for understanding this system. The table maps out events according to their frequency in order to show how this memory system is structured. During echoic memory, “events” refers to a single cycle in a soundwave. The cycles happen so fast, we conceive of the frequency of these events as audible sound. For example, 256 events per second is the equivalent of the note middle C on a piano. When the periodicity of events slows to less that 16 events per second (less than 20Hz), each event can be individually identified as distinct from one another. As the table shows, short-term memory takes over at this point. In short-term memory, events take the form of rhythmic and melodic phrases. As we will see, short-term memory groups individual notes into these phrases in order to make them easier to process and remember. When phrases reach a length which is longer than short-term memory can account for, at roughly 10 to 12 seconds per event, our long-term memory takes ! over and analyzes each event as a formal section (Snyder, 2000). In popular Western music, these formal sections take the shape of stanzas and verses.!


! ! ! ! ! !

EVENTS PER SECOND

SECONDS PER EVENT

16,384 1/16,384 8,192 1/8,192

Echoic Memory!

4,096 1/4,096

(pitch, timbre, amplitude)!

2,048 1/2,048

! ! ! ! ! ! ! ! ! ! !

1,024 1/1,024 512 1/512 256 1/256 128 1/128 64 1/164 32 1/32 16 1/16 8 1/8

Short-Term Memory! (melody, rhythmic phrases)!

! ! ! !

2 1/2 1 1

! ! ! !

1/2 2 1/4 4 1/8 8 1/16 16 1/32 32

Long-Term Memory! (formal sections, structures)!

! ! ! !

4 1/4

1/64 1 min 4 sec 1/128 2 min 8 sec 1/256 4 min 16 sec 1/512 8 min 32 sec

! !

1/1,024 17 min 4 sec 1/2,048 34 min 8 sec

Table 2: Three levels of Musical Experience: Memory and auditory Processing - Snyder (2000) Table 1: Three levels of musical experience: Memory and Auditory Processing- Snyder (2000).!


! Let us now look in depth at the first stage of memory, known as echoic memory (Snyder, 2000). During this stage, soundwaves bouncing through the air are sorted and shaped into a perceptual and familiar sonic environment. The brain processes sounds using both “bottom-up” and “top-down” processes simultaneously. In the “lower” parts of the brain are the phylogenically older systems, the unconscious instinct-like structures that humans developed early on in evolution, which break sound waves down into pitch, timbre, and loudness information. This is known as feature extraction (Snyder, 2000). These three parameters of sound are processed in parallel channels so that, for example, information about pitch is analyzed independently of loudness. This allows us to perceive changes in one component of sound while another stays the same (Levitin, 2006). ! ! These extracted features are sent to the higher, more sophisticated systems of the brain in the frontal lobe. The sound information is then categorized into coherent events in a process known as perceptual binding. These higher systems are fed information from our long-term memory in the hippocampus, which stores the unique sonic features of instruments, voices, and other familiar sounds as schemas (Snyder, 2000). Without these schemas, every sound we hear would be unfamiliar, and we would have no way of understanding the bombardment of noise around us. This information is relayed back to the lower parts of the brain in order to identify what is important to listen to and what is background noise (traffic, crickets, etc.). This process of feature extraction and perceptual binding all happens in less than a second (Snyder, 2000). ! ! Because feature extraction uses top-down processes to focus our attention, we immediately hear much more than we can remember. The reason why our brains are wired to parse down the information in such a sophisticated way is that our conscious awareness is only capable of processing a small number of objects at a time. However, our perception of the world around us is made up of much more than the limited information that enters our consciousness. We process a large amount of sensory information outside of our consciousness in what is known as working memory. “Working memory…consists of of immediate perceptions and related activated long-term memories, as well as contextual information that is semi-activated but not in consciousness and information that has just been in consciousness” (Snyder, 2000). One can imagine working memory as a bubble around our short-term memory, managing the contents of our auditory experience that are not immediately available to our conscious minds. Snyder also points out that expectations can be experienced through working memory. This means that we are able to process auditory information, extract its features, and and form expectations about future sounds beyond the peripheries of our conscious awareness. Whether we experience expectations as conscious thoughts or semi-conscious “feelings” about the future, they are able to affect us emotionally (Huron, 2008). If we have particularly strong expectations, or if our expectations prove to be false, the experience of expectation is more likely to enter our conscious awareness. ! ! The auditory information that does enter our consciousness is mediated by the memory system known as short-term memory. During this stage of memory, sound events that have previously been categorized and prioritized with the help of long-term memory are selected and sent on a path through conscious awareness (Snyder, 2000). Although there is a lot of


information and processing occurring during the course of hearing, we are only aware of the small amount of neatly organized events that enter our consciousness. Short-term memory lasts about 3-5 seconds and can stretch to 12 or so seconds at most. During this time, our consciousness can only process about 7-9 items without being overwhelmed (Snyder, 2000). To cope with this clear perceptual limitation, our brains have developed strategies for “chunking” together information into easily-digestible groups (Snyder, 2000), (Levitin, 2006). ! ! This perceptual limitation is not exclusive to music, and to demonstrate how the limitation effects informational intake, it can be helpful to look at language as a metaphor for music, a tactic we will return to throughout this paper. Speech is broken up into short sentences, relaying small chunks of information at a time. Sentence structure is designed to be compatible with our small window of conscious comprehension, lasting around 5 seconds in length and consisting of a small number of individual words and phrases. Easily-comprehended music conforms to these same parameters to make each musical phrase compatible with our memory (Snyder, 2000). This perceptual foundation has therefore also played a role in the development of phrase-based melodic schemas such as pitch proximity, step inertia (Huron, 2008), and gap fill tendencies (Meyer, 1956) in music, which will be discussed in detail later. The perceptual limitations of short-term memory are at the core of our musical experience. Musical organization has naturally conformed itself to these limiting factors in order to make music comprehensible.! ! Besides chunking and organizing information that passes through conscious awareness, short-term memory also keeps track of individual song-specific expectations, developed while listening through the use of repetition in music, which David Huron (2008) refers to as “dynamic expectations.” These expectations are formed by brief moments of exposure to material throughout a piece of music. In fact, the occurrence of any single tone or rhythmic gesture creates an expectation that it will appear again within the same song. Dynamic expectations are experienced in almost every work because of the high level of repetition that exists in music. Huron (2008) and his collaborator, Joy Ollen, concluded upon analyzing a large body of crosscultural musical works that on average there is a 94 percent chance a musical phrase longer that a few seconds will repeat again at some point. This extremely high level of repetition is unique to music. Imagine speech with this level of repetition; it would be unclear and intolerable. ! ! This repetition actually also utilizes another important function of short-term memory: memorization (Snyder, 2000). In order to transfer a moment to long-term memory, where we hold our memories of the past, our consciousness must replay that moment a number of times. Memorizing a phone number, for instance, usually requires repeating the sequence of numbers multiple times out loud or in one’s head (Snyder, 2000). notice that when memorizing said phone number we are compelled to group the numbers into chunks of information that are 3 or 4 numbers long. Instead of trying to fit a 10 digit number into our short term memory, which can only process 7-9 items at once on average, memorization and comprehension becomes easier by using a grouping strategy. As stated before, this process also comes into play while listening to music. Music hijacks this system of repetition-based memorization. By repeating musical figures throughout a piece, short-term memory engages listeners in the memorization process without having to consciously replay musical figures. As our brains adapt to this new repeating


information, they form strong expectations of sequence order. Our memory starts to predict the repetitions and sends signals back to echoic memory to attune our focus based on the dynamic expectations we form.! ! Aside from making music easier to memorize and digest, repetition also helps listeners adjust their attention to main melodic content. In order to conserve energy and protect ourselves, we naturally attune our senses towards new information and away from prolonged unchanging events (Huron, 2008). This is known as the “habituation response” (Snyder, 2000). In music, composers use the habituation response to distinguish between the background accompaniment and the foreground melody. Typically, accompaniments fall into regularly repeating rhythmic patterns, and as listeners becomes habituated to these patterns they no longer hold their attention, which instead focuses on the new and changing details of a melody. This same response is in effect when we listen to non-musical sounds. Imagine you are driving alone in a car with the radio off. You would hear your car wheels rolling on the pavement and the soft din of traffic around you. When you turn the radio on, your auditory system focuses your hearing to the voices coming through the car speakers. The sound of the cars around you seem to lessen, perhaps only drawing your attention when the sound of a siren starts wailing in the distance. It wasn’t that you stopped hearing the sound of traffic noise (maybe you did depending on how loud you played the radio), but you became habituated to the unchanging sonic characteristics of traffic that offered no new information. ! ! So far we have seen that auditory processing in music is facilitated by a complex and sophisticated system. Our memory is critical to understanding what we hear. In fact, memory is essential to our entire experience of the present. We depend on the coordination of echoic, short-term, and long-term memory to receive information from our senses, categorize it, and organize it in a way that our consciousness can understand. According to some research, we live approximately 80 milliseconds in the past (Stevens, 2012). During those 80 milliseconds, our memory works at lightning speed to produce what we experience in our conscious awareness as “the present.” The reason I am focusing so closely on memory has to do with the claim I made earlier in this section. The purpose of memory is to facilitate future-thinking. I am suggesting that memory processes are critical to both the present and the future, and, therefore, our expectations of the future have an effect on our present experience. In my research I have found this interconnectivity to be crucial for giving expectations the powerful effect they can create, which we will continue to uncover throughout this paper. ! ! Now that we have addressed the processes of hearing, we will shift our focus to develop an understanding of how this auditory information informs the schemas we use to create expectations. When listening to a new song, listeners depend on the internal repetition of a piece of music to inform their dynamic expectations. The information gathered from this first listen is sent to long-term memory, as an episodic memory (Snyder, 2000). Episodic memories are personal memories of specific moments in the past that include the time, place, and other contextual information from that moment in one’s life. During repeated listens of the same piece of music, the episodic memory of each listen is stacked together. Since the biological purpose of memory is preparation and not recall, the parts of each episodic memory that differ from one


another, such as time and place, become blurry while similar information regarding music (that has remained the same) is reinforced. As we continue to hear a piece of music throughout our lives, the memory of each individual listening experience fades and we are left with a musical blueprint (Huron, 2008). For example, most people do not remember the first time they ever heard “Happy Birthday” sung while simultaneously knowing the exact pitches and rhythm of the song. Our brain uses this musical blueprint to form what Huron (2008) refers to as “veridical” expectations. These expectations are the result of repeated exposure to a single piece or even performance of a piece of music. For music with which we are very familiar, these expectations are incredibly accurate. The internal repetition and logic of a work, once tracked as dynamic expectations, aid our memory in developing these strong veridical expectations. ! ! Let’s say that through this complex biological process, you have developed strong veridical expectations for the rock band ACDC’s song, “Back in Black.” The structure, melody, and each guitar lick are stored in this blueprint that you have created. Now let’s say you start regularly listening to the rest of their albums, forming detailed veridical expectations for a number of ACDC songs. By comparing the blueprints of one song to another, you would find certain characteristics common in most ACDC songs such as four beats to the measure (meter), distorted guitars (timbre), and guitar solo sections (structure). Based on these shared characteristics you form a category, or schema, in your mind of what “ACDC” music should sound like. This is the same as the kind of category in your mind that tells you how a flute should sound. These categories form are called “schematic” expectations (Huron, 2008). By continuing to compare other music you have heard with ACDC, you might create schematic expectations for what “ 1970’s Rock” music should sound like, or even how “Rock” music in general sounds. ! ! Schematic expectations are the result of comparison and categorization of our episodic memories into what are known as “semantic” memories. These memories are not tied down to particular events in life like episodic memories; instead they are the result of a synthesis of personal memories into schemas (Huron, 2008). These are as present outside of music as they are within. Semantic memory would include such information as what a light switch does, what a Starbucks store should look like, and how shoelaces are tied. Inside the music world, semantic memory-based schemas are the basis of “style” and “genre”(Levitin, 2006). Schemas can deal with physical acoustics; for example, classical music has an “orchestral” sound palette. Schemas can also store structural details; for instance, a 1-4-5 harmonic progression is typical in the Blues. We use these details from our semantic memory to inform our schematic expectations that we bring in to our listening experience every time hear sound (Bruce et al., 2009). For instance, what we expect to hear in a rainforest is different than what we would expect in an office building. If you were on the forest floor and the sound of a fax machine cut through your sound environment, it would certainly violate your schematic expectations. ! ! Because combining episodic memories is the only way to form schemas, one could say that every expectation is essentially a “biological” expectation. In the following section, we will look at what I refer to as “cultural” expectations. As we now know, even expectations we develop as a result of cultural exposure are technically of biological origin. I am forming this


distinction in order to show that listeners who are exposed to a specific culture’s music will form certain schemas that are shared between all listeners of that culture. The term biological expectation references schemas that are developed due to a listener’s specific taste, while cultural expectations refer to schemas that are developed due to general cultural exposure.! ! To summarize, biological expectations are formed by the events of personal experience. These expectations form over time based on the sounds and music we have heard. Some elements of this process are colored by the limitations of our anatomy. We hear much more than we can process because our short-term memory can only handle small chunks of information at a time. We consciously hear more than we remember because the biological purpose of memory is not recall, but preparation, and, therefore, our memories are malleable and subject to change (Snyder, 2000), (Huron, 2008). Both of these factors affect the way that Biological Expectations are formed and will be returned to later while discussing tactics for manipulating expectation. Dynamic, veridical, and semantic expectations are the result of separate memory processes. Short-term memory forms dynamic expectations regarding the moment-to-moment intake of sonic information. Veridical expectations are the result of repeated listens to a particular song; they are the blueprints we store for each piece of music. These expectations are the result of episodic memories. Semantic expectations form by comparing different songs to each other, combining their shared elements to form self-made schemas. They are a product of semantic memory.!

! ! CULTURAL EXPECTATION! !

! “It is pointless to ask what the intrinsic meaning of a single tone or a series of tones is. Purely as physical existence they are meaningless… the relationship between tones themselves or those existing between tones and the thing they designate or connotate, though a product of cultural experience, are real connections existing objectively in culture.” (Meyer,1956)!

!

! Unless we were to subject ourselves to total isolation, void of any human contact, it would be impossible to imagine life without the influence of the societies in which we live. How we speak, what we eat, and the things we value are all determined to some extent by our immediate and extended community. While Biological Expectation stems from the development of memory and personal experience, Cultural Expectation develops as a result of growing up within the confines of cultural norms (Huron, 2008). However, this separating categorization is more symbolic than tangible. In reality, these two facets of our identity inform each other, contributing details from the world around us to form the schemas we develop and use in expectation. Looking at the example I gave earlier about schemas within semantic memory, it is likely that one’s personal experience and cultural upbringing work in tandem to help develop schemas for light switches, Starbucks stores, and shoe tying. While the development of personal experience is complex, it is much easier to examine the construction of memories than it is to approximate the effects of cultural influence. Memory is based on direct exposure to a


specific and finite moment or event, though it becomes more complicated with repeated similar experiences and other factors mentioned above. Cultural influence, on the other hand, is based on an ambiguous and general level exposure that can’t be pointed to exactly. One cannot discuss the physiological intake of culture the same way one could with the intake of sound. ! ! There are many varied musical cultures around the world, and it is beyond the scope of this paper to account for them all. I will focus on the “Western” musical culture that originated in Western Europe around the time of Ancient Greece. This musical culture is responsible for “classical music”, but its influence can still be seen in the popular music of America and Western Europe today. Aside from my considerable familiarity with this musical culture, I am focusing on Western music because of its relationship with scientific theory. Pythagorus, an ancient Greek mathematician, is credited with developing the Western system of notes as we know them. Musical notes in the Western system are organized according to the natural harmonic ratios of vibrating strings. While the tonic/dominant relationship in the Western tradition (discussed in detail below) is a cultural phenomenon, it is rooted in a mathematical theory of acoustics. The fifth scale degree of a key is actually the third overtone of the harmonic series, with the octave being the second overtone. The dominant note of a key has a culturally held relationship to the tonic, but it also has an acoustical relationship. For these reasons, as well as its established theory of written notation, I believe the Western music tradition is best suited for scientific analysis and study. While some of the theories I will discuss throughout this paper may possibly apply to other cultures, I am concerned here with Western cultural music. ! ! In order to experiment with ways to manipulate cultural expectations and the schemas on which they are based, it is paramount to first produce a quantifiable system of cultural exposure in music. While such a system has yet to be explicitly proposed, it is possible to obtain usable data by examining the statistical properties within a culture’s musical cannon, using information theory to determine which musical functions are most likely to manifest at a given moment. In his book, Music and Probability, David Temperely (2010) provides useful knowledge about Probability and Information Theory, which can be applied to an investigation of culturebased expectations. In Probability Theory, the term entropy is used to describe the “level of uncertainty” in a body of data. A highly likely or probable event will have low entropy, while a surprising and unlikely piece of information will have high entropy (Temperely, 2010). In other words, entropy quantifies the predictably or expectedness of an event. For example, when rolling two dice, a combined roll equaling seven has low entropy, while a combined roll equaling two or twelve would have high entropy. In this example, the variables (dice) are independent of each other. Probability gets a bit more complicated when working with variables that are dependent on each other, known as conditional probability. ! ! Bayes’ Theorem of Conditional Probability is an equation used to determine an underlying reality or model given its surface, or the conditional probability of “A” given “B” (Temperely, 2010). For our purposes, “A” is a cultural musical schema and “B” is the given musical information that listeners hear. We can use conditional probability to try and establish structural models and culture-driven tendencies given the information provided from the Western music tradition. Employing Bayes’ rule on a sample of Western music from the 1600’s


to the 1900’s (B), we would invariably find an underlying system of functional harmony and tonality (A), the musical guidelines which structured all music from this culture during the given time frame.! ! Tonality is a framework that organizes the relationships between musical notes into a hierarchy. Tonal scales are a collection of notes; typically 7 of the 12 possible notes in Western music. They are organized around a single note called the tonal center, which is repeated most often in a song. Each note in the scale has a function related to the tonal center. Subdominant chords lead to the dominant, also known as the fifth scale degree, which then returns to the tonic, an ebb and flow of tension and resolution. When music returns to the tonic, we experience a release of tension. A return to the tonic tends also to indicate a closural moment in music, which we will examine in depth in a later section. As this system developed over time to accommodate polyphonic music (multiple notes at the same time), a relationship between stacked notes also developed. Triads—chords containing three notes that are either three or four semitones apart—took prominence over other chord combinations. Even though music progressed and pushed against this system over the centuries, a large part of Western music today still retains a tonal scale system. ! ! Totality is a musical cultural schema, and therefore a source of cultural expectation (Margulis, 2005), (Levitin, 2006), (Huron 2008), (Temperely, 2010). Western composers and musicologists have a long history of “tampering with tonality,” manipulating harmonies to occasionally work against the entrenched schemas of functional tonal harmony. Some tactics have become so common they have worked their way into the Western cultural schema framework. For example, the deceptive cadence (aptly named) is a technique where the dominant fifth of a scale leads to the sixth scale degree instead of returning to the tonic, prolonging the tension when it should have resolved, deceptively. The deceptive cadence disrupts a cultural prediction about what notes will play next and when the resolution will arrive. The ways in which music uses the expectation-based tension and release will be discussed later on, for now it is important to note that tonality has long been a prime source of Western music expectations.! ! In his book, Emotion and Meaning in Music, Meyer (1956) claims that within the system of tonality, minor scales are more ambiguous, since there are three different types of minor scales: natural, harmonic, and melodic minor. Because the 6, flat 6, 7, or flat 7 scale degrees may occur in these scales, he believes the pronounced feelings of tension that accompany the minor mode to be a result of ambiguity. Temperely (2010) addresses this claim using statistical probabilistic theory. He agrees that minor scales have enhanced tension-evoking properties, but he argues that this is due to the relatively low probability of their unique scale degrees, rather than their ambiguity. He claims that major scales are far more ambiguous because different keys in the major mode typically share one or more triad. For instance, the diatonic hexachord containing C, D, E, G, B and A exists in C major, G major and D major. From a probability standpoint, minor scales have unique notes that are less likely to occur. This means they have more potential to break our expectations and arouse our attention. As we will see later, sound


that elicits increased arousal and attention is linked to both emotion and expectation (Meyer, 1956), (Huron, 2008), (Barrett, 2012). ! ! Huron (2008) also focused some of his research specifically on the Western tonal schematic system. Throughout his musical statistical testing, he has developed a model of resting expectation for Western listeners. By resting expectations, he means what Western listeners may expect to hear before any music is played. In his tests, he found evidence that music listeners tend to expect the things they have been exposed to most. Using this theory, Huron sampled a breadth of Western music to see which aspects of music listeners come expect before they hear music. He concluded that listeners from the Western culture have resting expectations that a song will begin with an F major root position chord, within a major scale and binary (4/4 or 2/4) meter (Huron, 2008). His research presents a new possibility for manipulating expectations, by accounting for deeply ingrained cultural expectations that color the experience of any listener exposed to western music.! ! Temperely (2010) points out that schemas such as tonality are required to successfully create a low-probability event. Without the contextual reference of a schematic structure, there is no way to determine the likelihood of an event occurring. This means that a listener must be able to imply an underlying structure for an unexpected event to carry any weight. This is a fairly intuitive compositional technique, but by understanding its value through a probabilistic lens, we can better appreciate the expansive reach of cultural expectation within music. In order for accurate expectations to form, listeners must be familiar with the cultural-based schematic properties of the music to which they are listening, and the music must employ enough schematic regularity to give listeners an indication of which schemas they should employ.! ! Bayes’ Theorem is useful for looking at one structure, but in order to compare schematic structures and find the one that best predicts a body of data, Temperely (2010) suggests it is better to use “cross-entropy” analysis, taken from Information Theory. Cross-entropy analysis quantitatively shows how accurately a model (A) fits a body of data (B) (for detailed equations see Temperely, 2010). Using this method of analysis, we can find other structural schemas present in music that form our musical expectations. By testing a model against a surface body of data, we can see how effective the model is at predicting the data and whether it is the best fitting model or structure for predicting Western music. Tonality is primarily a schema for harmonic structure, and the following examples will focus on melodic expectations.! ! The first melodic model we will explore goes by a few different names. Some theorists like Leonard Meyer (1956) refer to it as the “gap-fill” theory, since the research we are currently analyzing comes from Huron (2008), I will use his terminology: “post-skip reversal.” According to his model, the line of best fit shows that melodies that leap by more than four semitones in one direction will tend change direction by step afterwards. Through a series of tests using a large sample of Western and non-Western music, Huron and his collaborator, Paul von hippel, hoped to find evidence that their post-skip reversal model fit the surface level musical data that they were testing. They discovered that in reality melodies tend to follow a pattern of melodic regression towards the tonic. This means that if, a note in a melody leaps away from the tonic center, then the next note will redirect the melody towards the center by step. If a leap from


below the tonic does not cross over the tonic midline, their data shows that it will likely continue climbing upwards towards the tonic instead of descending back away from it. These results are very interesting because the post-skip reversal technique is often taught in Western music theory classes. When they tested their new theory of melodic regression on a Westernenculturated listening audience, they found that specifically musician listeners tended to expect a post-skip reversal model rather than melodic regression, even though their research showed that melodic regression was a more fitting model. Paul von Hippel (2008) stipulates that this may be due to the fact that gap-fill theory is taught in most theory classes, and musicians develop gap-fill expectations through educational exposure. ! ! A second melodic model Huron and von Hippel examined was one proposed by Meyer (1956). Meyer suggested that melodic steps, consisting of intervals one or two semitones apart, tend to be followed by notes in the same direction. Huron (2008) has dubbed this theory “step inertia.” When he consulted an international sample of music, Huron concluded that an ascending step is equally likely to proceed either up or down by step. However, descending steps were followed by another descending step a convincing 70 percent of the time. For melodies it seems more fitting to use a model of step declination than step inertia since ascending steps have no statistical disposition towards rising, but descending steps consistently continue downwards. Step declination has been well-documented by ethnomusicologist Curt Sachs (1962). He has found what he calls “tumbling melodies” across many musical cultures. These melodies begin at a tonic, leap high above it, and then tumble back down in smaller steps (Huron, 2008). Huron’s work supports Sachs findings that show melodies typically consist of descending interval steps. This evidence seems to indicate that listeners are more likely to expect consecutive downward notes than upward notes. ! ! Huron and von Hippel (2008) tested their new model out on Western music listeners to see if their subjects’ predictions were in line with step declination. They discovered that listeners actually tend to exhibit expectations that are in line with schema of step inertia, and not step declination. The collaborators determined that probability plays a role in this incongruous phenomenon. Since ascending steps were split 50/50 between rising and falling continuations, there is no statistical penalty for listeners to apply a step inertia model of expectation. By always expecting a melodic step to continue in the same direction, Huron found that listeners would be right 62 percent of the time. However, if listeners used the step declination model they would be correct 62 percent of the time as well. Since there is no practical reason to expect the new step declination model over step inertia, the average listener may implement step inertia out of simplicity. It is easier to always assume steps in the same direction than to add additional rules to the model if it would yield the same predictive outcome (Huron 2008). ! ! These differences between model accuracy and tested listener expectation are testament to the complex and interconnected process of Cultural Expectation.Their findings suggest that it is not enough to merely account for the data present in a culture’s music to determine that culture’s musical schemas. The field of statistic music learning requires more indepth tests like the work done by Huron and his collaborators in order to discover the underlying differences between listener expectations and predictive models in order to uncover the


additional probabilistic realities at play in the listening experience beyond exposure to a particular structure. ! ! One reason why composers are compelled to write step declination melodies may have to do with speech. The existence of shared prosodic features such as pitch duration, timbre, intensity, and accent between music and speech have been well-documented (Thaut 2005). Huron (2008) suspects our tendency to use step declination comes from the prosodic pattern known as “declination” (also see Snyder, 2000), in which speech lowers in pitch at the end of a sentence (except in interrogative forms). This is due to a physical drop in air pressure as the lungs empty while speaking. ! ! Cultural rhythmic expectations in music are also formed by the language specific to that culture. Patel and Daniele (2002) found a distinctive difference between the average rhythmic tendencies in melodies of native English and French-speaking composers. English rhythms have a tendency towards rhythmic figures that alternate between long and short notes, mirroring the abundance of stressed and unstressed syllable combinations in English words. French melodies on the other hand tend to exhibit a more isometric and even rhythm. Their melodic rhythms reflect the balanced meter of their language. These findings provide strong evidence for the existence of cultural expectations, and perhaps a new latent approach to expectation manipulation. ! ! Until now, our discussion of cultural expectation has focused solely on harmony and melody. Both of these primary parameters are important factors in creating expectations, though perhaps secondary to our most visceral and instinct-driven expectations that are based on rhythm (Levitin, 2006), (Juslin et al. 2009). As we saw before, the definition of music put simply is sound over a period of time. Rhythm controls the distribution of sound over time into identifiable structures; it is the source of regularity and repetition in music with which we use to ground ourselves. Cultural genre schemas have their own unique rhythmic figures that help identify them (Huron, 2008). In Western music, the most basic component of rhythm and the source of our most powerful rhythmic expectations is the downbeat. ! ! Later on we will explore how the downbeat affects us emotionally, but for the purpose of this introduction, it is important to know that it is a powerful tool in the composer’s arsenal. The downbeat instills a similar release of tension like the one felt when a harmonic progression returns to the tonic center. One could even say the downbeat is the rhythmic tonic, where harmonic and structural changes are most likely to occur (Snyder, 2000). In the standard Western 4/4 meter the downbeat is the first and strongest accented rhythm of the measure. Typically within a measure or bar of music there is a hierarchical structure of rhythmic stress on each pulse — “STRONG, weak, Strong, weak.” The third beat, also known as the “backbeat,” receives a small accent, but it is not as strong as the downbeat. This is typical the rhythmic makeup of one measure of music in common time. Interestingly, this structural hierarchy is present across multiple measures as well (Huron, 2008). At the beginning of every four measures, there is an extra strong downbeat where our expectations are even more prepared for an accented pulse. The third measure’s downbeat has slightly more significance than the second or fourth measure, similar to a backbeat. Every sixteen measures, a downbeat falls with


even greater anticipatory power, and so on. For centuries music has formed and exploited the power of the downbeat, turning it into the musical schema we carry today. ! ! Just like the deceptive cadence in tonality, there are techniques that go against the grain of the downbeat, which have become schemas in their own right. These well-established techniques manipulate the expectation of the downbeat. The two that we will specifically address are rubato and syncopation. ! ! In Western music, the 19th century “romantic” movement was the heyday of rubato. It is a phrasing technique in which the musician plays out of time for expressive purposes. Rubato works best in rhythmically simple music passages that emphasize the downbeat with long notes and harmonic changes (Temperely, 2010). When playing in rubato, the exact location of when the downbeat will fall becomes blurred, increasing anticipation and tension of resolution. The stylistic timing of rubato manipulates our expectations of the basic rhythmic pulse. ! ! Another technique that smudges the rigid lines of rhythmic hierarchy is syncopation. Instead of playing in rubato loosely with an unsteady pulse, syncopation redistributes the power of hierarchical structure to previously unaccented rhythms. In the western tradition, jazz music is most known for this type of technique. In a 4/4 meter with syncopation, rhythmic stress may be placed on beats two and four for instance, instead of one and three, and accents may be placed between the four beats of the measure. By moving the focus off of the downbeat structure, syncopation creates energy and anticipation as our brains respond to arousing rhythmic stimulation at moments we were not expecting (Temperely 2010). Since syncopation requires a steady pulse to be most successful and rubato thrives with rhythmic simplicity, the two techniques do not produce the same effect when used together. A musical passage that is both full of rubato and syncopation would not create the same kind of anticipation because there would be too many ambiguous factors in the rhythm for our brains to form proper expectations. Rubato requires a strong sense of when the downbeat “should” occur, and syncopation requires a steady pulse for “off the beat” accents to work against (Temperely, 2010).! ! To summarize, we form cultural expectations because we are constantly exposed to the culture in which we live. The way this exposure manifests into predictive schemas is complicated and depends on numerous variables. Probability theory offers some insight into this process, by shedding light on what listeners are most accustomed to hearing. Using Bayes’ conditional probability models, theorists can produce quantifiable data on the likelihood of an underlying schema given a musical sequence. Once these potential schemas are discovered, cross-entropy equations could be used to find the comparable accuracy between different predictive models. Temperely (2010) and Huron (2008) are two researchers currently using information theory and probability to uncover information about our cultural exposure to music. I believe more work of this nature will help peel back the many layers of cultural expectation that obscure our understanding. Huron and Paul von Hippel tested several models on a participating audience to find out about the schemas listeners have about melody. Unfortunately, their experiments did not differentiate the results of musical models comparatively from one culture to the next, nor do they categorize the listeners in their study into any groups besides musicians and non-musicians. If both of these shortcomings were addressed, their data could have


provided more valuable information about exposure and expectations comparatively across cultures. ! ! However their studies do yield interesting results. They found that while cross-entropy analysis shows evidence of step declination instead of post-skip reversal, musicians tend to expect post-skip reversal due to their exposure to music theory training. They also discovered that step inertia is a heuristic artifact, a schema that we have obtained through personal experience that is not necessarily scientifically accurate (Huron, 2008). Although listeners form expectations based on this model, the underlying structure of music across cultures follows a rule of step declination (likely due to prosodic speech declination). Because there is no statistical advantage to forming expectations based on step inertia over step declination, listeners apply the simpler step inertia rule. Once we have addressed theories on emotion and expectation, we will return to these results and examine the emotional affective properties of these cultural schemas.! ! Tonality and rhythmic hierarchy form the primary cultural expectations for “what will happen” and “when it will happen” in Western music. Tonality focuses on a pitch center, while rhythmic hierarchy focuses on the downbeat or the rhythmic center. Both systems are used to evoke an ebb and flow of tension and release as music strays from and returns to these centers. As Temperely (2010) points out, these patterns and models are essential for creating the perception of predictability, and consequently expectation. Devices such as the deceptive cadence, rubato, and syncopation are exceptions to the rules of these systems that have developed into schemas of their own. Now that we have established exactly what expectations are and how we form them while listening to music we will focus in on the ways that music creates emotional experiences, and the role that expectation plays in this process. In the next section, we will take a brief tour through emotion theory in relation to music. !

! ! EMOTION AND MUSIC! !

! Before we start a discourse on emotion, I want to clarify a few terms to make our discussion more focused and clear. In this paper, I will use these definitions put forth by Juslin & Sloboda (2011), and Juslin et al. (2008). The terms “emotion” and “emotional affect” refer to relatively intense and short subjective responses that are brought on by a specific object or event. This experience should not be confused with “mood,” which is a less intense, longer lasting affective episode that is not identifiably evoked by a specific object or event. The term “arousal” will refer to the physiological activation of the autonomic nervous system, or ANS, which is responsible for involuntary changes in the cardiovascular system, the respiratory system, the endocrine system, and other organ structures. “Attention” is a psychological phenomenon, often evoked by our arousal, which heightens our sense of conscious awareness towards the event or object responsible for our physiological arousal. !

! !


!

!

“We learn most about a thing when we view it under a microscope, as it were, or in it’s most exaggerated form” - William James (Juslin & Sloboda, 2011)!

! To begin an exploration of emotion in music, we will first briefly observe the phenomenon in its most sensational form to see what music is capable of conjuring. Gabrielsson (1989, 2003, 2008) conducted an expansive study for precisely this purpose, to see what feelings and states were brought on by music during what he called SEM, or Strong Experiences in Music. Recording personal accounts from a diverse sample of listeners, Gabrielsson sought to compare and classify the different affective reactions people had during particularly impactful musical experiences (Juslin & Sloboda, 2011). ! ! Gabrielsson categorized these personal accounts into several groups based on the type of reaction his participants had during SEM. The groups include physical, perceptual, cognitive, emotive, existential, and reflective categories. Physical reactions to emotion included tears, chills, and dancing. The accounts in the perceptual category noted enhanced auditory focus, and the tactile sensation of loud bass frequencies. Cognitive experiences reported memory and association activation and an increased awareness of receptivity and expectancy of sound. Emotive responses to strong musical experiences ran the gamut of the emotional spectrum. Listeners reported positive feelings like beauty and rapture, negative feelings like sadness and loneliness, and also mixed feelings like nostalgia or bittersweetness. Regardless of the kind of emotion, these people reported intense and “overwhelming” feelings as well as tension. Existential reactions included transcendental, religious, and spiritual experiences. Finally, the reflective group voiced feelings of new insight into themselves and into music, as well as renewed confidence and courage (Juslin & Sloboda, 2011). The sheer number of experience types is profound, not to mention the diversified range of affected areas. In order to maintain a focused trajectory, this paper will focus on the physical, cognitive, and emotive categories of musical experience. These categories may appear separate according to these subjective articulation of personal experience, but in this section we will examine the ways that physical, cognitive and emotive reactions actually interact, and the role that musical expectancy plays in their interaction. ! ! William James (quoted above) and Carl Lange were two 19th century scholars who simultaneously developed similar theories about the origin of emotions. They hypothesized that emotions are the result of physiological arousal. This hypothesis is known as the James-Lange Theory (Lang, 1994). Over the past century, the James-Lange theory has been a topic of debate and criticism. Many critics, such as Wundt (1896) and Cannon (1927), felt that the James-Lange hypothesis was either backwards or overly simplistic (Lang, 1994). Regardless of its current accuracy, the theory serves as a milestone in emotion studies because it challenged the folk psychology of its time, shifting the field of emotion studies towards a scientific and quantifiable approach (Lang, 1994). ! ! In 1944 Fritz Heider, influenced by James and Lange, as well as gestalt psychology, proposed his own theory of emotions called “phenomenal causality.” His theory contends that the source of physiological arousal plays a role in its emotional affect. In the same way a set of


three dots on a page could represent a line or a triangle depending on their orientation, physiological arousal is labeled or colored as an emotion through cognitive evaluation. The same visceral feeling, brought on by different sources, will yield different emotions (Lang, 1994). Leonard Meyer (1956) leans on the theory of phenomenal causality in his seminal work on musical expectation. In Meyer’s framework, expectations create tension, and this tension causes a physiological reaction that he calls an undifferentiated emotional experience. “When an organism is in a situation which results in affect, the situation plus the reaction gives us the name or word which characterized the whole as a specific emotion. The reaction itself is not sufficient to differentiate the emotion, the character of the situation is involved in the differentiation” (Meyer, 1956). Meyer argues that since music is a non-referential system, which does not directly reference concepts or entities of the non-music world, our emotional response to music has a tendency to feel similarly mysterious and non-referential. If the theory of phenomenal causality states that our emotional response is conditioned by the source of the physiological sensory stimulus, and music (the source) is non-referential, then it makes sense that music can at time conjure up ineffable qualities we cannot fully express. ! ! But what is it exactly is it in music which causes the tension and visceral stimulation that requires our emotional evaluation? Huron (2008) believes that listeners tend associate these subjective feeling states, or qualia, with specific notes and key signatures. “Philosophers use the term quale to refer to the subjective feelings that accompany sensory experiences” (Huron, 2008). Every aspect of our sensory experience elicits unique qualia, like the feeling of sandpaper, the color blue, or onset of anger. Through conscious reflection, one can summon up his/her own subjective quale associated with such sensory experiences, but it is difficult to put in words the exact experience. By defining musical emotional experience as a qualia, Huron takes into consideration the “mysteriousness” and ineffable quality of music that Meyer references above. Huron (2008) argues that music listeners hold qualia for individual notes within the tonal system, and each scale degree elicits its own quale. However, while it may seem to listeners that these emotional properties belong to individual tones, they are actually the product of expectation and statistical learning. It may seem our emotions are guided by the imagined “stability” in a tonic chord or the perceived “unstable” quality of a leading tone, but through a process Huron calls “misattribution,” listeners systematically mistake musical notes as the source of tension and arousal. In reality, the probability and entropy of notes, which are learned through the process of repeated exposure to cultural schematic devices (such as tonality), are responsible for musical qualia (Huron, 2008). The source of physiological and visceral arousal in music is actually expectation. Since, as I explained, the cognitive evaluation of arousal creates an emotional reaction, the source of our emotional reaction to music is also based on our expectations. ! ! We have seen that our expectations are the result of both culturally acquired schemas and biological processes, and that these expectations are the source of the often ineffable and non-referential subjective feeling states that accompany music. It would seem logical then that a theory of emotion, which turns those feeling states of tension into emotional responses, gives reference to both the cultural and biological components of expectation. Lisa Barrett (2012), a


psychology professor at Northwestern University, has proposed such a theory. Barrett claims that, “emotions are, at the same time, socially constructed and biologically evident.” Drawing on research from James-Lange to the present, she makes an argument that emotions are real. Barrett believes that an emotional response is a social mechanism brought on by sensory input and physiological arousal. Through this lens, emotions are both a cultural construct and a product of biological processing, not unlike expectation. Barrett’s (2012) hypothesis of emotion is as follows: “a momentary array of sensations from the world (light, sound, smell, touch, and taste) combined with sensations from the body (X) counts as an experience of emotion or a perception of emotion (Y) when categorized as such during a situated conceptualization (C).” Let’s put this hypothesis into the context of a musical experience. A momentary intake of sonic information, combined with feelings of tension and arousal (brought on by expectation), counts as an experience or perception of emotion when categorized as such through conceptualization. Barrett’s theory can be seen as an update of the now dated theory of phenomenal causality. Her hypothesis assesses the visceral feelings of sensory experience by looking not just at its source, but also the social/cultural significance of that source. Additionally, the emotion produced is not only a cognitive reaction, but it is also a socially communicable construct that enables us to express our physical experience. ! ! In summary, the integration of scientific methodology within the field of emotion studies has been a century long process. The James-Lange theory was the first attempt to account for emotions as cognitive interpretations of our physical experience. Since its inception, many researchers have taken to updating this theory, integrating it with new fields of study such as gestalt aesthetics (phenomenal causality theory). The most updated model we have examined comes from Lisa Barrett. Her theory accounts for emotion within both physical and social realities. We approached the James-Lange model through a musical lens. Using the research of Meyer (1956) and Huron (2008), I have argued that the reactions we feel while listening to music such as increased alertness, arousal, tension, and even changes in the endocrine system are due to the expectations we form and test while listening to music. These changes in physical arousal led to stimulation of our attention. Once we experience this initial sensation, these experiences of increased arousal and attention are processed in our minds and interpreted into emotions and qualia, which serve the purpose of relating our experience in a social environment (Barrett, 2012), (Huron 2008). The emotions chosen to classify our experiences are based on the source of our affectations (Lang, 1994), and since music is a nonreferential source, the emotional states it induces can feel like nameless waves of tension (Meyer, 1956). ! ! “Though emotions are real in the social world, they both cause and are caused by changes in the natural world. They can be causally reduced, but not ontologically reduced, to the brain states that create them” (Barrett, 2012). By this Barrett means that although emotions are technically a cultural phenomenon, they are caused by specific brain processes. Following her assertion, in the next section we will look at the brain functions responsible for emotion in music. During this examination, we will see that throughout human evolution, cognitive emotional processes specific to music have developed along the way. These processes or


mechanisms have lead up to the adaptation of musical expectancy. I will argue that musical expectancy is the newest and most advanced mechanism, and that it utilizes the information from our prior emotional mechanisms. In this way, our examination of musical emotion in the brain is really leading to an examination of the evolution of musical expectation.!

! ! BIOLOGICAL EVOLUTION OF MUSICAL EXPECTATION! !

! In this section we will look at the relationship between emotion and expectation in detail, examining the way our brain functions, develops, and incorporates expectation as a source of emotion in music. The brain mechanisms responsible for music affect, according to Juslin et al.’s (2008) BRECVEM theory of music and emotion, are laid out in the table below. The theory’s name is an acronym, combining the first letter of each mechanism. The mechanisms of the BRECVEM theory are listed with their corresponding brain region, developmental stage and emotional induction speed. By induction speed, Juslin et al. (2008) is referring to how quickly a mechanism creates an emotional response to music. According to this theory, expectation is the most advanced form of cognitive musical interaction. After reviewing the mechanisms in this table and their developmental significance, I will argue that expectation utilizes and links together these earlier-developed cerebral processes outlined in Table 2. By showing that musical expectation incorporates many different regions and functions of the brain, I wish to demonstrate that musical expectancy is invariably linked to emotion, and that it is a complex and advanced system. !

!

Mechanism

Brain Region(s)

Developmental stage

Induction Speed

Brain Stem Reflex

Brain Stem

Prior to Birth

Fast

Rhythmic Entrainment

Cerebellum

Prior to Birth

Slow

Evaluative Conditioning

Cerebellum, Amygdala

Prior to Birth

Fast

Contagion

Basal Ganglia

1st Year

Fast

Visual Imagery

Occipital Cortex, Visual Association Cortex

Pre-School Age

Slow

Episodic Memory

Medial Temporal Lobe, Hippocampus, Prefrontal Cortex

3-4 Years

Slow

Musical Expectancy

Perisylvian Cortex, Anterior Cingulate Cortex, Broca’s Area

5-11 Years

Slow*

!

Table 2: Biological Mechanisms of Emotion- Juslin et. al (2008)!

*This study claims Musical Expectancy has a slow induction rate. I believe here Juslin is referring only to the final stage ! of expectation, appraisal, which will be discussed in detail later.!


! ! ! ! “Brain stem reflexes” are the most primitive response to our surroundings. They facilitate general arousal in our autonomic nervous system based on changes in our sound environment, drawing our attention to novel auditory changes. These reflexes are developed prior to birth and, therefore, function independently of cultural influence (Juslin et al., 2008). As the name suggests, these basic reflexes are triggered in the brain stem, which is one of the phylogenically oldest sections in the human brain. “Rhythmic entrainment” is another mechanism developed prior to birth. Originally acquired to link an infant’s heart-rate with it’s mother’s in-utero through the sound of the mother’s heartbeat, rhythmic entrainment is what helps us feel “on the beat” as we listen. Research shows that, through the mediation of the cerebellum, our internal circulatory and respiratory rhythms synch up with rhythmic musical stimulation. Rhythmic entrainment did not appear in Juslin et al.’s original study, but it was added to the BRECVEM theory in Juslin & Sloboda’s (2011) Handbook of Music and Emotion. I think this may be due to the fact that rhythmic entrainment does not create what might immediately be identified as “emotion”. However as we have seen, emotions are cognitive evaluations of physiological experiences, and rhythmic entrainment is a powerful tool for effecting a listener’s physiological condition. This mechanism may also help create a sense of community and shared experience during concerts (Juslin & Sloboda, 2011). ! ! “Evaluative conditioning” is a process that binds the experience of listening to music with the song itself. If a listener is repeatedly exposed to a song under pleasant, and emotionally positive circumstances, they become conditioned to feel happy when they hear the song outside the context of the pleasant circumstance. Evaluative conditioning need not be a positive stimulation, it can also cause us to avoid music that we associate with negative experiences. Citing the work of Martin et al. (1984) and De Houwer et al. (2005), Juslin (2008) points out that evaluative conditioning typically establishes itself and induces emotions outside our conscious awareness, and the activation of awareness may actually lessen the affect of the mechanism. This process is also developed in-utero and facilitated by the cerebellum, as well as the amygdala (Juslin, et al., 2008), which is responsible for storing the emotions related to specific memories. Later on, we will see that the amygdala is directly connected to musical expectation as well. ! ! “Contagion” is developed in our first year of life and is subsequently our first culturally influenced emotional mechanism. Contagion refers to a process whereby an emotion is induced by a piece of music because the listener perceives the emotional expression of the music, and then ‘mimics’ this expression internally “(Juslin et al., 2008). In this way contagion is related to the empathetic response. It also creates metaphors between music’s properties and the nonmusic world. Melodically imitating speech patterns or dynamically imitating a beach’s rolling waves are two examples of contagion. Music philosopher Eduard Hanslick (1884) argued that dynamic imitation was the primary way music elicited emotions. With the benefit of over a century of scientific research since his claim, it is thought that mirror neurons, also responsible for group cohesion and social interaction, play a role in this process (Juslin et al., 2008). !


! “Visual imagery” is essentially the imagination at work. If contagion creates the metaphoric comparison to an ocean, visual imagery creates the image of rolling waves in the mind. The exact way that these imagined images create an emotional experience is still a mystery, but many theorists including Meyer (1956) have suggested that it likely plays an important role for musical emotion. Some research has been done on the kinds of musical characteristics that help conjure vivid imagery. According to McKinney & Tims (1995) Repetition, predictability of musical elements, and slow tempo are especially effective in stimulating vivid imagery (Juslin et al. 2008). We will soon see that predictability in music can elicit a positive emotional response. ! ! As previously discussed, “episodic memory” refers to personal memories that are time and place specific. To me, it seems that episodic memory is a particular case of evaluative conditioning, in which the memory of a specific past event has bound itself to a piece of music and triggers the memory each time it is heard. We not only associate music with a positive or negative emotional state, at times we associate it with a specific episodic memory as well. As we saw in Gabrielsson’s (2008) study, emotion brought on by episodic memory is a trait of strong experiences in music (Juslin and Sloboda, 2011). According to Juslin (2008), the episodic memory mechanism develops in children between the ages of 3 and 4 years old. This might help explain why it is difficult to remember events in our lives prior to turning 3 years old. Now that we have covered other mechanisms of this chart, we will analyze musical expectancy and see how it utilizes these other mechanisms.! ! From the ages of about 5-11 years, children develop a system of musical expectancy. This system consists of many brain regions, working together to create expectations, test them in real time, and enforce correct predictions. The Brocas’ area, typically associated with speech production, is thought to store structural information for both language and music (Fadiga et al, 2009). This research suggests Broca’s area may be a center for storing musical schemas. In order to analyze incoming music and apply expectations, the left Perisylvian Cortex is used in auditory short-term memory (Koenigs et al., 2011). Koenings’ research shows that the left Perisylvian Cortex is important for auditory language comprehension. As we have seen, music, like language, requires organizational grouping in short-term memory in order to easily comprehend what we are hearing. This research suggests that the left Perisylvian Cortex may be involved in this organization process. Once our expectations are tested, the dorsal Anterior Cingulate Cortex, or dACC, uses positive or negative emotional stimuli to enforce the learning of accurate predictions (Brown & Braver, 2005). “The dACC may play a special role in reward circuitry— particularly in reward-based decision making, learning, and the performance of novel (non-automatic) tasks—functions known to be substantially influenced by dopamine” (Bush et al., 2002). These are just a few of the many brain regions associated with the complex system of musical expectation. ! ! According to Juslin’s BRECVEM model, each of these mechanisms is responsible for distinct emotional experiences related to music. While I agree that each mechanism in the model features a unique process and type of emotional experience, I would argue that each mechanism is also a developmental step towards musical expectancy, which incorporates the


brain regions and functions of the earlier developed systems. Let’s now look at the BRECVEM model through this new perspective, in which musical expectancy is the culmination of previous developments. ! ! “Episodic memory,” addressed earlier in the section on biological expectation, has clear and obvious ties to musical expectation. We depend on episodic memories to help create veridical expectations as well as schematic expectations. The memory network in general is paramount in the creation of expectations. The “visual imagery” mechanism seems less related to expectation on the surface. However, as I mentioned before in reference to research conducted by McKinney & Tims (1995), repetition and predictability of melodic, rhythmic, and harmonic content has been shown to increase the production of visual imagery while listening to music. Despite the fact that visual imagery is formed in the occipital cortex (Juslin et al., 2008) and music is experienced through the auditory cortex, there seems to be underlying and interesting connection between the mechanisms of visual imagery and musical expectancy. As our first emotional process developed after birth, the “contagion” mechanism has clear ties to cultural and social learning. Contagion represents our earliest stages of exposure-based schema learning. In order to form metaphor-like connections between a musical stimulus and an extra-musical experience, we must rely on schemas held within the mind. We use these schemas to determine the properties of the objects we wish to compare, and by finding similarities in their schematic makeup, we can discern metaphoric similarities between those objects. As our minds develop, these schemas become crucial to forming expectations. ! ! “Evaluative conditioning” binds an experience to a specific time-emotion event unconsciously (Juslin et al., 2008). I would argue that this mechanism, developed in-utero, is an early stage of memory. Evaluative conditioning affects our current state by recalling and recounting a previous experience, much like episodic memory. Interestingly, like evaluative conditioning, expectations can also occur unconsciously (Snyder, 2000), and they both involve the amygdala (Huron, 2008). I think this is clear evidence of a relationship between the two mechanisms. “Rhythmic entrainment” is the key to our sense of timing, coordination, and rhythm-based expectations. Processes in the cerebellum are responsible for the feeling of regular pulse we experience when listening to music. Our rhythmic hierarchy, structured around the tension-relieving downbeat (Huron, 2008), owes its power to rhythmic entrainment. Musical expectations depend on rhythmic entrainment to develop a sense of when things are expected to happen. Finally, the “brainstem reflex” mechanism is used to attune our hearing to novel incoming information. These reflexes are critical for expectations because they alert our attention and arousal levels when an unexpected sound occurs. “Sounds that meet certain criteria (e.g., fast, loud, noisy, very low- or high frequency) will therefore produce an increased activation of the central nervous system” (Juslin et al., 2008). Our feeling of surprise is dependent on these unconscious reflexes. !

! ! ! !


TENSION AND RELEASE!

!

! Now that we have explored the biological and cultural processes of expectation, the socio-cultural phenomenon of emotion, and the evolutionary manifestation of musical expectancy, we will turn our attention to some models of expectation to gain a better understanding of what happens at the moment of an expected or surprising event. Previously, I have noted that schematic structures such as tonality and rhythmic hierarchy create tension and release. Now we will explore exactly how each of those experiences manifests in music, and how tension and release are related to expectation and emotional reactions to music. In this section, we will look at theories proposed by Meyer (1956), Huron (2008), and Margulis (2005) regarding tension. After that, I will examine the experience of release. As we will see, release or “closure” in music is as important as tension in creating an emotional experience. In fact, our expectations are strongest during moments of closure (Meyer, 1956), (Huron, 2008). This discussion will lead us to an examination of compositional techniques in the next section that are based on the following models and all of the research we have previously explored.! ! Leonard Meyer (1956) was a pioneer in the field of music expectation. In his book, Emotion and Meaning in Music, he developed the argument that musical expectation plays a role in affective musical experiences. As we saw in the previous section, Meyer’s work is influenced by Heider's (1944) phenomenal causality theory. His theory also draws on the research of J.T. MacCurdy (1925), which stipulates that emotions are brought on by an arrest of an automatic instinct. MacCurdy’s framework involves 3 phases. The first phase is the arousal of nervous energy in connection with an instinct. In the second phase, the instinct is blocked or deffered. Finally, in the third phase this blocked energy becomes conscious and manifests itself as emotion-felt. Meyer (1956) replaces the term instinct with tendency, which he defines as “a patterned reaction that operates, or tends to operate, when activated, in an automatic way.” Meyer’s theory is that these tendencies, either natural or learned, become psychologically ingrained. When tendencies are automatically brought up and fulfilled, the operation is unconscious. When one of these patterned reactions is blocked or disturbed in relation to its structural or temporal properties, the resulting consequence is conscious emotional affect. “Such conscious and self conscious tendencies are often thought of and referred to as ‘expectations’” (Meyer, 1956). To translate his theory into the vernacular of this study, all tendencies are expectations. They come from learned patterns stored in our minds as schemas. Tendencies/expectations are brought up automatically as we listen, as a habited response pattern. When music takes an unexpected turn, our tendencies/expectations are not fully met, the result is a cognitive emotional response. Meyer notes that unlike most scenarios, in which tendencies are brought up and inhibited by different sources, in music a tendency is brought up, inhibited, and resolved by the same source.! ! David Huron (2008) developed Meyer’s theories on tendency into what he calls the ITPRA model of Expectation. The acronym references five different stages that occur as we form expectations and react to the accuracy of our predictions. His model breaks down the theories of MacCurdy (1925) and Meyer (1956) in an attempt to establish a multi-stage model of


expectation. ITPRA stands for imagination, tension, prediction, reaction, and appraisal. The first two processes occur before the onset of a musical event, which Huron refers to as the “preoutcome” period. This period is where our expectations of changes in musical parameters, be they veridical, dynamic, or schematic, are formed using a combination of biological and culturally developed schemas. The next two stages happen simultaneously at the moment of the musical event, depending on our expectation of that event. If listeners’ expectations are correct, they enter the prediction stage. When their expectations prove to be false, they enter the reaction stage. Finally, after the event has occurred, the listener has a chance to analyze the event in question in relation to his/her perceived expectations, which Huron calls the appraisal stage. To better understand the relationship between these stages and the affective results they produce in a musical context, we will begin a more in depth analysis of Huron’s ITPRA model, starting with the imagination stage. ! ! The first stage of Huron’s model has an indefinite time length. The “imagination” stage can begin anywhere from a few seconds to years before an event. In a musical setting, expectations formed in the imagination stage can range from something vague—like the content of your favorite band’s forthcoming album—to something precise—like the accuracy of a cover band’s rendition of a song for which you hold strong veridical expectations. His ITPRA model approaches expectation by uncovering its evolutionary function. Throughout this discussion of the ITPRA model, we will see a return to Huron’s focus on evolutionary function. Huron explains the purpose of imagination, from an evolutionary standpoint, as a behavioral motivator. When we imagine the outcome of an event, we also imagine the accompanying emotional response we will expect to have. Our brains actually cause us to feel a small amount of this imagined emotion chemically, motivating us to experience or avoid the imagined event depending on whether our emotional prediction of it is positive or negative (Huron, 2008). The reason why we listen to music in the first place comes from the expectation that it will be a positive event, the ensuing positive feeling from imagining our expectations fulfilled motivates us to pop in a CD and listen. This stage is also responsible for the human power of deferred gratification. The ability to imagine the positive experience of resolved tension in the future allows us to sit through moments of high tension in music as we wait for resolution. ! ! The next stage in Huron’s model is the “tension” stage. This stage begins moments before an expected event is about to occur. Again, Huron approaches tension through the lens of evolutionary adaptation. Organisms are more likely to survive and reproduce if they can plan ahead and prepare for future events (Huron, 2008). During the moments before an expected event, a well-equipped organism will prepare itself to react at the precise moment with the correct response in order to minimize energy expenditure. To do this, our tension response is calibrated depending on two factors: the uncertainty of “what” or “when” an event will occur, and the perceived importance of that event (Huron, 2008). If listeners are uncertain when the event they expect will occur, they must prepare themselves for it over an extended period of time, raising arousal and attention levels to match the expected outcome. This feeling of heightened energy is what we refer to as tension. !


! A level of uncertainty about what exactly will occur can also cause tension. In Huron’s (2008) example, a baseball player is standing in the outfield waiting for the batter to hit the ball. He is unsure if the batter will hit the ball and where the ball will go, but he can correctly predict that the event will only occur after the ball reaches the batter. He expends energy the moment before the ball is hit, positioning his body in a general “ready-stance” as he prepares for multiple possible scenarios. Huron stipulates that at times when listening to music we may have multiple competing expectations about what will occur, and we experience tension as we weigh the options, uncertain which expectation will come to fruition like the outfielder from the example. Well-prepared organisms must also consider how important an event is in order to conserve energy. It is critical to match our levels of arousal and attention to the perceived importance of an event in order to minimize energy expenditure. This factor is what causes certain downbeats within a hierarchical rhythmic structure to feel more tension-provoking than others. Depending on the pattern of a song, harmonic and structural changes are more likely to occur on certain downbeats than others, evoking a heightened state of tension during the onset of those beats (Huron, 2008). As we will see later on, our expectations are stronger during moments where we expect to experience closure and release. This may be due in part to the fact that we consider moments of closure as “important” in the overall structure in a song. ! ! Now that the expected event has occurred, and a shift in primary or secondary musical parameters has taken place, our minds can begin to process the tension (arousal and attention) we experienced as either a positively or negatively valenced event, and subsequently as an emotional reaction. The first stage we will look at, known as the prediction stage, occurs when our expectations of an event were correct. The prediction stage will always yield positively valanced feeling states (Huron, 2008). When our expectations are met our brain rewards us, enforcing the schema upon which those expectations were formed with dopamine. As previously mentioned, the dorsal Anterior Cingulate Cortex (dAAC) is known to play a role in this chemically-induced positive reward system. From the conception of an expectation during the imagination stage, to the prediction stage, the dAAC is thought to chemically influence positive expectations and reward us when we are correct (Bush et al., 2001). To analyze this stage in a more musical setting, this prediction phase may be responsible for the excessive repetition present in music of all cultures. As listeners, we are rewarded when our expectations are met. The satisfaction we feel when a piece of music predictably arrives at the tonic is a product of the prediction effect. Similarly, the reoccurrence of the downbeat is also gratifying because of this stage (Huron 2008). In this way, I believe the prediction stage is responsible for our experience of closure. I will expand on this argument momentarily. Simplicity and repetition are both important tools for composers. Knowing what to expect in a song through schema-driven expectations is important for bringing an audience into a piece of music, things start getting more interesting and complicated when those expectations are suddenly or subtly thwarted. ! ! The reaction stage comes into effect the moment we realize our expectations were wrong. Any time that our expectations prove false, our immediate reaction is negatively valenced (Huron, 2008). According to the research of LeDoux et al. (1997), The information from our auditory senses delivering this bad news goes straight from the thalamus to the


Amygdala (Huron, 2008). Among other functions such as storing memories associated with emotional experiences, the Amygdala mediates the affective significance given to stimuli from our sensory organs. When our expectations fail us, the amygdala alerts the PAG (Periaqueductal gray), our central defense system famously responsible for the fight, flight, and freeze reactions (Huron, 2008). This initial fear response is part of the physiological arousal response we have while listening to music. The negatively valenced reaction we have serves the same function as the positive effect from correct expectations, it helps us to learn from our mistakes in order to avoid the negative feeling associated with failure. The reaction stage is immediate, innate, and always negatively valenced. In the non-musical world it is an important mechanism for survival. The ability to react to an unanticipated and potentially threatening situation has played a key role in our evolution, and even though we know music to be nonthreatening, Huron (2008) argues that this mechanism is too important for our survival as a species to ever be manually shut off or ignored. ! ! So how is it that we have come to enjoy and even crave violations of expectation, especially in music? In Huron’s model, he offers his own “contrastive valence” theory. In the final appraisal stage of expectation, occurring typically one to two seconds after the event in question, listeners reflects on the accuracy of their expectations. If expectations were met exactly and the prediction stage was triggered, the event has already produced a positively valenced feeling which has reinforced the responsible schema. In this scenario our expectations may be formed, tested, and confirmed in working memory, outside the small bubble of our conscious awareness, and though we feel the positive reinforcement we may be unaware of its exact cause. This may lead listeners to misattribute musical notes with qualities that are really a result of unconscious expectations. However, when expectations fail and the reaction stage is triggered, the flow of information is actually split into two streams as it leaves the thalamus. As previously mentioned, the immediate reaction response travels to through the amygdala to the PAG. The slower appraisal response takes a detour through the cerebral cortex, where the overall situation is assessed with regard to the consequence of our false expectations. If our appraisal stage deems the fear reaction we initially had unfit for the situation, our cerebral cortex will override the reaction response triggering a positively valenced appraisal response (Huron, 2008). The level of contrast between our initial negative reaction and subsequent positive appraisal causes a pleasurable feeling. ! ! Huron sees this contrastive valence reevaluation as the primary cause of emotion through expectation. Our original negative and primal-like arousal is analyzed and reversed almost simultaneously by the cognitive appraisal response. According to Huron’s model, the reaction response increases our attention and arousal, the appraisal response assesses the situation and reinterprets the expectational folly as a non-threatening and exciting emotional experience. Huron’s ITPRA model and contrastive valence theory are in line with Barrett’s (2005) findings on emotion in music. Huron (2008) believes that musical emotions are the result of physiological stimulation, caused by expectation. Correctly predicted changes in musical parameters create a dopamine-fueled pleasurable sensation. Incorrect expectations cause an


initial fear response as the PAG triggers an increase in arousal. During the appraisal stage, the results of our expectation are cognitively assessed, resulting in a valanced emotional response. ! ! Huron’s model solidifies the assertion that the emotional reactions we have to music are the result of changes in our arousal and attention. My research conclusively shows that these changes in arousal and attention are the result of our expectations to music. Incorrect expectations create tension, while correctly predicted changes in musical parameters trigger release. When our expectations are false, music can surprise us. in order to look closer at this special case of false expectation, we will turn our attention towards Margulis’ (2005) theory that gives insightful details and shades to our experience of surprise. !

! ! SURPRISE! !

! Elizabeth Margulis (2005) has developed a theory for musical expectation that focuses specifically on melody, called the Melodic Tension Theory. She developed her theory as a way to mathematically compute the expectedness or entropy of each note in a melody, in order to predict the emotional reaction a listener may have to a given melody. Her melodic tension theory breaks surprise up into a three-tiered scale, responsible for different kinds of subjective qualia experienced by a listener. Margulis’ theory compliments Huron’s ITPRA framework well, adding complexity to his stages of expectation in order to account for the different physiological feelings associated with surprise. The first type of tension, in which the listener correctly predicts a musical event, is referred to as “expectation tension,” the equivalent to Huron’s prediction stage. On top of evoking pleasant feelings through predictive reward circuits involving the dACC, Margulis stipulates that expectation tension is associated with qualia of forward-moving directness in a melody (Margulis, 2005). When our expectations are fulfilled, it creates a feeling of comfortable movement and progress. ! ! The second kind of tension in Margulis’ theory is “denial tension”. This kind of tension occurs when an unlikely musical event unfolds instead of a more expected one. When this situation takes place, the low entropy of the occurrence is experienced as increased intentionality in a melody. When the melodic contour takes an unexpected turn, she claims that our physiological response creates the illusion that the melody is intentional, purposeful, or strong-willed. I believe denial tension may be responsible for the special emotive quality of music in minor scales. As I discussed earlier, Temeperely (2010) insists that minor scales are more affective because of the low entropy of their notes. This low entropy results in less expected events occurring rather than more typical ones. According to Margulis (2005), that means minor modes create a sensation of intentionality and strong-willed music. I find this to be an accurate description of the emotional qualia I associate with music in minor scales, especially the more rare melodic minor. ! ! The third type of tension in her model is “surprise tension.” When a melody suddenly goes in a direction that listeners could not have anticipated using any of their currently held schemas, it evokes surprise tension. According to her model, this type of tension results in


feelings of intensity and dynamism, stimulating increased arousal and closer attention from the listener. Surprise tension shares similarities with Huron’s reaction stage. This implies that Huron’s model refers specifically to complete surprise. By combining his model with Margulis’ we gain better insight into the nuanced and complex relationship between music and expectations. ! ! Huron (2008) also discusses in depth the increased arousal associated with surprise in music. From his lens of evolutionary causality, an unexpected surprise can be fatal for an organism unless it learns to quickly react with a fear response. Huron explains that this increased arousal in reaction to surprising musical events is associated with our fight, flight, and freeze functions of survival. In Huron’s discussion of surprise, he argues that it is capable of producing three kinds of involuntary reactions: frisson, laughter, and awe. He details the potential origins of each of these reactions, but here I will only discuss frisson, as it is often sighted as a unique and mysterious force of highly affective music. Frisson is essentially the raising of hairs on the body, or philoerection, used by mammals to appear larger and more intimidating to other creatures they are preparing to fight. Frisson can be induced in music by a sudden loud event, occurring during a modulation with a potentially unanticipated onset (Huron, 2008). The immediate response to this loud and surprising modulation is fear. As our bodies prepares to fight, the appraisal stage reinterprets the musical event as non-threatening, and through contrastive valence we experience the hair raising as pleasant. Frisson is an “exaptation”: a physiological response borrowed and repurposed during mammalian evolution. Originally mammals adapted philoerection for thermoregulation, using erected hairs to trap a layer of heat around the body. Later it was repurposed as a fear-based fight response because the raising of body hair makes mammals appear larger and more formidable (Huron, 2008). During musically-induced frisson, our hairs raise up in a fight response, yet we feel a sensation of cold due to philoerection’s original evolutionary purpose to keep mammals warm. There is also evidence that putting an audience in a colder environment improves the likelihood of evoking frisson (Huron, 2008).! ! If frisson is a product of surprise, how is it that music we have listened to many times can still produce this pleasant feeling? Huron suggests that while a listener may develop very strong veridical expectations for a specific piece of music, they will still hold onto more general schematic expectations about music at large. Even though we have a veridical prediction of the ensuing events, music that breaks our schematic expectations remains surprising to us. Expectations based on semantic memory are much more fixed than veridical ones (Huron, 2008). It takes a great deal of exposure to alter one’s schematic expectations. This is an example of Margulis’ denial tension at work. The expectations we hold from exposure to a particular piece of music are at odds with the general schematic expectations we have that are based on semantic memory instead of episodic memory. According to Margluis then, frisson also produces a feeling of increased intentionality and will. ! ! !

! !


CLOSURE!

!

! Our discussion so far has paid closer attention to false expectations than correct ones. Let us now focus our attention on correct expectations, which create release and closure. I will show the emotional importance of closure by relating it to expectation, which, as we have already seen, is responsible for our emotional music experience. As we discovered from Margulis’ (2005) model of melodic tension, expectation tension creates a sense of forward movement and progress in music. Our correct predictions create the illusion of forward motion in music. When we correctly predict an event, tension is immediately relieved (Huron, 2008). Here I will argue that this relief is felt strongest in moments of closure, when music returns to the tonic, or reaches an important hierarchical downbeat. ! ! Expectation

Definition

Pitch Proximity! (Boomsliter & Creed, 1979)

pitches tend to move by step

Step Inertia ! (Arden, 2003)

pitches tend to move by step in the same direction

Step Declination! (Vos & Troost, 1989)

pitches tend to move by step downwards in the same direction

Post-Skip Reversal! (Von Hippel and Huron, 2000)

leaps in pitch (usually upwards) tend to be followed by a step in the opposite direction (usually downwards)

Late-Phrase Declination! (Arden, 2003)

pitches tend to move by step downwards at the end of a phrase

Table 3: Western-Enculturated Melodic Expectations - information from Huron (2008; 2011)!

!

In Table 3 above, I have outlined the results that were previously concluded during our discussion of cultural expectations. While each of these melodic expectations are interesting and important in their own right, I would argue that they are indicative of something larger. Looking closer at the table above, let’s imagine each kind of expectation not as an individual phenomenon, but as a collective development towards a single schema. I have organized the table intentionally to highlight this relationship. Each successive element can be seen as an elaboration of the previous one. In this way, all of the schematic expectations above are related to a single phenomenon: closure. Musical closure may be indicated by a return to the tonic and/ or downbeat (Snyder, 2000). As the table above points out, closure may also be indicated by downward melodic motion. Like I mentioned before, this kind of motion is inherently linked with closure, because in speech downward motion (due to loss of air pressure) indicates the closing of a sentence. The hierarchical schemas we use to create our expectations in music actually depend on closure to signify changes in structural content (Neil, 1985).! !


!

The experience of closure in music is inherently linked to expectation. “To establish closure, especially at higher levels, we must have some basis for predicting what we think will come next. Although our predictions may be wrong, the very fact that we can have expectations creates a tension that carries us through a sequence and makes closure possible. These expectations may be based either on primitive grouping tendencies (natural closure), on learned patterns (tonal closure), or both” (Snyder, 2000). As Snyder points out, expectations create a sense of forward directness in music, a notion he shares with Margulis. Without our expectations, closing moments in music would pass us by suddenly and without warning.! ! Not only are expectations an important factor for creating the experience of closure, our expectations in music are also directly influenced by their proximity to a closing moment. “The effect of any particular (musical) deviant is a function of its position in the series. A deviant which might have only a slight effect at the beginning of a series, where expectation entertains a greater number of alternatives of approximately equal value, may have a powerful effect towards the end of a series, where expectation is more particular and where the probability of expectation is liable to be greater” (Meyer 1956). Meyer’s hypothesis has received considerable support since he first proposed this connection between expectation and closure. Let us return once more to language studies to examine a correlating theorem that I believe to be directly related to this experience of closure and expectancy. ! ! In 1987, William Marslen-Wilson proposed what is known as the “Cohort Model” of neurolinguistics. The model described the process of word-retrieval while reading or listening to a familiar language. Essentially, the model states that as listeners hears the first phoneme, or linguistic sound, of a word, they begin to access their lexical schema to derive what word will be spoken. As additional phonemes are uttered, the list of potential words the speaker could be saying shrinks until there is only one possible outcome. Huron (2011) gives a good example of this during his video lecture at Rice University on music and memory. He starts to speak a word, beginning by uttering only the phoneme “g.” He repeats the “g” sound, each time adding an additional phoneme to the word he is constructing. “‘G’…‘gl’…‘gli’…‘glim’…” At this point he stops. In the English language the only two words that could be inferred by this progression of sounds are glimpse or glimmer (he does not mention that other verb tenses such as glimpsed and glimmering are possible). ! ! Notice that at the beginning of Huron’s example there are literally thousands of equally likely words that begin with the sound “g.” By adding more information, the number of possible words is whittled down to a finite number of possibilities. This is essentially what Meyer (1956) believes to be true with musical expectations as well. The more auditory information we receive, the more specific our expectations become about what will happen. Logically then, our expectations are the most precise at the closure of a structural event. As we have seen, Huron’s (2008) research on statistical melodic expectations in Table 3 seems to confirm this theory as well. Our culturally held expectations point specifically towards closural moments in music, likely because expectation itself is linked to closure. Additionally, Huron’s (2008) ITPRA model provides support that closural moments in music create more tension. In Huron’s tension stage, he points out that our level of tension is equivalent to the level of expectedness of an event and


its perceived importance. Because moments of closure indicate large changes in musical content, they are very important in music. According to Meyer and the cohort model, because they occur at the end of a series of information, our expectations of closural moments are especially specific. I would argue that these two factors make moments of closure especially tension evoking, and therefore, especially emotional. When a closural event occurs as expected, the release of tension is greater and more satisfying than normal events. Manipulating a listener’s expectation during a moment of closure, by composing an unexpected event to occur over a likely one, will have a greater emotional effect. ! ! To summarize, we have looked at some of the leading models that address emotional affectation through the manipulation of expectations. Leonard Meyer (1956), influenced by J. T. MacCurdy (1925) and gestalt psychology that was blossoming at his time, applied expectationemotion research to the field of music. Meyer’s hypothesis proposes that tendencies or listening habits ingrained within us are constantly being brought up while listening to music. When these tendencies are blocked or temporarily withheld, it incites an emotional response. ! ! Considering the other research I have discussed so far in this study, Meyer correctly identifies that tendencies (or expectations) are a powerful source of emotion. Huron (2008) uses the large body of data that has developed since Meyer’s Emotion and Meaning in Music to update Meyer’s hypothesis and turn it into a detailed model. Huron’s ITPRA model splits the event of expectation into multiple stages. Each stage can produce a unique emotional response. In the “imagination” stage, by simply thinking about how we will feel in a situation we experience a taste of the emotion we expect to experience. In music, the imagination stage motivates us to listen to music and accept deferred gratification. In the “tension” stage, occurring the moment before we expect the event to happen, we raise our arousal and attention to precisely match the auditory information we expect to take in, in order to minimize energy expenditure. We experience this increased physiological/psychological preparation as tension. The “prediction” stage occurs when our expectations correctly predict the event in question. The result of the prediction stage is a positively valenced feeling, mediated by the dACC. The “reaction” stage, occurring when our expectations were wrong, produces an instinctive and immediate negative response, caused by the amygdala and PAG. In the “appraisal” stage, our analytical cerebral cortex assesses whether the result of our false expectation led to a dangerous or unwanted experience. In the case of musical expectation, the appraisal stage can reinterpret the experience of a false expectation as pleasant if the ensuing event meets our taste. Huron believes that the large contrast between our reaction response and appraisal response creates a positive experience through a process he calls “contrastive valence”. ! ! To augment Huron’s ITPRA theory, we looked briefly at Margulis’ (2005) theory of melodic tension. Her theory adds detail about the kinds of tension we can experience and the qualia they evoke. The experience of tension we have when we correctly predict a melodic event, known as “expectation tension,” evokes a feeling of forward motion and progress. “Denial tension,” which Margulis uses to label experiences where a low probability melodic event occurs over a likely one, creates a feeling of intention and strong will in music. During “surprise tension” when one fails to anticipate a melodic event all together, the result is a feeling of intensity and


increased attention toward melodic material, similar to Huron’s (2008) reaction stage. Huron also discusses surprise in greater depth in his book. He believes that musical surprise, in its most intense states, can evoke laughter, awe, and frisson. Huron claims that these experiences are evolutionary extensions of our flight, freeze and fight responses respectively. ! ! Finally, we looked at release and closure during music. I concluded that expectations have a unique relationship with closure. The more auditory information we receive from a song, the more precise our expectations become. This specificity, combined with the fact that closure coincides with important structural changes, means that the tension we feel at the onset of a closural moment is higher. Therefore, we experience more intense affective states during moments of closure, whether our expectations are fulfilled or denied. !

! ! ANALYSIS AND COMPOSITIONAL STRATEGIES ! ! !

So far we have examined how expectations and emotions are formed in music, and how musical expectations create physiological and emotional responses. In this section, I will bring together these three components and discuss compositional strategies that are based on the manipulation of expectations. The strategies I discuss here will target both biological and cultural expectations. For biological expectations, I will propose “memory tactics.” As we have seen, memory is crucially linked to our experience of expectations. We depend on our memory system for auditory feature extraction, perceptual binding, conscious experience, memorization, and schema formation (Snyder, 2000), (Huron, 2008). The memory tactics I discuss will target and disrupt each of these functions in order to affect our ability to form accurate expectations. By sabotaging memory processes, these tactics attempt to eliminate expectation from an audience’s listening experience. As we discussed above, expectations in music create a sense of forward motion, tension, release, and closure. Therefore, the memory tactics I discuss below will eliminate some of these experiences from music. “Removing memory and anticipation from the situation leaves us with nothing but the present to focus on. The ideal of such music would be not to engage long-term memory at all, a state approaching some kinds of meditation” (Snyder, 2000). As Snyder states, disrupting long-term memory processes can create an “in-the-moment” meditative state. We will see that removing short-term memory from music eliminates phrasing coherence and memorization. Impeding echoic memory targets our perceptual binding process, which can potentially increase tension in music. After we look at these memory tactics that target biological expectations, we will address the manipulation of cultural expectations using statistical tactics.!

! ! ! ! ! !


MEMORY TACTICS!

!

! Our temporal-based memory system that includes echoic, short-term and long-term memory has been optimized throughout humanity’s evolutionary process to help prepare us for the future, but it is far from a perfect system. Each of these memory processes contains limitations that can be targeted using compositional devices. First we will address echoic memory, which is responsible for both feature extraction and perceptual binding (Snyder, 2000). As we saw in Table 1, echoic memory only lasts about 1/16 of a second. During this short period of time, it identifies the pitch, timbre and loudness of incoming soundwaves (feature extraction), and communicates with long term memory to apply schematic labels to the sounds we hear (perceptual binding). A tactic that targets echoic memory must therefore inhibit our ability to form clear perceptual categories. This “anti-categorical” music must avoid the use of primary musical parameters such as notes, harmony, and rhythm. Anti-categorical music instead uses gliding pitches, quarter-tones, dissonance, and non-repeating rhythms. All of these techniques essentially target echoic memory. One American composer who writes music that could be described as “Anti-Categorical” is Gloria Coates. Her music consists almost exclusively of sliding string glissandi, which avoid the categorization of primary parameters. Coates’ music creates a viscerally perceptible sensation of tension. I believe this is due to the fact that her music sabotages echoic memory processes, and consequently our expectations. According to Huron’s (2008) ITPRA model, tension is caused by an inability to predict when an event will occur and what kind of event will transpire. When listening to Coates’ music, our arousal and attention remain stimulated as we are unable to predict when a change will occur or identify what that change may consist of using our primary parameter schemas. This suspended tension is interpreted as an unidentifiable emotionality since we are unable to attribute the tension to any specific parameter of the music. In this way, anti-categorical music is an effective strategy for manipulating expectations and exciting emotional reactions.! ! Short-term memory has a different set of limitations that affect our perception of the world. As previously mentioned, our short-term memory (STM) lasts 3-5 seconds on average and can process a maximum of 7-9 objects at once (Snyder, 2000). Compositional techniques can target both of these parameters to create perceptual confusion about musical continuity, patterns, and phrases. In order to form accurate expectations in music, we require that sounds be presented to us in a relatively strict format that conforms to our STM limitations. In a similar way, speech is easily prone to become unintelligible if a speaker does not follow a basic structure that requires words to be spoken at a regular pace. For this reason, both sentences and musical phrases are an average of 3-5 seconds long (Snyder, 2000). ! ! To manipulate this aspect of STM, a composer might separate each note in a melody by 10-12 seconds of silence. This would extend the onset of each note to a value larger than what is available in our STM, which would eliminate our ability to form relationships between the notes of a musical phrase. Without the ability to reference the notes that have occurred beforehand, music would not make logical sense. Imagine a speaker separating each word of a sentence by 15 seconds of silence. If the sentence requires us to relate adjective and verbs to


nouns that have already been spoken, it is crucial that we are able to remember those nouns. If words in a sentence are separated so that they stretch beyond the time limit of STM, a logical sentence becomes nonsense simply because we cannot remember enough of the words to relate them to each other. This same approach can be applied to musical phrases. Music that contains excessively long pauses will stretch musical phrases beyond the time limit of STM. This makes memorization and comprehension of musical material exceptionally difficult. Our expectations depend on the information from our STM, and without it our experience of music lacks coherence, and structure (Snyder, 2000). ! ! Aside from a time limit, our STM also has a capacity of 7-9 items at once. certainly one way to target this limit of our memory would be to give 7-9 musicians different pieces of music and have them play together simultaneously. There would be no way to comprehend the overlapping cacophony of noise or form any kind of expectations about it other than that hopefully it would stop or change before too long. However, there is more involved with this capacity limit and in order to create music which is designed to test this boundary of STM in more strategic and less chaotic way, we need consider exactly how STM processes information. Previously in this paper, we discussed that STM organizes auditory information we receive into easily digestible chunks, turning individual events into patterned sequences (Snyder, 2000), (Levitin, 2006). Our minds are constantly looking for ways to simplify what we are hearing in order to make it easier to take in. Using an example from Snyder, it would be very difficult for someone to quickly memorize this sequence: “cbdacadcbadbcdab”. However, if the letter are rearranged into “bacdbacdbaadbccd” it becomes easier to remember the letters, once you realize they are organized in groups of equal size that begin with “b” and end with “d”. This example exposes something important about STM. Our ability to take in complicated information is rapidly improved when we can identify a pattern of repetition within the information. Like I mentioned before, music is highly repetitive. This repetition helps us take in melodic figures and musical phrases, analyze them, and use them to form new expectations. Taking this into account, it is possible to disrupt STM functioning by creating a random sequence of musical events. Rather than using a multi-layered sonic cacophony to exceed the informational capacity of STM, creating music with random, non-repeating rhythms and intervals also targets this capacity. Some 20th century composers, such as John Cage, are famous for using dice and other random “chance-based” tools to develop what is known as aleatoric music. This kind of music inserts random variability into the writing process of notated music and should not be confused with “indeterminacy,” in which the composer gives some choice over to the musical performers by using inexact notational parameters. Aleatoric music avoids forming perceptible patterns in music, and in this way effects our short-term memory’s ability to process the information and form expectations. Similar to extreme length and silence in music, randomly organized sounds disrupt STM’s ability to organize information into a simplified structure which our conscious awareness can make sense. ! ! Lastly we will look at memory tactics that target our long-term memory, or LTM. As I have mentioned throughout, the purpose of memory is to prepare us for future experiences by learning from our past. if memory were designed with the purpose of recalling every detail of our


past with meticulous accuracy, it is unlikely that our expectations would be as vulnerable to manipulation. I believe that this would actually eliminate much of our emotional experience in music. Luckily, our LTM simplifies the experiences of our past. This facet of our memory can be targeted by utilizing musical phrases which are similar, but not necessarily identical (Snyder, 2000). When repeating musical events that are very similar, they are simplified as the enter LTM, becoming nearly identical. This confusion between the similarity of past and present events makes it difficult to apply accurate predictions. We expect most music to go somewhere because our expectations are goal/closure oriented, and, consequently, we prepare ourselves for changes in musical parameters. In repetitive music we form generalized expectations and are susceptible to discrete and nuanced changes. Minimalism, a 20th century genre developed in New York, applies this tactic of perceived similarity. After enough exposure to general repetition, I believe we give up on our goal-oriented expectations and enter a trance-like meditative state like the one I have quoted from Snyder (2000) above. Nuanced changes in minimalist music are crucial because they subtly throw off our expectations and keep us interested. Without any changes, we would become habituated to the music and lose interest. ! ! These subtle changes in minimalism may be described as a kind of structural nuance. Throughout this paper, I have focused exclusively on expectations that are created through compositionally notated means. In other words, “music” has been defined as an exact translation of a written score. In reality, this is clearly not the case. All music that we hear is invariably interpreted by the musician. I wish to take a brief moment to address this by discussing a different kind of nuance, the expressive nuance. ! ! When performers interpret a musical score, the music they play is filled with stylistic and expressive nuances that give character to their artistic voice. These nuances are widely regarded as a source of emotional power (Keil et al., 1994), (Snyder, 2000). Examining the research I have put forth above, it is easy to see that these emotive nuances get their power from expectation. As we saw with structural nuances, LTM remembers music in a simplified form. “We remember forms as ‘ideal types’ rather than as particular things” (Meyer, 1956). This is what Meyer refers to as a “weakening of shape,” which is a limitation of our long-term memory. Instead of remembering every nuanced detail, our memory system turns sounds into primary and secondary parameters, and in turn we remember sound in this categorical form (Snyder, 2000). When we form our expectations of music, we reach back into our LTM to access our schemas that are based on this categorical memorization. Consequently, nuances have the ability to surprise us even after many listens to the same performance of a song. “Recordings, which freeze the details of particular musical performances, can be listened to many times and continue to seem vital and interesting” (Snyder, 2000). Unless we are intimately familiar with a particular recorded performance of music, the nuanced details in the recording will continue to surprise us and challenge our expectations of the piece because our long term memory is not accurate enough to recall every detail. Juslin et al. (2008) would argue that “contagion” is also responsible for the power of expressive nuance. Recall that contagion refers to a process in which the listener interprets the emotional expression of music and mimics the feelings


internally. As I pointed out, these interpretations are formed using metaphoric connections that are based on schematic properties, and in this way contagion is also related to LTM. ! ! To summarize, each section of our temporal memory structure contains boundaries and limits that color our conscious experience of the world in which we live. Echoic memory is susceptible to anti-categorical music, which disrupts a listener’s ability to organize sound into primary parameter categories. This tactic can create unresolved tension. Short-term memory is limited by both time and capacity. Music that changes slowly or contains pauses of silence that stretch beyond 10-12 seconds becomes incoherent. Similarly, music that contains random nonrepeating notes and rhythms is confusing because it interferes with our short-term memory’s ability to organize musical information into simple structures that can be remembered. Longterm memory stores these simplified structures in a process that weakens the nuanced shape of a musical memory. These simplified memories are susceptible to both structural and expressive nuances, which can create emotional responses due to our overgeneralized schemas and expectations. Repetitive music has the ability to eliminate our sense of past and present, bringing us to a meditative state. However, too much repetition can lead to a habituation response, causing an audience to loose interest to music that offers no new information. !

! ! STATISTICAL TACTICS! !

! So far, we have discussed ways of tampering with the formation of accurate expectations. These tactics have focused on disrupting the formation of expectations by creating music that manipulates our perceptual and memory-forming processes. The next type of tactic we will look at focuses on schema sabotage rather than memory sabotage. I will refer to this type as “statistical tactics” because they are based on the statistical learning of culturally created musical norms. Rather than disrupt memory processes, these tactics depend on and exploit the schemas in our LTM. In this section, we will revisit different kinds of cultural expectations and examine ways to manipulate them and target a listener’s emotions using cultural-schematic expectations. In Table 4 below I have summarized the expectations we discussed earlier in the cultural expectation section. The effects of all these tactics can be analyzed using Huron’s (2008) ITPRA model of expectation and Magulis’ (2005) melodic tension model. I will briefly summarize each tactic, occasionally giving reference to the research we have discussed on emotion in music, as well as these models. !

! ! ! ! !


Expectation

Definition

Manipulation

Resting Expectations! (Huron, 2008)

F major root position chord, major scale, binary meter

Minor scales, chord inversion, tertiary or quintuple meter

Tonality

Music based around a tonal center, harmonic hierarchy, tonic to dominant relationship

Deceptive cadence, modulation, chromaticism, unique modal scales

Rhythmic Hierarchy

Periodic rhythm based on an emphasized downbeat

Rubato, syncopation, nonperiodic rhythm

Pitch Proximity! (Boomsliter & Creed, 1979)

Pitches tend to move by step

Leaping between pitches, gaps larger than major 3rds

Step Inertia ! (Arden, 2003)

Pitches tend to move by step in the same direction

Non-chord tone neighbor notes in opposite direction

Step Declination! (Vos & Troost, 1989)

Pitches tend to move by step downwards in the same direction

Rising repeatedly by step, minor or major 2nds

Post-Skip Reversal! (Von Hippel and Huron, 2000)

Leaps in pitch (usually upwards) tend to be followed by a step in the opposite direction (usually downwards)

Continuing in same direction by step after a leap in pitch

Late-Phrase Declination! (Arden, 2003)

Pitches tend to move by step downwards at the end of a phrase

Upward pitch motion during cadences/ phrase endings

Table 4: Statistical Tactics- information from Snyder, (2000), Huron, (2008; 2011), Temperely, (2010)!

!

Resting expectations are thought to be present in all Western-enculturated listeners. They are expectations that listeners have before a piece of music even begins. Huron (2008) derived these expectations by finding the musical events and parameters that are most common in the Western tradition. While these expectations may exist for many listeners, they are unfortunately of little consequence. As we have seen, the more information we receive while listening to a piece of music, the more precise our expectations become. As they become more precise, they are infused with more emotion-evoking arousal and attention. For these reasons, the expectations we have at the beginning of a piece of music don’t create much of an emotional response. In fact, I would argue that unless one applied all of the manipulation parameters above, possibly adding extreme loudness to that list, it would be difficult to manipulate someone’s expectations at the beginning of a piece and create a noticeable emotional response. ! ! The cultural schema of tonality has come up often in this paper, likely because it is one of the most ingrained schemas in the Western tradition. We spoke previously about using the deceptive cadence as a manipulation technique for tonality. I’ve added a number of other techniques in the table above that offer different ways to achieve a similar result. Statistical tactics such as modulation, chromaticism, and unique modal scales are strategies that target two elements of tonality, in order to dispel a listener’s expectations. These tactics attempt to !


confuse the listener’s notion of a tonal center, and introduce notes that are typically foreign to the perceived musical scale of the piece. As Temperely (2010) points out, unique notes in minor scales result in increased arousal and emotional response. To this, I would add the contributions of Margulis’s (2005) “melodic tension” theory. The unique low entropy of notes in unfamiliar scales can be seen as a form of denial tension. By using modulating and unique scales, composers can introduce notes that are unlikely in a given context. The use of less expected notes creates a quale of increased intentionality and will in the music. If composers enter strange enough harmonic territory, they may incite surprise tension, in which listeners have no way of predicting dramatically unexpected notes. ! ! Rhythmic hierarchy has also been discussed in detail. As previously stated, the downbeat can be considered a rhythmic tonic, around which rhythmic hierarchy is formed. For this reason, many of the manipulation tactics I have listed for this schema attempt to thwart our expectations in ways similar to the ones proposed for tonality. By moving focus away from the downbeat, composers can create a similar denial tension about when an accent will occur. As we have seen, denial tension is a form of Huron’s ITPRA reaction stage. Denial tension creates a physiological sensation triggered by the amygdala and PAG when our expectations are false. This negatively valenced response is conditioned by a contrastively valanced appraisal response as we reinterpret the unexpected and surprising accent as exciting and pleasant rather than threatening. ! ! The rest of the cultural expectations on the table above were covered in our analysis of cross-entropy schematic testing, as well as our discussion of closure in music. Here I have laid them out again with relatively straightforward manipulation tactics. I will briefly contextualize these cultural schemas in relation to material we have previously discussed in order to make the manipulation of these schemas more emotionally affective. Firstly, as I just mentioned, disrupting the schema of tonality works best by incorporating low entropy notes. This method can also enhance the manipulation of melodic schemas. While the main focus of these schemas is manipulation of melodic direction, they can be enhanced by additionally stepping and leaping to non-chord tones, or notes that are outside the currently perceived musical scale. As a compositional technique, this tactic is best used sparingly. Citing Temperely (2010) once more, in order for listeners to experience an unexpected occurrence, low entropy moments must be placed within a larger context of predictability. The overuse of non-chord tone leaps in unexpected directions may threaten a listener’s ability to establish cultural schemas. This may cause the notes to lose their emotional power. ! ! Secondly, as we have seen, unexpected occurrences that take place closer to moments of closure have the potential to evoke more tension than deviations at the beginnings of sections or phrases. For this reason, it is more effective to manipulate these culturally held melodic schemas towards the end of a phrase. Our expectations of closural events are more precise and of greater importance than other moments in a piece of music. Again, I would caution an overuse of this kind of manipulation. As powerful as denial tension and surprise tension may be, a healthy amount of expectation tension must also occur in music. That is, the prediction effect is also a powerful tool, inciting an immediate release of tension. Predictable


moments in music reinforce our schemas using a release of dopamine that is triggered by the dACC. While they may not cause an increase of arousal and attention, expectations that are formed and met outside the limited space of our conscious awareness are also powerfully affective, especially during moments of closure. !

! !

CONCLUSION ! ! ! ! Throughout the process of writing this thesis, I have learned a great deal from my research. By critically analyzing expectation, I have discovered many connections among music, memory, and cognition. My view on the importance of memory in music and conscious awareness has changed drastically. Without memory, expectation would not be possible. In fact, without memory, music would not be possible. Life “as we know it� would simply not exist. Every second of our lives would be unexpected, uncategorized, and new. Every sound would be foreign, and every image and object would signify nothing. ! ! Music without expectation may be possible, but based on the research I have done, it would not create an emotional experience. Our experience of musical emotion is connected to expectation in ways I had not previously anticipated or conceived. Although Juslin et al. (2008) cites that musical expectancy does not begin to form until the age of 5, the framework we develop for creating expectations begins the moment we take our first breath. The physiological sensations that we associate with musical emotions can even take place in utero, through mechanisms in the brainstem and cerebellum. I had previously assumed tension and release were an inevitable component of music, with the help of this research I now see their critical connection to expectation. ! ! As a composer, this wealth of new information is both exciting and and inspiring. How is it that the artistic mind seamlessly employs so many of these expectation-manipulating techniques? Despite a lack of knowledge about how the mind works, artists have an incredible propensity for understanding how the mind feels. Meyer (1956) puts it best when he explains that emotion and meaning in music are essentially different embodiments of the same experience. If you know what to listen for and can name the devices from which music is made, it becomes an incredibly meaningful experience. For listeners who lack the proper vernacular to interpret these devices, music is an incredibly emotive force. I was slightly worried that by picking apart music through such meticulous research, it would lose its mystery and aweinspiring qualities. I am happy to report that quite the opposite has occurred.! ! I expected my research to lead to new unanticipated compositional tactics, but what I found was quite the opposite. The compositional techniques that are laid out in my closing section are comprised of many commonly held practices and genres. What I find fascinating is how all of these techniques relate back to our expectations. I believe that seeing composition through this new lens will help me immensely as I pursue a career as a composer. I have also learned to put more trust in my instincts as an artist. This research has shown me how powerful and expressive my instincts can be. I see this new knowledge as a tool, a device I can use to


look “under the hood” of a musical piece and analyze its emotive power through expectation, not necessarily as a direct compositional aid. I hope to continue learning about memory and cognition in relation to sound. My thesis research has given me a new way of looking at sounds and my conscious experience, two subjects which I will continue to explore.!

! ! ! ! ! ! ! !

! Works Cited ! !

Barrett, Lisa Feldman. "Emotions Are Real." Emotion 12.3 (2012): 413-29. Web.!

!

Brown, J. W. "Learned Predictions of Error Likelihood in the Anterior Cingulate Cortex." Science 307.5712 (2005): 1118-121. Web.!

!

Bruce, Neil S., Et Al. "Expectation as a Factor in the Perception of Soundscapes." Euronoise 2009. Web.!

!

Bush, G. "Dorsal Anterior Cingulate Cortex: A Role in Reward-based Decision Making." Proceedings of the National Academy of Sciences 99.1 (2001): 523-28. Web.!

!

Bush, George, Phan Luu, and Michael I. Posner. "Cognitive and Emotional Influences in Anterior Cingulate Cortex." Trends in Cognitive Sciences 4.6 (2000): 215-22. Web.!

! Collins, Sean T. "Why Music Gives You The Chills." BuzzFeed. Sept. 2012. Web.! ! Hanslick, Eduard. The Beautiful in Music. New York: Liberal Arts, 1957. Print.! ! Huron, David. "Information Theory and Music." Ohio-State University. 2001. Web.! !

Huron, David. Sweet Anticipation: Music and the Psychology of Expectation. Cambridge, MA: MIT, 2006. Print.!

!

Huron, David. What Is a Musical Work? And Other Curiosities of Memory. Rice University, Shepherd School of Music, 20 July 2011. Web.!

!

Juslin, Patrik N., and Daniel Västfjäll. "Emotional Responses to Music: The Need to Consider Underlying Mechanisms." Behavioral and Brain Sciences 31.05 (2008). Web.!

!


Juslin, Patrik N., and John A. Sloboda. Handbook of Music and Emotion: Theory, Research, Applications. Oxford: Oxford UP, 2010. Print.!

!

Keil, Charles, and Steven Feld. Music Grooves: Essays and Dialogues. Chicago: U of Chicago, 1994. Print.!

!

Lang, Peter J. "The Varieties of Emotional Experience: A Meditation on James-Lange Theory." Psychological Review 101.2 (1994): 211-21. Web.!

!

Levitin, Daniel J. This Is Your Brain on Music: The Science of a Human Obsession. New York, NY: Dutton, 2006. Print.!

!

Margulis, Elizabeth Hellmuth. "A Model of Melodic Expectation." Music Perception 22.4 (2005): 663-714. Web.!

! Meyer, Leonard B. Emotion and Meaning in Music. Chicago: U of Chicago, 1956. Print.! !

Patel, Aniruddh D., and Joseph R. Daniele. "An Empirical Comparison of Rhythm in Language and Music." Cognition 87.1 (2003): B35-45. Web.!

!

Petrovic, Predrag, Et Al. "Placebo in Emotional Processing— Induced Expectations of Anxiety Relief Activate a Generalized Modulatory Network." Neuron 46.6 (2005): 957-69. Web.!

!

Schmuckler, and Marilyn G. Boltz. "Harmonic and Rhythmic Influences on Musical Expectancy." Perception & Psychophysics 56.3 (1994): 313-25. Web.!

! Snyder, Bob. Music and Memory: An Introduction. Cambridge, MA: MIT, 2000. Print.! ! Stevens, Michael. You Live in the Past. Perf. Vsauce. Youtube.com. 5 Feb. 2012. Web.! ! Temperley, David. Music and Probability. Cambridge, MA: MIT, 2010. Print.! !

Thaut, Michael. Rhythm, Music, and the Brain: Scientific Foundations and Clinical Applications. New York: Routledge, 2005. Print.!

!

Todd, Neil. "A Model of Expressive Timing in Tonal Music." Music Perception: An Interdisciplinary Journal 3.1 (1985): 33-57. Web.!

! Tye, Michael. "Qualia." Stanford University. 20 Aug. 1997. Web.! ! Zuckerman, Theodore. "Piano Pitches." EOM. 2007. Web.!


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.