Listening Experiment Documentation Anna Weisling - MUS7080 Word Count: 3,263 Table of Contents: 1. Introduction
1
2. Precedence
2
3. Hypothesis
4
4. Conceptualizing
4
4.1 Identifying Variables
4
4.2 Identifying Biases
4
4.3 Stimuli
4
5. Testing
5
5.1 Test Design and Creation
5
5.2 Performing the Test
6
6. Results
7
6.1 Gathering Results
7
6.2 Results Analysis
7
7. Conclusion
8
Appendix A
9
Appendix B
10
Appendix C
11
Appendix D
12
Bibliography
13
Group Roles
14
Galloping in the Streams of Perception: Pinpointing Divergences within Auditory Stream Grouping and Segregation Anna Weisling Summary This subjective listening experiment was an attempt to identify the point at which test subjects perceptually group and divide auditory streams based on tempo and semitone separation. 14 tests were performed using a modified version of the Schnupp, Nelken, King¹ interface in which the rate of tone alternation and pitch difference can be controlled independently of one another. Test subjects were asked to listen to a 2-minute sound file in which alternating tones (grouped in threes, hence the ʻgallopʼ) increased in both rate and pitch separation. Participants were polled at 30-second intervals and asked to record the time at which the melody (integrated stream) appeared to separate into two separate, unrelated tones (segregated streams). Our hypothesis was that both rate and semitone separation would need to be relatively high, with one or both of them at least above 5, to cause a perceptual separation. Furthermore, due to the rate/semitone increments, we posited that the majority of participants would mark the first two questions as ʻMelodyʼ and second two as ʻSeparate Tones.ʼ
1. Introduction The notion that humans perceive sounds in perceptual groups, or streams, is not new. Central to our test were the “Nonsimultaneous Grouping and Segregation: Streaming“ tests done by Schnupp, Nelken, and King.1 Their tests examined the average responses of ninety-one neurons in the brain of their test subjects, measuring the neural response to the A tone and B tone when they were separated by first one, and then 9 semitones. What they discovered was a much dampened neural response when tone B was separated by 9 semitones rather that 1. Our experiment,
1
Schnupp et al. 2011, p224-267
2
Moore 2003, p265
however, was designed to measure a more psychoacoustic variant of this, as we took measurements at four separate speed/semitone benchmarks as well as recording the point at which the participant perceived the change from grouped melody to separated streams. It is well established that sound is grouped perceptually in a number of ways: • Location: Sounds that reach our ears from different spacial locations are recognized as originating from different sources,2
Weisling - MUS7080 - Galloping in the Streams of Perception
• Similarity of Timbre and Pitch*: Sounds are separated based on frequency contents and closeness in pitch content and direction,3 • Proximity in Time*: Sounds with similar start- and end-times will be grouped together, as well as sounds that experience change together and sounds that happen in similar timeframes,³ • Previous Experience: Past experience a n d m e m o r y c a n i n fl u e n c e t h e perception and identification of stream segregation and grouping4. % *denotes an especially pertinent variable !
within the context of this experiment.
Within the context of this test, the divisions of rate and semitone introduce many variables: closer pitches even at high speeds will seem related to each other. Slower rates even with large pitch intervals will sound related.5 Itʼs at the point of both faster pacing and larger intervals that the streams begin to separate. !
2. Precedence The first listening tests designed specifically to test stream segregation based on frequency separation and rate were done in the early 1970s by Dr. Leo van Noorden at the Institute for Perceptual Research. These early tests saw two tones, F (for fixed) and V (for variable), played over the course of 80
3
Bregman, Pinker 1978!!
!
!
!
4
2
seconds. Tone V swept from a pitch located far above F to far below and back, and test subjects were asked to report whether they heard an integrated or segregated stream. This was, however, problematic, as listeners were able to hear both states at will. The test was then altered and subjects were asked to consciously try to hear the stream first as integrated the entire time, and then segregated. The points at which the listeners were unable to force their perception in this way were designated as the ʻTemporal Coherence Boundaryʼ and ʻFission Boundary,ʼ respectively. The results of this test confirmed that both rate and semitone separation have an effect on stream integration and segregation: at 150-milliseconds listeners were able to hear two streams if the distance between semitones was 4 or greater. Similarly, they were able to hear a single stream if the distance was below 13 semitones. 5 This test was expanded upon in 1985 by Anstis and Saida, who focused on the effects of speed rather than semitone interval distance. Using a system of voting (for integrated or segregated) subjects listened to a 30-second clip of tones alternating at increasing rates. Their results showed that at a rate of 4alternations-per-second, the “probability of hearing a single stream in a continuous alternation of high and low tones [falls]
3
Bregman 2001, p64-65, p58-60
Listeners can also be ʻtrainedʼ to hear audio in different ways, as demonstrated in the sine-wave speech experiments of Remez, et al. whereupon participants were exposed to the sentence “My dog Bingo ran around the wall” multiple times, first hearing only warbling tones, and then, with training, hearing the words. 5
Schnupp et al. 2010, p252!
!
!
!
Weisling - MUS7080 - Galloping in the Streams of Perception
linearly with the size of the frequencyseparation...in semitones.” 7 The concept of separated streams was further explored in the experiments of Bregman and Campbell. Bregman and Campbell showed that “a sequence of six notes will separate into two streams of three notes as the average frequency separation of the two groups of notes is increased.”6 Our test will attempt to pinpoint the average frequency and rate at which this split occurs. Stream segregation has been measured in many different ways, including: • Method of Adjustment7 : Test subjects themselves adjust a property of the test sequence (distance between notes, for example), • Proportion of Time⁷: Test subjects hold down one button for as long as they hear streams as integrated and push another when they hear it as segregated, • Rating Scale⁷: Test subjects rate on a numerical scale (1 to 5, for example), with one being totally integrated and 5 being totally segregated, • Drawing or Writing8: Test subjects are asked to draw or write what they hear. There are, of course, problems inherent in any technique used for testing. For example, asking test subjects to rate what they are hearing on a scale from 1 to 5 assumes that there are only 5 states at which a test subject could ʻfeelʼ their opinion. This method also allows for
6
Darwin, C.J. 1997, p327
7
Bregman 2001, p55-57
8
Heise and Miller
9
Bregman 2001, p83
3
several states of vacillation, whereas we wanted to test two discrete states (integrated or segregated). Asking participants to draw what they are hearing allows the test subjects themselves to introduce multiple variables into their answers and having them write what they hear requires them to have a certain level of knowledge about a relatively esoteric subject (auditory scene analysis). Our test uses a ʻMethod of Limits,ʼ whereby listeners signal discreet points at which they experience a change in stimuli. This discriminative method is an attempt to limit the ʻgranularityʼ of potential answers. It is clear from the tests of van Noorden, Campbell, Bregman, and Dannenbring, to name a few, that both rate and pitch separation play major roles within the integration and separation of streams. Though the amount of time needed to cause this shift has a large amount of data behind it, the pitch intervals needed remains more elusive. Furthermore, all of the aforementioned conclusions were reached under the category of monaural, non-complex (sine tone) stimuli. Experiments done by Donald Norman using binaural stimuli confirm that “dichotic presentations of the two tones could never lead to perceptual continuity because the tones could never be attended to simultaneously.”9 It was with these factors in mind that we created an experiment that increased rate and interval evenly over the course of the test using monaural sine tones.
4
Weisling - MUS7080 - Galloping in the Streams of Perception
all tests in order to ensure consistency, • Performing the test with individuals with varying levels of musical experience, ranging from ʻextremely experiencedʼ (e.g. members of the SARC Masterʼs class, musicians) to ʻinexperiencedʼ (non-musicians), • Ensuring that all participants have, to the best of their knowledge, no physical (e.g. hearing loss) or psychological (e.g. tinnitus) impairments related to hearing that might effect the test, • Performing the test only on subjects in good physical and mental health (not feeling ill, not overly tired).
3. Hypothesis The goal of this study was one of subjective segregation or integration based on a pre-determined increasing of rate and semitone distance. It was our hypothesis that, given the fixed audio file, respondents would pinpoint the time at which they felt the auditory stream of a three-note melody became two separate, unrelated tones to be somewhere between a semitone distance of 4 and 7, and rate of 8 and 11. This corresponds to a time between :48 seconds and 1:26. See Appendix A. ! % % % %
Hypothesis (H₁): Respondents will mark the point at which the integrated stream becomes segregated between :48 seconds and 1:26.
! ! % % %
Null Hypothesis (H₀): Respondents will mark the point at which the integrated stream becomes segregated between: 00-:47 seconds or 1:27-2:00.
!
Potential biases include: • Sampling bias: It is clear that the majority of test subjects will have at least a working knowledge of auditory scene analysis due to the availability of MA students, • Observer-expectancy bias: Because the test is designed and run by the same individuals, there is a chance that the researchersʼ goals will influence the participants selections or experiences, • Response bias: Because many individuals taking this test are also running listening experiments of their own, they may be inclined to answer according to testersʼ needs.
4. Conceptualizing !
4.1 Identifying Variables
Several potential problems were pinpointed when designing this listening test, including: • Using the ʻgallopingʼ test rather than the straight test to increase likelihood of test subject hearing the melody, • Having test subjects watch a clock can divert/break their attention, • Using the same set of headphones, computer, recording, and space for
4.2 Identifying Biases
% %
4.3 Stimuli
This test was designed using monaural sine tones presented over the course of 2 minutes with increasing frequency separation and rate of onset.
Weisling - MUS7080 - Galloping in the Streams of Perception
5
5. Testing ! !
5.1 Test Design and Creation
The first step of this listening test required that a sound file be made using the ʻGalloping Streamʼ interface available on the Auditory Neuroscience website.10 Initially a test was created using 4 distinct places of measurement spread equally across the available variables (Rate: 5 Semitone: 1, Rate: 10 Semitone: 4, Rate: 15 Semitone: 7, Rate: 20 Semitone: 10 11). However, an initial testing of this system revealed that these very clear, distinct changes were not only distracting but also revealing as to the nature of the test. It was decided that a gradual ramping up of values was necessary in order to maintain continuity and provide the best possible results.
Original Schnupp Module, used for this test
By recording a live run-through of this module, from Rate: 1 Semitone: 1 all the way to Rate: 20 Semitone: 10, a more uninterrupted test was made possible. It was then necessary to manually edit the wave file not only to ensure complete continuity between tests, but also in order
10 11
to record the division of rate and semitone changes. A complete breakdown of these figures can be found in Appendix A. The next step in the creation of this test
Editing of recorded test with separations at each change of rate and semitone interval
was to design an interface in which to both run the sound file as well as collect data. A basic layout was created in a word document with the same question posed at :30 second intervals: Do you hear a Melody, or Two Unrelated Tones? An Unsure option was also included. Jargon such as ʻharmony,ʼ ʻsegregation,ʼ and ʻstreamsʼ were avoided to ensure clarity to those test participants who might not be familiar with musical and psychoacoustic terminology. There was also a space at the bottom of the page in which to record the moment (in minutes and seconds) at which the participant felt the Melody became Two Unrelated To n e s . S e e A p p e n d i x B f o r t h i s document.
Schnupp, J, Nelken, E, King, A. (2010).
It is worth noting that the original interface is capable of reaching 12 semitones. 10 semitones were chosen simply for the sake of ease of division within the 20 available rate options.
Weisling - MUS7080 - Galloping in the Streams of Perception
It was apparent immediately that a paper version of this test, though convenient for record-keeping, posed several problems: • A clock would have to be designed that was entirely consistent for each test, as no tester could start a watch flawlessly. • Test subjects would have to divert their attention multiple times in order to see both the clock and the questionnaire • Subjects would be responsible for remembering at what time to record their answers, as no other cues could be employed (headphones prevent auditory cues, and the visual system is already spread across multiple platforms (see second problem). In order to remedy these issues, a testing platform would have to be designed that could run the audio file in perfect synchronicity with a clock, visually cue participants as to when they must record their answers, and, preferably, record the results and eliminate the need for multiple platforms. The software Max/MSP was selected for itʼs flexibility and visual interface, and created to model the original draft of the test. !
5.2 Performing The Test
6
compact design. Participants are presented with the following instructions: • You will hear a series of alternating tones. • Please select which of the options above that you hear (melody, two unrelated tones, or unsure) when you see the circle flash. • Click the red box at the point at which your opinion changes, if any.
Listening experiment interface
Once participants verbally acknowledge that they understand the procedures the test is started, and they log their answers via nominal checkboxes each time they see the circle flash. The software is programmed to signal the participant to check a box at :15, :45, 1:15, and 1:45, and each respondentʼs selection is recorded to a text file at these 30-second intervals. The exact time that the red box is selected is also reported in minutes and seconds.
An interface was designed to simultaneously present instructions, run the judgement-based listening test, and record participant results. See Appendix C for the finished patch as well as patching details. Max/MSP allowed for the construction of a very compact, very consistent mode of testing, with minimal distractions and
Selections recorded to text
Weisling - MUS7080 - Galloping in the Streams of Perception
6. Results ! ! 6.1 Gathering Results
7
A further examination of the test data is as follows:
!
Rate: • Mean: 10.64286 • Median: 10 • Mode: 9, 10 • Range: 14 • Standard Deviation: 3.499 • Variance: 12.247
!
Semitone: • Mean: 5 • Median: 5 • Mode: 5, 6 • Range: 7 • Standard Deviation: 1.664 • Variance: 2.769
Participants were allowed to ask any questions before and after the test, but once the test had started they were not allowed to stop or restart.12 Results were collected after each test both digitally (via the Max patch) and physically (recorded onto paper). The results were then entered into a spreadsheet, and double-checked against both collected databases. %
6.2 Results Analysis
The results of these 14 tests were as follows: • Average Separation Rate: 10.64 • Averate Semitone Distance: 5 • Average Time of Indicated Separation: 1m 5s Within the Indicated Time of Separation: • Mean: 64.571s (1:08) • Median: 62s (1:02) • Range: 69s (1:09) • Standard Deviation: 17.256s • Variance: 297.80
Please see Appendix D for raw data. 12
Only one participant had the need to re-test, as he had mistakenly turned the test off mid-way through. His initial results were thrown out.
Weisling - MUS7080 - Galloping in the Streams of Perception
8
%
7. Conclusion The results we collected fall well within the scope of our posited hypothesis. Like the Bregman and Campbell study, our test subjects found the melody of three tones separated into two separate streams of tones at a point of increased frequency separation and rate, at an average time of 1:08. Referencing the graph in Appendix A, we can see that this point falls at a semitone distance of 5 and a rate of 11–roughly the half-way point for both variables. It is worth noting that several respondents experienced a return to a three tone melody at a point far after the time at which they heard the streams separate. This is reminiscent of the Bethell-Fox study, which demonstrated that, when
13
Darwin, Bethell-Fox 1977, p666
faced with a polytonal series of pulses listeners will hear differing results over the course of several cycles.13 Though the data we gathered fell well within the predicted range of results, it was fascinating to observe the variance from subject to subject. Three separate respondents reported hearing a melody at all four benchmarks of the test, indicating that, though they marked a point at which the streams appeared to segregate, they experienced a reintegration after the 1:45 mark. Although this is a phenomenon that deserves further exploration, it is unfortunately beyond the scope of the results of our particular test.
Weisling - MUS7080 - Galloping in the Streams of Perception
9
Appendix A Breakdown of Changes in Rate and Semitone Intervals over Time
Weisling - MUS7080 - Galloping in the Streams of Perception
Appendix B Original Draft of Listening Test
10
Weisling - MUS7080 - Galloping in the Streams of Perception
Appendix C Max/MSP Patch - Finished Interface and Construction
11
Weisling - MUS7080 - Galloping in the Streams of Perception
Appendix D Raw Data
12
Weisling - MUS7080 - Galloping in the Streams of Perception
Bibliography Bregman, A. (2001). Sequential Integration. Auditory Scene Analysis: The %Perceptual % Organization of Sound. 4th ed. Massachusetts: The MIT Press. 48-97. Darwin, C.J. and Bethell-Fox, C.E. (1977) Pitch continuity and speech source % attribution. Journal of Experimental Psychology: Human Perception and ! Performance. 3, 665-672. Darwin, C.J.. (1997). Auditory Grouping. Trends in Cognitive Sciences. 1 (9), 327-333. Moore, B (2003). An Introduction to the Psychology of Hearing. 5th ed. San Diego: % Academic Press. Schnupp, J, Nelken, E, King, A. (2010). Streaming in the Galloping Rhythm Paradigm. % Available: http://mustelid.physiol.ox.ac.uk/drupal/?q=topics/streaming-galloping% rhythm-paradigm. Last accessed 15th Jan 2012. Thompson, S. K., Carlyon, R. P., & Cusack, R. (2011, April 11). An Objective % Measurement of the Build-Up of Auditory Streaming and of Its Modulation by % Attention. Journal of Experimental Psychology: Human Perception and % Performance. Advance online publication. doi: 10.1037/a0021925
13