Update378454

Page 1

AUTHOR QUERY FORM

Journal title:

UPDATE

Article Number:

378454

Dear Author/Editor, Greetings, and thank you for publishing with SAGE. Your article has been copyedited, and we have a few queries for you. Please respond to these queries when you submit your changes to the Production Editor. Thank you for your time and effort. Please assist us by clarifying the following queries: No

1 2 3 4 5 6 7 8 9 10

Query

Please provide a minimum of 6 keywords for the article. Please provide complete reference details for Klinger, Campbell, & Goolsby, 1998 or allow us to delete the citation. Please provide complete reference details for Phillips, Aitchison, & Nompula, 2002 or allow us to delete the citation. Please provide complete reference details for Rutkowski, Campbell, & Miller, 2001 or allow us to delete the citation. Please provide complete reference details for Rutkowski & Miller, 2003 or allow us to delete the citation. Please provide complete reference details for Rutkowski & Miller, 2001 or allow us to delete the citation. Please provide complete reference details for Rutkowski & Miller, 2003 or allow us to delete the citation. Please verify if the declaration of conflicting interests statement is accurate and correct. Please verify if the funding statement is accurate and correct. Welch, 1994 is not cited in text. Please indicate where a citation should appear or allow us to delete the reference.


How Can Elementary Teachers Measure Singing Voice Achievement? A Critical Review of Assessments, 1994-2009

Update XX(X) 1­–8 © MENC: The National Association for Music Education 2010 Reprints and permission: http://www. sagepub.com/journalsPermissions.nav DOI: 10.1177/8755123310378454 http://update.sagepub.com

Karen Salvador1

Abstract The first content standard of the National Standards for Music Education requires that students sing, alone and with others, a varied repertoire of music. Although state and district elementary music curricula vary widely, many are based on the National Standards for Music Education and therefore include singing as a primary content area and method of teaching and learning music. However, classroom assessments of singing voice achievement and development vary widely, and information about reliability and validity of these assessments is rarely reported. The purpose of this article was to identify and discuss measurements of singing voice achievement for elementary aged students that have been used in research studies from dissertations and refereed music education journals since the publication of the National Standards for Music Education in 1994. The author describes each measurement tool, discusses its validity and reliability, and evaluates the practicality of each measure for classroom use by elementary general music teachers. Finally, recommendations for how one of these measures might be used to improve instruction in an elementary music classroom have been made. Keywords [AQ: 1] From the time when Lowell Mason convinced the Boston School Board to offer vocal music, singing has been a cornerstone of public school music curricula (Mark & Gary, 1992). The first content standard of the National Standards for Music Education requires that students sing, alone and with others, a varied repertoire of music. The elementary achievement standard states that by fourth grade, students should be able to “sing independently, on pitch and in rhythm, with appropriate timbre, diction, and posture, and maintain a steady tempo” (Consortium of National Arts Education Associations, 1994). Although state and district elementary music curricula vary widely, many are based on the National Standards for Music Education and therefore include singing as a primary content area and method of teaching and learning music. According to MENC: the National Association for Music Education’s Performance Standards for Music Grades PreK–12: Strategies and Benchmarks for Assessing Progress Toward the National Standards (1996), the purpose of assessment is to improve learning. Furthermore, reliable, valid assessment in music is not only possible but also necessary. When discussing assessments, reliability refers to the consistency of results, whereas validity is the degree to which an assessment measures the skill or ability it was designed to measure. If an assessment is reliable, the same child would score nearly the same if the test was given again or if another judge evaluated her performance. Reliability is necessary for an assessment to be considered valid. For a

test of singing voice achievement to be deemed valid, it must actually measure the child’s ability to sing. Given the importance of singing in elementary music curricula, one may expect widespread reliable, valid assessment of students’ vocal development and singing achievement. In a survey of 200 elementary music teachers in Michigan, the 36 responding teachers reported low rates of singing assessment—from 40% in kindergarten to 13% in fourth grade (Talley, 2005). Those who assessed singing reported measuring singing voice development and pitch matching, and used teacher-designed rating scales or rubrics. Talley’s survey asked what areas teachers assessed at which grade levels and the methods they used, but did not address the reliability or validity of the measurements. Shih (1997) undertook a similar survey of fifth grade general music teachers in Texas. Of 59 respondents, approximately 63% reported “always” or “often” teaching singing objectives, and 33% “sometimes” or “rarely” addressed singing in the music classroom. Approximately 93% of respondents stated they assessed singing voice achievement. However, 65.17% of these assessments were “checking group performance,” which did not seem sufficient to provide diagnostic information for 1

Michigan State University, East Lansing, MI, USA

Corresponding Author: Karen Salvador, 1205 Woodbine Avenue, Lansing, MI 48910, USA Email: huberkar@msu.edu


2 teachers to help individual students increase their singing achievement (Shih, 1997). Although 26.6% of teachers in this study reported “checking individual performance,” Shih did not ask what measures teachers used to “check” the individual performances. Hepworth-Osiowy (2004) surveyed 190 elementary music teachers in Winnipeg, Canada, concerning assessment in their classrooms. Her 88 respondents indicated widespread use of a variety of assessment tools, and stated that assessment was most valuable when it informed instruction. The data did not present assessments by content area. Hepworth-Osiowy (2004) concluded, In order to assess music students successfully, teachers can utilize a variety of tools ranging from traditional testing, checklists, and rating scales to alternative assessment techniques including self and peer assessments, portfolios, and many teachercreated assessments. To experience effective assessment, music teachers need to find a balance of strategies that they can incorporate into their individual programs, which reflect student achievement of curricular goals and objectives. (p. 115) Information concerning the reliability and validity of classroom assessments was noticeably absent from the above studies. The purpose of this article was to identify and discuss measurements of singing voice achievement for elementary aged students that have been used in res­ earch studies from dissertations and refereed music education journals since the publication of the National Standards for Music Education in 1994. I described each measurement tool, discussed its validity and reliability, and evaluated the practicality of each measure for classroom use by elementary general music teachers. To conclude, I made recommendations for how one of these measures might be used to improve instruction in an elementary music classroom.

Review of Assessments In the music education literature, tools for assessment of singing voice achievement fell into the following categories: (a) acoustic measurements, (b) researcher-as-acoustic measurement tool, (c) researcher-designed measures, and (d) previously published scales. Acoustic measurements evaluated singing voice achievement—specifically pitch accuracy—based on the physical properties of sung res­ ponses (i.e., the frequency in hertz). Researcher-as-acoustic measurement tool referred to studies in which researchers measured pitch accuracy by using their perceptual skills as trained music teachers. Researcher-designed measures were used in a number of studies with regard to a variety

Update XX(X) of aspects of singing voice development. Finally, several researchers used previously published measures of singing voice achievement.

Acoustic Measurements In studies based on acoustic measurements, researchers relied on computers to assess the accuracy of children’s singing. Most of these studies used hardware and software manufactured by Kay Elemetrics for use in speech-language therapy (Cooper, 1995; Martinez-Castillo & Sotillo, 2008; Western, 2002). In Cooper (1995), 169 first- to fifthgrade children echo-sang patterns in response to a stimulus tape of a child’s voice. Singing was recorded with a throat-contact microphone and analyzed with a Visi-pitch (Kay Elemetrics, model 6087PC, Pine Brook, NJ), yielding a vocal pitch accuracy (VPA) score. Although computerized measurements are inherently reliable, VPA scores may not be a valid measure of singing achievement, because they only revealed an absolute value of how far the sung pitches differed from the criterion pitch. VPA scores represent an average amount of pitch divergence, so this method would not differentiate between a child who sang the same pitch in the middle of the range of a pattern and a child who replicated a pattern exactly but on different pitches. Western (2002) used a real-time pitch program, model 5121 of the Computer Speech Laboratory 4100 (Kay Elemetrics) to measure pitch accuracy in singing. This software package allowed children to see the contour of their singing and speaking, although their performance was still recorded as a single accuracy score, which measured absolute difference from criterion pitches. Price, Yarbrough, Jones, and Moore (1994) also used computerized methods to assess pitch accuracy as a measure of singing achievement. In this study, inaccurate singers echo-sang pairs of pitches and single pitches into a MacRecorder. To control for differences of pitch on attack and decay, the researchers looped the most stable or sustained part of each sung response (typically the middle third). This looped pitch was measured with a Korg Model AT-12 chromatic tuner. Using this method, one experimenter analyzed all stimuli and responses, and an independent interreliability observer analyzed 25%. The judges agreed on 76% of responses, and were within an eighth-step increment on 97% of pitches. This method of recording children on a computer, looping the most stable part of the pitch response, and then confirming the accuracy of the looped sample with a tuner was reliable but lacked validity as a measure of singing achievement. This approach also lacked practicality for the classroom setting: It would be too time consuming for teachers, and did not represent an authentic or holistic measure of how well a child actually matched pitch or sang. Looping the


3

Salvador most stable portion of a child’s response masked inaccuracies that occurred on attack and decay, which are important to a teacher’s evaluation of singing voice achievement. Assessment using computerized measurement of pitch accuracy may not be optimal for classroom use by a general music teacher. On initial inspection, a computerized system seemed viable—accurate and convenient. Children could go sing one at a time in response to stimulus recordings, and the computer would impartially record the student’s pitch accuracy. In addition, associated game packages that provided immediate visual feedback— enabling a child to see the contour of his or her vocalizations in real time—presented exciting potential for the remediation of inaccurate singing. However, these packages were expensive. More important, although the computers were highly reliable, VPA scores may not be valid measures of singing achievement because they did not reveal more information than an absolute measure of difference from a set of criterion pitches. VPA did not report important information concerning transposition, contour, register, breathing, tone quality, and so on. To improve instruction, music teachers would benefit from more diagnostic information than computerized methods provided.

Researcher-as-Acoustic Measurement Tool Some researchers used their perceptual skills to test pitch accuracy as a measure of singing voice achievement (Brophy, 1997; Green, 1994; Klinger, Campbell, & Goolsby, 1998[AQ: 2]; Moore, 1994; Moore, Fyk, Frega, & Brotons, 1995; Muse, 1994; Phillips & Aitchison, 1997a, 1997b; Phillips, Aitchison, & Nompula, 2002 [AQ: 3]). In these studies, pitch accuracy was among the criteria or the sole criterion for the assessment of singing voice achievement, and humans were used as the measurement tool. Responses were not rated with a scale (as in the section “Researcher-Developed Measures”), but were simply marked as accurate or inaccurate. The reliability and validity of these perceptual measurements were dependant on research design. Studies in which multiple, trained judges blindly scored randomized recordings of student performances were more reliable and valid than studies in which a child sang for or with a single researcher who wrote perceptions in real time. For example, Moore et al. (1995) recorded 120 children echosinging intervals with a stimulus CD. Two judges listened to the recordings and notated responses—not merely whether a response was correct or incorrect but also what pitches were actually sung. Accurate intervals were judged as accurate regardless of pitch. One judge scored all responses, and another scored 25%. The judges agreed 85% of the time on the accuracy of sung responses. This

method seemed rigorous and reliable as a method of measuring interval-singing achievement. In contrast, Phillips and Aitchison (1997a) measured the pitch accuracy of 269 children singing four tonal patterns (5-3-1, 5-8-7, 6-4-2, 3-5-1) echoing an electric keyboard. The researcher marked correct pitch or incorrect pitch as kids sang (possible score of 12). The method did not specify if a response would be correct or incorrect if a child started on wrong pitch but still sang 5-3-1. The article did not indicate that examples were recorded for blind review, and the design did not incorporate intrajudge reliability testing or include other judges for interjudge reliability. Informal assessment of singing achievement in elementary general music classes may frequently involve “teacheras-acoustic measurement.” This is the case anytime a teacher uses a game song with a solo part (like Doggie, Doggie, Where’s Your Bone) and marks a plus (+) or a minus (-) in the gradebook if a child sings accurate pitches on the solo portion. Learning sequence activities (LSAs) in music learning theory are also evaluated using the teacher’s perceptual skills. LSAs are sets of tonal and rhythm instructional materials, which children either imitate or respond to in other ways, such as singing the resting tone, adding solfège, or creating a related sung or chanted answer. Although Brophy (1997) found that singing games were a fairly reliable measure of pitch accuracy (interjudge reliability of .73), he concluded that they are best used as part of a larger, multiple context assessment scheme. Teachers who listen to a student in context and attempt to record in the moment (or later) if singing was accurate are vulnerable to biases that reduce the reliability of these assessments. Although teachers are encouraged to find authentic methods of assessment, they must also ensure that assessments are as reliable and valid as possible. Perhaps, as Brophy suggested, teachers should combine perceptual evaluation of pitch accuracy in game songs (and LSAs) with other measures such as using rating scales on longer recorded examples of individual singing. Teachers could even experiment with blind ratings. Teachers could increase the validity of their testing methods by rating recordings once, letting them sit for a week, and rating them again to see how consistent the ratings were. Teachers could also swap recordings with another teacher to test for reliability in ratings.

Researcher-Developed Measures Several researchers designed their own rating scales to measure singing voice achievement (Gault, 2000; Guilbaut, 2004; Hornbach & Taggart, 2005; Marshall, 2002; Newlin, 2004; Phillips & Aitchison, 1997a, 1997b; Welch, Sergeant, & White, 1995/1996, 1997). These scales differed from “researcher-as-acoustic measurement tool” because they


4 Table 1. Researcher-Designed Scale (Hornbach & Taggart, 2005) 5 4 3 2 1

Child is nearly or totally accurate singer Child sings with some accuracy, beginning in the established key Child sings song with some accuracy, starting in a different key than established, or modulates within the song Child sings/chants melodic shape at significantly different pitch Child sings/chants song with a different melodic contour than the song

Update XX(X) Table 3. Boardman Scale (in Mathias, 1997) 7 6 5 4 3 2 1

Accurate matching of all tones in the pattern without hesitation The child slid into one or more of the pitches in the pattern Exact transposition of the pattern Maintained general contour of pattern, but sang incorrect intervals Maintained general direction of the pattern, but not the exact contour Responses that ignored the contour of pitches Child spoke rather than sang or did not respond at all

Table 2. Researcher-Designed Scale (Newlin, 2004) 5 4 3 2 1

Sings every pitch correctly, maintaining the given tonal center with precise intonation Sings every pitch correctly, maintaining the given tonal center, but lacks precise intonation Sings most pitches correctly, maintaining the given tonal center, but lacks precise intonation Sings most or all pitches correctly, but not within the given tonal center sings few pitches correctly OR sings with an inconsistent of changing tonal center OR uses a speaking voice

assessed more than simple pitch accuracy. These rating scales were usually continuous scales, such the scale in Table 1. In the study, this scale was used by three judges, and attained reliabilities ranging from .76 to .92. These reliabilities were not surprising, because the scale seemed constructed in a way that would be difficult to misinterpret. This scale seemed valid and could be used in the classroom in an authentic manner to evaluate any independent singing. Newlin’s Tonal Rating Scale (2004; see Table 2) seemed more vulnerable to disagreements on how to rate a given response. For example, how could a student sing every pitch “correctly” but “lack precise intonation (4)?” Where exactly is the line between “lack[ing] precise intonation” and an incorrect pitch? Newlin’s scale is also problematic for classroom use because rating “1” groups together singers with different needs, rendering the scale less useful as a diagnostic instrument for the improvement of instruction. That is, a student who “sings few pitches correctly” is ready for different modes of singing instruction than a child who “uses a speaking voice.” However, in the study, the scale achieved acceptable interjudge reliabilities (.785 to .878). I will not describe or evaluate every researcherdesigned scale. The above scales were representative of most studies, in which the researchers designed a single scale of some kind to measure singing voice achievement.

Researcher-designed scales could be adapted for classroom use, and may in fact be similar to the self-designed rating scales and rubrics that some teachers already use to assess singing achievement (Talley, 2005). However, in designing or adapting a scale for use in the classroom, teachers must carefully ensure (a) that the scale is valid for the measurement of singing achievement, (b) that the researchers achieved acceptable reliability using the scale and/or that they achieve acceptable reliability in their classroom, (c) that the criteria are unambiguous, (d) that the scale could be incorporated into the classroom setting in the most authentic way possible, and (e) that the scale provides sufficient diagnostic information to show what each child needs to learn next on the path to increased singing achievement.

Previously Published Scales Researchers also used rating scales developed and published by other researchers, including those by Boardman (1964; cited in Mathias 1997), Wurgler (1990), and Rutkowski (1990). Mathias (1997) rated singing excerpts from 78 first, third, and fifth graders using a scale developed by Boardman (Table 3). Among three judges rating unlabeled, randomized tapes of pre-, post-, and repeated posttests, 48% of ratings were in perfect agreement, and 80% of ratings were within one point. Although it was not discussed in Mathias’s (1997) article, some of the disagreement may have resulted from confusion about how to rate a child who exactly transposed the pattern (rating 5) but also slid into a pitch, or how much the child needed to “slide” to fall from a (7) to a (6). Furthermore, judges may have struggled to decide between “maintaining the general contour” (rating 4) and singing the correct direction (rating 3). Although the interrater reliabilities indicated that this scale was not as reliable as computerized acoustic measures of pitchmatching, it may be a more valid measure of singing achievement because it provided diagnostic information,


5

Salvador Table 4. Register Transition Categories (Wurgler, 1990)

Table 5. Singing Voice Development Measure (SVDM)

1

1 1.5

2 3

4

5

Sings only in one register; range is limited (no register transition on test songs or scales) Independently chooses only one register but can find pitches in second register with help (no register transition on test songs or scales) Uses two (or more) registers with marked breaks in production (can find pitches using a second register, but subject unable to sing continuously throughout the tested range on songs, C1 to E2. Thus, can make a register transition but the “lowest” level of transition skill on the scale) Uses two (or more) registers with ease, but must be reminded to change production (inconsistent performance, but register transition on some test songs and scales) Independently uses two (or more) registers with ease; no production problems related to range

2 2.5 3 3.5 4 4.5

based on an authentic sample of singing, that would help a teacher know what a child needed to learn in order to become a better singer. Despite this, classroom teachers may wish to use other scales with clearer criteria and higher reliabilities. McGraw (1996) used a rating scale devised by Wurgler in 1990 to assess children’s ability to negotiate register changes as a measurement of singing voice achievement. Wurgler’s register transition categories (cited in McGraw, 1996) are shown in Table 4. It was difficult to determine the reliability of this scale. McGraw (1996) rated students using Wurgler’s scale but did not perform intrajudge reliability calculations or have any other judges rate the performances for interjudge reliability. In her dissertation, Wurgler (1990) reported .70 interjudge reliabilities on “a representative sample of items” (p. 75). Wurgler’s scale provided unusually detailed information about the vocal abilities of children. However, the scale seemed designed for use by private voice teachers, with the expertise and the time to coach individuals during the assessment (see Categories 2 and 4 in Table 4). Wurgler seemed to presume an awareness of vocal pedagogy that may be absent in some elementary music teachers, who may not have studied singing enough to recognize register shifts or to evaluate ease of vocal production without additional training. To a teacher with the requisite training and enough time to coach individual students through the test songs and scales, this assessment could provide valuable diagnostic information to help increase singing voice achievement in individual students. Rutkowski’s Singing Voice Development Measure (SVDM) was the most frequently used published scale (Guerrini, 2002; Jaffurs, 2000; Lange, 1999; Levinowitz et al., 1998; Rutkowski, 1996; Rutkowski, Campbell, &

5

“Presinger” does not sing but chants the text “Inconsistent Speaking-Range Singer” sometimes chants, sometimes sustains tones, and exhibits some sensitivity to pitch but remains in the speaking voice range (usually A2 to C3) “Speaking-Range Singer” sustains tones and exhibits some sensitivity to pitch but remains in the speakingvoice range (usually A2 to C3) “Inconsistent Limited-Range Singer” wavers between speaking and singing voice and uses a limited range when in singing voice (usually up to F3). “Limited Range Singer” exhibits used of limited singing range (usually D3 to F3) “Inconsistent Initial Range Singer” sometimes only exhibits use of limited singing range, but other times exhibits use of initial singing range (usually D3 to A3) “Initial Range Singer” exhibits used of initial singing range (usually D3 to A3) “Inconsistent Singer” sometimes only exhibits use of initial singing range, but other times exhibits use of extended singing range (sings beyond the register lift—B3 flat and above) “Singer” exhibits use of extended singing range (sings beyond the register lift—B3 flat and above)

Miller, 2001[AQ: 4]; Rutkowski & Miller, 1994; Rutkowski & Miller, 2003[AQ: 5]). The SVDM was more standardized than other measures identified in this article. Specific test patterns, details concerning test administration, and information concerning the development of the test were described in Rutkowski’s work (e.g., Rutkowski & Miller, 2001[AQ: 6]) and replicated in the work of others (e.g., Jaffurs, 2000). SVDM consisted of the ratings found in Table 5, although some researchers chose to include only the whole-number ratings. Reliabilities in Rutkowski’s studies ranged from .80 to .99 (e.g., Rutkowski, 1996; Rutkowski & Miller, 1994, 2003[AQ: 7]). In 1998, Levinowitz et al. investigated the reliability of the SVDM for children in first to sixth grades, and reported composite reliabilities from six judges ranging from .950 to .984. As shown by the citations above, the SVDM was deemed valid to measure singing voice achievement by a number of researchers. SVDM was designed for use with a specific set of tonal material (a song in harmonic minor that is divided into tonal patterns for children to echo) but could also be used to evaluate singing achievement on intact songs (Guerrini, 2002). This scale may be well suited to classroom use for a general music teacher who wants to improve instruction. SVDM was practical and clear, and its reliability has been firmly established. In addition, it could be used to measure performance of any song with sufficient range (from D3-B3 and above). Furthermore, it provided diagnostic


6 information within a developmental framework that indicated what a child needed to learn next in order to increase singing achievement.

Discussion The purpose of this article was to review the literature on the measurement of singing voice achievement in elementary music settings to help music teachers find and/or design reliable, valid singing voice assessments for use in the classroom. This review revealed several types of singing voice assessments, including acoustic measurements, researcher-as-acoustic measurement tool, researcherdesigned measures, and previously published scales. Some measurements focused exclusively (and sometimes acontextually) on pitch-matching, whereas others used more holistic methods that described developmental stage, registration, or other features of a child’s singing voice. Rutkowski’s (1990) SVDM was studied most frequently by researchers. It is interesting to note that SVDM does not include measurement of pitch accuracy. Rutkowski developed her scale to identify the steps children go through on the path to achieving singing accuracy because she viewed singing to be a developmental skill that required time, context, and maturity. This viewpoint has been supported by additional research since 1990 that indicates singing accurately may be a matter of physical skill related to vocal production in addition to tonal awareness (Pfordresher & Brown, 2007; Phillips & Aitchison, 1997a; 1997b). That is, a child may “hear” the correct pitch but lack the singing skill to produce it. Singing pitches accurately is the endpoint of a developmental progression. Therefore, simply measuring pitch accuracy is not a sufficient measure of singing achievement to allow a teacher to improve instruction, which is the goal of assessment. Furthermore, a child’s ability to match single pitches or acontextual pairs of pitches may not be a valid indicator of singing voice achievement for singing songs or contextual tonal patterns. Research also indicated that any test of singing achievement must be a sufficiently wide range of pitches (at least D3 to B4) to show if a child can sing accurately in multiple registers to more accurately reflect singing voice development (e.g., Rutkowski, 1990; Wurgler, 1990). According to this review of literature, many elementary music teachers consider singing to be an important curricular area (Shih, 1997) and some teachers assess singing voice achievement or pitch matching (Shih, 1997; Talley, 2005). The National Standards for Music Education require not only that children would sing on pitch (and in rhythm) but also that they would sing with “appropriate timbre, diction, and posture” (Consortium of National Arts Education Associations, 1994). These aspects of

Update XX(X) singing were not measured by the researchers in any of the literature reviewed for this article. Perhaps researchers considered “appropriate” timbre, diction, and posture too subjective, or thought that these goals were secondary to the primary goal of singing accurately. This review was conceived under the premise that consistent use of valid assessment tools could be an important step toward helping students to sing accurately. However, the research revealed that many elementary music teachers do not assess singing voice achievement (Talley, 2005) and/or have negative feelings about assessment (Shih, 1997; Talley, 2005). In contrast, HepworthOsiowy (2004) reported that elementary music teachers in her study: . . . felt assessment is critical for giving direction to one’s teaching and helps teachers to establish direction for instructional planning. They said that assessment provides students with valuable feedback, that it is a way to track individual progress, and that it often motivates student learning. The fact that assessment provides teachers with an opportunity to identify the skills and concepts that need to be addressed was an issue of primary importance to most respondents, as was the idea that assessment was a bona fide method of evaluation programs and the effectiveness of individual teaching practices. (p. 94) To show how formal assessment could be used to improve singing instruction in an elementary music classroom, I have constructed the following fictional vignette. In it, I chose to use SVDM for two reasons: (a) based on my review, SVDM is the scale with the most research to support its use in the classroom (both in terms of the amount of the research as well as the strength of the research support for the reliability and validity of the measure) and (b) SVDM provides a developmental framework that guides instruction. A score on the SVDM essentially tells a teacher what the student needs to learn next. Last week, Mrs. Smith administered the SVDM to all her first grade general music students. Individual students came over to the piano to sing for her while other students worked at music centers. Mrs. Smith marked the SVDM scores in green next to each child’s name in her gradebook. Now, Mrs. Smith is preparing to teach each of these classes after lunch. While she eats, she reviews the SVDM scores. Because most children scored between 2.5 and 4,1 she plans to continue her daily use of vocal play (sirens, ghosts, etc.) to help children access


Salvador their singing voices. The songs she has chosen for movement activities and rhythm concepts are primarily within initial range (D3-A3) for the next month or so, to help those who are still singing in a limited range. Two of the songs she has selected have descants that the students who scored 4.5 or 5 could sing. Once a week, during music centers time, one of the centers will be “voice lessons” at the piano with Mrs. Smith. Based on SVDM scores, Mrs. Smith has created heterogeneous groups for centers time so that students can learn from singing with one another during voice lessons. She plans to use this small-group instruction time to give individualized feedback and help to each student, guided in part by scores on the SVDM. At the end of the year, Mrs. Smith will administer SVDM again as a measure of individual growth in singing achievement. This article was intended to provide researchers and practicing music teachers with research-based information about the measurement of singing voice achievement. The preceding fictional vignette described only one of many possible approaches to implementing assessment of singing voice achievement in order to improve instruction for individual students in the elementary music classroom. Regardless of what measure is used, it is important that music teachers consistently use some valid measurement of singing voice achievement on individual students, not only as a form of summative assessment but also to inform their instruction. Although new studies are always helpful, the current research literature provided excellent information on this important topic. In addition to adding to this alre­ ady rich store of research into how we can assess singing voice achievement, new research concerning practicing teachers’ design, implementation and applications of singing voice measurements would be most helpful. Note 1. See Table 5.

Declaration of Conflicting Interests The author(s) declared no conflicts of interest with respect to the authorship and/or publication of this article. [AQ: 8]

Funding The author(s) received no financial support for the research and/or authorship of this article. [AQ: 9]

References Brophy, T. S. (1997). Authentic assessment of vocal pitch accuracy in first through third grade children. Contributions to Music Education, 24(1), 57-70.

7 Consortium of National Arts Education Associations. (1994). The National Standards for Arts Education. Reston, VA: MENC. Retrieved from http://artsedge.kennedy-center.org/ teach/standards/standards_k4.cfm#02 Cooper, N. A. (1995). Children’s singing accuracy as a function of grade level, gender, and individual versus unison singing. Journal of Research in Music Education, 43, 222-231. Gault, B. M. (2000). The effects of pedagogical approach, presence/absence of text, and developmental music aptitude on the song performance accuracy of kindergarten and firstgrade students (Unpublished doctoral dissertation) The Hartt School, University of Hartford, West Hartford, CT. Green, G. A. (1994). Unison versus individual singing and elementary students’ vocal pitch accuracy. Journal of Research in Music Education, 42, 105-114. Guerrini, S. C. (2002). The acquisition and assessment of the developing singing voice among elementary students (Unpublished doctoral dissertation). Temple University, Philadelphia, PA. Guilbaut, D. M. (2004). The effect of harmonic accompaniment on the tonal achievement and tonal improvisations of children in kindergarten and first grade. Journal of Research in Music Education, 52, 64-77. Hepworth-Osiowy, K. (2004). Assessment in elementary music education: Perspectives and practices of teachers in Winnipeg public schools (Unpublished master’s thesis). University of Manitoba, Winnipeg, Canada. Hornbach, C., & Taggart, C. (2005). The relationship of developmental tonal aptitude and singing achievement among kindergarten, first-, second-, and third-grade students. Journal of Research in Music Education, 53, 322-331. Jaffurs, S. E. (2000). The relationship between singing achievement and tonal music aptitude (Unpublished master’s thesis). Michigan State University, East Lansing. Lange, D. M. (1999). The effect of the use of text in music instruction on the tonal aptitude, tonal accuracy, and tonal understanding of kindergarten students Unpublished doctoral dissertation). Michigan State University, East Lansing. Levinowitz, L. M., Barnes, P., Guerrini, S., Clement, M., D’April, P., & Morey, M. J. (1998). Measuring singing voice development in the elementary general music classroom. Journal of Research in Music Education, 46, 35-47. Mark, M. L., & Gary, C. L. (1992). A history of American music education. New York, NY: Schirmer Books. Marshall, H. D., III. (2002). Effects of song presentation method on pitch accuracy of third grade children (Unpublished doctoral dissertation). Temple University, Philadelphia, PA. Mathias, S. L. (1997). A teaching technique to aid the development of vocal accuracy in elementary students (Unpublished doctoral dissertation). The Ohio State University, Columbus. Martinez-Castillo, P., & Sotillo, M. (2008). Singing abilities in Williams syndrome. Music Perception, 25, 449-469.


8 McGraw, A. G. B. (1996). An assessment of the effectiveness of vocalizes in training elementary school children to sing using head voice (Unpublished doctoral dissertation). University of Georgia, Athens, GA. Moore, R. S. (1994). Effects of age, sex, and melodic/harmonic patterns on vocal pitch-matching skills of talented 8-11 year olds. Journal of Research in Music Education, 42, 5-13. Moore, R. S., Fyk, J., Frega, A. L., & Brotons, M. (1995). Influences of culture, age, gender, and two-tone melodies on interval matching skills of children from Argentina, Poland, Spain, and the USA. Bulletin of the Council for Research in Music Education, 127, 127-135. Muse, M. B. (1994). A comparison of two methods of teaching singing to primary children: An attempt to determine which of two approaches to teaching singing is more effective (Unpublished master’s thesis). University of Louisville, Louisville, KY. Performance Standards for Music Grades PreK–12: Strategies and Benchmarks for Assessing Progress Toward the National Standards. (1996). Reston, VA: MENC. Retrieved from http:// www.menc.org/resources/view/performance-standards-formusic-assessment-strategies-for-music Newlin, G. A. (2004). The effects of part-work instruction on first grade part-singing acquisition and achievement (Unpublished doctoral dissertation). The Hartt School, University of Hartford, West Hartford, CT. Pfordresher, P. Q., & Brown, S. (2007). Poor-pitch singing in the absence of “tone deafness.” Music Perception, 25, 95-115. Phillips, K. H., & Aitchison, R. E. (1997a). Effects of psychomotor instruction on elementary general music students’ singing performance. Journal of Research in Music Education, 45, 185-196. Phillips, K. H., & Aitchison, R. E. (1997b). The relationship of singing accuracy to pitch discrimination and tonal aptitude among third-grade students. Contributions to Music Education, 24, 7-22. Price, H. E., Yarbrough, C., Jones, M., & Moore, R. (1994). Effects of male timbre, falsetto, and sine-wave models on

Update XX(X) interval matching by inaccurate singers. Journal of Research in Music Education, 44, 353-368. Rutkowski, J. (1990). The measurement and evaluation of children’s singing voice development. The Quarterly, 1(1-2), 81-95. Rutkowski, J. (1996). The effectiveness of individual/small group singing activities on kindergarteners’ use of singing voice and developmental music aptitude. Journal of Research in Music Education, 44, 353-368. Rutkowski, J., & Miller, M. S. (1994). The longitudinal effectiveness of individual/small-group singing activities on children’s use of singing voice and developmental music aptitude. Bulletin of Research in Music Education, 20, 31-43. Shih, T.-T. (1997). Curriculum alignment of general music in central Texas: An investigation of the relationship between the essential elements, classroom instruction, and student assessment (Unpublished doctoral dissertation). The University of Texas at Austin. Talley, K. E. (2005). An investigation of the frequency, methods, objectives, and applications of assessment in Michigan elementary general music classrooms (Unpublished master’s thesis). Michigan State University, East Lansing, MI. Welch, G. F. (1994). The assessment of singing. Psychology of Music, 22, 3-19. [AQ: 10] Welch, G. F., Sergeant, D. C., & White, P. J. (1995/1996). The singing competencies of five-year-old developing singers. Bulletin of the Council for Research in Music Education, 127, 155-162. Welch, G. F., Sergeant, D. C., & White, P. J. (1997). Age, sex, and vocal task as factors in singing “in tune” during the first years of schooling. Bulletin of the Council for Research in Music Education, 133, 153-160. Western, B. A. G. (2002). Fundamental frequency and pitchmatching accuracy characteristics of first grade general music students (Unpublished doctoral dissertation). University of Iowa, Ames. Wurgler, P. S. (1990). A perceptual study of vocal registers in the singing voices of children (Unpublished doctoral dissertation). The Ohio State University, Columbus.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.