1 Notes on Karlheinz Stockhausen’s Gesange der Jßngelinge by Bruce Christian Bennett Karlheinz Stockhausen imagined a "sound-word continuum" in which there is "a continuous transition from listening to comprehension."1 Within this continuum, "speech can approach music and music can approach speech up to the point of dissolution of the boundaries between sound and meaning."2 A spoken word can be heard as having linguistic meaning on one end of the soundword continuum and likewise may be heard as a sound in-and-ofitself at the other end. Stockhausen defines five tone characteristics and their scientific measures that occur in occidental music:3 1. pitch (harmony/melody)
-
cps
2. duration (metre/rhythm)
-
seconds
3. timbre (phonetics)
-
a formant area in cps
4. volume (dynamics)
-
phons
5. location (topography)
-
degrees and meters
Stockhausen's association of timbre and phonetics is particularly interesting when considering Gesang der JĂźngelinge. Electronic means of analysis, tone generation, and treatment of recorded material allow seemingly dissimilar acoustic phenomena, such as sounds that are electronic in origin as opposed to those that are vocal in origin, to have some hidden relationship, which is then only made apparent through the course of a composition. The most simple relation between electronic timbres and vocal timbres that is common to Stockhausen's work is the association of white noise with consonants and sinus tones with vowel. This allows for an extension of the notion of a continuum of apprehensibility to a
1 2 3
Stockhausen. "Music and Speech", Die Reihe 6 (1964), 59. ibid. Stockhausen. "Music and Space", Die Reihe 5 (1961), 73.
2 timbral continuum running from white noise to pure tone1 or from consonant to vowel sounds.2 The composition of Gesang der J端ngelinge proceeded in part from the idea of blending sung tones "with electronically produced ones to form a mutual sound-continuum: the will of the selected musical arrangement determines how fast, how long, how loud, how soft, how dense, how intricate the tones must be, how great and small the proportions of pitch and timbre must be in which the tones are audible."3 It is significant that the blending of "sung tones" and electronic sounds was attempted, for, as Stockhausen notes, "It would not have been possible to achieve the desired fusion with discontinuous sound sources ... with instruments (especially with respect to timbre)."4 He argued that "sung phones are, in part of their structure, more differentiated than any sound..."5 and consequently Gesang der J端ngelinge required a richer palette of electronic sounds than previous electronic music if any continuity between the two sound sources was to be attained.6 The precompositional necessity then became "to arrange everything separate [i.e., various electronically produced sounds and sounds originating with the voice] into as smooth a continuum as possible, and then to extricate the diversities from the continuum and compose them."7 Stockhausen executed this process according to serial procedures. He maintained that only by 'objectifying' the sung tones by submitting them to serial processes could he bring them into the
Stockhausen. "Electronic and Instrumental Music", Die Reihe 5 (1961), 63-64. 2 Stockhausen. "Actualia", Die Reihe 1 (1965), 15. 3 Stockhausen. "Music and Speech", Die Reihe 6 (1964), 58. 4 ibid. 5 Stockhausen. "Actualia", Die Reihe 1 (1965), 45. 6 Heikinheimo (1972), 63. 7 Stockhausen. "Music and Speech", Die Reihe 6 (1964), 64. 1
3 "sphere of electronic sound."1 Thus, the basic elements of electronic sounds and phones had to be similarly scaled. "Only then can a continuum of timbre be perceived."2 The timbral continuum was divided into three scales for both the vocal materials and the parallel electronic materials:3 vocal 1. 2.
dark vowels (u) <-> light vowels (i) vowels <-> consonants
3.
dark consonants (ch) <-> light consonants (s)
electronic dark timbre <-> bright timbre purely harmonic <-> aleatoric spectra noise bands darkest noise <-> brightest noise
Thus, the vowel is a single element in the series of formants and the consonant is a single element in the series of noises.4 Stockhausen then distinguishes eleven basic elements of electronic sounds whose levels of pitch, duration, and dynamic are all controllable. These eleven elements are non-identical in their basis and are basic in that they "cannot be reduced to further varied spectral components."5 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Sinus tones Sinus tones with periodic frequency modulation Sinus tones with statistical frequency modulation Sinus tones with periodic amplitude modulation Sinus tones with statistical amplitude modulation Periodic combinations of both frequency and amplitude modulation Statistical combinations of both frequency and amplitude modulation Coloured noise with constant density Coloured noise with statistically varied density Periodic sequences of filtered 'beats' (Knacke - clicks) Statistical sequences of filtered 'beats'
These distinctions in the "micro-time-structure" of elemental tones are significant to Stockhausen's way of thinking, for these
1 2 3 4 5
Stockhausen. "Actualia", Die Reihe 1 (1965), 45. ibid., 46. Stockhausen. "Music and Speech", Die Reihe 6 (1964), 58. Stockhausen. "Actualia", Die Reihe 1 (1965), 46. ibid.
4 distinctions manifest themselves on a larger scale, "the structure of the work and its material are one in the same thing."1 This will be made evident when I examine the first and last formal divisions of the composition. Stockhausen thus explores a "colour"-continuum2 between words and music on the level of timbre, but perhaps more interesting is the parallel "speech-continuum."3 This is constituted by varying degrees of the comprehension of word as meaning and of word as sound, wherein "comprehensibility" is "both semantically and sound-phonetically differentiated,"4 or, to employ Berio's definition, this continuum is graded by various modes of apprehension. At this point it is necessary to discuss the choice of a boy soprano for the source of the vocal material as well as the text itself.
Gesang der J端ngelinge is undeniably a religious work. The text is part of a litany glorifying God taken from the benedicite from the Apocrypha in the Book of Daniel (3:57-73). The J端ngelinge ('Holy Children') are the Judeaeans Hananiah, Mishael, and Azariah taken into the Babylonian court of Nebuchadnezzar (605-562 BC) and renamed Shadrach, Meshach, and Abed-Negro. King Nebuchadnezzar issued a decree that all shall prostrate themselves before a golden statue upon hearing the sound of "horn, pipe, zither, ...and every other kind of instrument" or they would be "immediately be thrown into the burning firey furnace."5 The J端ngelinge, rather than bow down before brazen images in violation of Jewish Law, do not heed the King's decree and are consequently cast into the "burning firey furnace." Their song is the litany of praise to God and in answer to their praise God provides an angel to protect them.
1 2 3 4 5
ibid., 51. Stockhausen. "Music and Speech", Die Reihe 6 (1964), 58. ibid. ibid., 59. Daniel 3:5-6.
5 Stockhausen's choice of a boy soprano is interesting for several reasons. The obvious reason is that in the biblical context the text is attributed to three youths. As well, the very sound of a boy soprano has a religious connotation. The least apparent benefit of this decision, though perhaps the most significant, is that the spectral profile of vowels sung by a boy soprano is more similar to simple sine waves than that of any other voice type. Thus, the possibility for truly blending the voice and the electronics is greatly improved. Stockhausen's choice of text is well suited both to the electronic media and to his compositional techniques for several reasons. First, the fact that it is a litany allows the text to be, "integrated into purely musical structural arrangements (especially permutationalserial ones) without ...[greatly altering] the literary form, its message or other aspects."1 Furthermore, by choosing a commonly known text Stockhausen gives himself the freedom to play with the listener's expectations as regards the text. If the word “preiset” occurs at one moment and the word “Herrn” at another, the informed listener (i.e., one who understands German and is familiar with the Biblical reference) is reminded of a known relation between the words. Thus, the elements of memory and expectation become paramount to the coherence of the text within the structure of the work. Certainly the subordination of language to serial manipulations similarly subordinates linguistic semantic to musical logic and as such the semantic of language is destroyed. However, by using a known text Stockhausen allows himself the liberty of being able to only imply the text and still have its import be apprehended. At certain structurally significant points in the work, spoken or sung words or groups of words are clearly heard as comprehensible linguistic symbols; however throughout much of the work the text is disassembled and subjected to various permutational processes wherein it is heard purely as sound in-and-of-itself. There exist 1
Stockhausen. "Music and Speech", Die Reihe 6 (1964), 58.
6 between these polarities various degrees of comprehensibility of the word as language or as sound. Furthermore, within this continuum exists the possibility of uncovering new meanings and new combinations of words that are not apparent in the original text; for instance 'preis,' 'preist,' jubilt,' 'Scneewind, 'Eisglut', or 'Feuerreif.'1 The need for a German audience to appreciate the 'speechcontinuum' is lessened because of the greater sound-word continuum. Stockhausen notes that even if, "a listener does not understand the German words, he can still appreciate the structural function of the 'series of comprehensibility', because it is coupled with a series of gradual transformation of sequences of sung sounds; the degrees of transformation become increasingly similar to the sound-connections of non-vocal character and finally become interchangeable with electronic sounds. ...something unequivocally perceived as 'sung' is at the same time 'comprehensible to the highest degree.'"2
Gesang der Jüngelinge may be divided into six sections or 'textures' (Texturen). The formal structure is further articulated by the words "preiset," "jubelt,"3 or "den Herrn" being both audible and apprehensible.4 Distinct structural borders between these Texturen are obscured by the very nature of the Textur itself. György Ligeti distinguishes what is a Textur from what is a Struktur: "Eine Struktur kann gemäss ihren Komponenten analysiert werden; eine Textur ist besser durch globale, statistische Merkmale zu beschreiben."5 I have ibid., 59. ibid. 3 Stockhausen explains that "According to the context I employed 'Jubelt' instead of 'Preiset." ibid. 4 >>Der 'Gesang der Jüngelinge' ist in 6 ganz zusammenhängenden Texturen komponiert. Diese Formgliederung spürt am nun daran, daß in jeder dieser längeren Texturen 'preist' oder 'jubelt' in Beziehung zu den Worten 'den Herrn' ganz verständlich zu hören ist. Das schafft Zusammenhang und überbrückt große Zeiträume.<< Stockhausen (1964), p. 59. 5 Ligeti. "Wandlungen der musikalischen Form" die Reihe 7 (1961), 13. 1 2
7 tried to distinguish some of the various characteristics of these Texturen as they may relate to the sound-word continuum in its parallel aspects of timbre and semantic. As a means for framing a description the Texturen in Gesang der Jüngelinge I refer to Stockhausen himself.1 Gesang der Jüngelinge: I. 0:00 - 1:02 = 1:02 "In der ersten Textur Hört man in der Ferne (nach 10,5") noch undeutchlich 'jubelt'."2
The first Textur is a model for the form of the whole work and it is clearly set apart from the rest of the structure by a silence at 1:02 to ca. 1:05. Its general form is symmetrical. The piece opens with a "shower" of electronic tones. At 0:10.5 the boys voice is heard singing "jubelt" on a high, sustained tone. From 0:16 to 0:30 and again from 0:35 to 0:40 the texture dense and active. Sustained sung tones recur at 0:31 and at 0:40. Then the section ends as it begins (as does the totality of the work itself) - with a "shower" of electronic tones at 0:55 and evaporating into silence by 1:02. This first section clearly establishes the sung voice and establishes a graded scale of semantic perception (comprehensibility <-> incomprehensiblity). Stockhausen defines 7 degrees of understanding of the compositional elements (see discussion of sixth Textur) with occurrences in this section:3 1) Nicht genau verständlich, weit entfernt im geschlossen Raum (10,5"): Beispiel: Das Wort hieß >jubelt< 2) Geringe Lautstärke, hohe Dichte, scharenweise Stimmen, Permutation der Silben, große Entfernung im offenen Raum, relativ kurz (ca. 3 sec) (nach 15,4"): Beispiel: Qualifiziert als >Nicht verständlich<; der Text enthält >jubelt dem Herrn< 3) Etwas weniger dicht, dafür aber ziemlich lang (ca. 6 sec); durch Zu- und wieder Abnahme der Lautstärke sowie räumliche Näherung in der Mitte des folgenden Komplexes werden dort einige Silben verständlicher (nach 20,2"): 1 2 3
Stockhausen (1964), 59. ibid. ibid., 61-62.
8 Beispiel: Qualifiziert als >Kaum verständlich<; es heißt: >Preiset den Herrn ihr Werke alle des Herrn< 4) Ein sehr kurzer Sprachkomplex folgt (ca. 11/2 sec), der jedoch räumlich mit zunehmender Lautstärke sehr nahe kommt bei geringer Dichte (nach 27,4"): Beispiel: Das war >Verständlicher< und wird durch 5) die sofort anschließende Solostimme rück-wirkend unterstützt, die langsam und deutchlich singt (nach 28,4"): Beispiel: Qualifiziert als >Verständlich< und heißt >lobet ihn - lobet ihn< 6) Vielfältige Silben- und Wortpermutationen gleichzeitig in der längsten und dichtesten Gruppe, große dynamische Unterschiede (nach 34,5"): Beispiel: Qualifiziert als >Ganz wenig verständlich< und heißt >über alles ihn< 7) Großer halliger Raum, sehr langsam und eigentlich aus einem separaten Lautsprecher von einer Einzelstimme gesungen (nach 42,3): Beispiel: >Fast verständlich< - es wird oft falsch verstanden - und heißt >in Ewigkeit<
II.
1:02 - 2:52 = 1:50 "In der zweiten Textur (ab 1'02") hört man , zunächst chorisch, 'dem Herrn jubelt', und bald darauf (nach 1'o8,5" und nach 1'58,5") ganz nah, mit Solostimme, 'preiset den Herren'."1
The second Textur is fairly dense and alternates between pitched recitations (spoken, not sung) of "preiset den Herren" and choral textures and electronic interludes. III.
2:52 - 5:15.5 = 2:23.5 "In der dritten Textur (ab 2'52") mit Solostimme 'preiset den Herren'."2
The third Textur begins more sparsely with a low electronic “bassline,” which I hear as being a counterpart (if not a counterpoint, though occasionally temporally displaced) to the spoken word. This section clearly articulates a process of separating out the words as distinct sound objects. At 4:20 "den ...Herrn" is heard, then "preis(e)t" at 4:23. Mirroring that articulation on the electronic plane at 4:28 an electronic-sound-complex is articulated followed by a choral-sound-complex at 4:30 and again a consequent electronic1 2
ibid., 59. ibid.
9 sound-complex at 4:42. Thus, an effort is made to clearly articulate words as abstracted timbraly complex sound objects and visa-versa. IV.
5:15.5 - 6:22 = 1:06.5 "In der vierten Textur (ab 5'15,5") singen viele Stimmen in Akkorden 'den Herren preiset' (noch einmal bei 5'46,5")."1
The fourth Textur articulates the process of disassembling the text into its constituent phonemes and syllables. From 6:01 to 6:14 the text, "und starrer Winter," is spoken intelligibly yet distinctly separated into syllables. As well, the Textur audibly employs serial treatment of the text by electronic means: the most obvious example is at 5:31 to 5:36 where the retrograde of the 'main theme' is heard (the tape recording of "preiset den Herren" sung to a five note series is played backwards). V.
6:22 - 8:40 = 2:18 "In der fĂźnften Textur (ab 6'22"), klingt mehrstimmig aus groĂ&#x;er Entfernung (bei 6'52,5") 'Herrn preiset', dan 'preiset den Herren' (bei 7'20,5" und bei 7'51")."2
The fifth Textur begins with sparse electronic sounds with statistical rhythms and modulations. The text is sung with longer, sustained tones in stretto. The text is comprehensible in one sense for it is sung yet; it less intelligible because of the intricate counterpoint it has with itself and with statistically modulated bands of noise. VI.
8:40 - 13:00 = 4:20 "In der sechsten Textur (ab 8'40"), singt, in groĂ&#x;em Melodiebogen, eine Solostimme (bei 8'42") 'jubelt dem Herrn' und bei (8'51") 'preiset', dann (bei 10'50") 'ju-belt'."3
The sixth Textur is the by far the longest and perhaps the most complex in its permutational structure. Stockhausen's control over the listener's apprehension voice and electronics within a 1 2 3
ibid. ibid. ibid.
10 continuum of timbre is at its zenith here. At about 8:33 the 'ju-' of "jubelt" emerges from a periodically amplitude modulated tone. From 8:47 to 9:05 and again from 9:12 to 10:05 Stockhausen achieves a continuous blending of vocal and electronic sounds in a dense polyphonic texture by rapidly moving through series of differentiated timbres either phonetic or electronic in origin. Particularly intriguing is the 'electronic chatter' that occurs at ca. 8:59. Another effective technique is heard from 11:20 to 11:45 - a long gradual crescendo in a low dynamic range of a sustained tonecomplex of vocal and electronic sound functions as a background texture yet the apprehension as to whether that sound is vocal or electronic in origin is ambiguous. Stockhausen offers a key to understanding the complex formal structure of this final Textur in a table for the distribution of recurrences and proportion of elements, as sinus tones are transformed to sung chords; "the element-groups A - W stand for a corresponding number of time-sections of various length; each element occurs - statistically viewed1 - equally often,"2 in this final section: SK
=
IK LS R I SV
= = = = =
RO IO
= =
IA
=
RA
=
sinus complexes (showers of sinus tones with defined frequency, duration, and intensity in very complex rhythmic micro-structure) impulse complexes (showers of impulses as SK) sounds and syllables noise filtered to about 2% wide (in cps) single impulses synthetic vowel sounds (spectra rich in overtones in various formant combinations) noise filtered 1-6 octaves wide showers of impulses of statistically fixed density, filtered 16 octaves wide single impulses in chords (in each case, pitches of used scales) chords from 2% (in cps) wide noise bands (middle pitches according to the scale)
The various elements of this section would have to be considered statistically (as Ligeti's definition of a Textur intimates). 2 Stockhausen. "Music and Speech", Die Reihe 6 (1964), 59. 1
11 S(A)
=
GA
=
sinus tone chords (or mixtures in unharmonic in unharmonic types of scales, sounds as boundary case in harmonic scales) sung chords (combined sung chords)
Methods of analytic phonetics (vowels-sinus sounds; consonants-bands of noise; plosives-impulses; various hybrid forms) were made use of for the system of the scale of sound-elements (arrangement of the sounds in the synthetic sound-family). A: B: C: D: E: F: G: H: I: J: K: L: M: N: O: P: Q: R: S: T: U: V: W:
SK " " " " " " " " " " "
IK " " " " " " " " " " "
LS " " " " " " " " " " "
R " " " " " " " " " " "
I " " " " " " " " " " "
SV " " " " " " " " " " "
RO " " " " " " " " " " "
IO " " " " " " " " " " "
IA " " " " " " " " " " "
RA " " " " " " " " " " "
SA " " " " " " " " " " "
This continuous change of elements was divided into four rows of related tendency in the piece: A E I M Q U -> -> B F J P T (X) ->x -> C G K O S W -> -> D H L N R V ->x receives a structure with special definition resulting from the total plan of the work [such as 'jubelt' sung]1
Thus, in Gesang der J端ngelinge, Stockhausen achieves, at least in part, the sound-word continuum he envisioned. The timbral continuum is realized mostly in the realm of vocal becoming
1
ibid., 59-60.
GA " " " " " " " " " " "
12 electronic, rather than the other way around, due in a large part to the limitations of the technology available. Considering the technology at his disposal Stockhausen's achievement in electronically creating phonetic-like timbres is monumental. Likewise, the Verst채ndlichkeitsgraden in the sound-word continuum is heard proceeding from linguistic meaning to sound object rather than moving from a perceptibly electronically generated sound object to a linguistic symbol. Stockhausen does attempt this latter transformation with limited success (for instance, in section III. 4:28-4:42), however only the possibility of linguistic symbolism can be asserted while no real comprehensible meaning can be established.