7 minute read

Speech intelligibility

There are very few topics in fire protection that cause so much ‘love–hate’ debate as the speech intelligibility for emergency warning systems. Questions vary from “Do we really need it at all?” to “If I’m EWS designer, how much of black/white magic do I need to learn to predict SI level in my design?”.

VYACHESLAV (SLAVA) SHARGORODSKY

Newsound Fire Services

On 1 March 1991, two Boeing 737 aircraft were flying in opposite directions between Darwin and Brisbane on a route passing over Mount Isa, outside radar coverage—Ansett’s VH-CZG was operating from Darwin to Brisbane and Australian Airlines’ VH-TJD was flying in opposite direction.

Once beyond radar coverage, pilots were required to provide position reports and their cruising level at specific points along the route.

The flight level of VH-CZG was reported as “three five zero” by the Darwin controller, but the Brisbane controller thought that the level had been given as ‘three nine zero’, so he read the level back as “three niner zero”, using normal acknowledgment terminology.

When the word ‘niner’ was received in Darwin, a temporary loss of signal clarity occurred, so the Darwin controller interpreted the sound as a ‘five’.

Luckily, the crew of one aircraft became aware that both aircraft were flying at the same level and manually initiated avoidance action—each crew saw the other aircraft pass less than one minute later.

It became one of the well-known ‘near-miss’ incidents in Australian aviation history due to speech ‘unintelligibility’.

A brief history

As with many other ‘dual-use’ technologies, speech intelligibility (SI) started (unsurprisingly) with the military.

Unassuming sentences such as “these days a chicken leg is a rare dish” or “a large size in stockings is hard to sell” are a few examples of so-called phonetically balanced sentences, or Harvard sentences, that were developed during World War II.

They were created in a boiler room, under Harvard University Memorial Hall, that was transformed into a secret wartime research laboratory (Psycho-Acoustic Laboratory) in 1940, in an attempt to reduce communication problems experienced by bomber pilots.

In this facility, volunteers were exposed to noise and speech, as scientists tested military communications systems, which ultimately led to the Harvard sentences—a set of standardised phrases still widely used to test everything using speech, from mobile phones to Voice over Internet Protocol systems.

Over the years, multiple methods were developed to evaluate the SI of various critical speech systems, mostly involving expert human ‘speakers’ and ‘listeners’. But the need for an objective and consistent SI measurement became obvious.

It was not until the 1970s that a major research project at the TNO Research Laboratory in Netherlands, funded by NATO, was able to deliver a result, creating the current speech transmission index (STI) method of SI measurement.

Audibility versus intelligibility

There is a difference between whether something can be heard and whether it can be understood.

The NFPA 72 standard provides a clear distinction between these two parameters: ‘audibility’ is equivalent to “can you hear me now?”; ‘intelligibility’ corresponds with “do you really understand me?”.

In practice, various factors affect speech intelligibility and, while the language and pronunciation skills of the human speaker are extremely important, most of the factors have to do with audio equipment or features of the apartment.

Major technical factors affecting intelligibility

Signal-to-noise ratio

Signal-to-noise ratio (SNR) is a proportion of the sound pressure level (SPL) produced by a loudspeaker compared with the ambient or background noise in the room. To achieve an acceptable SI level, AS 1670.1:2018 and AS 1670.4:2018 recommend that SNR should be 10 dB over the ambient noise level (with other conditions in place, as shown in the summary table below).

It should be noted that AS 1670.4:2015 did not provide enough details for speech SNR measurement methods, but they were clarified in Clause H.4 of AS 1670.4:2018.

Frequency response

For intelligible live speech, the whole audio path (including emergency microphone, audio processors, audio amplifiers, loudspeaker cabling, and loudspeakers) should preferably have a frequency response (FR) between 150 Hz and 11 kHz, as this is an average adult voice frequency range.

In practice, however, FR for some equipment is specified for much narrower frequency ranges, between 400 and 450 Hz and 3.2 and 4.0 kHz, which is closer to the average range of human hearing.

For the most intelligible sound, it is important that FR is as flat as possible. Yet, some caution must be exercised when comparing different FR. Conventionally, FR boundaries are specified for the points where signal level drops to a –3 dB threshold below a mid-frequency level. This

typically means that only half of the mid-frequency power is reproduced at the FR boundaries.

These threshold levels may vary dramatically between various FR specifications within the standards. For example, AS 4428.16-2020 allows up to 15 dB FR variation between 500 Hz and 3.2 kHz for emergency warning system (EWS) panel audio equipment (in AS 4428.16 terminology it is called EWCIE—Emergency Warning Control and Indicating Equipment), while AS 7240.24-2018 allows up to 20 dB FR variation between 447 Hz and 7.08 kHz for fire alarm loudspeakers.

Such margins permit enormous variations of signal level for any FR to be considered flat, unless audio amplification equipment is designed specifically to meet such a requirement or audio equalisers are used.

Total harmonic distortions

The average person can detect as little as two percent distortion when listening to some sound output, but when the sound distortion level reaches 15%, reproduced speech is considered non-intelligible.

Some EWS have reduced SI due to audio amplifier clipping and occasionally overdriven loudspeakers. AS 4428.16-2020 sets distortion level at maximum one percent for audio equipment, while AS 7240.24-2018 is silent on such a requirement for fire alarm loudspeakers.

Reverberation

Reverberation is the persistence of sound through echoes and reflections after the initial sound source is removed. This is the main reason why it is often difficult, and sometimes impossible, to achieve the required SI level in large areas with reflective surfaces such as car parks, atriums and so on. One often-used approach to resolve this is to increase the quantity of closer-spaced loudspeakers, set at their lower power tap settings.

Recognising this problem in highreverberation areas, the latest set of AS 1670 standards provides another pathway to compliance. This is based on a principle that can be described in layman’s terms as: “I can hear emergency tones at my location, but I can’t fully understand what they’re saying. The closer I get to the exit, the clearer the voice message is.”

Speech intelligibility requirements of AS 1670.4:2018 and AS 1670.1:2018 in different conditions

Condition

If ambient (noise) SPL is 85dB(A) or above Less than 1.5 s Reverberation

1.5 s or higher

Install visual alarm devices (VADs) and/or visual warning devices (VWDs). Do not measure SI, except within a 6 m radius of the approach to all required exits, where measured Common Intelligibility Scale (CIS) shall be not less than 0.7.

If ambient (noise) SPL is less than 85 dB(A) and live speech SNR is 10 dB or less Option 1: Measure SI in all areas where these conditions are met. Measured CIS shall be not less than 0.7.

Option 2: Install VADs and/or VWDs. Do not measure SI, except within a 6 m radius of the approach to all required exits, where measured CIS shall be not less than 0.7. Install VADs and/or VWDs. Do not measure SI, except within a 6 m radius of the approach to all required exits, where measured CIS shall be not less than 0.7.

If ambient (noise) SPL is less than 85 dB(A) and live speech SNR is more than 10 dB and loudspeakers are spaced not further apart than the twicemounting height from the floor

Source: Slava Shargorodsky Do not measure SI. It is deemed to be compliant under these conditions. Install VADs and/or VWDs. Do not measure SI, except within a 6 m radius of the approach to all required exits, where measured CIS shall be not less than 0.7.

Current standard requirements

Unfortunately, SI requirements are fairly convoluted and are not easy to read in all revisions of the AS 1670 suite of standards, although they were somewhat simplified in the 2018 editions.

The table above might help to clarify SI requirements under the current set of AS 1670.1-2018 and AS 1670.4-2018 standards.

There are many other aspects of SI that could not be covered in this short article. For example, the currently specified STI method works well for most linear analog audio equipment, but fails when some non-linear audio processing is involved, such as with digital audio compressing algorithms. But this is for another day and another article.

In our next article, we plan to cover some basic acoustic rules and considerations for achieving required SPL for EWS.

This article is from: