Audio Bandwidth Extension Based on Ensemble Echo State Networks with Temporal Evolution
Abstract: The bandwidth limitation of wideband audio systems degrades the subjective quality and naturalness of audio signals. In this paper, a new method for blind bandwidth extension of wideband audio signals is proposed based on ensemble echo state network with temporal evolution. The high-frequency components in the band of 7 ~14 kHz are artificially restored only from the information in the wideband audio. For each region in the wideband feature space, a specific echo state network with recurrent structure is explored to dynamically model the local mapping relationship between wideband audio features and high-frequency spectral envelope. The transition process among regions is modeled by a hidden Markov model, and a network ensemble technique based on temporal evolution is used to fuse multiple echo state networks such that the high-frequency spectral envelope is estimated. Combining the high-frequency fine spectrum extended by spectral translation, the proposed method can effectively extend the wideband audio to super wideband. In addition, the proposed extension method is applied to the ITU-T G.729.1 wideband audio codec and is further evaluated in comparison with the ITU-T G.729.1 Annex E super-wideband audio codec and the hidden Markov model-based reference bandwidth extension method. Objective quality evaluation results indicate that the proposed method is preferred over the hidden Markov model-based reference bandwidth extension method in terms of log spectral distortion, cosh measure, and differential log spectral distortion.
Further, the proposed method improves the auditory quality of the wideband audio and also gains a good performance in the subjective listening tests.