Paper id 28201448

International Journal of Research in Advent Technology, Vol.2, No.8, August 2014 E-ISSN: 2321-9637

Speech Enhancement of Punjabi Language at Phoneme Level using Digital Signal Processing Techniques Jaismine Jassal1, Manjot Kaur Gill2 M.Tech. student, Dept. of Computer Science and Engineering1, Guru Nanak Dev Engg. College, Ludhiana1 Assistant Professor, Dept. of Information Technology2,Guru Nanak Dev Engg. College, Ludhiana2 Email:jassal.priya@yahoo.com1 , gill.manjot@gmail.com2 Abstract-This paper presents an overview of several most commonly used methods for enhancement of degraded speech. The common methods like Spectral Subtraction, Wiener Filter, Kalman Filter, RASTA Filter and the Proposed Method which contains the features from all the methods mentioned are explained. Each method uses certain Digital Signal Processing (DSP) techniques. Framing, windowing, DFT(Discrete Fourier Transform), FFT(Fast Fourier Transform), noise detection, SNR are the common parameters used in each method. These methods are applied on the phonemes of Punjabi language extracted from the word recorded. Keywords- Noise, speech enhancement, phonemes, SNR (Signal to Noise Ratio).

2.

1. INTRODUCTION Speech signals in the real worlds scenario are often corwhere f is the index of frequency bin. rupted by various types of degradations. The most common The problem of enhancing noisy speech received degradation includes background noise, reverberation and considerable attention in the literature and a variety of speech from competing speaker(s). Degraded speech is methods have been proposed to overcome it. the overpoor, both in terms of quality and intelligibility. Therefore, view for each of them is discussed underneath. there is a need to process the degraded speech for enhancing the perceptual quality and intelligibility. Several methods in the literature have been proposed for the purpose. Degraded 2.1. Spectral Subtraction Spectral Subtraction is a very popular method to enspeech is processed in the frequency domain for achieving hance the quality of speech that has been degraded by enhancement. Different types of noise from the environadditive noise. It is a form of spectral amplitude estiment were being added and their results were computed and mation method to restore signals degraded by additive compared. noise, where the phase distortion can be ignored This paper provides an overview of some of the (Saeed, 2005) .Since, it is assumed that the human ear commonly used methods, the comparison between them and is insensitive to the phase. This method of enhancement the proposed method. The rest of the paper is organised as works at restoring the signal by subtracting an estimate follows: Section 2 presents a review of the methods for processing speech degraded by background noise. Section 3 of the noise spectrum from the noisy signal spectrum describes the Punjabi language and its phonemes. Section 4 (Saeed, 2005). In Spectral Subtraction the noise in the covers the methodology followed. Section 5 describes the degraded speech is estimated from the ‘pauses’ or comparative results and discussion between the methods ‘quiet’ periods in the speech signal, when there is no applied on the phonemes. The conclusion is discussed in speech being said and only noise is present. The noise Section 5. spectrum is then usually updated as more frames of noise or silent periods appear in the speech signal. However since the noise is random by nature the resulENHANCEMENT OF NOISY SPEECH Background noise is the most common factor that causes tant spectrum can become negative when Spectral Subdegradation of the quality and intelligibility of speech. The traction is applied. This means that the negative values term background noise refers to any unwanted signal that is need to be set to a positive value. This in turn can also added to the desired signal. Background noise can be stacause distortion of the signal but reduces distortion tionary or non-stationary and is assumed to be uncorrelated caused when the spectrum turns negative. Spectral Suband additive to the speech signal. Mathematically, speech traction of the signal takes place in the frequency dodegraded by background noise can be expressed as the sum main rather than the time domain where the signal is of clean speech and background noise (Krishnamoorthy and given. To transform the signals to the frequency doPrasanna, 2010) given as main is usually done using a Discrete Fourier transform (DFT). In this, the Fast Fourier Transform is used ins(n) = x(n) + p(n) (1) stead (FFT). The FFT is the same as the DFT only it is an efficient way of doing it. Therefore, it is quicker and where s(n), x(n) and p(n) denote the noisy speech, clean will use fewer resources when working with it, making the system more efficient(Paul, 2009). speech and the background noise respectively. In the frequency domain it can be represented as 2.2. Wiener Filtering Method S(f) = X(f) + P(f) (2) 98

Turn static files into dynamic content formats.

Create a flipbook