IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 08 (August. 2014), ||V1|| PP 21-25
www.iosrjen.org
An Approach to Extract Feature using MFCC Parwinder Pal Singh, Pushpa Rani Department of computer science engineering, Chandigarh University Assistant Professor1, Research Scholar2 Abstract: - Speech is the most natural and efficient way of communication between humans. Lots of efforts have been made to develop a human computer interface so that one can easily interact and communicate in an unskilled way. Speech recognition systems find their applications in our daily lives and have huge benefits for those who are suffering from some kind of disabilities. This paper presents an approach to extract features from speech signal of spoken words using the Mel-Scale Frequency Cepstral Coefficients .It is a nonparametric frequency domain approach which is based on human auditory perception system. Firstly, all the voice samples of isolated words are taken as the input and by using praat tool denoise all these samples. Then coefficients are extracted by using MFCC as these coefficients collectively represent the short term power spectrum of sound. All this implementation is build in Matlab. Keywords: - Speech Recognition, Mel frequency cepstral coefficients (MFCC), cepstrum
I.
INTRODUCTION
Speech signals are naturally occurring signals and hence, are random signals. These informationcarrying signals are functions of an independent variable called time. Speech recognition is the process of automatically recognizing certain word which is spoken by a particular speaker based on some information included in voice sample. It conveys information about words, expression, style of speech, accent, emotion, speaker identity, gender, age, the state of health of the speaker etc. There has been a lot of advancement in speech recognition technology, but still it has huge scope. Speech based devices find their applications in our daily lives and have huge benefits especially for those people who are suffering from some kind of disabilities [3] [4]. We can say that such people are restricted to show their hidden talent and creativity. We can also use these speech based devices for security measures to reduce cases of fraud and theft [7]. Speech Sample
Denoising
Pattern Output matching
Feature Extraction
Fig. 1: Speech recognition system Speech recognition mainly focuses on training the system to recognize an individual’s unique voice characteristics. The most popular feature extraction technique is the Mel Frequency Cepstral Coefficients called MFCC as it is less complex in implementation and more effective and robust under various conditions [2]. MFCC is designed using the knowledge of human auditory system. It is a standard method for feature extraction in speech recognition. Steps involved in MFCC are Pre-emphasis, Framing, Windowing, FFT, Mel filter bank, computing DCT.
II. Speech input
PreEmphasis
Framing
FEATURE EXTRACTION Window ing
Fast Fourier Transform
Mel Filter bank
DCT
Delta Energy
Output
Fig. 2: MFCC block diagram The most commonly used acoustic features are mel-scale frequency cepstral coefficients. Explanation of step by step computation of MFCC is given below:1. Pre-Emphasis- In this step isolated word sample is passed through a filter which emphasizes higher frequencies. It will increase the energy of signal at higher frequency.
International organization of Scientific Research
21 | P a g e