Feature enhancement and selection methods for isolated Malay speech recognition
Abstract
Automatic speech recognition (ASR) is a technique to translate automatically incoming speech signal into their contextual information. In the pass few decade, various acoustic feature extraction and classification algorithms have been developed for native English speech recognition and different languages spoken around the world using acoustic signals. Research in Automatic Speech Recognition (ASR) by machines had been done for more than five decades. Various research findings have been reported in recent years in speech recognition for many different languages. However, every languages having
their own unique words structure. As examples, English words are formed due to the changes of phoneme in the based word itself according to its group of words and Malay words allow addition of affixes to the base word to form new words. In this research,
signal processing techniques are applied to the acoustic signals in an effort to recognize
the Malay speech. To reduce the misclassification, the recorded speech signals were
segmented to remove the unvoiced speech (noise). In this research works, parametric
Linear Prediction Coefficients (LPC), Linear Prediction Cepstral Coefficient (LPCC),
Weighted Linear Prediction Coefficients (WLPCC), Mel-Frequency Cepstral
Coefficients (MFCC) and non-parametric Wavelet Packet Transform based Energy and
Entropy (WPT-EE) representations of features were extracted. The features extracted
were enhanced to increase the discriminant ability using artificial bee colony based
clustering. Then, the enhanced features set were dimensionally reduced by using two
feature selection techniques. They are binary particle swarm optimization (BPSO) and
discrete artificial bee colony (DABC) feature selection technique. Last, two classifiers
as the probabilistic neural network (PNN) and extreme learning machine (ELM) were
used to evaluate the performance of extracted and enhanced features from recorded
Malay speech signal. The proposed artificial bee colony based feature enhancement
(ABC-FE) features show promising average results of 99.61% (Speaker Dependent) and
96.21% (Speaker Independent). Experimental results showed that the average accuracy
obtained by using hybrid features of LPC, LPCC, WLPCC, MFCC and WPT-EE for
Speaker Dependent and Speaker Independent with ELM classifier were 97.89% (PSO)-
98 features and 99.33% (ABC)-67 features for Malay speech recognition.