Show simple item record

dc.contributor.authorYusnita, Mohd Ali
dc.contributor.authorPandiyan, Paulraj Murugesa, Prof. Dr.
dc.contributor.authorSazali, Yaacob, Prof. Dr.
dc.contributor.authorShahriman, Abu Bakar, Dr.
dc.contributor.authorNataraj, Sathees Kumar
dc.date.accessioned2014-06-12T16:40:25Z
dc.date.available2014-06-12T16:40:25Z
dc.date.issued2012-10
dc.identifier.citationp. 262-267en_US
dc.identifier.isbn978-1-4673-1649-1 (Print)
dc.identifier.isbn978-1-4673-1704-7 (Online)
dc.identifier.issn1985-5753
dc.identifier.urihttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6408416
dc.identifier.urihttp://dspace.unimap.edu.my:80/dspace/handle/123456789/35449
dc.descriptionProceeding of the 3rd IEEE Conference on Sustainable Utilization and Development in Engineering and Technology (STUDENT) 2012 at Kuala Lumpur, Malaysia on 6 October 2012 through 9 October 2012. Link to publisher's homepage at http://ezproxy.unimap.edu.my:2080/Xplore/dynhome.jspen_US
dc.description.abstractAccent recognition is one of the most important topics in automatic speaker and speaker-independent speech recognition (SI-ASR) systems in recent years. The growth of voice-controlled technologies has becoming part of our daily life, nevertheless variability in speech makes these spoken language technologies relatively difficult. One of the profound variability is accent. By classifying accent types, different models could be developed to handle SI-ASR. In this paper, we classified three accents in English language recorded from three main ethnicities in Malaysia namely Malay, Chinese and Indian using artificial neural network model. All experiments were performed in speaker-independent and three most accent-sensitive words-independent modes. Mel-bands spectral energy was extracted from eighteen bands taking the statistical values of each speech sample i.e. mean, standard deviation, kurtosis and the ratio of standard deviation to kurtosis to characterize the spectral energy distribution. The system was evaluated using independent test dataset, partial-independent test dataset and training dataset. The best three-class accuracy rate of 99.01% with independent test dataset was obtained. The overall accuracy rate for several trials was averaged to 96.79% with the average learning time at 49 epochs.en_US
dc.language.isoenen_US
dc.publisherIEEE Conference Publicationsen_US
dc.relation.ispartofseriesProceeding of The 3rd IEEE Conference on Sustainable Utilization and Development in Engineering and Technology (STUDENT 2012);
dc.subjectAccent recognitionen_US
dc.subjectMel-bandsen_US
dc.subjectNeural networken_US
dc.subjectSpectral energyen_US
dc.subjectStatistical analysisen_US
dc.titleSpeaker accent recognition through statistical descriptors of Mel-bands spectral energy and neural network modelen_US
dc.typeWorking Paperen_US
dc.identifier.urlhttp://dx.doi.org/10.1109/STUDENT.2012.6408416
dc.contributor.urlyusnita082@ppinang.uitm.edu.myen_US
dc.contributor.urlpaul@unimap.edu.myen_US
dc.contributor.urls.yaacob@unimap.edu.myen_US
dc.contributor.urlshahriman@unimap.edu.myen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record