Investigation of nonlinear feature extraction techniques for facial emotion recognition
Abstract
Over the last decades, facial emotion recognition has received a significant interest
among researchers in areas of computer vision, pattern recognition and its related field.
The increasing applications of facial emotion recognition have shown a sizeable impact
in many areas ranging from psychology to human-computer interaction (HCI).
Although facial emotion recognition has achieved a certain level of success, however its
performance is far from human perception. Many approaches have been constantly
proposed in the literature. In fact, the ability of facial emotion recognition to operate in
fully automated with high accuracy remains challenging due to various problems such
as intra-class variations, inter-class similarities and subtle changes of facial features.
The adhered problem is further hampered as physiognomies of faces with respect to age,
ethnicity and gender, thus increase the difficulties of recognizing the facial emotion. In
order to resolve this problem, this thesis aims to develop nonlinear features extraction
techniques of using Higher Order Spectra (HOS) and Empirical Mode Decomposition
(EMD) separately in recognizing the seven facial emotions (anger, disgust, fear,
happiness, neutral, sadness and surprise) from static images. A pre-processing step of
isolating face region from different background was first employed by means of face
detection. The 2-D facial image was then projected into 1-D facial signal by successive
projection via Radon transform. Radon transform is translation and rotation invariant,
hence preserves the variations in pixel intensities. The facial signal that describes the
expression was extracted using HOS and EMD to obtain a set of significant features. In
HOS framework, the third order statistic or bispectrum that captures contour (shape)
and texture information was applied on facial signal. In this work, a new set of
bispectral features was used to characterize the distinctive features of seven classes of
emotion. While, in EMD framework, the facial signal was decomposed using EMD to
produce a small set of intrinsic mode functions (IMFs) via sifting process. The IMF
features which exhibit the unique pattern were used to differentiate the facial emotions.