 Research Article
 Open Access
Gabor Directional Binary Pattern: An Image Descriptor for Gaze Estimation
 Hongzhi Ge^{1}Email author
https://doi.org/10.1155/2010/807612
© Hongzhi Ge. 2010
 Received: 27 April 2010
 Accepted: 24 August 2010
 Published: 29 August 2010
Abstract
This paper proposes an image descriptor, Gabor Directional Binary Pattern (GDBP), for robust gaze estimation. In GDBP, Gabor magnitude information is extracted firstly from a cropped subimage. The local directional derivations are then utilized to encode the binary patterns in the given orientations. As an image descriptor, GDBP can suppress noises and robustness to illumination variations. Meanwhile, the encoding pattern can emphasize boundary. We use the GDBP features of eye regions and adopt the Support Vector Regression (SVR) to approximate the gaze mapping function, which is then used to predict the gaze direction with respect to the camera coordinate system. In the personindependent experiments, our dataset includes 4089 samples of 11 persons. Experimental results show that the gaze estimation can achieve an accuracy of less than by using the proposed GDBP and SVR.
Keywords
 Local Binary Pattern
 Gabor Filter
 Lighting Variation
 Gabor Wavelet
 Camera Coordinate System
1. Introduction
In HCI (HumanComputer Interaction) scenario, eye gaze means the pointer from the viewer's two eyes to an object, and gaze is a very useful natural input modality. Combining with the sign language recognition or speech recognition, eye gaze tracking can highly improve usability for the disabled persons, and meanwhile it can be applied in some special fields such as, ophthalmology, neurology, and psychology. Many researchers in computer vision and pattern recognition community have been focusing on this topic, and some methods for gaze estimation can be found in related literature. These methods, by their approaches to represent the position of a pupil's center in the eye socket, are divided into two categories [1]: modelbased methods and appearancebased methods. Modelbased solutions, such as, Pirkinje image [2, 3] and limbus tracking [4], use an explicit geometric eye model and the geometric features to estimate the gaze direction. Appearancebased solutions treat an eye image as a highdimensional feature, instead of using the explicit geometric characteristics [5]. These appearancebased approaches are usually more robust in experiments by better exploiting the statistical properties. Sugano et al. [1] take the cropped eye region as a point in a local manifold model and make gaze estimation by clustering learning samples with similar head poses and constructing their local manifold model. In the approach proposed by Lu et al. [6], Local Binary Pattern (LBP) [7] represents the "pupilglint" vector information related to gaze direction by obtaining the texture changes of eye images. In [8], an appearancebased method, Local Pattern Model (LPM), is presented. This model combines the improved PixelPatternBased Texture Feature (PPBTF) and LBP texture feature. Although the existing appearancebased methods have made significant progress in gaze estimation, their accuracy and robustness need to be further improved.
In this paper, we present an appearancebased gaze estimation method based on a novel image operator, Gabor Directional Binary Pattern (GDBP), and Support Vector Regression (SVR) [9]. In GDBP, multiscale and multiorientation Gabor wavelets are used to decompose an eye image, followed by the Directional Binary Pattern (DBP) operator. We use the GDBP operator to represent the texture changes of the eye images caused by the pupil centers which keep moving in the eye sockets, when people at a certain head pose gazes in different directions. With the advantages of Gabor filters [10] and the local directional differentiation information, GDBP is not only robust to illumination variances, but also with much discriminating power. In applications, these patterns are useful in representing the horizontal and vertical pupil movements. As appearancebased features, GDBP is fed into SVR to approximate the gaze mapping function. The output gaze direction is represented in terms of Euler angles with respect to the camera coordinate system. Our experimental results show the validity of the proposed operator, and additionally we have achieved an accuracy of less than .
The rest of the paper is organized as follows. In Section 2, we elaborate the computation of the proposed GDBP operator in detail, as well as some analysis on its robustness to the light variances and its different discriminating power in different orientations. Gaze estimation with fixed front head pose based on GDBP is presented in Section 3, followed by experimental results with comparisons with other approaches. In the last section, some brief conclusions are drawn with some discussions on the further work.
2. GDBP Operator
In this section, we first define the Directional Binary Pattern (DBP), and then extend it to GDBP, using multiscale and multiorientation Gabor filters. Finally, we analyze the robustness and discriminating power of the GDBP. The details are given as follows.
2.1. Directional Binary Pattern (DBP)
2.2. Extending DBP with Gabor Filters
As stated above, a cropped eye image is encoded into GDBP by the following procedures. The image is normalized and transformed to obtain multiple Gabor magnitude maps in frequency domain by applying multiscale and multiorientation Gabor filters. The Directional Binary Pattern is extracted from these maps.
2.3. Robustness Analysis of the GDBP
Errors of three gaze directions under three lighting variations (degree).
Gaze  Feature  

LBP  DBP  GDBP  












3. Experiments
3.1. Experimental Data Collection
The FASTRAK has a transmitter (the world coordinate system . The origin is located in the centre of the transmitter) and four receivers, and only three receivers are used in our data collection procedure (Each receiver owns a local coordinate system). Three receivers are mounted on the head of the viewer, the camera, and topleft corner of the screen, respectively. The data from a receiver is its position and orientation related to the transmitter, which are six values: in cm, and Azimuth, Elevation, Roll in degree. And then our system can calculate the receiver's translation and rotation matrices related to the transmitter's coordinate system. In Figure 7, one receiver is mounted on the top of the camera, and the output of its translation and calculated rotation matrices are and related to the transmitter's coordinate system. are the axes of camera's coordinate system. Suppose that the translation and rotation matrices from the receiver's coordinate system to camera's coordinate system are and = I (I is a unit matrix), respectively. The second receiver is mounted on the topleft corner of the screen, and the output of its translation and calculated rotation matrices are and related to the transmitter's coordinate system. For each generated cursor as a gazed point, we assume the translation and rotation matrices from the receiver's coordinate system to screen's coordinate system are and = I, respectively. The third receiver is mounted on top of the viewer's head, and the output of its translation and calculated rotation matrices are and related to his transmitter's coordinate system. Assume that the centre of the two eyes has a translation of in cm related to the third receiver ( is a statistical average and the error of different centre can be ignored and tested by the experiments). We keep the direction of the receiver paralleling to the line between the two eyes and the direction upright, and the direction parallels the direction of head pose.
Our system synchronizes the image capturing and the computation of gaze direction. In the data collection, the distance between subject's heads and the screen is around 600 mm. In this paper, in order to simplify the experiments, we keep the head pose in front view by holding the direction of receiver fixed on the top of head. In the data collection, at each time only one of predefined 16 points appears on the monitor and its world coordinate is calculated from the screen coordinate by the translation and rotation matrices. We provide a database for further research.
3.2. Gaze Estimation Based on GDBP and Experimental Results
Errors of different regions (degree).
Gaze  Regions  

3×2  4×2  3×3  4×3  



 
 2.4  2.1  2.3  2.7 
Errors of three gaze directions (degree).
Gaze  Feature  

GDBP  DBP  LBP  












In our experiments, the kernel function of SVR is the Gaussian kernel function. The sixteen regions of double eye images are used and the average error is around . It is important to note that our eye gaze method is noninvasive, fast, and stable. It is stable due to the robustness of our novel features to the light variances.
4. Conclusions
In this paper, a robust image descriptor, GDBP, is proposed for gaze estimation. GDBP captures not only the local binary pattern, but also the texture change information related to the given directions. Other advantages of GDBP include noise restrain and robustness to lighting variations. GDBP features are finally fed into SVR to estimate the gaze direction with respect to the camera coordinate system. In the future, we will investigate how to match two GDBPs and how to apply the discriminative capacity of the GDBP operator for other tasks.
Authors’ Affiliations
References
 Sugano Y, Matsushita Y, Sato Y, Koike H: An incremental learning method for unconstrained gaze estimation. Proceedings of the Europeon Conference on Computer Vision, 2008, Lecture Notes in Computer Science 5304: 656667.Google Scholar
 Cornsweet TN, Crane HD: Accurate twodimensional eye tracker using first and fourth Purkinje images. Journal of the Optical Society of America 1973, 63(8):921928. 10.1364/JOSA.63.000921View ArticleGoogle Scholar
 Ohno T, Mukawa N, Yoshikawa A: FreeGaze: a gaze tracking system for everyday gaze interaction. Proceedings of the Eye Tracking Research and Applications Symposium (ETRA '02), March 2002 125132.View ArticleGoogle Scholar
 Matsumoto Y, Zelinsky A: An algorithm for real time stereo vision implementation of head pose and gaze direction measurement. Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (AFGR '00), 2000 499504.Google Scholar
 Hansen D, Ji Q: In the eye of the beholder: a survey of models for eyes and gaze. IEEE Transactions on Pattern Analysis and Machine Intelligence 2010, 32(3):478500.View ArticleGoogle Scholar
 Lu HC, Wang C, Chen YW: Gaze tracking By binocular vision and LBP features. Proceedings of IEEE 19th International Conference on the Pattern Recognition (ICPR '08), August 2008 14.Google Scholar
 Ojala T, Pietikäinen M, Mäenpää T: Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 2002, 24(7):971987. 10.1109/TPAMI.2002.1017623View ArticleMATHGoogle Scholar
 Lu H, Fang G, Wang C, Chen Y: A novel method for gaze tracking by local pattern model and support vector regressor. EURASIP Signal Processing 2010, 90(4):12901299.View ArticleMATHGoogle Scholar
 Smola A, Scholkopf B, et al.: A tutorial on support vector regression. Royal Holloway College, University of London, London, UK; 1998.Google Scholar
 Shan S, Gao W, Chang Y, Cao B, Yang P: Review the strength of gabor features for face recognition from the angle of its robustness to misalignment. Proceedings of the 17th International Conference on Pattern Recognition (ICPR '04), August 2004 338341.Google Scholar
 Zhang B, Wang Z, Zhong B: Kernel learning of histogram of local Gabor phase patterns for face recognition. EURASIP Journal on Advances in Signal Processing 2008, 2008:8.Google Scholar
 Polhemus FASTRAK, http://www.polhemus.com/?page=Motion_Fastrak
 Niu Z, Shan S, Chen X, Ma B, Gao W: Enhance ASMs based on AdaBoostbased salient landmarks localization and confidenceconstraint shape modeling. Proceedings of the International Workshop on Biometric Recognition Systems (IWBRS '05), 2005, Lecture Notes in Computer Science 3781: 914.Google Scholar
 EyeLink II, http://www.srresearch.com/EL_II.html
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.