Driver Drowsiness Warning System Using Visual Information for Both Diurnal and Nocturnal Illumination Conditions
© Marco Javier Flores et al. 2010
Received: 23 November 2009
Accepted: 21 June 2010
Published: 11 July 2010
Every year, traffic accidents due to human errors cause increasing amounts of deaths and injuries globally. To help reduce the amount of fatalities, in the paper presented here, a new module for Advanced Driver Assistance System (ADAS) which deals with automatic driver drowsiness detection based on visual information and Artificial Intelligence is presented. The aim of this system is to locate, track, and analyze both the drivers face and eyes to compute a drowsiness index, where this real-time system works under varying light conditions (diurnal and nocturnal driving). Examples of different images of drivers taken in a real vehicle are shown to validate the algorithms used.
ADAS is part of the active safety systems that interact to a larger extent with drivers to help them avoid traffic accidents. The goal of such systems is to contribute to the reduction of traffic accidents by means of new technologies; that is, incorporating new systems for increasing vehicle security, and at the same time, decreasing danger situations that may arise during driving, due to human errors. In this scenario, vehicle security research is focused on driver analysis. In this particular research, a more in-depth analysis of drowsiness and distraction is presented .
Drowsiness appears in situations of stress and fatigue in an unexpected and inopportune way and may be produced by sleep disorders, certain types of medications, and even, boredom, for example, driving for long periods of time. The sleeping sensation reduces the level of vigilante producing danger situations and increases the probability of an accident occurring.
It has been estimated that drowsiness causes between 10% and 20% of traffic accidents, causing both fatalities dead  and injuries , whereas within the trucking industry 57% of fatal truck accidents are caused by this problem [4, 5]. Fletcher et al. in  have stated that 30% of all traffic accidents have been caused by drowsiness, and Brandt et al.  have presented statistics showing that 20% of all accidents are caused by fatigue and lack of attention. In the USA, drowsiness is responsible for 100000 traffic accidents yearly producing costs of close to 12.000 million dollars . In Germany, one out of four traffic accidents originate from drowsiness, while in England 20% of all traffic accidents are produced by drowsiness , and in Australia 1500 million dollars has been spent on fatalities resulting from this problem .
By taking advantage of these visual characteristics, computer vision is the most feasible and appropriate technology available to deal with this problem. This paper presents the drowsiness detection system of the IVVI (Intelligent Vehicle based Visual on Information) vehicle . The goal of this system is to automatically estimate the driver's drowsiness and to prevent drivers falling asleep at the wheel.
This paper is laid out as follows. Section 2 presents an extensive review on the state of the art considering different lighting conditions. A general framework of the proposed method is presented Section 3. There are two systems, one for diurnal and another nocturnal driving. Both have a first step for face and eye detection, followed for a second step for face and eye tracking. The output of both systems is a drowsiness index based on a support vector machine. A deeper explanation of both systems is presented in Sections 4 and 5 where the similarities and differences of both approaches are highlighted, and the results are shown. Finally, in Section 6, the conclusions are presented.
2. Related Work
To increase traffic safety and to reduce the number of traffic accidents, numerous universities, research centers, automotive companies (Toyota, Daimler Chrysler, Mitsubishi, etc.), and governments (Europe Union, etc.) are contributing to the development of ADAS for driver analysis , using different technologies. In this sense, the use of visual information to obtain the state of the driver drowsiness and to understand his/her behavior is an active research field.
This problem requires the recognition of human behavior when in a state of sleepiness by means of an eye and facial (head) analysis. This is a difficult task, even for humans, because there are many factors involved, for instance, changing illumination conditions and a variety of possible facial postures. Considering the illumination, the state of the art has been divided in two parts; the first provides details on systems that work with natural daylight whereas the second deals with systems which operate with the help of illumination systems based on near infrared (NIR) illumination.
2.1. Systems for Daylight Illumination
To analyze driver drowsiness several systems have been built in recent years. They usually require the problem to be simplified to work partially or for specific environments; for example, D'Orazio et al.  have proposed an eye detection algorithm that searches for the eyes within the complete image and have assumed that the iris is always darker than the sclera. Using the Hough transform for circles and geometrical constraints the eye candidates are located; next, they are passed to a neural network that classifies between eyes and noneyes. This system is capable of classifying eyes as being open or closed. The main limitations to this algorithm are as follows. It is applicable only when the eyes are visible in the image, and it is not robust for changes in illumination. Horng et al.  have presented a system that uses a skin color model over an HSI space for face detection, edge information for eye localization, and dynamical template matching for eye tracking. By using color information from the eyeball, the state of the eye is defined; thus the driver's state can be computed, that is, asleep or alert; if the eyes are closed for five consecutive frames, the driver is assumed to be dozing. Brandt et al.  have shown a system that monitors driver fatigue and lack of attention. For this task, the Viola Jones (VJ) method has been used  to detect the driver's face. By using the optical flow algorithm on eyes and the head this system is able to compute the driver's state. Tian and Qin in  have built a system which verifies the state of the driver's eye. Their system uses Cb and Cr components of the YCbCr color space; with a vertical projection function this system locates the face region and with a horizontal projection function it locates the eye region. Once the eyes are located the system computes the eye state using a complexity function. Dong and Wu  have presented a system for driver fatigue detection; this is based on a skin color model on a bivariate Normal distribution and Cb and Cr components of the YCbCr color space. After locating the eyes, it computes the fatigue index using the distance of the eyelid to classify whether the eyes are open or closed; if the eyes are closed for five consecutive frames, the driver is considering to be dozing, as in Horng's work. Branzan et al.  also have presented a system for drowsiness monitoring using template matching to analyze the state of the eye.
2.2. Systems Using Infrared Illumination
As a result of nocturnal lighting conditions, Ji et al. in [4, 15] have presented a drowsiness detection system based on NIR illumination and stereo vision. This system locates the position of the eye using image differences based on the bright pupil effect. Later, this system computes the blind eyelid frequency and eye gaze to build two drowsiness indices: PERCLOS (percentage of eye closure over time)  and AECS (average eye closure speed). Bergasa et al.  have also developed a nonintrusive system using infrared light illumination this system computes the driver's vigilance level using a finite state automata (FSM)  with six different eye states that compute several indices, among them, PERCLOS; this system is also capable of detecting inattention considering a facial posture analysis. Other research work based on this type of illumination has been presented by Grace , where the authors measure slow eyelid closure. Systems using NIR illumination work well under stable lighting conditions [5, 18]; however, these systems present drawbacks for applications in real vehicles, where the light continually changes. In this scenario, if the spectral pupils disappear, then the eye detection process becomes more complex.
3. System Design for Drowsiness Detection
This paper presents a system which detects driver drowsiness which works for both day and night time conditions and follows the classification presented in the state of the art.
This composition has allowed two systems to be obtained, one for day and a second for night time conditions. The first works with natural daylight illumination and the second with artificial infrared illumination. It is interesting to note that both systems operate using grayscale images taken within a real vehicle.
Each one of these parts will be explained in the following sections.
4. Day System Design
In this section, the daytime system based on the algorithm schematic shown in Figure 1(a) will be described, where the visual information is acquired using a digital camera.
4.1. Face Detection
To locate the face, this system uses the VJ object detector which is a machine learning approach for visual object detection. This makes use of three important features to make an efficient object detector based on the integral image, the AdaBoost technique and the cascade classifier . Each one of these elements is important to efficiently process the images and in near real-time with correct detections as high as 90%. A further important aspect of this method is its robustness for changing light conditions. However, in spite of the above-mentioned features, its principal disadvantage is that it cannot extrapolate and does not work appropriately when the face is not in front of the camera axis. This particular case occurs when the driver moves his/her head. This shortcoming will be analyzed later on in this paper.
4.2. Eye Detection
Locating the position of the eye is a difficult task as different features define the same eye depending, for example, on the area of the image where it appears and on the color of the iris, but the main problem that occurs when driving is the changes in the ambient lighting conditions.
The main reason behind using pixel information from a random sample is due to the fact that head movements, illumination changes, and so forth, do not allow complete eye pixel information to be obtained, that is, only partial information of the eye in images B, G, and L is available, where the elliptical shape prevails. This random information makes it feasible to use an algorithm that computes the parameters of a function which approximate the eye ellipse shape. EM computes the mean, variance, and the correlation of the and coordinates that belong to the eye. The initial parameters required to run the EM are obtained from a regression model adjusted using the least square method. The number of iterations of the EM algorithm is set to 10, and the sample size is taken to be at least 1/3 of the rectangles area . These parameters will be used in the eye state analysis presented below.
There are a number of reasons for using a tracking module. The first is due to problems that were encountered using the VJ during this research. Another is related with the necessity to track the face and eyes continuously from frame to frame. A third reason is to reduce the search space thus satisfying the real-time condition requirement. The tracking process has been developed using the Condensation Algorithm (CA) in conjunction with Neural Networks (NNs) used for face tracking and with template matching for eye tracking.
4.3.1. The Condensation Algorithm
where is the measurement system at time and is a nonlinear equation that links the present state plus a white noise. The processes and are both white noise terms and are independent of each other. Also, these processes in general are non-Gaussian and multimodal. It must be pointed out that is an unobservable underlying stochastic process.
4.3.2. Neural Networks
Next, the characteristic vector which consists of the pixel gray-level values from the face image is extracted. The rate of classification following training is greater than 93%.
4.3.3. Face Tracking
The main problem of the VJ method is that it is only able to locate the human face when it is positioned in front of the camera. This drawback leads to an unreliable system for driver analysis throughout the driving process which is highly dynamic, for example, when looking at the rearview or wing mirrors. Much effort has gone into correcting this problem resulting in an efficient tracker which has been implemented using CA combined with a backpropagation neural network.
where is the transition matrix proposed in , and represents the system perturbation at time . The most difficult part of the CA is to evaluate the observation density function. In this contribution to compute the weight for , at time , a neural network value in the range of has been used; this provides an approximation of the face and nonface in conjunction with the distance and with respect to the face to track. This is similar to the work performed by Satake and Shakunaga  who have used sparse template matching to compute the weight of the sample for . In this contribution, the neural network value is used as an approximate value for the weights.
Result of face tracking.
4.3.4. Eye Tracking
Result of eye tracking.
4.4. Eye State Detection
4.4.1. Support Vector Machine
SVM classification [28–30] is rooted in statistical learning theory and pattern classifiers; it uses a training set, , where is the characteristic vector in represents the class, in this case 1 for open eyes and 2 for closed eyes, and is the number of elements of . From a training set, a hyperplane is built that permits classification between two different classes and minimizes the empirical risk function .
4.4.2. Eye Characteristic Extraction Using a Gabor Filter
Previous to SVM training, it is crucial to preprocess each image where this procedure involves histogram equalization, filtering using a median filter, followed by the sharpen filter. The median filter is used to reduce image noise, and the sharpen filter enhances the borders.
The main objective of the SVM training is to obtain the best parameters and the best kernel that minimizes (5). After several SVM training experiments, it was decided to use the RBF kernel, that is, is , and , where these parameters achieve a high training classification rate of close to 93%.
Result of eye state analysis.
4.5. Drowsiness Index
The eye-blinking frequency is an indicator that allows the level of driver drowsiness (fatigue) to be measured. As in the works of Horng et al.  and Dong and Wu , if for five consecutive frames or during 0.25 seconds the eye is identified as being closed the system issues an alarm cue, PERCLOS , which is also implemented in this system.
It is estimated that 20% of traffic accidents are caused by driver distraction . To detect this characteristic the driver's face should be studied because the pose of the face contains information about one's attention, gaze, and level of fatigue . To verify driver distraction the following procedure has been implemented.
4.6.1. Face Orientation
4.6.2. Head Tilt
The method described above presents problems when a monocular camera is used, and so, to overcome this drawback, this contribution has implemented a head-tilt based on neural networks. Keeping in mind that the driver face database is composed of face examples for five different orientations, the face is passed to the neural network to determine its orientation, specifically for the up and down cases. If the system detects that the face position is not looking straight on, an alarm cue is issued to alert the driver of a danger situation.
5. Night System Design
In this part of the work, the night system will be described, where this is based on the algorithm scheme shown in Figure 1(b). Note that it is composed of both software and hardware platforms. The main difference between this and the previous system is in perception system.
5.1. Perception System
Each frame is deinterlaced in both odd and even fields which contain the dark and bright pupil images, separately. Hence, the height of the odd and even image fields is a medium of the original image; this procedure can be seen in Figure 27(c): the top photograph is the even image field, and the bottom is the odd image field. Even an odd image will be used later on for eye detection.
5.2. Eye Detection
The bright effect pupil is the main principle behind locating the position of the eye. To do this sense, three images have been generated from the initial driver image, these are the difference image ( ), the edge image ( ), and the bright part of the fast radial symmetry transform (FRST) image  ( ).
Most researchers only make use of the difference image for pupil detection; however, in real driving conditions, this image deteriorates due to external illumination, vibrations, and so forth and is also very sensitive to lighting conditions. In such circumstances, it is necessary to incorporate more robust information to improve the detection step. Therefore, in this paper, the edge and FRST images have been implemented to obtain enhanced results considering the aforementioned drawbacks.
Once all the images used to detect the eyes have been specified, the next step is to compute a binary threshold for the difference, edge, and FRST images. In the first of these, the threshold is obtained from a systematic analysis of its histogram, where two groups are formed. In the second case, the histogram is modelled using a Gamma distribution function where the 90% cumulative interval provides the threshold. Finally, in the third image, the maximum histogram value produces the required threshold level. This yields three binary images consisting of binary blobs that may contain a pupil.
5.3. Face Detection
where is the centre of the face, and and are the axes of the face ellipse. Figure 30 depicts this model and its result.
The tracking process has been developed using the Condensation algorithm for face and eye tracking.
5.4.1. Face Tracking
Results of face and eye tracking and eye state analysis.
5.4.2. Eye Tracking
To evaluate the probability observation density, a triangular density function based on a value from the difference image has been used. CA is initialized when the eyes are detected using the method described in the previous section plus a white noise. Table 4 shows the eye tracking results which have been obtained from several image sequences.
5.5. Eye State Detection and Drowsiness Index
To identify drowsiness from an eye analysis, knowledge of the eye's state is required, that is, open or closed, in time and to develop an analysis over large periods of time, that is, to measure the time spent in each state. Classification of the open and closed state is complex due to changes in the shape of the eye, the changing position, and face rotations, as well as variations in twinkling and illumination, and so forth. All of these factors make it difficult to reliably analyze the eyes. However, when using the edge and FRST images, the eye state may be computed satisfactorily.
This method is similar to the previous case, once the face is continuously located in time; a neural network is used to determine its orientation and to verify the driver's level of distraction. If the system detects that the face position is not facing forward, an alarm cue is issued to alert the driver of a danger situation.
In this paper, a research project to develop a nonintrusive and autonomous driver drowsiness system based on Computer Vision and Artificial Intelligence has been presented. This system uses advanced technologies which analyze and monitor the state of the driver's eye in real-time and for real driving conditions; this is driving conditions for both daytime and nocturnal situations.
In the first case, based on the results presented in Tables 1, 2, and 3, the algorithm proposed for eye detection, face tracking, and eye tracking is shown to be robust and accurate for varying light, external illumination interference, vibrations, changing backgrounds, and facial orientations. In the second case, and as presented in the results of Table 4, the system is also observed to provide agreeable results.
To acquire the data required to develop and test the algorithms presented in this paper, several drivers have been recruited and were exposed to a wide variety of difficult situations commonly encountered on roadways, for both daytime and nocturnal conditions. This guarantees and confirms that the experiments presented here are proven to be robust and efficient for real traffic scenarios. The images were taken using two cameras within the IVVI (Intelligent Vehicle based on Visual Information) vehicle (Figure 27(a)): a pin-hole analog camera connected to a frame-grabber for the nocturnal illumination and a fire-wire camera for the diurnal use. Besides that, the hardware processes 4-5 frames per second using an Intel Pentium D, with 3.2 GHz, 2 GB. RAM memory and MS Windows XP.
For future work, the objective will be to reduce the percentage error, that is, reduce the amount of false alarms; to achieve this, additional experiments will be developed, using additional drivers and incorporating new analysis modules, for example, facial expressions.
This paper was supported in part by the Spanish Government through the CICYT projects VISVIA (Grant TRA2007-67786-C02-02) and POCIMA (Grant TRA2007-67374-C02-01).
- Brandt T, Stemmer R, Mertsching B, Rakotonirainy A: Affordable visual driver monitoring system for fatigue and monotony. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '04), October 2004 7: 6451-6456.Google Scholar
- Tian Z, Qin H: Real-time driver's eye state detection. Proceedings of the IEEE International Conference on Vehicular Electronics and Safety, October 2005 285-289.Google Scholar
- Dong W, Wu X: Driver fatigue detection based on the distance of eyelid. Proceedings of the IEEE International Workshop on VLSI Design and Video Technology (IWVDVT '05), May 2005, Suzhou-China 397-400.Google Scholar
- Ji Q, Yang X: Real-time eye, gaze, and face pose tracking for monitoring driver vigilance. Real-Time Imaging 2002, 8(5):357-377. 10.1006/rtim.2002.0279MATHMathSciNetView ArticleGoogle Scholar
- Bergasa LM, Nuevo J, Sotelo MA, Vázquez M: Real-time system for monitoring driver vigilance. Proceedings of the IEEE Intelligent Vehicles Symposium, June 2004 78-83.Google Scholar
- Fletcher L, Petersson L, Zelinsky A: Driver assistance systems based on vision in and out of vehicles. Proceedings of the IEEE Symposium on Intelligent Vehicles, 2003 322-327.View ArticleGoogle Scholar
- NHTSA : Evaluation of techniques for ocular measurement as an index of fatigue and the basis for alertness management. DOT HS 808762 National Highway Traffic Safety Administration, Washington, DC, USA; 1998.Google Scholar
- Hagenmeyer L: Development of a multimodal, universal human-machine-interface for hypovigilance-management-systems, Ph.D. thesis. University of Stuttgart, Stuttgart, Germany; 2007.Google Scholar
- Longhurst G: Understanding Driver Visual Behaviour. Seeing Machine, Canberra, Australia;Google Scholar
- Armingol JM, de la Escalera A, Hilario C, Collado JM, Carrasco JP, Flores MJ, Pastor JM, Rodríguez FJ: IVVI: intelligent vehicle based on visual information. Robotics and Autonomous Systems 2007, 55(12):904-916. 10.1016/j.robot.2007.09.004View ArticleGoogle Scholar
- D'Orazio T, Leo M, Distante A: Eye detection in faces images for a driver vigilante system. In Proceedings of the IEEE Intelligent Vehicles Symposium, June 2004, Parma, Italy. University of Parma; 14-17.Google Scholar
- Horng W-B, Chen C-Y, Chang Y, Fan C-H: Driver fatigue detection based on eye tracking and dynamic template matching. Proceedings of the IEEE International Conference on Networking, Sensing and Control, March 2004 7-12.Google Scholar
- Viola P, Jones M: Rapid object detection using a boosted cascade of simple features. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, December 2001 511-518.Google Scholar
- Branzan A, Widsten B, Wang T, Lan J, Mah J: A computer vision-based system for real-time detection of sleep onset in fatigued drivers. Proceedings of the IEEE Intelligent Vehicles Symposium (IV '08), June 2008 25-30.Google Scholar
- Ji Q, Zhu Z, Lan P: Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE Transactions on Vehicular Technology 2004, 53(4):1052-1068. 10.1109/TVT.2004.830974View ArticleGoogle Scholar
- Brookshear JG: Theory of Computation: Formal Languages, Automata and Complexity. Addison Wesley Iberoamericana; 1993.MATHGoogle Scholar
- Grace R: Drowsy driver monitor and warning system. Proceedings of Driving Assessment International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, 2001Google Scholar
- Daugman JG: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America 1985, 2(7):1160-1169. 10.1364/JOSAA.2.001160View ArticleGoogle Scholar
- Gejgus P, Sparka M: Face Tracking in Color Video Sequences. The Association for Computing Machinery; 2003.View ArticleGoogle Scholar
- Otsu N: A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics 1979, 9: 62-66.View ArticleGoogle Scholar
- Jafar I, Ying H: A new method for image contrast enhancement based on automatic specification of local histograms. International Journal of Computer Science and Network Security 2007., 7(7):Google Scholar
- Wu Y, Liu H, Zha H: A new method of detecting human eyelids based on deformable templates. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '04), October 2004 604-609.Google Scholar
- McLachlan GJ: The EM Algorithm and Extensions. John Wiley & Sons, New York, NY, USA; 1997.MATHGoogle Scholar
- Isard M, Blake A: Condensation: conditional density propagation for visual tracking. International Journal of Computer Vision 1998, 29(1):5-28. 10.1023/A:1008078328650View ArticleGoogle Scholar
- Isard MA: Visual motion analysis by probabilistic propagation of conditional density, Ph.D. thesis. Oxford University, Oxford, UK; 1998.Google Scholar
- Parker JR: Practical Computer Vision Using C. John Wiley & Sons, New York, NY, USA; 1994.Google Scholar
- Satake J, Shakunaga T: Multiple target tracking by appearance-based condensation tracker using structure information. Proceedings of the 17th International Conference on Pattern Recognition (ICPR '04), August 2004 3: 294-297.View ArticleGoogle Scholar
- Cristianini N, Shawe-Taylor J: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK; 2000.View ArticleMATHGoogle Scholar
- Chang C, Lin C: LIBSVM: a library for support vector machine. 2001, http://www.csie.ntu.edu.tw/~cjlin/libsvm
- Guyon I, Gunn S, Nikravesh M, Zadeh LA: Feature Extraction: Foundations and Applications. Springer, Berlin, Germany; 2006.View ArticleMATHGoogle Scholar
- Chen Y-W, Kubo K: A robust eye detection and tracking technique using gabor filters. Proceedings of the 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIHMSP '07), November 2007 1: 109-112.View ArticleGoogle Scholar
- Loy G, Zelinsky A: Fast radial symmetry for detecting points of interest. IEEE Transactions on Pattern Analysis and Machine Intelligence 2003, 25(8):959-973. 10.1109/TPAMI.2003.1217601View ArticleMATHGoogle Scholar
- Looney CG: Pattern Recognition Using Neural Networks, Theory and Algorithms for Engineers and Scientists. Oxford University Press, Oxford, UK; 1997.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.