Shadow detection is of broad interest in computer vision. In this article, a new shadow detection method for single color images in outdoor scenes is proposed. Shadows attenuate pixel intensity, and the degrees of attenuation are different in the three RGB color channels. Previously, we proposed the Tricolor Attenuation Model (TAM) that describes the attenuation relationship between shadows and their non-shadow backgrounds in the three color channels. TAM can provide strong information on shadow detection; however, our previous study needs a rough segmentation as the pre-processing step and requires four thresholds. These shortcomings can be overcome by adding intensity information. This article addresses the problem of how to combine TAM and intensity and meanwhile to obtain a threshold for shadow segmentation. Simple and complicated shadow images are used to test the proposed method. The experimental results and comparisons validate its effectiveness.
Shadow detection is highly desirable for a wide range of applications in computer vision, pattern recognition, and image processing. As shown in Figures 1 and 2, shadows can be divided into two types: cast shadow and attached shadow (also called self-shadow). The attached shadow is the part of an object that is not illuminated by direct light; the cast shadow is the dark area projected by an object on the background. Cast shadow can be further divided into umbra and penumbra regions. Umbra is the part of a cast shadow where the direct light is completely blocked by its object; penumbra is the part of a cast shadow where direct light is partially blocked.
As shown in Figure 2, the illumination on non-shadow region is daylight (direct sunlight and diffused skylight); that on penumbra is skylight and part of sunlight, while on umbra is only skylight. Since skylight is a part of daylight, the pixel intensity in shadow is lower than that in non-shadow background, i.e., there exists intensity attenuation. The light source and the intensity of shadow region and non-shadow region are listed in Table 1.
Denoting as a shadow pixel value vector and as the pixel value vector of the corresponding non-shadow background, the relationship between and is
where denotes the tricolor attenuation vector. The relationship among is called Tricolor Attenuation Model (TAM)  which can be represented by:
where m=1.31 and n=1.19.
TAM describes the attenuation relationship between shadows and their non-shadow backgrounds in the three color channels, and this relationship can be used for shadow detection. TAM-based subtraction image (hereafter TAM image) is obtained by subtracting the minimum attenuation channel from the maximum attenuation one. Based on the TAM image, the multi-step shadow detection algorithm is previously proposed . Its main steps includes
Segmenting the original image and calculating TAM in each segmented sub-region.
Simply using the mean value over each sub-region to binarizate the TAM images and to obtain initial shadows.
Simply using the mean values in three color channels, in each sub-region, as the thresholds to verify and refine the initial shadows (to obtain detailed and more accurate results).
Generally, the method  is an automatic one and can work on single still images, even with complex scenes. However, there are two unsolved problems in the method.
It needs segmentation. Although the method is not sensitive to little segmentation error, it is not an easy work to get a satisfying segmentation result (shadows and their non-shadow backgrounds are segmented into same regions). For some images, serious segmentation errors may lead to bad shadow detection results.
It uses four simple mean values as thresholds in the two key steps (steps 2 and 3). One threshold is used for initial shadow segmentation and three thresholds are used to obtain accuracy boundaries and details. The thresholds sometimes have noticeable influence on the final results, i.e., simple thresholds are insufficient for some images.
In this article, we try to solve the above-mentioned two problems; we combine TAM and intensity information to avoid the segmentation step and derive only one threshold to substitute previous four simple ones. The new proposed method in this article is simpler and meanwhile can achieve similar or better results.
2 Previous studies
Shadows, a common phenomenon in most outdoor scenes, take extensive effects in computer vision and pattern recognition. It brings many difficulties to computer vision applications such as segmentation, tracking, retrieval, recognition. On the other hand, shadows in an image also provide useful information about the scene: they provide cues about the location of the sun as well as the shape and the geometry of the occluder. Overall, dealing with shadows is an important and challenging task in computer vision and pattern recognition.
The most straightforward feature of shadow is that it darkens the surface it casts on, and this feature is adopted by some methods directly [2, 3] or indirectly [4, 5]. Many methods assume that shadow pixels mainly change luminance but less chrominance. For example, in , the authors assume hue and saturation components change within a certain limit in HSV space. In , multiple cues including color, luminance, and texture are applied to detect moving shadows. Another commonly used feature for shadow detection is intrinsic feature. Intrinsic features locate shadows by comparing the intrinsic image and the original one. Salvador et al.  employed c1c2c3 feature to derive intrinsic images. Finlayson et al.  developed a method to generate a 1D illumination invariant image by finding a special direction in a 2D chromaticity feature space. Tian and Tang  proposed a method to generate illumination invariant image by using the linearity between shadow and non-shadow paired regions. The intrinsic image is useful for shadow detection. However, it cannot totally eliminate the illumination effect and thus is often used in the simple scenes.
Most shadow detection methods focus on detecting moving shadows. Moving shadow detection methods can employ the frame difference technique to locate moving objects and their moving shadows. Then, the problem of shadow detection becomes differentiating the moving objects and the moving shadows. Prati et al.  provided a good review for shadow detection methods in video sequences. To adapt to background changes, learning approaches have proven useful. Huang and Chen  employed Gaussian mixture model to learn the color features and to model the background appearance variations under cast shadows. Brisson and Zaccarin  presented an unsupervised kernel-based approach to estimate the cast shadow direction. Siala et al.  described a moving shadow detection algorithm by training the manually segmented shadow regions. Joshi and Papanikolopoulos  used SVM and co-training technique to detect shadows. Compared with static shadow detection methods, moving shadow detection methods can employ the powerful background subtraction techniques. Therefore, the majority of moving shadow detection methods cannot be directly used to detect static shadows in single images.
As detecting moving shadows has made great progress, detecting it from a single image remains a difficult problem. Wu and Tang  used the Bayesian approach to extract shadows from a single image, but it requires user's intervention as the input. Panagopoulos et al.  used the Fisher distribution to model shadows, but this approach needs 3D geometry information. As a special application of shadow detection in single image, literatures [3, 18, 19] focus on detecting shadows in the remote sensing images. Lalonde et al.  proposed a learning approach to train a decision tree classifier on a set of shadow sensitive features to detect ground shadows in consumer-grade photographs. Guo et al.  proposed a learning-based shadow detection method by using paired regions (shadow and non-shadow) for a single image. Learning methods can achieve good performance if the parameters are trained well. However, they will fail when the test image is vastly different from the images in the training set . In the previous study , we proposed the TAM-based shadow detection algorithm. The algorithm is automatic and simple but it depends more or less upon priori segmentation and the four simply chosen thresholds. The improved algorithm described in Section 3 can address these two problems.
3 Method description
To obtain TAM image, we first calculate the mean values in three color channels of original image F by
where denotes the k th pixel of image F in R channel, and M is the number of pixels.
In Figure 3, tricolor attenuation order for the first original image is and for the second one is , therefore the corresponding TAM images are formed by and , respectively. Shadows are dark in TAM images, which provide strong information for shadow detection. However, sometimes the TAM-based channel subtraction procedure may cause not only shadows, but also some other objects become dark. Just take the second TAM image of Figure 3 as an example, the TAM image is formed by subtracting the blue channel from the red channel, not only the shadows but also some blue objects (e.g., the flowerpot) become dark. The flowerpot may be falsely classified as a shadow after binarization. TAM assumes a shadow and its non-shadow background share an identical reflectance property, that's why our previous study  requires a priori segmentation to ensure shadows are detected on uniform reflectance regions. Additionally, the subtraction will smooth pixel values because of the high correlation among R, G, and B components . The smoothing may cause details missing in detection results. The first image of Figure 4 demonstrates that there are false detections and details missing if we only employ TAM (without segmentation) to detect shadows.
As mentioned above, though TAM can provide information for shadow detection, it may suffer from false detection and details missing problems. These problems caused by luminance information are lost during the channel-subtraction procedure. Fortunately, the lost information in the TAM image can be compensated by intensity (grayscale) image. The problem then becomes how to combine intensity image with TAM image. In the following, we will give a method to address it meanwhile to derive a threshold for shadow segmentation.
Combined image Z is obtained by combing TAM image X with intensity image Y as follows:
where α is the weight coefficient. We define the objective function as:
where S(T) denotes the shadow determined by a threshold T.
denotes the mean value of shadow regions in Z; denotes the mean value of the non-shadow regions in Z. The subtraction of them can measure the difference between the shadow regions and the non-shadow regions (the subtraction is always positive, which will be proved in Appendix). The difference between them is weighted by a quadratic function G(T), defined as follows, to avoid too high or too low T.
in which u is the mean value of image Z. The best T should make the mean value of shadow regions and that of non-shadow regions have the biggest weighted difference.
Given T, S can be determined by using Equation (6).
Denoting and , the weight α is defined as:
κ and η measure the contributions of X and Y on getting the threshold. The exponent of heightens the difference of the contributions and make sure α > 1 for the following two reasons.
The range of variation of X is lower than that of Y (as stated above, the TAM-based subtraction will smooth pixel values).
Shadow detection relies mainly upon X; Y is mainly used to obtain precise result (see Figure 4 and refer to ).
α is initialized with . Repeating (4)-(9) to update T and α until .
4 Experimental results
Figure 5 shows the result comparisons between the algorithms proposed in this article and those state-of-the-art in [1, 20, 21], respectively. The original image of the first row is a simple image of a person's shadow, half on the grass and half on the road. By using the method presented in this article we achieve quite similar result as that by  and better than both by [20, 21]. The original image of the second row in Figure 5 is an aerial image with complex content. Most attached and cast shadows can be detected by the proposed method. The weakness is that some trees and some solar panels in the bottom left of the image are incorrectly classified as shadows compared with the result given in . The result by the study  misses some shadows of the house and the tree in the left part of the image while the result by the study  misses most shadows. The original image of the third row in Figure 5 is a forest image with complicate texture that was taken from 100 m high, with some small sparse cast shadows. They can be detected by the algorithm proposed in this article. Especially, the black words marking date and time on the top of the image are not falsely classified as shadows, which may be inevitable by intensity-based shadow detection methods. The result by the study  is over detected and has false alarms; the result by the study  misses many shadows and that of  misses some in the bottom of the image. The original image of the fourth row in Figure 5 contains two cast shadows on the ground and one attached shadow on the leg. All of them can be detected by the algorithm proposed in this article. The result by the study  misses some details; the result by the study  misclassifies the brighter region at the up-right corner as a non-shadow one; the result by the study  misses most shadow of the tree. Compared with method , the method proposed here does not need segmentation and requires only one threshold. Compared with [20, 21], the proposed method does not need training. These advantages may make the proposed method easier to use.
More results of the method are listed in Figure 6. These images contain various shadows: attached shadows and cast shadows on ground, road, grass, etc. The results show that shadows can be detected correctly.
Because shadow detection usually is a preprocessing step of practical applications, fast computing is important. Time consuming of the four methods is tabulated in Table 2. From the comparisons, we can find that our method is faster than the other there methods. The experiment was conducted on a computer with Intel (R) Core™ 2 Q8400 2.66 GHz CPU, 2 GB RAM memory. The programs were compiled with Matlab R2010b.
In this article, we propose a shadow detection method based on combining TAM image and intensity image. In previous study , TAM information and intensity information are used separately. Shadow detection only relies on TAM information, and it needs a rough segmentation preprocessing step; intensity information is simply used to improve the boundary accuracy and details of the detected shadows. The effective combination of them in this article allows that the new method is free from segmentation. Furthermore, the new method only requires one threshold to detect shadows and handle the details simultaneously. These advantages make the proposed method easier to use and more robust in applications.
Given an image g∈R2. Denoting as the mean value of pixels whose values smaller than T, as the mean value of pixels whose values lager than T, and as the mean value of the whole image, we have
Denoting ni as the number of pixels at level i, we have
Barnard K, Finlayson G: Shadow identification using colour ratios. In Proceedings of the IS&T/SID Eighth Color Imaging Conference: Color Science, Systems and Applications. Volume 8. Scottsdale, Arizona, USA; 2000:97-101.
Tian J, Tang Y: Linearity of each channel pixel values from a surface in and out of shadows and its applications. In IEEE Conference on Computer Vision and Pattern Recognition. Springs, Colorado, USA; 2011:985-992.
Prati A, Cucchiara R, Mikic I, Trivedi MM: Analysis and detection of shadows in video streams: a comparative evaluation. In IEEE Conference on Computer Vision and Pattern Recognition. Volume 2. Kauai, Hawaii, USA; 2001:571-576.
Siala K, Chakchouk M, Besbes O, Chaieb F: Moving shadow detection with support vector domain description in the color ratios space. In International Conference on Pattern Recognition. Volume 4. Cambridge, UK; 2004:384-387.
Panagopoulos A, Samaras D, Paragios N: Robust shadow and illumination estimation using a mixture model. In IEEE Conference on Computer Vision and Pattern Recognition. Miami, Florida, USA; 2009:651-658.
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.