Target detection in complex scene of SAR image based on existence probability
EURASIP Journal on Advances in Signal Processing volume 2016, Article number: 114 (2016)
This study proposes a target detection approach based on the target existence probability in complex scenes of a synthetic aperture radar image. Superpixels are the basic unit throughout the approach and are labelled into each classified scene by a texture feature. The original and predicted saliency depth values for each scene are derived through self-information of all the labelled superpixels in each scene. Thereafter, the target existence probability is estimated based on the comparison of two saliency depth values. Lastly, an improved visual attention algorithm, in which the scenes of the saliency map are endowed with different weights related to the existence probabilities, derives the target detection result. This algorithm enhances the attention for the scene that contains the target. Hence, the proposed approach is self-adapting for complex scenes and the algorithm is substantially suitable for different detection missions as well (e.g. vehicle, ship or aircraft detection in the related scenes of road, harbour or airport, respectively). Experimental results on various data show the effectiveness of the proposed method.
Target detection in complex scenes, such as urban areas, airports or harbours, is a challenge in the area of synthetic aperture radar (SAR) image interpretation. Instead of a single scene, such as grassland, farmland or sea, the target detection performance in complex scenes is degraded by using conventional methods. In these complex scenes, the clutter produced by the background may be similar with the targets and are detected as false alarms. For example, the strong reflections of urban building considerably affect vehicle detection. Moreover, the echo waves from various backgrounds overlap and induce strong coherent speckles .
To date, many algorithms are adaptive for detecting a specific target in a complex scene by exploring the region of interest (ROI). For these specific target detections (e.g. ship, vehicle and aircraft detections) in a complex scene, the algorithms obtain the ROIs (e.g. ocean, road and airport) by utilising a preprocess module [2–10]. In [2–5], ROIs are obtained by combining the region mask, which is derived using geographic information system (GIS) data and image data. However, the performances of the GIS data-based algorithms are profoundly influenced by the accuracy of the GIS information. These data often suffer from systematic or random positional errors; hence, the region axes have to be realigned to the image . Wang et al. used the Markov random field (MRF) algorithm to extract the ocean scene and detect ships . In , the authors achieve the elimination of land areas by filtering the divided sub-image based on the rate of high-intensity pixels. In , the SAR image is segmented into N sub-images or regions that comprise high, median or low backscatters based on the k-means programme. The pixels of the targets are detected by the thresholds of the different regions. For these segmentation- or classification-based algorithms, ROI and its characteristics are defined by human experience. That is, the type of target and the possible scenes where the targets exist are predefined before detection. This way, only one algorithm suits one type of target; hence, the range of application and efficiency of the algorithm are unsatisfying.
This study proposes an existence probability-based approach for the SAR image target detection in a complex scene. The existence probability takes the advantage of saliency depth (SD) value to represent the probability that targets exist in a scene. Prior to the estimation of these probabilities, a preprocessing module exists to obtain scenes and arrange the labels of superpixels. Accordingly, an improved visual attention detection algorithm achieves the detection result. The proposed algorithm is self-adapting for complex scenes and for different types of target in the SAR image target detection. The results of the simulated and real SAR data experiments verify the performance of the proposed algorithm.
Figure 1 represents the flowchart of the proposed algorithm. The main framework of this approach is constituted by three modules. The first module is preprocessing, which is organised by superpixel generation and texture-based classification. In this module, a complex scene SAR image is represented by several scenes comprising superpixels; the superpixels in each scene are assigned as the same label. In the second module (i.e. estimation module), the target existence probability is measured by the SD value of each scene. The last module is the detection module, which is an improved visual attention detection algorithm. The saliency map of the visual attention model is endowed with different weights based on the existence probability; thus, the improvement of the detection is achieved.
2.1 Preprocessing module
The estimation and detection modules are processed in superpixel elements. Therefore, the generation of superpixels is achieved in the preprocessing module. Compared with a pixel, a superpixel has substantial statistical characteristic, which is the basis for calculating the saliency depth value. Furthermore, the observed object in the visual attention model is replaced by the superpixel from the pixel; hence, the single salient pixels are contained into the un-salient superpixel. In the current study, the superpixels are generated by the simple linear iterative clustering (SLIC) method  because they process with limited computational effort and the superpixels adhere to the boundaries well.
In the preprocessing module, each scene that has an approximate background is extracted using the classification algorithm. A dense texture feature extracted through a morphological operation , which has been proven suitable for remote-sensing image classification, is opted. Six morphological operations are used in the extraction of features, including opening, closing, opening and closing by reconstruction and opening and closing by top-hat. Meanwhile, the structural elements in the morphological operation comprise square or diamond shapes, and the scales are set at 3 or 7, respectively.
To match the boundaries of scenes to the superpixels in the superpixel classification result, the scene label of the superpixels is assigned as the major scene label of the pixels in it. Figure 2 shows the results of the superpixel generation and superpixel classification.
2.2 Estimation module
The existence probability of the target is estimated through an SD value of each classified scene, thereby presenting the possibility that a single scene contains the targets. After extracting the outlier information in the scene, the estimation measures original and predicted saliencies. The original SD value of the scene means the existence probability of the scene with potential targets, whereas the predicted SD value derived after excluding the outliers means the probability without targets. Thereafter, a comparison between the two SD values shows the existence probability of the targets.
2.2.1 Self-information of the superpixel
In a relatively homogeneous scene, the intensity distributions of the entire scene are similar to that of most of the superpixels in the scene. In addition, for an SAR image that needs detection, the number of target pixels is limited and the distribution of the target is different from the scene. Therefore, the target superpixels are discriminated by measuring the similarity of distributions between the scene and the superpixels.
Self-information, which represents an information value of a random event in information theory, is utilised to measure the similarity. For a random event x with probability p(x), the self-information function is as follows:
The more possible a random event happens, the less the information value is. For a scene in a SAR image, superpixels contain different grey-level pixels that appear to be different random events. Thereafter, the information value is small as one superpixel is a background superpixel, because the probability that a superpixel contains background pixels is large. On the contrary, the information value of the target is large as one superpixel is a target superpixel.
Variable sp is the superpixel, variable s is the scene and the “|” symbol stands for conditional. After all, the similarity between the superpixel and the scene S(sp,s) is inversely proportional to its self-information I(sp|s).
The speckle noise of the SAR images are described by a multiplication model as in (3):
The first and second factors indicate the texture and Gaussian distribution components, respectively. The texture in one superpixel was assumed homogenous based on the SLIC algorithm. Therefore, the texture component in (3) can be regard as constant, and then, the pixels in the superpixel could be considered iid.
In a complex scene SAR image I, which contains i types of scene s i , the conditional probability of each superpixel for each scene is calculated by the probabilities of its contained pixels. In addition, the conditional probability of each pixel could be represented by the histogram of the scene, such as P(SP ij (r)|s i ) = P(l|s i ; SP ij (r) = l). The variable SP ij is the j-th superpixel in scene s i , and the variable SP ij (r) is the r-th pixel in superpixel SP ij . Variable l is the intensity value. Therefore, the pixels with the same intensity have equal probability, the amount of pixels with intensity l equals to P(l|SP ij ) ⋅ R and variable R is the number of pixels in SP ij . After accumulating the same probability, the total conditional probability of superpixel is transformed as (4).
Variable P(SP ij ) is the probability of superpixel SP ij . The probability is equal to the accumulation of the conditional probability of each pixel SP ij (r) in the superpixel.
An outlier detection method is used to filter out the potential target superpixels, which have a large value of self-information. Outlier detection is commonly utilised in data mining to eliminate noise or detect meaningful knowledge. In this study, the target superpixel is a meaningful knowledge that is calculated as follows:
Variable t is a constant that is experimentally set as [2,3]; thus, the number of outliers are sufficient to represent the salient level of s i ; variables μ i and δ i are the mean and variance, respectively, that are calculated over all the self-information values of the superpixels in s i .
2.2.2 SD-value-based existence probability
The SD value of the scene interprets the difference between the potential targets and the entire scene. It is calculated by the third-order moment distance to the threshold caused by the derived outliers because distance is from the long tail in the distribution, which often includes the target superpixels. The superpixels are outstandingly enhanced, whereas the superpixels near the threshold deduce low values.
Variable q is the number of the detected outliers and variable i is the types of the scene and threshold Th i = t • δ i + μ i . The SAR image shown in Fig. 2a is tested, and the types of the scenes are i = [1,2,3,4] which correspond to grass, road, tree and shadow, respectively. Each scene is a collection of superpixels that have the same classification label. The SD value of a scene is based on the self-information of these superpixels. The distributions in each scene are shown in Fig. 3. The SD values are calculated as follows: Dep1 ≈ 105.1536, Dep2 ≈ 26.2686, Dep3 ≈ 12.9632 and Dep4 ≈ 11.0291. Two targets are located in the grass scene; hence, the SD value of grass is the largest amongst the four scenes. The other scenes have close values because of different thresholds.
The original SD value is obtained by (7). The targets mostly belong to outliers; thus, the SD of the scene that contains the targets is generally decreased after excluding the outliers because the existence of targets enhances the SD value of the scene. Therefore, a predicted SD value is recalculated after excluding the outliers.
Variable D i is the new SD value after excluding the outliers in scene s i , and α is a constant predictive coefficient. The difference between the original and predicted SD value interprets the existence probability of the targets in a scene. When the original value is outstanding beyond the predicted value, the scene is supposed to contain targets. Lastly, the scenes that lack targets are censored, whereas the rates between the differences of the scene with the targets are assigned as weights of the saliency map in the next detection module.
2.3 Detection module
Following the idea of saliency in the estimation module, an improved superpixel-based visual attention model is proposed for the SAR image target detection. The proposed model is based on the Itti visual attention model , which is commonly used for the salient object detection in an optical image. The structure of the proposed model is shown in the red box of Fig. 1.
Compared with the structure of the Itti model, the proposed model achieves several improvements for the SAR image detection. The first improvement is a singular value decomposition (SVD)-based pyramid instead of the traditional Gaussian pyramid. The Gaussian pyramid model is utilised to simulate a multi-scale structure that often comprises nine images with different spatial resolutions. In the multi-scale space, the pixels in which the characteristics are significant variances with resolutions are confirmed as salient pixels. However, it is unsuitable for the SAR image target detection because the targets in the SAR image are weaker and smaller than the salient object in the optical image; the targets in the SAR image become fuzzy and disappear rapidly when the resolution decreases. Therefore, an SVD-based pyramid is utilised to extract the principal components of the image. The SAR image I with a size of m × n can be decomposed as follows:
The variable Σ is a diagonal matrix that comprises eigenvalues and zeros. By retaining the different numbers of the eigenvalue, the low-rank approximate image I(p) that was formed by a different matrix Σ(p) is computed as follows:
The variable k p is the number of residual eigenvalue in the matrix Σ(p). The initial value of the variables are set as follows: I(1) = I, Σ(1) = Σ and k 1 = Rank(I). The variable f is the degree of the low-rank approximation, which is experimentally set as [0.3, 0.7].
The second improvement is the important information-enhancing component. In the Itti model, after the Gaussian pyramid model, a centre-surround difference component based on the subtraction between the different resolution images extracts the difference, thereby indicating the value in the saliency map. However, the pixels around the targets in the result of the image subtraction are insufficient to be detected. In the SVD-based pyramid, with the increasing layer of the pyramid, the image information is decreasing, whereas the principle component (e.g. the target region) is retained. The important information is enhanced by adding different layers of the SVD pyramid.
The third improvement is that the proposed model is based on the superpixel. In the pixel-level-based Itti model, the detection results are several salient areas with the salient pixel as the centre. Thereafter, several defects exist for the SAR image. For example, the isolated salient pixels are determined as the targets because the heavy fluctuation in the SAR image causes outstanding bright pixels in the local region. Moreover, the range of the salient area may not be matched for the targets. To eliminate the isolated pixels, the superpixel is defined as the basic unit of the proposed model. The saliency of the superpixel S ij is computed by averaging the saliency of each pixel S ij (r) in SP ij .
In (13), the saliency of the isolated pixels is averaged into its surrounding, whereas the target superpixels or scene superpixels are homogeneous and have limited influence.
Generally, the saliency map of the proposed model is weighted by the existence probability. The saliency of the high existence probability scene is enhanced whilst the others are depressed; thus, the weighted map reorders the priority of the visual attention detection. With the improved visual attention model, the target detection module is suitable for the SAR image target detection.
3 Experimental results
To verify the performance of the proposed approach in complex scenes, several simulated SAR images are tested in the first experiment. The size of the data is 1170 × 1378; Fig. 4a shows the original amplitude image. In this image, five types of scenes are identified: grass, tree, shadow, bush and road scenes. Figure 4d, g, j, m have the same scenes as Fig. 4a. However, the groups of the tank target are added in the grass, bush and road scenes, which are marked by a yellow dotted circle. The target group includes different types of tanks, namely BMP2, BTR70 and T72, which are collected from the MSTAR database.
Figure 4b shows the distributions of the self-information value belonging to the five scenes without the targets. Figure 4e, h, k, n show the distributions of the scene with the following targets: grass, bush, road and both bush and road scenes, respectively, and the self-information values of the targets are marked by the red dotted circle in the four figures. By comparing Fig. 4e, h, k, n with Fig. 4b, we can find that the self-information value of the targets are all larger than the self-information value of the backgrounds. Figure 4c shows the original and predicted SD values of all scenes. The colours of the predicted values are displayed in the legend alongside the figures, and the original values are identical cyan coloured. From Fig. 4f, only the original SD value of the grass scene is beyond the predicted value, thereby satisfying the fact in Fig. 4d that the targets are added in the grass scene. A similar situation is shown in Fig. 4i, l, o. Particularly, Fig. 4c shows that no target exists; thus, all original values are lower than the predicted values. By contrast, Fig. 4o shows that two groups of targets appear at the bush and road scenes and the corresponding original values are larger than the predictions. Hence, the existence of targets can make the SD value larger, and the module achieves correct results no matter the target situation or situation where the targets are located in multi-scenes.
Figure 5 shows the four situations with the target in Fig. 4 and the detection results of the three algorithms: AC–CFAR , visual attention and proposed approaches. Using the CFAR and visual attention approaches will detect all targets, and the detection rates are higher than 0.95, although numerous false alarms occur in the detection results. In the CFAR method results, false alarms generally appear in the border area between the tree and shadow scenes and the false alarm rates are about 2.4 × 10−3, 1.5 × 10−3, 3.2 × 10−3 and 3.9 × 10−3 for Fig. 5b from top to bottom, respectively. This result is due to large intensity contrast and fuzzy edge between the two scenes, as well as the CFAR method that is sensitive to light pixels in the dark background. In the visual attention method results, false alarms occur in the crown area of the tree scene. This result is due to the strong intensity value crown that deduces high saliency, and the false alarm rates are about 1.5 × 10−2, 1.8 × 10−2, 1.6 × 10−2 and 1.9 × 10−2 for Fig. 5c from top to bottom, respectively. However, the false alarms caused by the edges or strong intensity areas are filtered out using the proposed method, and the false alarm rates are reduced to less than 1 × 10−3 (Fig. 5c). The edges or strong intensity areas have high saliency, whereas the scenes which they belong to have a lower SD value than the predicted one. Therefore, the existence probability of the targets is low in the scene; hence, these areas are barely considered targets. In summary, the experiments proved that maintaining an acceptable false alarm rate in the complex scenes for the two detection methods is difficult and the proposed approach is suitable for the target detection in complex scenes of the SAR images.
To evaluate the performance of the proposed approach for noise, ten groups of simulated SAR images are used. Every group contains 1 original image and 24 images with different degrees of speckle noises. Noise intensity of the images is measured by the equivalent number of looks M ENL. Figure 6 shows an original image, a noised image and their detection results of three algorithms, which are the proposed algorithm, CA–CFAR detector based on Rayleigh distribution  and AC–CFAR detector based on G0 distribution . In Fig. 6a, M ENL ≈ 2.9, and for Fig. 6f, M ENL ≈ 1.5. Figure 6b, g show the saliency depth value. With the increased noise, the predicted SD values of the grass scene that contains targets are reduced but still higher than the original SD values. The proposed approach remains similar in performance, as shown in Fig. 6c, h, besides some edge pixels of targets were missed. By the results of the CA–CFAR method in Fig. 6d, i, target pixels significantly decrease on the constant false alarm ratio condition. For the AC–CFAR approach, a comparison of Fig. 6e, j suggests that numerous false alarm pixels are emerging and several target pixels are disappearing.
Figure 7 shows the detection rate (DR) and the false alarm rate (FAR) of these approaches. Because the noise became serious and most of the target information was corrupted, the performance of all algorithms in Fig. 7a significantly degrade as the ENL value is lower than 1.5. In detail, both the targets and the false alarms disappear in the results based on the CA–CFAR method, whereas many false alarms appear in the results based on the AC–CFAR method and the proposed approach. On the other side where the ENL is greater than 1.5 in Fig. 7, the DR of the CA–CFAR method and the AC–CFAR still degrades fastest in noised images. Meanwhile, the DR of the proposed method is always highest, and the FAR of the proposed method is consistently low. As a result, the proposed method can relatively guarantee a higher DR value and a lower FAR value simultaneously when the image speckle noise maintains its strength.
To validate the extensive application range of the proposed approach, several SAR images that contain an airport or harbour are tested. Figure 8 shows the detection results of the different approaches. The performance of airplane or vessel detection in the airport or harbour scene is coincidental with the vehicle target detection in the land scene. All the targets are detected, and most of the false alarms in the other approaches, which are caused by edges or strong intensity areas, are inhibited by the proposed approach. Consequently, the proposed method effectively deals with the target detection in complex scenes and the application range is extensive for target detection, such as vehicle detection on the road and aircraft airport, amongst others.
In this study, the target detection in the complex scene SAR image was focused by estimating the existence probability of the target in each scene. A texture-based classification is used to obtain the scenes firstly. Thereafter, the SD value based on information theory is used to estimate the existence probability. Lastly, an improved visual attention detection module is used to derive the detection result. The proposed method is a superpixel-based approach that maximises the extensive statistical features provided by the superpixel in the estimation module and decreases the false alarm rate in the detection module. By the target existence probability, the focus of visual attention is changed to the scene that contains the target superpixels with high possibility. With these benefits, the proposed approach is suitable for target detection in the complex scene SAR image and is extensively used for different target detection missions.
F Argenti, A Lapini, T Bianchi, L Alparone, A tutorial on speckle reduction in synthetic aperture radar images. Geosci Remote Sens Magazine IEEE 1(3), 6–35 (2013)
L Eikvil, L Aurdal, H Koren, Classification-based vehicle detection in high-resolution satellite images. ISPRS J Photogramm Remote Sens 64(1), 65–72 (2009)
L. Xie, L. Wei, Research on vehicle detection in high resolution satellite images”, in Proc. IEEE Int. Conf. on Intelligent Systems, 2013, pp. 279–283.
J. Leitloff, S. Hinz, U. Stilla, Vehicle queue detection in satellite images of urban areas, in Proc. URS’ 05, pp. 14–16, Mar. 2005
D. Pastina, F. Fico and P. Lombardo, Detection of ship targets in COSMO-SkyMed SAR images, in Proc. IEEE Radar conf., 2011, pp. 928–933.
Q Wang et al., Inshore ship detection using high-resolution synthetic aperture radar images based on maximally stable extremal region. J Appl Remote Sens 9(1), 1931–3195 (2015)
M Amoon, A Bozorgi, G Rezai-rad, New method for ship detection in synthetic aperture radar imagery based on the human visual attention system. J Appl Remote Sens 7, 071599–071599 (2013)
M Liao, C Wang, Using SAR images to detect ships from sea clutter. IEEE Geosci Remote Sens Lett 5(2), 194–198 (2008)
X Wen, L Shao, X Yu, W Fang, A rapid learning algorithm for vehicle classification. Inform Sci 295(1), 395–406 (2015)
Zhili Zhou, Yunlong Wang, Q.M. Jonathan Wu, Ching-Nung Yang, Xingming Sun, Effective and efficient global context verification for image copy detection, IEEE Transactions on Information Forensics and Security, 2016.
J Leitloff, S Hinz, U Stilla, Vehicle detection in very high resolution satellite images of city areas. IEEE Trans Geosci Remote Sens 48(7), 2795–2806 (2010)
R Achanta, A Shaji, K Smith, SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Ana Mach Intell 34(11), 2274–2282 (2012)
J. Feng, Z. Cao, Y. Pi, Amplitude and texture feature based SAR image classification with a two-stage approach, in Proc. IEEE Int. Conf. on RADAR, pp. 0360–0364, 2014.
L Itti, C Koch, E Niebur, A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Analysis Machine Intell 20(11), 1254–1259 (1998)
G Gao et al., An adaptive and fast CFAR algorithm based on automatic censoring for target detection in high-resolution SAR images. IEEE Trans Geosci Remote Sens 47(6), 1685–1697 (2009)
LM Novak, The automatic target-recognition system in SAIP. Lincoln Lab J 10(2), 187–202 (1997)
The authors declare that they have no competing interests.
About this article
Cite this article
Liu, S., Cao, Z., Wu, H. et al. Target detection in complex scene of SAR image based on existence probability. EURASIP J. Adv. Signal Process. 2016, 114 (2016). https://doi.org/10.1186/s13634-016-0413-4