Skip to main content

Optimal edge detection using multiple operators for image understanding


Extraction of features, such as edges for the understanding of aerial images, has been an important objective since the early days of remote sensing. This work aims at describing a new framework which allows for the quantitative combination of a preselected set of edge detectors based on the correspondence between their outcomes. This is inspired from the problem that despite the enormous amount of literature on edge detection techniques, there is no single technique that performs well in every possible image context. Two approaches are proposed for this purpose. The first approach is the well-known receiver operating characteristics analysis which is introduced for a sound quality evaluation of the edge maps estimated by combining different edge detectors. In the second approach, the so-called kappa statistics are employed in a novel fashion to amalgamate the above-mentioned selected edge maps to form an improved final edge image. This method is unique in the sense that the balance between the false detections (false positives and false negatives) is explicitly determined in advance and incorporated in the proposed method in a mathematical fashion. For the performance evaluation of the proposed techniques, a sample set of the RADIUS/DARPA-IU Fort Hood aerial image database with known ground truth has been used.

1 Introduction

Automatic detection of both geographical (natural) and man-made structures, such as vegetation, buildings, roads and vehicles, in aerial or satellite images has been an active research topic the last decade [1, 2]. Aerial images, with their highly detailed contents, are an important source of information for applications including GIS [3], traffic surveillance [4] and military applications [5]. When processing aerial images, the extraction of high-level features for object detection is an important field. Features of interest can be extracted using a variety of image-processing techniques, which analyze the image to detect characteristics, such as edges, texture and shape.

Edge-driven approaches have been extensively used in understanding remote sensing images and detecting man-made objects in them. In Noronha and Nevatia [6], extract edge points to build a system that detects and constructs 3D models of buildings using multiple aerial images. Tupin et al. [7] applied the ratio-of-averages (RoA) edge detector that was first presented by Touzi et al. [8] to identify linear structures, such as main axes in road networks in synthetic aperture radar (SAR) images. In [9], a framework for automatic change detection of linear features (e.g. roads and buildings) in aerial images is built, based on the edge maps which indicate pixels that segment areas with significantly different brightness values. Gamba et al. [10] proposed an approach to extract the map of urban areas exploiting edge information in very high-resolution images (VHR).

Edge detection in aerial images is a challenging task for many reasons. Aerial images differ in resolution, sensor type, orientation, quality, dynamic range, light conditions, different weather and seasons, factors that increase the complexity of the edge detection process. In several cases, some of the detected edges do not correspond to meaningful objects, while some edges that belong to objects are distorted, broken or missed. Furthermore, edges of different objects or edges of different layers of a structure are likely to stick to each other. Edge detection must be efficient and reliable because it is crucial in determining how successful subsequent processing stages will be. To fulfill the reliability requirement of edge detection, a great diversity of operators have been devised with differences in their mathematical and algorithmic properties.

Some of the earliest methods, such as the Sobel [11] and Roberts [12], are based on the so-called "Enhancement and Thresholding" approach [13]. According to that method, the image is convolved with small kernels which represent low-order high-pass filters and the result is thresholded to identify the edge points. Since then, more sophisticated operators have been developed. Marr and Hildreth [14] were the first to introduce the Gaussian smoothing as a pre-processing step in edge feature extraction. Their method detects edges by locating the zero-crossings of the Laplacian (second derivative) of the output of a Gaussian filtered image. Canny [15] developed an alternative Gaussian edge detector based on the optimizing three criteria. He employed Gaussian smoothing to reduce noise and the first derivative of the Gaussian to detect edges. Deriche [16] extended Canny's work to derive a recursively implemented edge detector. Rothwell [17] designed a spatially adaptive operator which is able to recover reliable topological information.

Further to the above, an alternative approach to edge detection is the multiresolution one. In such a detection framework, the image is convolved with Gaussian filters of different sizes to produce a set of images at different resolutions. These images are integrated to produce a complete final edge map. Typical algorithms which follow this approach have been produced by Bergholm [18], Lacroix [19] and Schunck [20]. Another interesting category of edge detectors is the logical/linear operators [21] which combine aspects of linear operators' theory and Boolean algebra. Furthermore, the idea of mimicking the human vision function using mathematical models gave space to the development of feature detection algorithms based on the human visual system. A representative example is the edge detector developed by Peli [22].

Recent approaches have used supervised learning to detect edges and object boundaries. In [23], a data-driven statistical edge detection approach has been proposed, where the probability distributions of edge filter responses on and off edges are learnt from pre-segmented data sets, while edges are detected using the log-likelihood ratio test. In a similar spirit, Martin et al. [24] combine multiple local cues to detect local boundaries. Based on the brightness, color and texture features, a classifier is trained using pre-segmented data to model the true posterior probability of a boundary at every image location and orientation. Another supervised learning algorithm for edge detection is the boosted edge learning (BEL) [25]. In this approach, a large number of features across different scales are combined to learn a discriminative model using an extended version of the probabilistic Boosting tree classification algorithm.

Intuitively, the question that arises is which edge detector and detector parameter settings can produce optimal results. Despite the aforementioned volume of existing work, an ideal scheme able to detect and localize edges with precision in many different contexts, has not yet been produced. Depending on the application, pre-segmented data might not be available for supervised training. This is getting even more difficult because of the absence of an evident "correct" edge map (ground truth), on which the performance of an edge detector could be evaluated. Although an edge detector may be robust to noise, it may fail to mark corners and junctions properly. Another common issue with edge detection is the incomplete contour representation. Problems, such as the above, strongly motivate the development of a general method for combining different edge detection schemes to take advantage of their strengths, while overcoming their weaknesses.

Let us assume n original detectors, where a detector refers to a mathematical method that attempts to identify the presence (or absence) of an event. In our work, we are interested in edge detectors that investigate the presence of edges in a digital image signal. These original detectors are transformed to a new set of detectors, where each new detector is a function of all of the original detectors. This function is solely controlled by a parameter named correspondence threshold (CT) which will be explained in the main body of the paper. Each one of the new detectors is associated with a specific value of the CT parameter; this value identifies uniquely the detector. The new detectors vary with respect to their richness, starting from weak detectors that highlight only the strong edges and are basically noise free, to strong detectors that also highlight weak edges and fine details, but exhibit significant amount of noise. In this work, we are interested in selecting one of the new edge detectors as the final detection result. The method assumes that the best of the new edge maps is the one which is most consistent with the verity of the detections produced by the set of original edge detectors. We present two novel contributions.

The first contribution is based on the use of the so-called receiver operating characteristic (ROC) curve. The only related work was presented in [26]. However, in [26] the original edge maps are generated for different combinations of the parameter values of a single edge detector and more specifically the Canny edge detector. In this paper, the original edge maps are different popular edge detectors which although follow similar mathematical techniques, they still produce different results.

The second contribution is based on the employment of a normalized and corrected edge detection performance statistical metric known as kappa statistic. The kappa statistic has been used solely in medicine [27]. We are seeking at optimizing the kappa statistic which, in the specific framework, is a function of the available edge detectors and additionally a scalar parameter which controls the strength of the final detector and consequently the balance between false alarms and misdetections.

The later is the main novelty of this paper. It is an important research contribution to the edge detection problem, since it allows for the blind combination of multiple detectors and more importantly the pre-specified control of the type of preferred misclassifications.

The work presented in this paper is a significant extension of the preliminary work presented in [28, 29]. Two different contributions for optimal edge detection are studied in detail. Exhaustive experimental results are provided for assessing the relative strength of intrinsic technical merit of the proposed techniques for detecting edges in natural scenes. The proposed framework is compared against existing methods and their respective performance is evaluated on aerial images.

The paper is organized as follows. Section 2 concerns the brief analysis of a set of popular edge detectors that will be used in this work. Section 3 presents two novel approaches for the quantitative combination of multiple edge detectors. Section 4 contains experimental results yielded using our implementation of the automatic edge detection algorithms together with a comparative study of the methods' performance. Conclusions are given in Section 5.

2 Operators implemented for this work

Several approaches to edge detection focus their analysis on the identification of the best differential operator necessary to localize sharp changes of the image intensity. These approaches recognize the necessity of a preliminary filtering step, as a smoothing stage, since differentiation amplifies all high-frequency components of the signal, including those of the textured areas and noise. The most widely used smoothing filter is the Gaussian one which has been shown to play an important role in detecting edges.

Canny's approach [15] is a standard technique in edge detection. This scheme, in substance, identifies edges in the image as the local maxima of the convolution of the image with an "optimal" operator. The operator's optimality is subject to three performance criteria defined by Canny and is a very close approximation to the first derivative of the two-dimensional Gaussian function G(x, y). For example, the partial derivative with respect to x is defined as:

where σ2 denotes the variance of the Gaussian filter and controls the degree of smoothing. After this process, candidate edge pixels are identified as the pixels that survive an additional thinning process known as nonmaximal suppression[15]. Then, the candidate edges are thresholded to keep only the significant ones. Moreover, Canny suggests hysteresis thresholding to eliminate streaking of edge contours.

Using an approach similar to Canny's, Deriche [16] derived an alternative optimal operator. Contrary to Canny, whose operator is based on a finite antisymmetric filter, Deriche deals with an antisymmetric filter which has an infinite support region defined as:

where a, c and ω are positive real numbers. This filter is sharper than the derivative of the Gaussian and is efficiently implemented in a recursive fashion. The procedure that follows in Deriche's method is the same as the one used in Canny's edge detection; nonmaximal suppression and hysteresis thresholding are applied as described previously.

Although Canny's detector performs well in localizing edges and suppressing noise; yet in several cases, it fails to provide a complete boundary in objects. Rothwell's [17] operator is an improvement to earlier edge detectors, capable of recovering sound topological descriptions. It follows a line of work similar to Canny's. The uniqueness of this algorithm originates in the use of a dynamic threshold which varies across the image [17].

In general, it is very difficult to find a single scale of smoothing which is optimal for all the edges in an image. One smoothing scale may keep good localization while giving detections sensitive to noise. Thus, multiscale edge detection is introduced as an alternative. In this approach, edge detectors with different filter sizes are applied to the image to extract edge maps at different smoothing scales. This information is then combined to result in a more complete final edge image.

Bergholm [18] introduced the coarse-to-fine tracking as an approach to multiscale edge detection. The initial steps of this method are based on Canny's approach. This algorithm relies on the fact that edge detection at a coarse resolution yields significant edges, while their accurate location is detected at a finer resolution. Therefore, the main idea is to initially detect the edges applying a strong Gaussian smoothing and then focus on these edges by tracking them over decreasing smoothing scale.

In Lacroix [19] introduces another algorithm for multiscale detection based on Canny's method. Contrary to Bergholm [18] who proposed the tracking of edges from coarse-to-fine resolution, in Lacroix's method the edge information is combined moving from fine-to-coarse resolution aiming at avoiding the problem of splitting edges. Schunck's work [20] is another study that advocates the use of derivatives of Gaussian filters with different variances to detect intensity changes at different resolution scales. The gradient magnitudes over the selected range of scales are multiplied to amplify significant edges, while suppressing the weak ones. Hence, a composite edge image is formed.

In this work, we use the six edge detectors mentioned in this section. The use of convolution-based methods is justified by the fact that they are simple to implement, while producing accurate detection results.

3 Automatic edge detection

In this paper, we intend to throw light on the uncertainty associated with the parametric edge detection performance. The statistical approaches described here attempt to automatically form an optimum edge map, by combining edge images emerged from different detectors.

We begin with the assumption that N different edge detectors will be combined. The first step of the algorithm comprises the correspondence test of the edge images, E i for i = 1, ..., N. A correspondence value is assigned to each pixel and is then stored in a separate array, V, of the same size as the initial image. The correspondence value is the frequency of identifying a pixel as an edge by the set of detectors. Intuitively, the higher the correspondence associated with a pixel, the greater the possibility for that pixel to be a true edge. Hence, the above correspondence value can be used as a reliable measure to distinguish between true and false edges [26].

However, these data require specific statistical methods to assess the accuracy of the resulted edge images-accuracy here being the extent to which detected edges agree with true edges. Correspondence values ranging from 0 to N produce N + 1 thresholds which correspond to edge detections with different combinations of true-positive and false-positive rates. The threshold that corresponds to correspondence value 0 is ignored. Hence, the main goal of the method is to estimate the correspondence threshold CT (from the set CT i where i = 1, ..., N) which results in an accurate edge map that gives the finest fit to all edge images E i . In this section, we describe two different approaches for this purpose.

3.1 ROC analysis

In our case, the classification task is a binary one including the actual classes {e, ne}, which stand for the edge and non-edge event, respectively and the predictive classes, predicted edge and predicted non-edge, denoted by {E, NE}. Traditionally, the possible outcomes obtained by an edge detector are displayed graphically in a 2 × 2 matrix, the confusion matrix.

To mathematically define the conditional probabilities in the confusion matrix, we begin by considering an image of size K × L. The probability of a pixel to be a true edge will be denoted as pk,l, where k = 1, ..., K and l = 1, ..., L. In a similar way, qk,lwill represent the probability of a pixel to be detected as edge. The probability of a true-positive outcome over all the pixels (k, l) of an image is defined as:

This leads to the following equation:


where σ p and σ q stand for the standard deviation of the distribution of pk,land qk,l, respectively. The parameter P represents the prevalence which refers to the occurrence of true edge pixels in the image whereas, the level Q of the detection corresponds to the occurrence of pixels detected as edges [30]. The parameter ρ denotes the correlation coefficient between pk,land qk,land its role is explained in detail in the Appendix. In this work, we assume that for a legitimate edge detection the correlation coefficient between true and detected edges is positive. This is a realistic assumption since edge detection relies on mathematical methods that exploit the local edge intensity information. In the case of random edge detection, where the edges are identified purely by chance, the correlation coefficient is equal to ρ = 0. All the probabilities computed for legitimate and random edge detection are presented in the Appendix. Clearly, the optimum edge detector is the one that identifies as edges all the true edge pixels and therefore satisfies the equality:


In this work, the concept of accuracy refers to the quality of information provided by an edge map. Thus, the assessment of the edge map accuracy is necessary for the estimation of the optimum correspondence threshold. The accuracy is characterized using the metrics of sensitivity (SE) and specificity (SP) [31]. Both these measures describe the edge detector's ability to correctly identify true edges, while it negates the false alarms. Sensitivity (SE) corresponds to the probability of identifying a true edge as edge pixel. It is also referred to as true-positive rate and is defined as follows:


The term specificity (SP) expresses the probability of identifying an actual non-edge as non-edge pixel. The measure 1 - SP is known as false-positive rate. These measures are given by the equations:



Relying on the value of only one of the above metrics for edge detection accuracy estimation would be an oversimplification and will possibly lead to misleading inferences. Based on this idea, the ROC analysis [32, 33] can be introduced to quantify detection accuracy. In fact, a ROC curve provides a view of all the true-positive/false-positive rate pairs emerged from varying the correspondence over the range of the observed data. In this work, the ROC curve is used to select the correspondence threshold CT that would provide an optimum trade-off between the TPrate and the FPrate rate of edge detectors.

To calculate the points on the ROC curve, we apply each correspondence threshold CT i on the correspondence test outcome, i.e, the matrix V mentioned above. This means the pixels are classified as edges and non-edges according to whether their correspondence value exceeds a CT i or not. Thus, we end up with a set of possible best edge maps M j , for j = 1, ..., N, corresponding to each CT i . Each point on the ROC curve corresponds to the average match between all the initial edge maps and a possible best edge map. For that purpose, every M j is compared to the set of the initial edge images, E i , to calculate the true-positive, , and the false-positive, , rates associated with each of them. Hence, according to Equations 3 and 4, for the M j map these rates are defined as:


where represents the average number of true edges in M j .

Averaging in Equations 5 and 6 refers to the joint use of multiple edge detectors as shown in the following equations.


where and represent the pixels detected as edges and non-edges in the edge map M j , respectively. The same notation is used in the case of the edge maps E i . The probabilities and are defined in a similar way. For instance, the probability measurement in Equation 7 indicates the average number of pixels detected as edges in M j and match with edge pixels in all detections E i . Each edge map M j generates a point (, ) in the ROC plane, forming the ROC curve. The position of these points provides qualitative information about the detection accuracy of each edge map. As we mentioned in Equation 2, the optimum CT should correspond to a detection that gives prevalence value P equal to its level Q. By definition of true-positive and false-positive rate, P' · FPrate + P · TPrate = Q. This definition in conjunction with Equation 2 leads to the following mathematical expression that the optimum ed ge detection should satisfy:


Equation 9 defines a line that connects the points (0, 1) and (P, P) in the ROC plane, known as diagnosis line. Therefore, the optimum CT occurs at the intersection (or close to that) of the ROC curve and the diagnosis line. The value of the selected CT determines how detailed the final edge image, EGT, will be. In the case of a noisy environment, the selection of CT should give a trade-off between the increase in information provided by the final edge image and the decrease in noise.

3.2 Weighted kappa coefficient

In edge detection, it is prudent to consider the relative seriousness of each possible disagreement between true and detected edges when performing accuracy evaluation. This section is confined to the examination of an accuracy measure which is based on the acknowledgement that in detecting edges, depending on the specific application, the consequences of a false positive may be quite different from the consequences of a false negative. For this purpose, the weighted kappa coefficient[34, 35] is introduced for the estimation of the correspondence threshold that results in an optimum final edge map.

Consider a mathematical measure A0 of agreement between the outcomes of two algorithms that both attempt to solve the problem of detection of the presence or absence of a condition. Let A c be the value expected on the basis of agreement by chance alone and A a the value expected on the basis of complete agreement, i.e., A a = max{A0}. Based on the above definitions, the kappa coefficient defined below is introduced as a corrected and normalized measure of agreement [35]:


In edge detection, A0 may be defined as a measure of agreement between true and detected edges. The definition of A c and A a is obvious.

A generalization of the above coefficient can be made to incorporate the relative cost of false positives and false negatives into the accuracy measure. We assume that weights wu,v, for u = 1, 2 and v = 1, 2, are assigned to the four possible outcomes of the edge detection process displayed in the confusion matrix. The weights indicate gain or cost and they lie in the interval 0 ≤ |wu,v| ≤ 1. The observed weighted proportion of agreement is given as


where du,vindicates the probabilities in the confusion matrix, calculated as shown in Table 1. Similarly, the chance-expected weighted proportion of agreement has the form

Table 1 Probabilities for legitimate and random edge detection

where cu,vrefers to the above four probabilities, but in the case of random edge detection, i.e. the edges are identified purely by chance. Their analytic expression is given in Table 1. Based on the definition of kappa coefficient described previously, the weighted kappa coefficient is then given by


Substituting Equations 11 and 12 in Equations 13 gives



where P', Q' are the complements of P and Q, respectively. k(1, 0) and k(0, 0) are the quality indices of sensitivity and specificity, respectively, defined as

The major source of confusion in statistical methods related to the weighted kappa coefficient is the assignment of weights. From Equation 14 it can be deduced that the total cost W1 for true edges being properly identified as edges or not, is equal to W1 = |w1,1| + |w2,1|. Similarly, the total cost W2 for the non-edge pixels is defined as W2 = |w1,2| + |w2,2|. We propose that true detections should be assigned positive weights representing gain whereas, the weights for false detections should be negative, representing loss. It can be proven that no matter how the split of these total costs is made between true and false outcomes, the result of the method is not affected [30]. Hence, for the sake of convenience, the total costs are split evenly. As a result, we end up with two different weights instead of four:

A further simplification leads to


Considering the fact that the maximum value of the quality indices k(1, 0) and k(0, 0) is equal to 1, the denominator in Equation 15 takes the form: W1 · P · Q' + W2 · P' · Q. Dividing both numerator and denominator by W1 + W2, the final expression of the weighted kappa coefficient, in accordance with the quality indices of sensitivity and specificity, becomes




and r' is the complement of r. The weighted kappa coefficient k(r, 0) indicates the quality of the detection as a function of r. It is unique in the sense that the balance between the false detections is determined in advance and then is incorporated in the measure.

From Equation 17 it is deduced that the index r is indicative of the relative importance between false negatives and false positives. Its value is dictated by which type of error carries the greatest importance for a particular application and ranges from 0 to 1. If we focus on the elimination of false positives in edge detection, W2 will predominate in Equation 15 and consequently r will be close to 0 as it can be seen from Equation 17. On the other hand, a choice of r close to 1 signifies our interest in avoiding false negatives since W1 will predominate in Equation 15. A value of r = 1/2 reflects the idea that both false positives and false negatives are equally unwanted. No standard choice of r can be regarded as optimum because the balance between the two errors shifts according to the application.

Thus, for a selected value of r, the weighted kappa coefficient k j (r, 0) is calculated for each edge map as in Equation 16. The optimum CT is the one that maximizes the weighted kappa coefficient.

3.3 Geometric approach for the weighted kappa coefficient

The estimation of the weighted kappa coefficient k j (r, 0) can also be done geometrically. Every edge map M j , for j = 1 ... N, can be represented as a point (k j (0, 0), k j (1, 0)) on a two-dimensional graph with coordinates (k(0, 0), k(1, 0)). The set of points (k j (0, 0), k j (1, 0)), j = 1, ..., N, consist the so-called quality ROC (QROC) curve. A great deal of information is available from visual examination of such a geometric representation. Equation 16 for the j th edge map can be rewritten in the form:


Therefore, if we consider the straight line on the QROC-plane described by the equation:


it is obvious from Equation 18 and 19 that the point (k j (0, 0),k j (1, 0)) lies on this line. This is called the r-projection line and its slope is


It is obvious that the point (k j (r, 0),k j (r, 0)) lies also on the r-projection line and also on the main diagonal described by equation k(0, 0) = k(1, 0).

This means that the weighted kappa coefficient k j (r, 0) can be calculated graphically by drawing a line, for any value r of interest, through the point (k j (0, 0), k j (1, 0)) with slope given by Equation 20. The intersection point, (k j (r, 0), k j (r, 0)), of this line with the major diagonal in the QROC plane is clearly indicative of the k j (r, 0) value. Figure 1 presents an example for the calculation of the weighted kappa coefficient for a test point for r = 0.5. The procedure is repeated for every CT i to generate N different intersection points. The closer the intersection point to the upper right corner (ideal point), the higher the value of the weighted kappa coefficient. Hence, the optimum correspondence threshold is the one that produces an intersection point closer to the point (1, 1) in the QROC plane.

Figure 1
figure 1

Calculation of k (0.5, 0) using a graphical approach on the QROC plane.

3.4 An alternative to the selection of the r parameter value

In the previous section, parameter r is evaluated according to Equation 17. By assigning more weight to the false detection, we want to eliminate the ratio in Equation 17 yields the appropriate value of r. However, a more efficient analysis is necessary. An alternative analysis that justifies the previously described selection of r is presented in this section.

Our main concern is to examine the behavior of the quality measure k(r, 0) as a function of the level, Q, and the parameter r. Substituting in Equation 16 the probabilities given in the Appendix, the weighted kappa coefficient can be expressed as

Thus, the quality measure, k(r, 0), takes the form

The derivative of the weighted kappa coefficient with respect to r is given by


The measures σ p , σ q are positive as they express standard deviations. The correlation coefficient, ρ, is positive, as well. Thus, it becomes obvious that the sign of the the derivative, , is determined by the value of Q relative to P.

A level, Q, greater than the prevalence, P corresponds to an edge detection that eliminates the misdetections by favoring the false positives. In this case, according to Equation 21, the derivative of the Weighted Kappa Coefficient is positive for any value of r and the quality measure k(r, 0) is an increasing function of r. This means in applications where we are more interested in the elimination of false negatives, a higher value of r in the interval [0, 1] will result in the selection of a more accurate edge map.

Equivalent conclusions are derived for the elimination of false positives, i.e. detections where the level is smaller than the prevalence. According to Equation 21 the derivative, , will be negative and the weighted kappa coefficient will be a decreasing function of r. Therefore, small values of r in the interval [0, 1] will yield CT s that correspond to more accurate edge maps.

The above conclusions are also verified experimentally. Figure 2 illustrates the values of the weighted kappa coefficient as a function of r for four edge maps at different CT s. These plots are yielded from applying the weighted kappa coefficient method on the "School" image (from the RADIUS/DARPA-IU Fort Hood aerial image database [36]) to combine six edge detectors. Figure 2a,b corresponds to CT s equal to 1 and 2, respectively, where the level values are greater than the prevalence. Observing these curves, it is clear that the weighted kappa coefficient is an increasing function of r and high quality is achieved for values of r close to 1. In contrast, as shown in Figure 2c,d, the quality measure is a decreasing function of r for CT = 3 and CT = 4, where the level is smaller than the prevalence. In this case, values of r close to 0 give edge maps with better quality.

Figure 2
figure 2

Weighted kappa coefficient plots for edge maps that correspond to (a) CT = 1 (b) CT = 2 (c) CT = 3 (d) CT = 4.

4 Experimental results and discussion

Using the above framework, six edge detectors, proposed by Canny [15], Deriche [16], Bergholm [18], Lacroix [19], Schunck [20] and Rothwell [17] were combined to produce optimum edge maps. The selection of the above edge detectors relies on the fact that they basically follow the same mathematical approach.

The performance evaluation study of the proposed approach is carried out on a set of 10 images, selected from the RADIUS/DARPA-IU Fort Hood aerial image set [36]. Some of the images contain mainly vertical views and others contain more oblique views as well. All the images are 8-bit/pixel and their size ranges from 476 × 477 to 645 × 667 pixels. The ground truth for the aerial images is provided in the data set [36], in the form of specified image points that should be identified as edges and specified image regions where no edges should be detected.

The performance of the edge detectors is evaluated with respect to two different measures; namely, the detection error and the similarity between the ground truth and the extracted edge maps. The detection error is defined as the distance of the (FPrate, TPrate) point of the detector in the ROC plane from the ideal point (0, 1) and is mathematically expressed as:


where the probabilities for the estimation of the true positive rate and false positive rate of an edge map are defined based on the ground truth edge map that corresponds to the original image, in a similar way as in Equation 7 and 8. The similarity between the ground truth edge map and the result of an edge detector is estimated as:


where NGT is the number of edge points on the ground truth and d i is the minimum distance between the i th edge point on the ground truth and the estimated edge map. Intuitively, the lower the detection error and the higher the similarity to the ground truth, the better is the performance of the operator.

Specifying the value of the edge detection operators' input parameters was a crucial step. In fact, the parameter selection depends on the implementation and intends to maximize the quality of the detection. In our work, we were consistent with the parameters proposed by the authors of the selected detectors. The Bergholm algorithm was slightly modified by adding hysteresis thresholding to allow a more detailed result. In Lacroix technique, we applied non-maximal suppression by keeping the size k × 1 of the relative window fixed at 3 ×1 [19]. For simplicity, in the case of Schunck edge detection, the non-maximal suppression method we used is the one proposed by Canny [15] and hysteresis thresholding was applied for a more efficient thresholding.

For all the images on the data set, the standard deviation (sigma) of the Gaussian filter in Canny's algorithm [15] was set to sigma = 1, whereas, the low and high thresholds were automatically calculated by the image histogram. In Deriche's technique [16], the parameters' values were set to a = 2 and w = 1.5. The Bergholm [18] parameter set was a combination of starting sigma, ending sigma and low and high threshold and these where starting sigma = 3.5, ending sigma = 0.7 and the thresholds were automatically determined as previously. For the Primary Rater in Lacroix's method [19], the coarsest resolution was set to σ2 = 2 and the finest one to σ0 = 0.7. The intermediate scale σ1 was computed according to the expression proposed in [19]. The gradient and homogeneity thresholds were estimated by the histogram of the gradient and homogeneity images, respectively. For the Schunck edge detector [20], the number of resolution scales was arbitrarily set to three as: σ1 = 0.7, σ2 = 1.2, σ3 = 1.7. The difference between two consecutive scales was selected not to be greater than 0.5 to avoid edge pixel displacement in the resulted edge maps. The values for the low and high thresholds were calculated by the histogram of the gradient magnitude image. In the case of Rothwell method [17], the alpha parameter was set to 0.9, the low threshold was estimated by the image histogram again and the value of the smoothing parameter, sigma, was equal to 1. It is important to stress out that the selected values for all of the above parameters fall within the ranges proposed in the literature by the authors of the individual detectors.

When the optimum correspondence threshold is estimated using the maximization of the 'Weighted Kappa Coefficient' approach, the value of the r parameter needs to be set. The cost, r, is initially determined according to the particular quality of the detection (FP or FN) that is chosen to be optimized. For example, as far as target object detection in military applications is concerned, missing existing targets in the image (misdetections) is less desirable than falsely detecting non-existing ones (false alarms). This is as well the scenario we assume in this piece of work; namely, we are primarily concerned with the elimination of FN at the expense of increasing the number of FP. Therefore, according to the analysis in the previous section, the cost value r should range from 0.5 to 1.

The graphs in Figure 3 show the detection error and the percentage of false positives for values of r greater than 0.5, for the "Buildings" and the "Baseball" aerial images. As it is expected, as the parameter r increases, the detection error decreases, while the number of points falsely detected as edges increases. From the above graphs, it can be observed that a value of r between 0.6 and 0.8 gives a good trade-off between the increase in edge information and the decrease in noise in the final edge image. In our experimental work, the parameter r has been set equal to 0.7.

Figure 3
figure 3

Evaluation of the r parameter (a) Detection error and (b) False Positives as a function of r for the "Buildings" and "Baseball" aerial images.

The results of applying the 'ROC analysis' and the 'weighted kappa coefficient' approach on the "Large Building" aerial image shown in Figure 4a are presented in Figures 4, 5, and 6. The sample space, E i (i = 1, ..., 6), consisted of the edge detection out-comes produced by the six selected operators is depicted in Figure 5. The probabilities given by Equation 5 and 6 were calculated for the statistical correspondence test of the edge detections E i . The ROC curve implementation is illustrated in Figure 6a where it can be observed that the intersection of the diagnosis line with the ROC curve occurs at a correspondence level close to 3. Thus, the optimum threshold for this approach is CT = 3. The graphical estimation of the weighted kappa coefficient, k j (r, 0), for each CT is illustrated in Figure 6b. On the graph, it is observed that the weighted kappa coefficient takes its maximum value at k2(r, 0) and therefore the optimum CT is equal to 2. The final edge maps, EGT, for both approaches are presented in Figure 4c,d.

Figure 4
figure 4

Optimal edge detection on the "Large Building" image (a) Original aerial image (b) Ground truth image. Final edge map when applying (c) the "ROC analysis" (d) the "Weighted Kappa Coefficient" approach (r = 0.70).

Figure 5
figure 5

Sample set of edge maps for the "Large Building" aerial image (a) Canny detection (b) Deriche detection (c) Bergholm detection (d) Lacroix detection (e) Schunck detection (f) Rothwell detection.

Figure 6
figure 6

Graphical estimation of the optimum CT for the "Large Building" aerial image when applying (a) the "ROC analysis" (b) "Weighted Kappa Coefficient" approach with r = 0.70.

Another set of experimental results using the "Woods" aerial image is presented in Figure 7, 8 and 9. The original "Woods" image is shown in Figure 7a and the six edge maps extracted from the selected detectors are demonstrated in Figure 8. The ROC curve implementation and the graphical estimation of the weighted kappa coefficient are presented in Figure 9 giving an estimation for the optimum CT for each approach equal to 3 and 2, respectively. The final edge maps for the 'ROC analysis' and the 'weighted kappa coefficient' approach are presented in Figure 7c,d. The results of applying the proposed approaches to optimal edge detection on the "Main Building" and the "School" images are presented in Figure 10.

Figure 7
figure 7

Optimal edge detection on the "Woods" image (a) Original aerial image, (b) ground truth image. Final edge map when applying (c) the "ROC analysis" (d) the "Weighted Kappa coefficient" approach (r = 0.70).

Figure 8
figure 8

Sample set of edge maps for the "Woods" aerial image (a) Canny detection, (b) Deriche detection, (c) Bergholm detection, (d) Lacroix detection, (e) Schunck detection, (f) Rothwell detection.

Figure 9
figure 9

Graphical estimation of the optimum CT for the "Woods" aerial image when applying (a) the "ROC analysis" (b) "Weighted Kappa coefficient" approach with r = 0.70.

Figure 10
figure 10

Optimal edge detection on the "Main Building" and "School" images (a) Original "Main Building" aerial image (b) Ground truth image. Final edge map when applying (c) the "ROC Analysis" (d) the "Weighted Kappa Coefficient" approach (r = 0.70) (e) Original "School" aerial image (f) Ground truth image. Final edge map when applying (g) the "ROC Analysis" (h) the "Weighted Kappa Coefficient" approach (r = 0.70).

The above examples emphasize the ability of the approaches to combine high accuracy with good noise reduction in the final edge detection result. Insignificant information is cleared, while the one preserved allows for easy, fast and accurate object recognition. Furthermore, it is interesting to note that objects missed by one of the selected edge detectors are included in the final edge image. This is particularly obvious in the "Large Building" image set when comparing the final edge maps in Figure 4c,d with the Bergholm detection in Figure 5c. Finally, edges due to texture are suppressed in the final edge maps.

Comparing the edge maps produced by applying the above two approaches it is observed that the edge maps for the "weighted kappa coefficient" approach have better quality than those for the "ROC analysis". The objects detected by the "weighted kappa coefficient" approach are better defined regarding their shape and contour and the number of detected edges is greater. This is expected since the selected value of r is 0.7. The performance of the "weighted kappa coefficient" approach for the particular choice of r, seems to be superior to "ROC analysis" since it is more sensitive to minor details.

To investigate the effect of the cost r on the detection error, the latter measurement was evaluated for both of the above aerial images, for different values of r, greater than 0.4. The results for the "Large Building" and the "Woods" images are graphically presented in Figure 11. The curves that correspond to the six edge detectors are straight lines parallel to the horizontal axes as the detection result for these operators is not affected by the parameter r. From the graphs, it is clear that the detection error for the 'weighted kappa coefficient' approach is lower than that of the six selected detectors for any value of r in the examined range. This observation shows that our experimental results are not biased by the selection of the parameter r since the 'weighted kappa coefficient' approach is superior to the selected detectors for any value of r that eliminates the FN (r > 0.5).

Figure 11
figure 11

Detection error with respect to the value of the r parameter for (a) the "Large Building" image (b) the "Woods" image.

For a more thorough performance evaluation and comparison of the initial set of edge operators with the proposed approaches, the detection error and the similarity to the ground truth of each approach for all the aerial images on the database has been computed and is presented in Tables 2 and 3, respectively. The above results show that the performance of the 'weighted kappa coefficient' approach for a value of the cost r equal to 0.7, is always better than all the other approaches. The performance of the 'ROC analysis' is lower than that of the 'weighted kappa coefficient' approach but better than that of the six selected operators for the majority of the examined aerial images. The above results verify the superiority of the proposed approaches against the initial set of edge detectors.

Table 2 Detection error
Table 3 Edge map similarity

Without code optimization, the MATLAB implementation of each of the proposed approaches ("ROC analysis" and the "weighted kappa coefficient") comfortably runs at around 2.8 s on a Pentium 2.8 GHz desktop for a 476 × 477 size image.

5 Conclusion

The selection of an edge detector operator is not a trivial problem, since different edge detectors often produce essentially varying edge maps, even if they follow similar mathematical approaches. In this paper, we propose two techniques for the automatic statistical analysis of the correspondence of edge images that have emerged from different operators; the ROC analysis and the weighted kappa coefficient method. Both techniques integrate efficiently the pre-selected set of edge detectors in terms of both the quality of the highlighted features and the elimination of noise and texture. However, the weighted kappa coefficient approach can be considered superior in the sense that the trade off between detection of minor edges and noise reduction can be quantified in advance as part of the problem specifications.

In future work, we intent to incorporate in the identification of edges information from their surrounding pixels. That means, the probability of a pixel being an edge will be affected by the state (edge/non-edge) of its neighbors. Furthermore, the possibility of using soft values for the final edge maps instead of hard edge extractors will be explored.


Correlation coefficient between the probability of a pixel being a true edge and being detected as edge

The parameter ρ in Equation 1 denotes the correlation coefficient between the probability of a pixel being a true edge and being detected as edge. A positive correlation coefficient between two random variables indicates that these variables follow the same trend. In our case, the random variables of interest are the true edge image and the detected edge image. Therefore, a positive correlation coefficient indicates that if the probability of a pixel f(x1, y1) being a true edge is higher as compared to the same probability for the pixel f(x2, y2), then the probability of the pixel f(x1, y1) detected as edge pixel is also higher compared with the same probability for the pixel f(x2, y2).

All the probabilities, computed for legitimate and random edge detection, according to Equation 1 are presented in Table 4 where the ' symbol denotes the complement operator.

Table 4 Analytic form of probabilities for legitimate and random edge detection


  1. Baltsavias EP: Object extraction and revision by image analysis using existing geodata and knowledge: current status and steps towards operational systems. ISPRS J Photogram Remote Sens 2004, 58: 129-151. 10.1016/j.isprsjprs.2003.09.002

    Article  Google Scholar 

  2. Gruen A, Kuebler O, Agouris P: Automatic Extraction of manmade Objects from Aerial and Space Images I (Birkhaeuser, Basel). 1995.

    Book  MATH  Google Scholar 

  3. Zhang K, Yan J, Chen SC: Automatic construction of building footprints from airborne lidar data. IEEE Trans Geosci Remote Sens 2006,44(9):2523-2533.

    Article  Google Scholar 

  4. Munoz-Ferreras JM, Perez-Martinez F, Calvo-Gallego J, Asensio-Lopez A, Dorta-Naranjo BP: A Blanco del Campo, Traffic surveillance system based on a high-resolution radar. IEEE Trans Geosci Remote Sens 2008,46(6):1624-1633.

    Article  Google Scholar 

  5. Robertson P, Brady JM: Adaptive image analysis for aerial surveillance. IEEE Intell Syst Appl 1999,14(3):30-36. 10.1109/5254.769882

    Article  Google Scholar 

  6. Noronha S, Nevatia R: Detection and modeling of buildings from multiple aerial images. IEEE Trans Pattern Anal Mach Intell 2001,23(5):501-518. 10.1109/34.922708

    Article  Google Scholar 

  7. Tupin F, Maitre H, Mangin J-F, Nicolas J-M, Pechersky E: Detection of linear features in sar images: application to road network extraction. IEEE Trans Geosci Remote Sens 1998,36(2):434-453. 10.1109/36.662728

    Article  Google Scholar 

  8. Touzi R, Lopes A, Bousquet P: A statistical and geometrical edge detector for SAR images. IEEE Trans Geosci Remote Sens 1988,26(6):764-773. 10.1109/36.7708

    Article  Google Scholar 

  9. Rowe NC, Grewe LL: Change detection for linear features in aerial photographs using edge-finding. IEEE Trans Geosci Remote Sens 2001,39(7):1608-1612. 10.1109/36.934092

    Article  Google Scholar 

  10. Gamba P, Dell' Acqua F, Lisini G, Trianni G: Improved VHR urban area mapping exploiting object boundaries. IEEE Trans Geosci Remote Sens 2007,45(8):2676-2682.

    Article  Google Scholar 

  11. Matthews J: An introduction to edge detection: the Sobel edge detector.2002. []

    Google Scholar 

  12. Roberts LG: Machine perception of 3-D solids. Optical and Electro-Optical Information Processing. MIT Press, cambridge; 1965.

    Google Scholar 

  13. Abdou IE, Pratt WK: Quantitative design and evaluation enhancement/thresholding edge detectors. Proc IEEE 1979, 67: 753-763.

    Article  Google Scholar 

  14. Marr D, Hildreth EC: Theory of edge detection. Proc R Soc Lond 1980, B207: 187-217.

    Article  Google Scholar 

  15. Canny JF: A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 1986,8(6):679-698.

    Article  Google Scholar 

  16. Deriche R: Using Canny's criteria to derive a recursive implemented optimal edge detector. Int J Comput Vis 1987,1(2):167-187. 10.1007/BF00123164

    Article  Google Scholar 

  17. Rothwell CA, Mundy JL, Hoffman W, Nguyen VD: Driving vision by topology. International Symposium on Computer Vision (Coral Gables, FL) 1995, 395-400.

    Chapter  Google Scholar 

  18. Bergholm F: Edge focusing. IEEE Trans Pattern Anal Mach Intell 1995,9(6):726-741.

    Google Scholar 

  19. Lacroix V: The primary raster: a multiresolution image description. Proceedings 10th International Conference on Pattern Recognition 1990, 1: 903-907.

    Article  Google Scholar 

  20. Schunck BG: Edge detection with Gaussian filters at multiple scales. Proceedings IEEE Computer Society Workshop on Computer Vision 1987, 208-210.

    Google Scholar 

  21. Iverson LA, Zucker SW: Logical/linear operators for image curves. IEEE Trans Pattern Anal Mach Intell 1995,17(10):982-996. 10.1109/34.464562

    Article  Google Scholar 

  22. Peli E: Feature detection algorithm based on visual system models. Proc IEEE 2002, 90: 78-93. 10.1109/5.982407

    Article  Google Scholar 

  23. Konishi S, Yuille AL, Coughlan JM, Zhu SC: Statistical edge detection: learning and evaluating edge cues. IEEE Trans Pattern Anal Mach Intell 2003,25(1):57-74. 10.1109/TPAMI.2003.1159946

    Article  Google Scholar 

  24. Martin DR, Fowlkes CC, Malik J: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 2004,26(5):530-549. 10.1109/TPAMI.2004.1273918

    Article  Google Scholar 

  25. Dollar P, Zhuowen T, Belongie S: Supervised learning of edges and object boundaries. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) 2006, 2: 1964-1971.

    Google Scholar 

  26. Yitzhaky Y, Peli E: A method for objective edge detection evaluation and detector parameter selection. IEEE Trans Image Process 2003,25(8):1027-1033.

    Google Scholar 

  27. Kraemer HC, Periyakoil VS, Noda A: Tutorial in biostatistics: kappa coefficients in medical research. Stat Med 2002,21(14):2109-2129. 10.1002/sim.1180

    Article  Google Scholar 

  28. Giannarou S, Stathaki T: Edge detection using quantitative combination of multiple operators. IEEE Workshop on Signal Processing Systems Design and Implementation 2005, 359-364.

    Google Scholar 

  29. Giannarou S, Stathaki T: Novel statistical approaches to the quantitative combination of multiple edge detectors. International Conference on Image Analysis and Recognition (ICIAR) 2006, 184-195.

    Chapter  Google Scholar 

  30. Kraemer HC: Evaluating Medical Test: Objective and Quantitative Guidelines. Saga Publications, Newbury Park; 1992.

    Google Scholar 

  31. Kirkwood BR, Sterne JonathanAC: Essential Medical Statistics Blackwell Science, Oxford. 2003.

    Google Scholar 

  32. Fawcett T: ROC graphs: notes and practical considerations for data mining researchers. HP Labs Tech Report HPL-2003-4 2003.

    Google Scholar 

  33. Zweig MH, Campbell G: Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Am Assoc Clin Chem 1993,39(4):561-577.

    Google Scholar 

  34. Fleiss JL: Statistical Methods for Rates and Proportions. Wiley, New York; 1981.

    MATH  Google Scholar 

  35. Sim J, Wright CC: The kappa statistic in reliability studies: use, interpretation, and sample size requirements. J Am Phys Therapy 2005,85(3):257-268.

    Google Scholar 

  36. []

Download references


The investigations which are the subject of this paper were initiated by Dstl under the auspices of the United Kingdom Ministry of Defence Systems Engineering for Autonomous Systems Defence Technology Centre.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Stamatia Giannarou.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Giannarou, S., Stathaki, T. Optimal edge detection using multiple operators for image understanding. EURASIP J. Adv. Signal Process. 2011, 28 (2011).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: