A novel shape descriptor, named as ratio histograms (R-histogram), is proposed to represent the relative attitude relationship between two independent shapes. For a pair of two shapes, the shapes are treated as the longitudinal segments parallel to the line connecting centroids of the two shapes, and the R-histogram is composed of the length ratios of collinear longitudinal segments. R-histogram is theoretically affine invariant due to collinear distance invariance of the affine transformation. In addition, as the computation of the length ratio weakens the noise contribution, R-histogram is robust to noise. Based on the R-histogram, the shape-matching algorithm includes two major phases: preprocessing and matching. The first phase, which can be processed off-line, is trying to obtain the R-histograms of all original shape pairs. In the second phase, for each transformed shape pair, its R-histogram is computed and the candidate matched shape pair with minimal R-histogram matching error is found. Subsequently, a voting strategy, which further improves the accuracy of shape matching, is adopted for the candidate corresponding shape pairs. Experimental results demonstrate that the proposed R-histogram is robust and efficient.
Shape matching plays an important role in image processing and computer vision applications. Since the images taken from different viewpoints usually suffer from perspective distortions, the matching algorithm should be capable of dealing with them. Numerous methods, such as spectral transform[1, 2], moment invariants[3, 4], iso-area normalization, time series, B-splines, curvature scale space (CSS)[8, 9], shape contexts, shape signature, diagonals of orthogonal projection matrices (DOPM), multiscale oriented corner correlation, etc., have been proposed for shape matching under affine transformations. However, most of these methods make an assumption that a sparse set of boundary points or interest points have been extracted beforehand. Besides, many traditional matching methods only analyze the properties of a separate shape and the spatial relationships between different shapes are ignored.
The relative position between the shapes often helps in image understanding and shape matching. In some recent related study such as, the descriptors extracted from spatial relationship between shapes are used for shape matching. Krishnapuram et al. first proposed a descriptor named histogram of angles. A histogram of all possible angles between point pairs in the regions is used to describe the directional spatial relations between two shapes. The histogram is computationally expensive and can only deal with raster data. To overcome its limitation, F-histogram[16, 17], generalizing from the histogram of angles, is proposed. Different from the histogram of angles treating the 2D object as a set of points, F-histogram method handles the shapes as longitudinal segments. Besides the spatial relationship between two shapes, the orientation and the size of the shapes are captured by F-histogram as well. Furthermore, F-histogram is capable of processing both raster and vector data. As a result, F-histogram can effectively be used for shape matching. However, the descriptor is not affine invariant. The F-histogram of a shape pair varies with the viewing orientation. Thus to compare two F-histograms, the difference should be normalized by searching for several parameters, which are equivalent to the parameters of the geometric transformation between the images. The parameter searching process leads to an expensive computational cost.
Motivated by F-histogram, in this article, a novel affine invariant histogram, named as ratio histograms (R-histogram), is proposed to describe the relative attitude relationship between two shapes. For a pair of two independent shapes, the longitudinal segments, which are the intersections of shapes and the lines parallel to the line connecting the centroids of the two shapes, are treated as primitives. Then, the R-histogram is composed of the length ratios of collinear longitudinal segments from the two shapes. This descriptor has a clear physical interpretation and can be applied to shape matching without searching for affine transformation parameters. In contrast to F-histogram which treats longitudinal segments in a number of directions during the range [0, 2π], R-histogram treats the longitudinal segments of shapes with only one fixed direction; thereby the computational complexity is significantly reduced. Moreover, an efficient shape-matching algorithm based on R-histogram is developed in this article. The algorithm is divided into two phases. The R-histograms of original shape pairs are first obtained in the off-line preprocessing phase. Then, in the matching phase, the algorithm first seeks the candidate correspondences between shape pairs based on R-histogram matching, and a novel schema is designed to testify the matched shapes by a voting of the candidate corresponding shape pairs. Accordingly, the accuracy of shape matching is obviously improved.
This study is based on the assumption that there are at least two independent shapes in the image, and the topology is reserved while the image is being transformed. The assumption is valid as objects are composed of several shapes, among which the relative positions are reserved while mapped.
The outline of this article is as follows. R-histogram and its affine invariance are introduced in “R-histogram and its fundamental properties” section. In “Shape matching” section, the matching algorithm based on R-histogram is described. Experimental results are presented in “Experiments” section. Finally, “Conclusions” section concludes the article and suggests the future work.
R-histogram and its fundamental properties
In this section, R-histogram, which represents the relative attitude relationship between two shapes, is first defined, and then its symmetry and affine invariance are explored.
Definition of R-histogram
As shown in Figure1, two shapes A and B, which are located in an original positively oriented orthonormal frame, are denoted as and respectively. To measure the attitude of A relative to B, we first define a relative positively oriented orthonormal frame (Figure1a), in which, the has the same direction with. cA and cB are the centroids of A and B, respectively. is the line parallel to the with v the intercept on. For any real number v, the intersection, denoted by, is a longitudinal segment of A. In this article, as the term “shape” denotes a 2D plane which may have holes in it or may consist of many connected components, e.g., in Figure1a is the union of a finite number of disjoint segments. Similarly, stands for a longitudinal segment of shape B.
The R-histogram of A with respect to B, whose elements are the length ratios of the collinear longitudinal segments, is represented by RAB as
In Equation (1), L(ξ) denotes the length of the longitudinal segment ξ, N is the number of v that are sampled equidistantly in the range of, and are the minimum and maximum intercepts of the valid lines l(v) which have nonempty intersections with both A and B, i.e.,
and are the minimum and maximum intercepts of the lines l(v) which have nonempty intersections with A. Similarly, and are the minimum and maximum intercepts of l(v) which have nonempty intersections with B.
Similarly, the R-histogram of B relative to A is
Note that, the RBA is calculated based on the relative positively oriented orthonormal frame (Figure1b), where the has the same direction with.
Symmetry and affine invariance
Here, desirable properties (i.e., symmetry and affine invariance) of R-histogram for shape matching are proved.
As shown in Figure1, the direction of of the frame (Figure1a) is opposite to the direction of of the frame (Figure1b). Consequently, the lines l(vmin) and l(vmax) in for RAB are interchanged into the lines l(vmax) and l(vmin) in while RBA is being computed. Therefore, the symmetry of R-histogram, which describes the mathematical link between RAB and RBA, is deduced as
The general 2D affine transformation transforms point p in the original image into its corresponding point q in the transformed image by q, where is the translation vector and is the affine transformation matrix between the two images. Rotation, scaling and shearing, considered as the special cases of affine transform, are represented as the following matrices:
If two shapes A and B are transformed into A′ and B′ by an affine transformation, we have
Equation 6 reveals that a shape pair preserves its R-histogram while the affine mapping.
Obviously, the R-histogram is invariant to translation transformation, thus, we only explain the invariance of the R-histogram to affine transformation matrix. As shown in Figure2, in an original positively oriented orthonormal frame, two shapes A and B (Figure2a) are transformed into A′ and B′ (Figure2b) by an affine transformation matrix AT. and are the relative frames for RAB and RA′B′. cA, cB, cA′ and cB′ are centroids of A, B, A′ and B′, respectively. It can be deduced that, if the projection of a point on yAB-axis is known, the projection of the mapped point on yA’B’-axis can be obtained by
In Equation (7), det(AT), which is the determinate of the transformation matrix, is positive. is the distance between two points, v is the projection of a point on yAB-axis, and v′ is the projection of the mapped point on yA’B’-axis.
Eq. 7 reveals that the projection of the transformed points on yA’B’-axis is directly proportional to the projection of the original points on yAB-axis. Moreover,, representing the intersection, can be treated as the union of points of shape A with the projection value of v on yAB-axis. Therefore, we have
where, “⇔” denotes the correspondence relationship. As introduced in “Definition of R-histogram” section,,, and are the minimum and maximum values of intercepts of valid lines for original and transformed shape pairs. So far, as and are collinear, the affine invariance of R-histogram is proved due to the collinear distance invariance of the affine transformation.
For convenience, we call the shape in an original image as template shape, and the shape in an input image as input shape. The efficient shape matching algorithm has a preprocessing phase and a matching phase. In the preprocessing phase, R-histograms are obtained from all shape pairs in the original image. As the preprocessing phase can be executed off-line, the complexity of our algorithm can remarkably be reduced compared with the straightforward approach. Subsequently, in the matching phase, the candidate corresponding shape pairs are first discovered by R-histogram matching, and the correspondences between the template and input shapes are obtained by candidate corresponding shape pairs voting.
In the preprocessing phase, our aim is to obtain the R-histograms of the template shape pairs in the original image. Assume we are given an original image where m template shapes have been extracted. For each ordered template shape pair, its R-histogram is computed through the following steps:
Extract the centroids ci and cj of the two template shapes.
A relative positively oriented orthonormal frame , whose has the same direction with , is built up. Then, , the range of intercepts of the valid lines, can be obtained. The valid lines are parallel to the x-axis and have nonempty intersections with both Mi and Mj.
For each number , .
Due to the symmetry of R-histogram, the R-histogram of the ordered shape pair can be obtained as .
The complexity of the preprocessing step is.
In the matching phase, we are given a transformed image where t input shapes are extracted. In this stage, the corresponding shape pairs with the minimal descriptor matching error are first discovered. The matching error between descriptors, which is measured via the normalized L1 distance, is given by
where and are the R-histograms of a template shape pair (Mi, Mj) and an input shape pair (Tk, Ts), respectively. The smaller the err, the better the localization of the corresponding shape pairs is.
Theoretically, we have
Unfortunately, the matched shape pair in dependence on a single corresponding descriptor pair is sometimes not reliable, as the presence of noise, and/or because of the inaccuracy in shape extracting. To improve the accuracy of shape matching, for each input shape, its corresponding template shape is established by voting. In the following, we show how to confirm the template shape corresponding to the input shape Tk:
For each input shape pair, its corresponding template shape pair is. Then, the template shape Mi′ is casted a vote.
The template shape that scores the largest number of votes is considered as the corresponding one to Tk.
In this section, the performance of R-histogram is tested and compared with three state-of-the-art methods, including zero-order F-histogram[16, 17], global DOPM and CSS[8, 9]. The sensitivity of descriptors to affine transformations and noise are first compared, and the performances of descriptors on shape matching and object recognition are further tested.
Sensitivity to affine transformation and noise
The affine invariance of R-histogram has theoretically been proven in “Symmetry and affine invariance” section, here the examples of this property are provided. The 46 test shapes of 5 groups (see Figure3) are chosen from the MPEG-7 CE-shape-1 database, and the shapes in each group are placed randomly to create 10 different original images. Consequently, the following statistic results are counted based on 50 original and their transformed images. The transformed images are obtained according to different affine transformations and various noise levels. The examples of the transformed shapes are shown in Figure4.
To estimate the sensitivity of descriptors and the performance to each transformation parameter and noise separately, for each group of experiment, only one parameter is changed while others are fixed.
The sensitivity of R-histogram to affine transformations is evaluated by the matching error between descriptors of the corresponding template and input shape pairs. Simultaneously, the global DOPM and the normalized zero order F-histogram are taken for comparison.
First, the behavior of different descriptors in relation to rotation is reported. Figure5a depicts the matching error under different values of rotation θ ranging from 10° to 180° with 10° intervals, showing that the matching error floats slightly while the θ changes, and R-histogram outperforms to global DOPM and F-histogram.
Then the behavior of descriptors in relation to scaling is evaluated. The original images are transformed by different nonuniform scaling while values of changing from 0.25 to 4 in steps of 0.25. Figure5b depicts that the matching errors of the descriptors increase while the degree of the shape distortion increases, whereas R-histogram is less sensitive to nonuniform scaling than global DOPM.
The sensitivity of descriptors with respect to shearing is shown in Figure5c. The larger the absolute value of k, the higher the matching error is. R-histogram is more robust to shearing.
Finally, the robustness of the descriptors to noise is observed. To obtain noisy shapes, the coordinates of points on the contours are shifted in the range of [–r, r, and simultaneously with the direction vertical to the tangent of the point. The signal-to-noise ratio (SNR) is defined as
where d is the average distance between all the points on the contour and its centroid. The noise with different values of SNR, which changes from 20 to 45 dB in steps of 5 dB, is added onto shapes. The matching errors obtained from the noisy cases are summarized in Figure5d. It points that the errors of all descriptors decreases as SNR increase, and R-histogram has the best performance against noise.
R-histogram for shape matching
The original and transformed images in “Sensitivity to affine transformation and noise” section are still taken for experiments. Considering each shape in the original images, the matched shapes in the transformed images should be observed. The correct matching rate (CMR) is calculated as
where OM is the number of correct correspondences that are observed, and AM is the actual number of correspondences between original and input shapes.
The performance of R-histogram on shape matching is compared with that of global DOPM, zero-order F-histogram, and affine length CSS. Moreover, for R-histogram and F-histogram algorithm, besides the results of shape matching, the results of descriptor matching are also counted.
First, the performance of different descriptors is compared while the values of rotation change from 0° to 180° with 10° intervals. Figure6a indicates that R-histogram is more robust to rotation. Furthermore, the CMRs of shape matching using R-histogram and F-histogram, which are both 100%, are higher than CMRs of their descriptor matching. It proves the validity of our shape matching algorithm based on voting.
Then, the effect of scaling (Figure6b) and shearing (Figure6c) upon descriptors is observed. The values of scaling change from 0.25 to 4, with 0.25 intervals, and the shearing factor k is associated with different values from –3 to 3, with 1 intervals. R-histogram is most robust to nonuniform scaling and shearing, and the zero-order F-histogram performs worst as it is not invariant to nonuniform scaling and shearing.
Finally, Figure6d shows the performance of descriptors against varying noise levels, denoting that CMR s of all descriptors increase as SNR increases, and R-histogram is most robust to noise.
In addition, the semi-log scale plots of the running times of descriptors with respect to affine transformations and noise are depicted in Figure7. From Figure7, the following observations are obtained: (1) the computational costs of R-histogram and the F-histogram are nearly regardless of rotation, shearing, and noise; however, they increase as the scale of shape increases; (2) the computational cost of R-histogram is much less than that of F-histogram; (3) the computational cost of global DOPM and CSS, which are decided by the sampled curvature, nearly invariant while the shape is transformed; (4) DOPM has the lowest computational cost while F-histogram has the highest computational complexity.
As a summary of this section, we found that (1) R-histogram has the best performance in robustness to affine transformations and noise; (2) for R-histogram and F-histogram, the CMR of shape matching is higher than that of descriptor matching because of the voting strategy; (3) the computational complexity of R-histogram is much lower than that of F-histogram.
R-histogram for object recognition
As indicated in “Introduction” section, R-histogram is based on the premise that there are at least two shapes in an image, and the shapes preserve their relative position while the image is being transformed. However, the R-histogram can be used for object recognition since objects are usually composed of several shapes with fixed topology. Therefore, in this section, we test the performance of R-histogram on object recognition. Different from the experiments finding the correspondences between the shapes in “R-histogram for shape matching” section, here, the task is to discover the matched objects with maxima corresponding shape pairs.
The licenses (see Figure8a) which are searched from Internet are taken as template objects. The input objects in Figure8b are obtained by arbitrary affine transformations on the template objects. The independent shapes of letters and numbers, whose contours are labeled by red color in Figure8, are obtained by simply image segmentation as they are uniform. The results of the license recognition are given in Table1. The notations “Y” and “N” represent the correct and wrong results of object recognition, respectively. In addition, the shape-matching results of R-histogram, F-histogram, DOPM, and CSS are 100, 10, 57.5, and 60%, respectively. Experiments validate that R-histogram is superior to F-histogram, DOPM, and CSS for object recognition under affine transformations, whereas F-histogram performs worst as it is not affine invariant.
In this article, a novel affine invariant descriptor, R-histogram, is proposed to describe the relative attitude relationship between the shapes. The shapes are handled as the longitudinal segments parallel to the line connecting centroids of two shapes, and the R-histogram is constructed by the length ratios of collinear longitudinal segments from two shapes. In the shape-matching algorithm, the R-histograms of original shape pairs are first found in the off-line preprocessing phase. Then in the matching phase, to improve the shape-matching accuracy, a voting strategy is applied to the candidate corresponding shape pairs, which are discovered by R-histogram matching. There are four advantages of the proposed algorithm. First, the contours of the shapes do not need to be extracted; second, the new descriptor is robust to affine transformation and noise; third, it is simple with low computational complexity; finally, it guarantees high shape matching accuracy by voting for all candidate correspondences with minimal error of R-histogram matching. The R-histogram of a shape pair is insensitive to the distance along the line connecting the centroids of the shapes, which results in that the shape pairs with the same attitudes but different distances generate undistinguishable R-histograms. One solution to this limitation is to add distance information to R-histogram, which will be investigated in our future research.
Arbter K, Snyder WE, Burkhardt H, Hirzinger G: Application of affine-invariant fourier descriptors to recognition of 3-D objects. IEEE Trans Pattern Anal. Mach. Intell 1990, 12(7):640-647. 10.1109/34.56206
Mei Y, Androutsos D: Affine invariant shape descriptors: the ICA-Fourier affine invariant shape descriptors: the ICA-Fourier descriptor and the PCA-Fourier descriptor. 19th International Conference on Pattern Recognition, Tampa, Florida, USA; 2008:1-4.
Matsakis P, Keller JM, Sjahputera O, Marjamaa J: The use of force histograms for affine-invariant relative position description. IEEE Trans Pattern Anal. Mach. Intell 2004, 26(1):1-22. 10.1109/TPAMI.2004.1261075
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Wang, W., Xiong, B., Sun, H. et al. An affine invariant relative attitude relationship descriptor for shape matching based on ratio histograms.
EURASIP J. Adv. Signal Process.2012, 209 (2012). https://doi.org/10.1186/1687-6180-2012-209