- Research
- Open Access
An affine invariant relative attitude relationship descriptor for shape matching based on ratio histograms
- Wei Wang^{1}Email author,
- Boli Xiong^{1},
- Hao Sun^{1},
- Hongping Cai^{1},
- Yongmei Jiang^{1} and
- Gangyao Kuang^{1}
https://doi.org/10.1186/1687-6180-2012-209
© Wang et al.; licensee Springer. 2012
- Received: 12 July 2012
- Accepted: 17 September 2012
- Published: 3 October 2012
Abstract
A novel shape descriptor, named as ratio histograms (R-histogram), is proposed to represent the relative attitude relationship between two independent shapes. For a pair of two shapes, the shapes are treated as the longitudinal segments parallel to the line connecting centroids of the two shapes, and the R-histogram is composed of the length ratios of collinear longitudinal segments. R-histogram is theoretically affine invariant due to collinear distance invariance of the affine transformation. In addition, as the computation of the length ratio weakens the noise contribution, R-histogram is robust to noise. Based on the R-histogram, the shape-matching algorithm includes two major phases: preprocessing and matching. The first phase, which can be processed off-line, is trying to obtain the R-histograms of all original shape pairs. In the second phase, for each transformed shape pair, its R-histogram is computed and the candidate matched shape pair with minimal R-histogram matching error is found. Subsequently, a voting strategy, which further improves the accuracy of shape matching, is adopted for the candidate corresponding shape pairs. Experimental results demonstrate that the proposed R-histogram is robust and efficient.
Keywords
- Affine invariant
- Attitude relationship
- R-histogram
- Shape matching
Introduction
Shape matching plays an important role in image processing and computer vision applications. Since the images taken from different viewpoints usually suffer from perspective distortions, the matching algorithm should be capable of dealing with them. Numerous methods, such as spectral transform[1, 2], moment invariants[3, 4], iso-area normalization[5], time series[6], B-splines[7], curvature scale space (CSS)[8, 9], shape contexts[10], shape signature[11], diagonals of orthogonal projection matrices (DOPM)[12], multiscale oriented corner correlation[13], etc., have been proposed for shape matching under affine transformations. However, most of these methods make an assumption that a sparse set of boundary points or interest points have been extracted beforehand. Besides, many traditional matching methods only analyze the properties of a separate shape and the spatial relationships between different shapes are ignored.
The relative position between the shapes often helps in image understanding and shape matching. In some recent related study such as[14], the descriptors extracted from spatial relationship between shapes are used for shape matching. Krishnapuram et al.[15] first proposed a descriptor named histogram of angles. A histogram of all possible angles between point pairs in the regions is used to describe the directional spatial relations between two shapes. The histogram is computationally expensive and can only deal with raster data. To overcome its limitation, F-histogram[16, 17], generalizing from the histogram of angles, is proposed. Different from the histogram of angles treating the 2D object as a set of points, F-histogram method handles the shapes as longitudinal segments. Besides the spatial relationship between two shapes, the orientation and the size of the shapes are captured by F-histogram as well. Furthermore, F-histogram is capable of processing both raster and vector data. As a result, F-histogram can effectively be used for shape matching. However, the descriptor is not affine invariant. The F-histogram of a shape pair varies with the viewing orientation. Thus to compare two F-histograms, the difference should be normalized by searching for several parameters, which are equivalent to the parameters of the geometric transformation between the images. The parameter searching process leads to an expensive computational cost.
Motivated by F-histogram, in this article, a novel affine invariant histogram, named as ratio histograms (R-histogram), is proposed to describe the relative attitude relationship between two shapes. For a pair of two independent shapes, the longitudinal segments, which are the intersections of shapes and the lines parallel to the line connecting the centroids of the two shapes, are treated as primitives. Then, the R-histogram is composed of the length ratios of collinear longitudinal segments from the two shapes. This descriptor has a clear physical interpretation and can be applied to shape matching without searching for affine transformation parameters. In contrast to F-histogram which treats longitudinal segments in a number of directions during the range [0, 2π], R-histogram treats the longitudinal segments of shapes with only one fixed direction; thereby the computational complexity is significantly reduced. Moreover, an efficient shape-matching algorithm based on R-histogram is developed in this article. The algorithm is divided into two phases. The R-histograms of original shape pairs are first obtained in the off-line preprocessing phase. Then, in the matching phase, the algorithm first seeks the candidate correspondences between shape pairs based on R-histogram matching, and a novel schema is designed to testify the matched shapes by a voting of the candidate corresponding shape pairs. Accordingly, the accuracy of shape matching is obviously improved.
This study is based on the assumption that there are at least two independent shapes in the image, and the topology is reserved while the image is being transformed. The assumption is valid as objects are composed of several shapes, among which the relative positions are reserved while mapped.
The outline of this article is as follows. R-histogram and its affine invariance are introduced in “R-histogram and its fundamental properties” section. In “Shape matching” section, the matching algorithm based on R-histogram is described. Experimental results are presented in “Experiments” section. Finally, “Conclusions” section concludes the article and suggests the future work.
R-histogram and its fundamental properties
In this section, R-histogram, which represents the relative attitude relationship between two shapes, is first defined, and then its symmetry and affine invariance are explored.
Definition of R-histogram
${v}_{min}^{A}$ and${v}_{max}^{A}$ are the minimum and maximum intercepts of the lines l(v) which have nonempty intersections with A. Similarly,${v}_{min}^{B}$ and${v}_{max}^{B}$ are the minimum and maximum intercepts of l(v) which have nonempty intersections with B.
Note that, the R_{ BA } is calculated based on the relative positively oriented orthonormal frame$\left(O,{\overrightarrow{x}}_{\mathit{BA}},{\overrightarrow{y}}_{\mathit{BA}}\right)$ (Figure1b), where the${x}_{\mathit{BA}}-\text{axis}$ has the same direction with${\overrightarrow{v}}_{{c}_{B}{c}_{A}}$.
Symmetry and affine invariance
Here, desirable properties (i.e., symmetry and affine invariance) of R-histogram for shape matching are proved.
Symmetry
Affine invariance
Equation 6 reveals that a shape pair preserves its R-histogram while the affine mapping.
In Equation (7), det(A_{ T }), which is the determinate of the transformation matrix, is positive.$dis\left(\cdot ,\cdot \right)$ is the distance between two points, v is the projection of a point on y_{ AB }-axis, and v′ is the projection of the mapped point on y_{ A’B’ }-axis.
where, “⇔” denotes the correspondence relationship. As introduced in “Definition of R-histogram” section,${v}_{min}$,${v}_{max}$,${{v}^{\prime}}_{min}$ and${{v}^{\prime}}_{max}$ are the minimum and maximum values of intercepts of valid lines for original and transformed shape pairs. So far, as${E}^{A}\left({v}_{n}\right)$ and${E}^{B}\left({v}_{n}\right)$ are collinear, the affine invariance of R-histogram is proved due to the collinear distance invariance[18] of the affine transformation.
Shape matching
For convenience, we call the shape in an original image as template shape, and the shape in an input image as input shape. The efficient shape matching algorithm has a preprocessing phase and a matching phase. In the preprocessing phase, R-histograms are obtained from all shape pairs in the original image. As the preprocessing phase can be executed off-line, the complexity of our algorithm can remarkably be reduced compared with the straightforward approach. Subsequently, in the matching phase, the candidate corresponding shape pairs are first discovered by R-histogram matching, and the correspondences between the template and input shapes are obtained by candidate corresponding shape pairs voting.
Preprocessing
- (1)
Extract the centroids c _{ i } and c _{ j } of the two template shapes.
- (2)
A relative positively oriented orthonormal frame $\left(O,{\overrightarrow{x}}_{{c}_{i}{c}_{j}},{\overrightarrow{y}}_{{c}_{i}{c}_{j}}\right)$, whose ${x}_{{c}_{i}{c}_{j}}\u2013\text{axis}$ has the same direction with ${\overrightarrow{v}}_{{c}_{i}{c}_{j}}$, is built up. Then, $\left[{v}_{min},{v}_{max}\right]$, the range of intercepts of the valid lines, can be obtained. The valid lines are parallel to the x-axis and have nonempty intersections with both M _{ i } and M _{ j }.
- (3)
For each number $\text{n}\in \left[0,\text{N}\right]$, ${R}_{{M}_{i}{M}_{j}}\left(n\right)=L\left({E}^{{M}_{i}}\left({v}_{n}\right)\right)/L\left({E}^{{M}_{j}}\left({v}_{n}\right)\right)\left({v}_{n}={v}_{\mathit{min}}+n\left({v}_{\mathit{max}}-{v}_{\mathit{min}}\right)/N\right)$.
- (4)
Due to the symmetry of R-histogram, the R-histogram of the ordered shape pair $\left({M}_{j},{M}_{i}\right)$ can be obtained as ${R}_{{M}_{j}{M}_{i}}\left(n\right)=1/{R}_{{M}_{i}{M}_{j}}\left(N-n\right)$.
The complexity of the preprocessing step is$O\left({m}^{2}\right)$.
Matching
where${R}_{{M}_{i}{M}_{j}}\left(n\right)$ and${R}_{{T}_{k}{T}_{s}}\left(n\right)$ are the R-histograms of a template shape pair (M_{ i }, M_{ j }) and an input shape pair (T_{ k }, T_{ s }), respectively. The smaller the err, the better the localization of the corresponding shape pairs is.
- (1).
For each input shape pair$\left({T}_{k},{T}_{s}\right),s\in \left[1,t\right],s\ne k$, its corresponding template shape pair is${\left({M}_{i\text{'}},{M}_{j\text{'}}\right)}_{\mathit{opt}}=\underset{i,j\in \left[1,m\right],i\ne j}{min}\phantom{\rule{0.2em}{0ex}}err\left({R}_{{T}_{k}{T}_{s}},{R}_{{M}_{i}{M}_{j}}\right)$. Then, the template shape M_{ i′ } is casted a vote.
- (2).
The template shape that scores the largest number of votes is considered as the corresponding one to T_{ k }.
Experiments
In this section, the performance of R-histogram is tested and compared with three state-of-the-art methods, including zero-order F-histogram[16, 17], global DOPM[12] and CSS[8, 9]. The sensitivity of descriptors to affine transformations and noise are first compared, and the performances of descriptors on shape matching and object recognition are further tested.
Sensitivity to affine transformation and noise
To estimate the sensitivity of descriptors and the performance to each transformation parameter and noise separately, for each group of experiment, only one parameter is changed while others are fixed.
The sensitivity of R-histogram to affine transformations is evaluated by the matching error between descriptors of the corresponding template and input shape pairs. Simultaneously, the global DOPM and the normalized zero order F-histogram are taken for comparison.
Then the behavior of descriptors in relation to scaling is evaluated. The original images are transformed by different nonuniform scaling while values of${s}_{x}/{s}_{y}$ changing from 0.25 to 4 in steps of 0.25. Figure5b depicts that the matching errors of the descriptors increase while the degree of the shape distortion increases, whereas R-histogram is less sensitive to nonuniform scaling than global DOPM.
The sensitivity of descriptors with respect to shearing is shown in Figure5c. The larger the absolute value of k, the higher the matching error is. R-histogram is more robust to shearing.
where d is the average distance between all the points on the contour and its centroid. The noise with different values of SNR, which changes from 20 to 45 dB in steps of 5 dB, is added onto shapes. The matching errors obtained from the noisy cases are summarized in Figure5d. It points that the errors of all descriptors decreases as SNR increase, and R-histogram has the best performance against noise.
R-histogram for shape matching
where OM is the number of correct correspondences that are observed, and AM is the actual number of correspondences between original and input shapes.
The performance of R-histogram on shape matching is compared with that of global DOPM, zero-order F-histogram, and affine length CSS. Moreover, for R-histogram and F-histogram algorithm, besides the results of shape matching, the results of descriptor matching are also counted.
Then, the effect of scaling (Figure6b) and shearing (Figure6c) upon descriptors is observed. The values of scaling change from 0.25 to 4, with 0.25 intervals, and the shearing factor k is associated with different values from –3 to 3, with 1 intervals. R-histogram is most robust to nonuniform scaling and shearing, and the zero-order F-histogram performs worst as it is not invariant to nonuniform scaling and shearing.
Finally, Figure6d shows the performance of descriptors against varying noise levels, denoting that CMR s of all descriptors increase as SNR increases, and R-histogram is most robust to noise.
As a summary of this section, we found that (1) R-histogram has the best performance in robustness to affine transformations and noise; (2) for R-histogram and F-histogram, the CMR of shape matching is higher than that of descriptor matching because of the voting strategy; (3) the computational complexity of R-histogram is much lower than that of F-histogram.
R-histogram for object recognition
As indicated in “Introduction” section, R-histogram is based on the premise that there are at least two shapes in an image, and the shapes preserve their relative position while the image is being transformed. However, the R-histogram can be used for object recognition since objects are usually composed of several shapes with fixed topology. Therefore, in this section, we test the performance of R-histogram on object recognition. Different from the experiments finding the correspondences between the shapes in “R-histogram for shape matching” section, here, the task is to discover the matched objects with maxima corresponding shape pairs.
The results of object recognition
Input objects | ||||||
---|---|---|---|---|---|---|
I1 | I2 | I3 | I4 | I5 | I6 | |
R-histogram | Y | Y | Y | Y | Y | Y |
F-histogram | Y | N | N | N | N | N |
DOPM | Y | Y | N | Y | Y | Y |
CSS | N | Y | Y | Y | Y | Y |
Conclusions
In this article, a novel affine invariant descriptor, R-histogram, is proposed to describe the relative attitude relationship between the shapes. The shapes are handled as the longitudinal segments parallel to the line connecting centroids of two shapes, and the R-histogram is constructed by the length ratios of collinear longitudinal segments from two shapes. In the shape-matching algorithm, the R-histograms of original shape pairs are first found in the off-line preprocessing phase. Then in the matching phase, to improve the shape-matching accuracy, a voting strategy is applied to the candidate corresponding shape pairs, which are discovered by R-histogram matching. There are four advantages of the proposed algorithm. First, the contours of the shapes do not need to be extracted; second, the new descriptor is robust to affine transformation and noise; third, it is simple with low computational complexity; finally, it guarantees high shape matching accuracy by voting for all candidate correspondences with minimal error of R-histogram matching. The R-histogram of a shape pair is insensitive to the distance along the line connecting the centroids of the shapes, which results in that the shape pairs with the same attitudes but different distances generate undistinguishable R-histograms. One solution to this limitation is to add distance information to R-histogram, which will be investigated in our future research.
Declarations
Acknowledgment
This work was supported by the National Natural Science Foundation of China under Grant 61201338.
Authors’ Affiliations
References
- Arbter K, Snyder WE, Burkhardt H, Hirzinger G: Application of affine-invariant fourier descriptors to recognition of 3-D objects. IEEE Trans Pattern Anal. Mach. Intell 1990, 12(7):640-647. 10.1109/34.56206View ArticleGoogle Scholar
- Mei Y, Androutsos D: Affine invariant shape descriptors: the ICA-Fourier affine invariant shape descriptors: the ICA-Fourier descriptor and the PCA-Fourier descriptor. 19th International Conference on Pattern Recognition, Tampa, Florida, USA; 2008:1-4.Google Scholar
- Yang Z, Cohen FS: Cross-weighted moments and affine invariants for image registration and matching. IEEE Trans. IEEE Trans. Pattern Anal. Mach. Intell 1999, 21(8):804-814. 10.1109/34.784312MathSciNetView ArticleGoogle Scholar
- Hosny KM: On the computational aspects of affine moment invariants for gray scale images. Appl. Math. Comput 2008, 195(2):762-771. 10.1016/j.amc.2007.05.025MathSciNetView ArticleMATHGoogle Scholar
- Yang M, Kpalma K, Ronsin J: Affine invariance contour descriptor based on iso-area normalisation. Electron. Lett 2007, 43(7):379-380. 10.1049/el:20070076View ArticleMATHGoogle Scholar
- Kartikeyan B, Sarkar A: Shape description by time series. IEEE Trans on Pattern Anal. Mach. Intell 1989, 11(9):977-984. 10.1109/34.35501View ArticleGoogle Scholar
- Yu-Ping Wang SL, Lee KT: Multiscale curvature based shape representation using B-spline wavelets. IEEE Trans. Image Process 1999, 8(11):1586-1592. 10.1109/83.799886View ArticleGoogle Scholar
- Mokhtarian F, Abbasi S: Shape similarity retrieval under affine transforms. Pattern Recognit 2002, 35(1):31-41. 10.1016/S0031-3203(01)00040-1View ArticleMATHGoogle Scholar
- Mai F, Chang CQ, Hung YS: Affine invariant shape matching and recognition under partial occlusion. Proceeding of 2010 IEEE 17th International Conference on Image Processing, Hong Kong; 2010:4605-4608.Google Scholar
- Belongie S, Malik J, Puzicha J: Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal. Mach. Intell 2002, 24(24):509-522.View ArticleGoogle Scholar
- EI-ghazal A, Basir O, Belkasim S: Farthest point distance: a new shape signature for Fourier descriptors, Signal Process. Image Commun 2009, 24(7):572-586.Google Scholar
- Wang Z, Liang M: Locally affine invariant descriptors for shape matching and retrieval. IEEE Signal Process. Lett 2010, 17(9):803-806.View ArticleGoogle Scholar
- Zhao F, Huang Q, Wang H, Gao W: MOCC: a fast and robust correlation-based method for interest point matching under large scale changes, EURASIP. J. Adv in Signal Process 2010. 10.1155/2010/410628Google Scholar
- Lee SY, Hsu FJ: Spatial reasoning and similarity retrieval of images using 2D C-string knowledge representation. Pattern Recognit 1992, 25(3):305-318. 10.1016/0031-3203(92)90112-VMathSciNetView ArticleGoogle Scholar
- Krishnapuram R, Keller JM, Ma Y: Quantitative analysis of properties and spatial relations of fuzzy image regions. IEEE Trans. Fuzzy Syst 1993, 1(3):222-233. 10.1109/91.236554View ArticleGoogle Scholar
- Matsakis P, Wendling L: A new way to represent the relative position between area objects. IEEE Trans Pattern Anal. Mach. Intell 1999, 21(7):634-643. 10.1109/34.777374View ArticleGoogle Scholar
- Matsakis P, Keller JM, Sjahputera O, Marjamaa J: The use of force histograms for affine-invariant relative position description. IEEE Trans Pattern Anal. Mach. Intell 2004, 26(1):1-22. 10.1109/TPAMI.2004.1261075View ArticleGoogle Scholar
- Modenov PS, Parkhomenko AS: Geometric Transformations. 1st edition. Academic, New York; 1965.MATHGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.