 Research
 Open access
 Published:
Fourier descriptors for broken shapes
EURASIP Journal on Advances in Signal Processing volumeÂ 2013, ArticleÂ number:Â 161 (2013)
Abstract
Fourier descriptors are powerful features for the recognition of twodimensional connected shapes. In this article, we propose a method to define Fourier descriptors even for broken shapes, i.e. shapes that can have more than one contour. The method is based on the convex hull of the shape and the distance to the closest actual contour point along the convex hull. We define different invariant Fourier descriptors for this threedimensional representation of a twodimensional shape and compare them on different data sets. The recognition rates are comparable to normal Fourier descriptors while the new scheme at the same time offers the option to also deal with broken contours. We also discuss and evaluate different normalisation schemes that make the descriptors invariant under scale and rotation.
1 Introduction
Shape descriptors are numbers that are computed from a twodimensional shape. In some cases, the set of numbers is complete in the sense that the original shape can be reconstructed from the shape descriptors[1, 2], but even in these situations, only a subset of the shape descriptors is typically used in practical applications. The shape descriptors can thus be considered as an approximative description of the shape such that shape similarity somehow corresponds to similarity of the shape descriptors. Consequently, they can be used for object recognition and object similarity detection.
There are two main categories of shape descriptors: volumebased and contourbased. Volumebased descriptors use all pixels of the object and include descriptors like geometric moments[3] or Zernike moments[4]. Contourbased descriptors compute the descriptors only from the shape boundary and include descriptors like curvature scale space[5] or Fourier descriptors[6]. Which approach is better depends on the application, especially whether the internal content of the shape or the boundary is more important.
The term Fourier descriptors covers a wide variety of shape descriptors, which have in common that they compute the discrete Fourier transform of some representation of a closed contour. They vary in the representation (or 'signatureâ€™[6]) of the contour and in the additional manipulations to achieve invariance properties under certain geometric transformations. Typically, invariance under translation, scaling, and rotation is achieved, but there are also Fourier descriptors that are invariant to shearing[7]. As Fourier descriptors require the extraction of a closed contour from the shape, they are restricted to connected shapes and cannot be applied to possibly broken objects.
To overcome this restriction, we define a new threedimensional contour signature function, based upon the convex hull of the shape and the distance of the convex hull to the closest point of the shape. As the convex hull is defined for arbitrary shapes, including broken shapes, the new shape signature no longer requires connectivity of the shape. Based upon this shape signature, we derive different invariant Fourier descriptors and compare their performance on different data sets.
This paper is organised as follows: In Section 2 we give an overview over different Fourier descriptors described in the literature for closed contours of unbroken shapes. In Section 3 we describe our new method for a contour representation of broken shapes and define different methods to obtain invariant Fourier descriptors from this representation. Sections 4 and 5 contain the results of a comparative evaluation of the different Fourier descriptors and the conclusions drawn therefrom.
2 Fourier descriptors
Every connected object has a closed contour that can be represented as a sequence of the pixel coordinates x(t),y(t) where tâ€‰=â€‰0,â€¦Nâ€‰â€‰1. A popular algorithm for contour extraction can be found in[8]. The coordinates can be considered to be sampling values
of a continuous, closed curvef:[0,2\mathrm{\xcf\u20ac}]\xe2\u2020\u2019{\mathbb{R}}^{2} such that f can be extended continuously to a 2Ï€periodic function. When the function f (or, more generally, any signature function g(x(t), y(t)) derived from the coordinates) is expanded into a Fourier series, a fixed number of discrete Fourier coefficients approximately represents the contour shape. This allows for data reduction.
This is the basic idea underlying all Fourier descriptors suggested in the literature. They vary however in the contour representation g(x(t),y(t)) used as the starting point for the Fourier expansion. The representations typically fall into one of the following categories:

Complex representation:x(t)+j\xc2\xb7y(t)\xe2\u02c6\u02c6\mathbb{C}

Multidimensional representation:(x(t),y(t))\xe2\u02c6\u02c6{\mathbb{R}}^{2}

Scalar representation:(x(t),y(t))\xe2\u2020\xa6g(t)\xe2\u02c6\u02c6\mathbb{R}.
Depending on the contour representation, the resulting Fourier coefficients will behave differently under the geometric transformations scaling, rotation, and start point shift. When the contour points are transformed by any of these operations, the Fourier coefficients change according to simple rules, which can be used to define invariant descriptors.
In the following subsections, we give an overview over the three categories and the normalisation approaches proposed in the literature to achieve invariance under these geometric transformations.
2.1 Complex contour representation
When the twodimensional plane is interpreted as a complex plane, the contour is represented by a sequence of complex numbers z(t)â€‰=â€‰x(t)â€‰+â€‰jâ€‰Â·â€‰y(t) which has the discrete Fourier expansion (tâ€‰=â€‰0,â€¦,Nâ€‰â€‰1)
with discrete Fourier coefficients
When the coefficients c _{ k } are interpreted as numerical approximations of the Fourier coefficients\stackrel{\xcc\u201a}{f}(k) of the continuous curve f(Ï„)â€‰=â€‰x(Ï„ N/2Ï€)â€‰+â€‰j y(Ï„ N/2Ï€) in (1)
then the connection between\stackrel{\xcc\u201a}{f}(k) and the discrete Fourier coefficients (3) is given by
According to the RiemannLebesgue lemma (p. 45 in[9]), it is
so that coefficients for large values of k are small and only describe lessimportant details. Cutting off higher frequencies k in (4) is thus equivalent to omitting coefficients in the middle of the vector (c _{0},â€¦,c _{ N1}). An example can be seen in Figure1.
The zeroth coefficient c _{0} is the centre of gravity of the contour. As the smallest period of the contour curve f is the length 2Ï€ of the parameter interval, we can assume that at least c _{1}â€‰â‰ â€‰0 or c _{ N1}â€‰â‰ â€‰0. Which of these two coefficients actually is guaranteed to be nonzero depends on the orientation of the contour path: for Pavlidisâ€™ algorithm[8], e.g. it is c _{1}â€‰â‰ â€‰0. In general, it is however not guaranteed that both coefficients are nonzero, as can be seen from the unit circle f(t)â€‰:=â€‰(cos(t), sin(t)), which has coefficients c _{1}â€‰=â€‰1, c _{ N1}â€‰=â€‰0, Nâ€‰â‰¥â€‰3.
Based on elementary properties of the discrete Fourier transform, simple rules for the change of the coefficients c _{ k } under translation, scale, and rotation immediately follow. Let c _{ k } be a coefficient calculated before any of the following operations. Then the geometric operations have the following effects:

Translation. Adding the same complex number u to all points z(t) leads to new coefficients (c _{0}â€‰+â€‰u,c _{1},â€¦,c _{ N1})

Scale. Multiplying all points z(t) with the same real factor dâ€‰>â€‰0 leads to new coefficients dâ€‰Â·â€‰c _{ k }

Rotation. Rotation in the complex plane is the same as multiplying all points z(t) with a factor exp(j Ï†), where Ï† is the angle of rotation. This leads to new coefficients exp(j Ï†)c _{ k }

Start point shift. Starting the contour at a different point results in a cyclic shift of vector{(z(t))}_{t=0}^{N1}. If the index shift is m, then the new coefficients areexp\left(\mathit{\text{jkm}}\frac{2\mathrm{\xcf\u20ac}}{N}\right){c}_{k}
To achieve translation invariance, the first coefficient c _{0} can be discarded because it is the only one that depends on translation. For scale invariance, all coefficients can be divided by the absolute value of a nonzero coefficient c _{ r }, râ€‰>â€‰0. Usually, a fixed coefficient r is chosen, e.g. râ€‰=â€‰1, but our experiments in Section 4.3 have shown that it is better to always choose the coefficient c _{ r } with the largest absolute value.
Since  exp(j Ï†)â€‰=â€‰ exp(j k m 2Ï€/N)â€‰=â€‰1, a simple approach to obtain a rotation and shift invariant descriptor is to completely drop the phase information and to only use the absolute values of the Fourier coefficients. This approach was used e.g. by Zhang and Lu in their comparative study[6]. The resulting absolute value descriptors are
It is also possible to define invariant Fourier descriptors that still keep the phase information, as already observed by Dimov and Laskov[10]. Let c _{ r }â€‰â‰ â€‰0 and c _{ s }â€‰â‰ â€‰0, râ€‰â‰ â€‰s be two nonzero coefficients with polar angles Î± _{ r }â€‰= arg(c _{ r }) and Î± _{ s }â€‰= arg(c _{ s }). Rotation invariance is achieved by multiplying each coefficient with exp(j Î± _{ r }), and shift invariance is established by replacing the polar angle Î± _{ k }â€‰= arg(c _{ k }) with s Î± _{ k }â€‰â€‰k Î± _{ s }. Combining these phase normalisations yields the invariant descriptors
for kâ€‰=â€‰1,Nâ€‰â€‰1,2,Nâ€‰â€‰2,â€¦. The two normalisation coefficients c _{ r } and c _{ s } should be chosen as the coefficients with the largest and second largest absolute values, respectively^{a}.
Granlund[11] proposed different descriptors as
These descriptors include the phase information, but because of d _{ k }â€‰=â€‰d _{ Nk }, there is a considerable loss of information compared to (6). Granlund also defined the (Nâ€‰â€‰1)(Nâ€‰â€‰2) (sic!) descriptors{c}_{1+k}^{i}{c}_{N+1i}^{k}/{c}_{1}^{i+k}, but these have a lot of redundancy and it is not clear how to select a small subset therefrom.
2.2 Multidimensional contour representation
Instead of interpreting the contour coordinates as complex numbers, the x and y coordinates can alternatively be transformed separately:
or, split into real and imaginary part:
Half of these components are redundant because{b}_{0}^{(x)}={b}_{0}^{(y)}=0 and, for 0â€‰<â€‰kâ€‰â‰¤â€‰N/2, it is
The inverse formula then reads
where\xe2\u0152\u02c6\frac{N1}{2}\xe2\u0152\u2030 is the smallest integer i withi\xe2\u2030\yen \frac{N1}{2}. For the sake of simplicity, let N be an odd number throughout the rest of this section.
The real coefficients (9) of the multidimensional representation are connected to the complex coefficients (3) by
In contrast to c _{ k }, higher frequencies directly correspond to higher indices k of the real coefficients{a}_{k}^{(x/y)},{b}_{k}^{(x/y)} because of the symmetry relations (10).
Kuhl and Giardina[12] interpreted each summand
in (11) as a parameterisation with parameter tâ€‰âˆˆ [0,2Ï€] of an ellipse that visualises the k th Fourier coefficients. Therefore they called the resulting Fourier descriptors elliptic features. Based upon this idea, Lin and Hwang[13] proposed the following translation, rotation, and shift invariant (but not scale invariant) Fourier descriptors:
where iâ€‰âˆˆâ€‰1,â€¦,(Nâ€‰â€‰1)/2 is a fixed index and sgn denotes the signum function. The properties of the real Fourier coefficients imply
To make these features also scale invariant, they additionally need to be divided by a normalisation factor, e.g. I _{1}, which leads to the invariant descriptors I _{ k }/I _{1}, J _{ k }/I _{1}, and{K}_{k}/{I}_{1}^{2}.
A nice point about the features (12) is that they have a geometric interpretation: I _{ k } is the sum of two semiaxis lengths of the k th ellipse, J _{ k } is proportional to the area of the k th ellipse, and K _{ k } contains the phase difference between ellipses i and k for the fixed i. Compared to the Nâ€‰â€‰2 complex features (6), the\frac{3}{2}(N1) real elliptic Fourier features contain, however, considerably less information about the shape. Lin and Hwang tried to compensate this with additional features, which were not rotation invariant, however.
An interesting generalisation of Fourier descriptors from twodimensional curves to ndimensional closed curves was made by Badreldin et al.[14]. They transformed each component separately and then built a vector containing l _{2}norms of all Fourier coefficients of a given index. In two dimensions this is equivalent to the descriptors by Shridhar and Badreldin[15] (kâ€‰=â€‰0,â€¦,(Nâ€‰â€‰1)/2):
Rotation and start point shift invariance is obtained because absolute values are used, and translation invariance results from discarding kâ€‰=â€‰0. For scale invariance, they additionally need to be divided by a normalisation factor, e.g.\sqrt{{c}_{1}^{(x)}{}^{2}+{c}_{1}^{(y)}{}^{2}}. However, the Fourier descriptors (13) discard a considerable portion of the shape information: not only phase information is lost but also x and ycomponents are not coupled. Therefore, for example, a shift of xcoordinates without a change to ycoordinates cannot be detected.
2.3 Scalar contour representation
A twodimensional contour can also be represented in one dimension by mapping it to a onedimensional signature function: (x(t),y(t))â€‰â†¦â€‰f(t). The signature function f can already be invariant under translation, scaling and rotation, like Zahn and Roskiesâ€™ cumulative angular function[16], or the invariance normalisation can be applied after the Fourier transform. Mapping two dimensions onto one generally leads to some loss of shape information (see Figure2), but the hope is that the essential features are still captured for most shapes. For an overview of possible signature functions, see[6].
In the comparative study[6], the centroid distance performed best. Let({x}_{0},{y}_{0}):=\frac{1}{N}{\xe2\u02c6\u2018}_{k=0}^{N1}({x}_{k},{y}_{k}) be the centre of gravity of the contour. Then the centroid distance is defined as
These values are already rotation and translation invariant. Let{({c}_{k})}_{k=0}^{N1} be the (complex) Fourier transform of{(r(t))}_{t=0}^{N1}, i. e.
Descriptors R _{ k } that are also scale and start point shift invariant can be obtained with the phase normalisation
where Î± _{ k }â€‰= arg(c _{ k }) denotes the polar angle of coefficient c _{ k }, and Î± _{ s } is the polar angle of the coefficient c _{ s }, sâ€‰â‰¥â€‰1 with the second largest absolute value. Note that c _{0}â€‰=â€‰c _{0} is always the largest absolute value because
An alternative simpler normalisation is to discard the phase information and use the absolute value R _{ k }. This normalisation was used by Zhang and Lu.
3 Application to broken shapes
All Fourier descriptors described in Section 2 start from a closed contour description of the shape and are therefore not applicable when the shape is broken, i.e. consists of more than one connected component. In this section we first present a method to describe the contour of an arbitrary (broken or unbroken) shape by a periodic threedimensional curve and then derive different Fourier descriptors for this curve which are invariant under translation, scale, rotation, and start point shift.
3.1 Contour representation of broken shapes
A simple solution to circumvent the problem of broken shapes would be to replace the shape parts with a single closed curve that contains all parts and to compute the Fourier descriptors from this curve instead. An obvious candidate for such a curve is the convex hull, i.e. the smallest convex polygon that contains all points of the shape. There are efficient algorithms for computing the convex hull from a set of points[17]. As can be seen in Figure3, replacing a contour with its convex hull looses a considerable amount of information because very different shapes can have the same convex hull. To encode more shape information, we therefore compute for each point (x,y) on the convex hull its closest Euclidean distance d to the shape S:
Instead of a twodimensional contour (x(t),y(t)), we then obtain a threedimensional parametric curve (x(t),y(t),d(t)) representing the shape, as shown in Figure4.
When implementing an algorithm for computing the contour representation (x(t),y(t),d(t)), two questions occur: how the convex hull should be sampled and how the distances d(t) can be efficiently computed. The vertices of the convex hull polygon can be obtained e.g. with Grahamâ€™s scan algorithm[17]. These vertices are obvious sampling points, but their distance can be arbitrary, so that the edges need to be sampled. As the image sampling distance is one pixel, it is natural to compute the edge length l and to add âŒŠlâ€‰â€‰1âŒ‹ equidistant sampling points on each edge.
To compute the distance d(t) for each sampling point x(t),y(t), two efficient approaches are possible:

Compute the distance transform image[18] of the original shape and approximate d(t) by linear interpolation of the distance image at the real point x(t),y(t).

Store all shape contour points in a kdtree[19] and compute (17) for each sampling point x(t),y(t) with a nearest neighbour search in the kdtree.
To estimate the runtime complexity of both algorithms, let us first observe that a shape with an nâ€‰Ã—â€‰n bounding box has O(n ^{2}) volume pixels, but only O(n) contour points. As the fastest algorithms for computing the distance transform require two runs over all image pixels[18], the first algorithm requires O(n ^{2}) operations to compute all contour distances. The second algorithm requires O(n logn) operations for building the kdtree and O(n logn) operations for querying all nearest neighbours, resulting in a total runtime of O(n logn). The second approach is thus faster, and we have implemented it with the kdtree library shipped with the Gamera framework[19].
3.2 Broken shape Fourier descriptors
For the derivation of invariant Fourier descriptors for the threedimensional point sequence{(x(t),y(t),d(t))}_{t=0}^{N1}, we propose three different approaches. Our first Fourier descriptor is built upon the techniques in Section 2.1. It builds a complex number by taking the centroid distance r(t)â€‰:=â€‰(x(t),y(t))â€‰â€‰(x _{0},y _{0}) (see Equation (14)) as real part and the distance d(t) as imaginary part. The sequence{(r(t)+\mathit{\text{jd}}(t))}_{t=0}^{N1} is already invariant under translation and rotation. Scale and start point shift invariance of the Fourier coefficients
is either achieved with the phase normalisation (compare Equation (16))
or, simply, by using the absolute values A _{ k }, where Î± _{ k }â€‰= arg(c _{ k }) denotes the polar angle of coefficient c _{ k }, and c _{ r } and c _{ s } are the two coefficients with the largest and second largest absolute values (râ€‰â‰¥â€‰0,â€‰0â€‰<â€‰sâ€‰<â€‰N/2), and Î± _{ s }â€‰= arg(c _{ s }) is the polar angle of c _{ s }. For a fixed number of n descriptors, the first n values in the sequence A _{0},A _{ N1},A _{1},A _{ N2},â€¦ should be selected.
The second Fourier descriptor under investigation follows the multidimensional approach by Badreldin et al. as described in Section 2.2. Let{c}_{k}^{(x)},{c}_{k}^{(y)}, and{c}_{k}^{(d)} be the complex Fourier coefficients of the three dimensions x(t), y(t), d(t) according to (8). Invariant Fourier descriptors are then obtained as (k=1,2,\xe2\u20ac\xa6,\xe2\u0152\u02c6\frac{N1}{2}\xe2\u0152\u2030)
For a fixed number of n descriptors, the first n values B _{1},B _{2},â€¦,B _{ n } should be selected.
The third descriptor uses the scalar representation r(t)d(t) that is already invariant under translation and rotation. It is an approximation to the local radius of the shape and leads to the Fourier coefficients
As r(t)d(t) are real values, the Fourier coefficients at k and nk are complex conjugates:{c}_{k}={c}_{Nk}^{\xe2\u02c6\u2014}, 1â€‰â‰¤â€‰kâ€‰<â€‰N. Therefore, only values for0\xe2\u2030\xa4k\xe2\u2030\xa4\xe2\u0152\u02c6\frac{N1}{2}\xe2\u0152\u2030 are relevant. Again, the coefficients (21) can be made scale and start point shift invariant either with the phase normalisation
or, simply, by using the absolute values C _{ k }, where Î± _{ k }â€‰= arg(c _{ k }) denotes the polar angle of coefficient c _{ k }, and c _{ r } and c _{ s } are the two coefficients with the largest and second largest absolute values (râ€‰â‰¥â€‰0,â€‰0â€‰<â€‰sâ€‰<â€‰N/2). For a fixed number of n descriptors, the first n values C _{0},C _{1},C _{2},â€¦ should be selected.
4 Evaluation
We have evaluated our new Fourier descriptors on two different data sets, the MPEG7 database of unbroken shapes and a new realworld data set with broken shapes from scans of 19th century chant books in the Eastern neumatic notation[20]. Both data sets are described in detail in Section 4.2. Apart from a performance comparison of the broken shape descriptors (Section 4.5), we have also investigated the effect of different normalisation schemes (Section 4.3) and the number of descriptors needed for similaritybased retrieval (Section 4.4).
We have implemented all Fourier descriptors as a toolkit for the Gamera framework for document analysis and recognition^{b}[21]. The toolkit is published under a free license together with the new data set of broken neumes in the 'Addonsâ€™ section of the Gamera website.^{c} For convenience, a brief summary of all Fourier descriptors under investigation is given in Table1.
4.1 Performance measures
As evaluation criteria for shape based image retrieval, we have used two different performance measures, the precision/recall curve and the leaveoneout error rate of a knearest neighbour (kNN) classification. For a single query image belonging to class Ï‰, precision and recall are defined as follows: let n _{ Ï‰ } be the number of all images of class Ï‰, and let k _{ Ï‰ } be the number of images of class Ï‰ among the knearest neighbours of the query image; then k _{ Ï‰ }/k is the precision and k _{ Ï‰ }/n _{ Ï‰ } is the recall for this query. The precision of all test images is averaged to yield a single precision value. Typically, k is not fixed, but the precision is measured for a given recall rate. When the recall is increased, the precision will generally decrease, but less so for better similarity measures. The decrease of the precision/recall curve can thus serve as a performance measure for similaritybased retrieval.
To evaluate the classification performance of the different Fourier descriptors, a natural criterion is the crossvalidation or leaveoneout error rate of a kNN classifier because it is an unbiased estimator of the expected error rate[22]. A kNN classifier assigns a test sample to the majority class among its knearest training samples. The leaveoneout error rate is the average error rate when each sample is classified with a kNN classifier that has been trained with the remaining n1 samples, thereby yielding a single performance measure.
4.2 Data sets
A data set that has already been used for the evaluation of different Fourier descriptors in the study[6] is part B of the MPEG7 CEShape1 database^{d}[23]. It consists of 1,400 shapes that have been classified into 70 classes with 20 similar items in each class. Figure5 shows sample shapes from this data set. As pointed out by the authors of the data set, a 100% retrieval rate is impossible because some shapes are more similar to the shapes from different classes than to their own class so that 'it is not possible to group them into the same class using only shape knowledgeâ€™[23]. In some images, there are noise pixels which form additional small random shapes. In order to ignore this noise, we have computed the contour of the largest connected component for each image only.
The MPEG7 data set does not contain any broken shapes and thus allows for a performance comparison of the new descriptors with the ordinary Fourier descriptors described in Section 2. To also test the new descriptors from Section 3.2 on actual broken shapes, we have created a data set of broken glyphs from the four 19th century music prints in Byzantine neume notation that have also been used in[20] (sources HA1825, HS1825, AM1847, and MP11850). This 'NEUMESâ€™ data set consists of 640 images out of 40 different classes with 16 items in each class. Due to varying print quality, some glyphs are connected while others are randomly broken into up to eight fragments. As can be seen in Figure6, some neumes are mirrored or elongated versions of different neumes. It is thus important that the shape descriptors used for discrimination are not invariant to axial mirroring or arbitrary affine transformations. The sample images in Figure6 are not rotated; to make rotational invariance of the shape descriptors mandatory, we have rotated the 16 items in each class in steps of 22.5Â°.
4.3 Normalisation schemes
The Fourier descriptors from Section 2 can be normalised (i.e. made invariant) in different ways. There are generally two degrees of freedom:

Phase normalisation versus absolute values

The index choices s and r for the normalisation coefficients c _{ r } and c _{ s }
Figure7 shows the effect of the different normalisation schemes on the leaveoneout recognition rate on the MPEG7 data set for the complex position Fourier descriptor. Both for the absolute values and the phasenormalised descriptors, it is better to normalise not with a fixed coefficient, but with the coefficient with the largest absolute value (which may vary from shape to shape). This normalisation limits the numeric range of the descriptors to a fixed interval, a feature normalisation scheme that is known to improve the recognition rate in many cases[24].
The observation that the phase normalisation performs poorer than the absolute values is surprising, however, because the phasenormalised descriptors carry information that is lost in the absolute values. It turned out, however, that the phase angles of the descriptors are much less robust with respect to small changes in the contour coordinates. To demonstrate this phenomenon, we did a small Monte Carlo experiment. We added normally distributed random noise independently to the x and y coordinates of a sample contour and measured the deviation Î” of the resulting descriptors{\stackrel{~}{l}}_{k} and{\stackrel{~}{l}}_{k} as
where l _{ k } are the undisturbed descriptors andL:={\xe2\u02c6\u2018}_{k=1}^{m}{l}_{k}+{l}_{Nk}. Figure8 shows these deviations, averaged over 10,000 random experiments, as a function of the variance Ïƒ ^{2} of the random noise. The phase normalisation obviously is much less robust. The same phenomenon already occurs for the Fourier coefficients c _{ k } due to\phantom{\rule{0.3em}{0ex}}{c}_{k}{\stackrel{~}{c}}_{k}\phantom{\rule{0.3em}{0ex}}\xe2\u2030\xa4{c}_{k}\stackrel{~}{{c}_{k}}, and this is amplified by the phase normalisation (6) because the phase angles are multiplied with large integer values (sr, ks, and rk).
Further evidence for the instability of the phase angles can be derived from Figure9, which shows the recognition rates for different normalisations of the Fourier descriptor broken A on the NEUMES data set. When the phase normalisation coefficient c _{ s } is chosen as the largest coefficient for 0â€‰<â€‰sâ€‰<â€‰N, this often results in a high value sâ€‰â‰ˆâ€‰Nâ€‰â€‰1. This amplifies the phase angle error of Î± _{ k }â€‰= argâ€‰c _{ k } because Î± _{ k } is multiplied with s in the phase normalisation (19), thereby even resulting in a negative effect on the recognition rate compared to a fixed normalisation with sâ€‰=â€‰1. When the maximum coefficient c _{ s } is only searched for small s (in most cases this led to sâ€‰=â€‰2), the recognition rate is considerably better. Nevertheless, in any case, the absolute values performed yet better.
We therefore conclude that it is generally better to use the absolute values instead of the phasenormalised coefficients and that the scale invariance normalisation should be done with the coefficient c _{ r } with the largest absolute value rather than with fixed r.
4.4 Number of descriptors
Figures7 and9 show that a small number of Fourier descriptors is sufficient for shape retrieval. In our experiments, this behaviour was universal, as can be seen in Figure10: using more than 20 descriptors generally does not increase the recognition rate any further. In our experiments in the following subsection, we have therefore limited the number of descriptors to 60, to be on the safe side. It is interesting to note that these numbers, which are derived from the leaveoneout recognition rates, are much lower than the numbers derived by Zhang and Lu from the absolute magnitude of the descriptors[6]. The reason for this difference is that a criterion based on the magnitude does not take the discriminative power of the coefficients into account.
4.5 Descriptor comparison
Figure11 shows the precision/recall curves for all investigated Fourier descriptors on the MPEG7 data set. To all descriptors, the recommendations from the proceeding subsections have been applied, i.e. they have been normalised with the largest coefficient and by taking the absolute value, and the first 60 descriptors have been used.
The best performing Fourier descriptor was the complex position, which seems to be a contradiction to the experiments by Zhang and Lu[6], who found the centroid distance to be best performing. This discrepancy can however be explained with the different choice of the normalisation coefficient c _{ r }, as shown in Figure12: when the complex position Fourier descriptor is normalised with a fixed coefficient, e.g. râ€‰=â€‰1, it performs poorer than the centroid distance, as was the case in the study by Zhang and Lu.
On the MPEG7 data set, the broken shape Fourier descriptors did not perform as well as the best single shape Fourier descriptor (complex position), but the precision/recall curve of our broken C descriptor is almost identical to the centroid distance Fourier descriptor, with broken A and broken B performing only slightly poorer. That the broken C and centroid distance descriptors behave very similar is hardly surprising because for single closed shapes the signature r(t)â€‰â€‰d(t) is simply an approximation of the signature (14).
On the NEUMES data set, only our new Fourier descriptors are applicable, and the resulting precision/recall curves are shown in Figure13. On this data set, there is a more distinct difference between the three broken Fourier descriptors. Actually, the ranking is in ascending order of the information loss of the descriptor: mapping the complex number râ€‰+â€‰j d (broken A) onto the real number râ€‰â€‰d (broken C) looses some information, and the broken B descriptor looses even more shape information because it decouples the x, y, and d coordinates. On the MPEG7 data set, this has a smaller impact on the recognition performance because the shapes within a single class vary considerably. On the NEUMES data set, in contrast, this is of importance because detailed shape information is required for class discrimination; see e.g. the two bottom rows in Figure6.
5 Conclusions
The new Fourier descriptors for broken shapes have shown retrieval performances that were comparable to common closed contour shape descriptors like the 'centroid distanceâ€™ Fourier descriptor. As the new descriptors have the benefit of being applicable to arbitrary shapes (connected or broken), they can serve as a general replacement for other Fourier descriptors lacking this flexibility.
Our experiments have shown that it is generally better to use the absolute values rather than the phasenormalised descriptors and that scale invariance normalisation should be done with the largest coefficient, rather than with a fixed coefficient. For practical applications on realworld data, we would recommend to use the 'broken Aâ€™ Fourier descriptor.
Endnotes
^{a} When only one coefficient is nonzero, this can be chosen as c _{ r } and no phase normalisation is necessary.
^{b} http://gamera.sf.net/.
References
Crimmins T: A complete set of Fourier descriptors for twodimensional shapes. Syst. Man Cybern. IEEE Trans 1982, 12: 848855.
Ghorbel F, Derrode S, Mezhoud R, Bannour T, Dhahbi S: Image reconstruction from a complete set of similarity invariants extracted from complex moments. Pattern Recognit. Lett 2006, 27: 13611369. 10.1016/j.patrec.2006.01.001
Gonzalez R, Woods R: Digital Image Processing, 2nd edn. New Jersey: PrenticeHall; 2002.
Khotanzad A, Hong YH: Invariant image recognition by Zernike moments. Pattern Anal. Mach. Intell. IEEE Trans 1990, 12: 489497. 10.1109/34.55109
Mokhtarian F, Abbasi S, Kittler J: Robust and efficient shape indexing through curvature scale space. In British Machine Vision Conference. Edinburgh; September 1996:5362.
Zhang D, Lu G: Study and evaluation of different Fourier methods for image retrieval. Image Vis. Comput 2005, 23: 3349. 10.1016/j.imavis.2004.09.001
Arbter K, Snyder W, Burkhardt H, Hirzinger G: Application of affineinvariant Fourier descriptors to recognition of 3D objects. Pattern Anal. Mach. Intell. IEEE Trans 1990, 12: 640647. 10.1109/34.56206
Pavlidis T: Algorithms for Graphics and Image Processing. New York: Springer; 1982.
Zygmund A: Trigonometric Series, 3rd edn. Cambridge University Press; 2002.
Dimov D, Laskov L: Invariant Fourier descriptors representation of medieval Byzantine neume notation. In Joint COST 2101 and 2102 International Conference on Biometric ID Management and Multimodal Communication, Madrid, 2009. Lecture notes in Computer Science. vol. 5707. Berlin: SpringerVerlag; 2009:192199.
Granlund G: Fourier preprocessing for hand print character recognition. IEEE Trans. Comput 1972, 21: 195201.
Kuhl F, Giardina C: Elliptic Fourier features of a closed contour. Comput. Graph. Image Process 1982, 18: 236258. 10.1016/0146664X(82)90034X
Lin CS, Hwang CL: New forms of shape invariants from elliptic Fourier descriptors. Pattern Recognit 1987, 20: 535545. 10.1016/00313203(87)90080X
Badreldin A, Wong A, Prasad T, Ismail M: Shape descriptors for ndimensional curves and trajectories. In International Conference on Cybernetics and Society. Cambridge, 8â€“10 October; 1980:713717.
Shridhar M, Badreldin A: High accuracy character recognition using Fourier and topological descriptors. Pattern Recognit 1984, 17: 515524. 10.1016/00313203(84)900499
Zahn C, Roskies R: Fourier descriptors for plane closed curves. IEEE Trans. Comput 1972, C21: 269281.
Cormen T, Stein C, Leiserson C, Rivest R: Introduction to Algorithms, 2nd edn. Cambridge: MIT Press; 2001.
Bailey D: An efficient euclidean distance transform. In 10th International Conference on Combinatorial Image Analysis (IWCIA). Auckland, 1â€“3 December; 2004:394408.
Dalitz C: Kdtrees for document layout analysis. In Document Image Analysis with the Gamera Framework, Schriftenreihe des Fachbereichs Elektrotechnik und Informatik, Hochschule Niederrhein, vol. 8. Aachen: ShakerVerlag; 2009:3952.
Dalitz C, Michalakis G, Pranzas C: Optical recognition of psaltic byzantine chant notation. Int. J. Doc. Anal. Recognit 2008, 11: 143158. 10.1007/s1003200800744
Droettboom M, MacMillan K, Fujinaga I: The gamera framework for building custom recognition systems. In Symposium on Document Image Understanding Technologies. Greenbelt, Maryland, April 911; 2003:275286.
Raudys S, Jain A: Small sample size effects in statistical pattern recognition: recommendations for practitioners. Pattern Anal. Mach. Intell. IEEE Trans 1991, 13: 252264. 10.1109/34.75512
Latecki L, Lakamper R, Eckhardt T: Shape descriptors for nonrigid shapes with a single closed contour. In IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head, 13â€“15 June; 2000:424429.
Haiying G, Antani S, Long L, Thoma G: Comparative study of shape retrieval using feature fusion approaches. In IEEE 23rd International Symposium on ComputerBased Medical Systems (CBMS),2010. Perth, 12â€“15 October; 2010:226231.
Acknowledgments
We are grateful to Georgios K. Michalakis for providing us with scans of Byzantine chant books that we have used for extracting the NEUMES test data set.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authorsâ€™ original submitted files for images
Below are the links to the authorsâ€™ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Dalitz, C., Brandt, C., Goebbels, S. et al. Fourier descriptors for broken shapes. EURASIP J. Adv. Signal Process. 2013, 161 (2013). https://doi.org/10.1186/168761802013161
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/168761802013161