3D point cloud registration based on a purpose-designed similarity measure

Torre-Ferrero, Carlos; Llata, José R; Alonso, Luciano; Robla, Sandra; Sarabia, Esther G

doi:10.1186/1687-6180-2012-57

Research
Open access
Published: 06 March 2012

3D point cloud registration based on a purpose-designed similarity measure

Carlos Torre-Ferrero¹,
José R Llata¹,
Luciano Alonso¹,
Sandra Robla¹ &
…
Esther G Sarabia¹

EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 57 (2012) Cite this article

5713 Accesses
5 Citations
Metrics details

Abstract

This article introduces a novel approach for finding a rigid transformation that coarsely aligns two 3D point clouds. The algorithm performs an iterative comparison between 2D descriptors by using a purpose-designed similarity measure in order to find correspondences between two 3D point clouds sensed from different positions of a free-form object. The descriptors (named with the acronym CIRCON) represent an ordered set of radial contours that are extracted around an interest-point within the point cloud. The search for correspondences is done iteratively, following a cell distribution that allows the algorithm to converge toward a candidate point. Using a single correspondence an initial estimation of the Euclidean transformation is computed and later refined by means of a multiresolution approach. This coarse alignment algorithm can be used for 3D modeling and object manipulation tasks such as "Bin Picking" when free-form objects are partially occluded or present symmetries.

1. Introduction

The alignment of two point clouds is quite a frequent task, both in 3D modeling and in object recognition. Similarly, the need for automating certain applications, such as computer-aided manufacturing or bin-picking, has necessitated the use of 3D information about the parts being manipulated. This information can be sensed by 3D acquisition methods [1], such as laser scanners or time-of-flight cameras, which provide a range image for every different pose of the object.

Finding the rigid transformation producing a suitable alignment of the resulting point clouds, without having a previous estimate, is a problem that has been approached using different strategies [2, 3]. Although no solution has prevailed as the most accepted, algorithms based on intrinsic properties have been more widely applied due to their generality. These algorithms extract shape descriptors [4–15], curves [16, 17], structures [18, 19], or graphs [20] from both point clouds (sometimes meshes are used instead) in order to compare them. If several correspondences are found, then a coarse transformation that aligns them in a suitable way can be calculated.

On the other hand, the algorithms that use extrinsic properties will be subject to one important restriction: as they match properties that are relative to a coordinate system, the surfaces must be roughly aligned in order to establish point correspondences. Therefore, these algorithms (such as the ICP algorithm [21, 22] and its variants [23]) are used to refine that initial transformation and obtain a more precise one. Since this refinement process has been successfully achieved, the most challenging part of the 3D point cloud alignment problem is to determine the rough initial transformation.

2. CIRCON descriptor

2.1. Introduction

After reviewing the state-of-the-art, the following problems were found in the most significant alignment algorithms [2, 3]: lack of generality (they work well with objects of a given topology), excessive computation time, problems with symmetries, poor behavior when point clouds have low density and when they overlap each other in a small region, need for a method (usually based on robust statistical techniques) that discards false correspondences and obtains a valid correspondence group to obtain the Euclidean transformation. All the methods reviewed show, to a greater or lesser extent, at least two of these drawbacks.

The three characteristics to which we have given most importance when designing our alignment algorithm are it must have no restrictions regarding the type of objects that can be used, a good performance in the presence of symmetries (which is quite common in industrial components) and good behavior when the overlap and the density of the point clouds are low. However, we have also taken into account the other problems that may occur.

One of the main drawbacks observed after analysis of the commonly used descriptors in the state-of-the-art is that although many of them are based on geometric properties of the environment of the point-of-interest, the evaluation of their similarity does not have a direct relationship with the distance between the point clouds [4, 8–15]. Moreover, since a good alignment is characterized by a small distance between corresponding points, it would be more convenient to use a descriptor that represents the geometry of the environment better. Furthermore, the descriptors analyzed need to find at least three good correspondences to determine an approximate Euclidean transformation.

Another drawback associated with some descriptors is that at the end of the local matching stage, a considerable percentage of false matches can be found. This is usually caused by a descriptor with low discriminating capacity and a choice of similarity measure that is not sufficiently appropriate.

In our opinion, for the correspondence search to be effective, the descriptor used by the coarse alignment algorithm should:

Be based on the geometry in the environment of the points-of-interest.
Be highly descriptive, so that the correspondences can be adequately discriminated and no false matches appear.
Enable the use of a similarity measure based on distances between points of the cloud.
Enable the calculation of a Euclidean transformation based on a single correspondence.
Be useful for 3D modeling (alignment of two scans from two different views of the object) and for 3D object recognition (alignment of the point clouds in the scene and the model).

2.2. Descriptor construction

In order to obtain the descriptor associated with a particular point-of-interest in the cloud (let ^wp_q be this point), it is necessary to express the cloud points in a local coordinate system centered on ^wp_q and whose Z_q-axis is its normal vector. The X_q-axis is chosen so that it is perpendicular to both the Y_w-axis of the reference coordinate system and the normal vector at the point-of-interest. Thus the Y_q-axis is determined by the cross product of unit vectors along the X_q and Z_q axes. This criterion establishes a unique reference for the angles of rotation about the Z_q-axis (i.e., above normal ${\bar{n}}_{q}$ ), which will subsequently facilitate the calculation of the Euclidean transformation associated with that correspondence.

Once the cloud points are transformed to the local frame, the environment of the point-of-interest is considered to be divided into n_s sectors (whose angle is ρ_θ radians), which are further divided radially into cells with length ρ_r mm (excluding the cell closest to the centre, "cell 0", which will be a sector with a radius 0.5 ρ_r mm). The sectors are numbered clockwise starting with the sector that is centred on the X_q-axis (θ₁ = 0). Figure 1 shows, around the Z_q-axis, this sense of numbering and the nature of the cells for the i th sector.

Taking into account this division of the point cloud into sectors and cells, a transformation based on cylindrical coordinates is applied in order to obtain, for any point p_d with coordinates (x_d, y_d, z_d) in the coordinate system with origin at p_q, the i index corresponding to the number of sector to which it belongs, the j index indicating the cell within that sector, and the height value associated with its coordinate z_d.

i = (⌊ n_{s} - \frac{{tan}^{- 1} (\frac{y_{d}}{x_{d}})}{ρ_{θ}} ⌉ m o d n_{s}) + 1

(1)

j = ⌊ \frac{\sqrt{x_{d}^{2} + y_{d}^{2}}}{ρ_{r}} ⌉

(2)

c_{i, j} = ⌊ \frac{z_{d}}{ρ_{z}} ⌉

(3)

where ρ_θ is the angular resolution, ρ_r the radial resolution, and ρ_z the height resolution. Note also that ⌊x⌉ is the nearest integer to x. Applying these three equations to all points of the cloud and maintaining the maximum values c_{i, j} for each cell (which is equivalent to saying that its z coordinate is maximum), we obtain a set of triples (i, j, c_{i, j}) that uniquely correspond to each of the cells into which the environment of the point-of-interest has been divided. Therefore, the descriptor consists of a set of coded values that represent the contours described by the points with the maximum z coordinate of each of the cells into which each sector is divided. Figure 1 shows, for sector i, the contour described by points with the largest 'z' in the cells. As each pair of indexes (i, j) has a single value of c_{i, j}, we can build a matrix C whose row index is the sector number, i, and whose column index is the number of cell, j, within that sector (see Figure 2). Note that the cells closest to the center (the origin of the local coordinate system) are not represented in this matrix since their value would always be close to zero.

Since the sequence should be closed because it describes the environment of the point-of-interest, the first and last rows must be considered adjacent, since their elements with the same column index correspond to adjacent cells. In other words, this descriptor has the property of being cyclical.

This matrix can also be viewed as an image (see Figure 3) whose pixels represent the values c_{i, j}, so that each of them has an associated colour (or a gray level). Hence, this descriptor can be considered as a cyclical image of the environment of a point-of-interest. Since each row represents a radial encoding of the contour described by the maximum heights of each of the cells in a sector, we will call this descriptor "CIRCON", which is an acronym of "Cyclical Image of Radial CONtours".

3. The proposed similarity measure

3.1. Introduction

Unlike the similarity measures based on correlation coefficient (CC), mutual information (MI) [24, 25], joint entropy [26, 27], or others [28, 29] that have been used by the most popular coarse alignment algorithms, such as spin images, this similarity measure is based on distance between pixels and takes into account the problems of occlusion that can appear in real situations that need 3D registration or object recognition.

This similarity measure gives weighting to both the overlap and the proximity of two point clouds. This enables the simultaneous evaluation of the geometric consistency of the correspondences. Although computational cost increases with the number of correspondences evaluated, the Euclidean transformation associated with this correspondence can be directly calculated, given that the rotation around the normal necessary to align the two point clouds is determined. Therefore, it will be possible to determine which correspondences give rise to the Euclidean transformation which fits best and to base the stopping criterion on their validation. In this way, the algorithm finalizes when the coarse transformation satisfies the end conditions imposed (detailed in the algorithm in Section 5.3), without necessity to evaluate all the points-of-interest selected in the two point clouds.

3.2. Sets of pixels

Assuming two CIRCON descriptors A and B corresponding to two matched points; if both a pixel from A, a_ij, and a pixel from B, b_ij, represent a part of the point cloud, it will be considered that this pair of pixels, with indexes (i, j), are overlapped.

Taking into account that the matrix elements not belonging to the point cloud will be considered computationally as 'not-a-number' (i.e., NaN), the following sets of pixels have been defined:

Intersection Set (overlapped pixels):

I_{A B} = {(i, j) | (a_{i j} \neq N a N) a n d (b_{i j} \neq N a N)}

(4)

Exclusive-OR Set (non-overlapped pixels):

X_{A B} = {(i, j) | (a_{i j} \neq N a N) x o r (b_{i j} \neq N a N)}

(5)

Union Set (overlapped pixels and non-overlapped pixels):

U_{A B} = {(i, j) | (a_{i j} \neq N a N) o r (b_{i j} \neq N a N)}

(6)

These sets of pixels will be taken into account in order to define the similarity measure.

3.3. Area represented by each cell

The top-view area represented by each cell (see Figure 4) increases with the distance from the central point (the point-of-interest where the descriptor has been generated). Therefore, the size of this area will determine which points in the cloud correspond to a cell of the image matrix.

Accordingly, the pixels corresponding to a specific column j in the CIRCON image represent the points that have a radius between ρ_r (j-0.5) and ρ_r (j+0.5). Therefore, the theoretical area corresponding to each one of these pixels will be the same, A_pj, and it will be given by the following expression:

A_{p j} = \frac{π}{n_{s}} \cdot ({(ρ_{r} \cdot (j + 0.5))}^{2} - {(ρ_{r} \cdot (j - 0.5))}^{2})

(7)

n_s being the number of angular divisions.

Simplifying,

A_{p j} = \frac{2 π}{n_{s}} \cdot ρ_{r}^{2} \cdot j

(8)

3.4 Weight of the pixels

In the same way, the theoretical area corresponding to a pixel of the first column (equivalent to a cell of the first ring of the Figure 4) will be

A_{p 1} = \frac{2 π}{n_{s}} \cdot ρ_{r}^{2}

(9)

Therefore, the relationship of areas between a pixel in the j th column and the one in the first column will be as follows

\frac{A_{p j}}{A_{p 1}} = j

(10)

Using this expression, the area represented by the pixels in different columns will be taken into account to correctly weight the contribution of each pixel to the average distance in the overlapped area.

3.5. Similarity measure expression

As was explained previously, the similarity measure selected will depend on the distance and the overlap among the CIRCON images.

To calculate the average distance between the pixels of the two images, those of the overlapped area and those of the non-overlapped area will be considered separately.

Therefore, the average distance in the overlapped area will be defined using the following expression:

D_{o v} = \frac{\sum_{(i, j) \in I_{A B}} A_{p j} \cdot | a_{i j} - b_{i j} |}{\sum_{(i, j) \in I_{A B}} A_{p j}} = \frac{\sum_{(i, j) \in I_{A B}} j \cdot | a_{i j} - b_{i j} |}{\sum_{(i, j) \in I_{A B}} j}

(11)

And the overlap ratio is defined as

σ_{o v} = \frac{\sum_{(i, j) \in I_{A B}} j}{\sum_{(i, j) \in U_{A B}} j}

(12)

which expresses the relationship between the weighted number of overlapping pixels and the weighted total number of pixels pertaining to the object (overlapping and non-overlapping).

The similarity measure proposed considers both terms and it provides values between 0 and 1,

M_{S} = \frac{σ_{o v}}{(ρ \cdot D_{o v} + λ') + σ_{o v} \cdot (1 - λ')}

(13)

where λ' is defined as λ' = ρ λ., λ being a parameter whose value represents the additional distance with which non-overlapping pixels are penalized. In contrast the parameter ρ modifies the relationship between the expected similarity value and the distance D_ov that produces it, when the overlap is 100%.

The values for these parameters used in our experiments are ρ = 1 and λ = 1, which give a similarity value of 0.5 both when σ_ov = 0.5 and D_ov = 0 and when D_ov = 1 and σ_ov = 1.

Since, given a matching pair, the point coordinates and their corresponding normal vectors are known for both points, only a free parameter is needed to compute the rigid transformation: the rotation around the normal. This can be easily calculated using CIRCON images since these are cyclical. By shifting the last row to the top for the first CIRCON image (from point cloud 1) and leaving the second one fixed (from point cloud 2), the similarity measure for a rotation ρ_θ can be calculated. If the last two rows are shifted to the top, the equivalent rotation will be 2ρ_θ, and so on. This can be practically implemented by means of matrix blocks so that the similarity measures for all the shifts can be computed at the same time. Subsequently, the similarity value for a matching pair will be the maximum for all the possible rotations and it will be associated with an angle k·ρ_θ (k being the number of row shifts). A preliminary analysis of this similarity measure can be found in [30].

4. Coarse alignment algorithm

4.1. Point-of-interest selection

First, it will be necessary to select a set of points-of-interest in both clouds, {p_1i} and {p_2i}. An important novelty is that the alignment algorithm does not need a thorough selection of the points-of-interest or a large number of them to obtain a proper alignment. Instead we have designed a simple algorithm that selects the points-of-interest by taking into account the particular characteristics of the descriptor proposed. First, normal vectors at each point are calculated by interpolation using the range image data. Applying basic morphological operations on the resulting image and the Laplace operator on the normal components (see Figure 5), points are extracted whose normal vectors are stable and also close to areas where the normal varies abruptly. This enables points-of-interest with a stable CIRCON descriptor to be selected that, in addition, represent close areas with relevant features.

The number of points extracted depends on the topology of the object, although a minimum distance between them is considered so that this number is not very high.

4.2. Correspondence search algorithm

This search algorithm is the core of the coarse alignment algorithm. It is based on an iterative search for the greatest value of similarity measure using the array of cells, C₁, into which the environment of a point-of-interest in point cloud 1 is divided. The chosen stopping criterion ensures that this search is convergent, since the environment where the correspondences are searched for is progressively reduced.

The algorithm will evaluate correspondences between different points of cloud 1 and a point-of-interest selected from cloud 2. The degree of validity of two matching points is to be determined based on the similarity between their CIRCON images: I_1x(image of a point P_1xin cloud 1) and I₂ (target image). Since a single point, P_1x, is extracted from each cell of the array C₁, for each iteration, the algorithm performs as many similarity measure evaluations as the number of 'valid' cells in the distribution C₁ around the point of the previous iteration. Note that the points whose CIRCON images obtain a low similarity value are stored in a list of non-valid indexes, ind_nv. Thus, a cell will be considered 'valid' when its ratio of non-valid points, r_nv, is less than a prefixed threshold τ_nv. This allows to progressively reduce the number of cells to be checked.

Therefore, the similarity value returned by the algorithm, M_Sc, is the highest obtained in all the iterations until the stopping condition is met. The algorithm ends when an iteration uses a starting point whose distance to one of the previously used points is less than a preset δ (in our implementation δ = ρ_r/16).

As will be explained in Section 5.3, the size of the descriptors used by this search algorithm depends on the resolution level in the main algorithm. Moreover, since one of its goals must be to avoid an incorrect alignment in the presence of occlusions and symmetries of the objects, the CIRCON images will represent the entire point clouds in order to increase their descriptiveness. However, depending on the application (e.g., mixed objects), the environment size of the descriptor can be varied.

Figure 6 shows the steps that must be followed in order to perform this iterative search for correspondences.

4.3. Main algorithm: selection of the most suitable transformation

Figure 7 shows all the steps that must be followed in order to obtain a coarse alignment between two point clouds. This algorithm uses the Correspondence Search Algorithm introduced in Section 5.2 in a multiresolution approach which refines the Euclidean transformation matrix.

For each interest-point chosen in cloud 2, P_2y, the search for correspondences by cells is established for n_v levels of resolution. The number of levels and the lowest resolution must be determined through a compromise between computation time and accuracy of the point cloud alignment, which will depend on the application.

The starting points for the first level are the interest-points chosen in cloud 1, {p_1i}. This first level enables the discarding of those zones (cells) of the surroundings of the chosen point, P_1x, where, due to their low similarity, it is unlikely to find the desired correspondence. Once the Correspondence Search Algorithm has found an approximate correspondence, P_1c, for the first level, the resolution is increased and a new search around this new starting point is performed but with smaller cells. In this way, the search zone is reduced, which will be the object of progressive refinement for the next resolution levels (since only the first n_c columns of the array of cells C_1care used for the correspondence search and this number is halved when the resolution is increased). When the convergence of the search associated with the last resolution level is achieved and the similarity value M_Sc of the resulting correspondence is greater than a value $τ_{M_{S}} (n_{v})$ , its corresponding Euclidean transformation, T_c, is calculated using Equation (22).

We propose a stopping criterion for the alignment algorithm that also takes into account the characteristics of the descriptor. It uses the correspondence found and two additional ones, (P_1m, P_2n) and (P_1r, P_2s), to create a fictitious correspondence (see Figure 8) with which another transformation matrix, T_f, is computed. Those additional points are chosen from corresponding cells in which both point clouds are distributed. The condition that must be satisfied for each of the two selected points in point cloud 1 is that the angle formed by their normal vectors and the normal at the point being evaluated should be very similar to the angle formed by their corresponding normal vectors in point cloud 2.

Then the matrix associated with the fictitious correspondence, T_f, is compared with that obtained by the algorithm, T_c. The execution is stopped if a distance measure for the rotation, d_R, and another for the translation, d_t, do not exceed their respective thresholds τ_R and τ_t.

Thus, the rotation distance d_R is obtained through the following expression:

d_{R} = \sqrt{\frac{1}{3} (α_{R}^{2} + β_{R}^{2} + γ_{R}^{2})}

(14)

where (α_R, β_R, γ_R) are the ZYX Euler angles from the rotation matrix $R_{f}^{T} \cdot R_{c}$ ; R_c being the rotation matrix obtained by the algorithm, and R_f the rotation matrix associated with the fictitious correspondence.

The translation distance d_t will be calculated as the RMS distance between the translation vectors, t_c and t_f, associated with both correspondences.

As will be shown in the results, the solution obtained by the algorithm can be sufficiently accurate for object manipulation tasks; however, the Euclidean transformation could be refined using the ICP algorithm by taking advantage of the data provided by our algorithm about the correspondences between the points in the two clouds.

4.4. Calculation of the Euclidean transformation using a single correspondence

Once the correspondence with the highest similarity measure within the surroundings of the point-of-interest chosen for an iteration of the algorithm is found, the Euclidean transformation that coarsely aligns the point clouds can be calculated.

In the first place, it is necessary to express both point clouds within a frame of coordinates whose origin is each of the two points that have been matched and whose z-axis is aligned with the normal vectors at these points.

By convention, the x-axis of the new frame is perpendicular to the normal (new z-axis) and to the y-axis of the original frame W, since the CIRCON images were generated in this way.

Therefore, given an interest-point ^WP_q with normal vector ${\bar{n}}_{q}$ expressed in the frame W, the data will be rotated by using the following matrix:

_{W}^{q} R = {[\begin{matrix} {\bar{x}}_{q} & {\bar{n}}_{q} \times {\bar{x}}_{q} & {\bar{n}}_{q} \end{matrix}]}^{T}

(15)

where

{\bar{x}}_{q} = \frac{ŷ_{W} \times {\bar{n}}_{q}}{| | ŷ_{W} \times {\bar{n}}_{q} | |}

(16)

The translation vector is given by the coordinates of the point in the original frame. Therefore, the total Euclidean transformation will be given by the following expression:

_{W}^{q} T = {[\begin{matrix} _{W}^{q} R & -_{W}^{q} R \cdot^{W} P_{q} \\ 0_{1 \times 3} & 1 \end{matrix}]}^{- 1}

(17)

Given that the transformation matrix is 4 × 4, all the points in the following equations are expressed in homogeneous coordinates.

Suppose a correspondence between a point ^{W 1}P_α of cloud 1 and a point ^{W 2}P_β of cloud 2. Let $_{W_{1}}^{α} T$ be the transformation matrix that enables the expression within the interest-point frame of the coordinates of a point in the cloud 1 expressed in the original frame W₁. In the same way, $_{W_{2}}^{β} T$ permits a similar transformation for point cloud 2.

A point ^{W 1}P_i is expressed within the frame of point ^{W 1}P_α in the following way:

^{α} P_{i} =_{W_{1}}^{α} T \cdot^{W_{1}} P_{i}

(18)

In the same way, a point ^{W 2}P_j in the frame centered on ^{W 2}P_β will be:

^{β} P_{j} =_{W_{2}}^{β} T \cdot^{W_{2}} P_{j}

(19)

As the points ^{W 1}P_α and ^{W 2}P_β form a correspondence, the origins of coordinates and the z-axes of the new frames must be coincident. To align a point ^αP_i in point cloud 1 with its corresponding point ^βP_j in cloud 2 it is necessary to rotate cloud 1 about the z-axis of the new frame by an angle of k·ρ_θ radians, where k is the number of rows the CIRCON image 1 was rotated to achieve the similarity value associated with this correspondence.

Therefore, to coarsely align the two points, the following must be fulfilled:

^{β} P_{j} \approx R_{z} (k) \cdot^{α} P_{i}

(20)

Substituting (18) and (19) into (20)

_{W_{2}}^{β} T \cdot^{W_{2}} P_{j} \approx R_{z} (k) \cdot_{W_{1}}^{α} T \cdot^{W_{1}} P_{i}

(21)

Consequently, the final Euclidean transformation T_c that coarsely aligns the two point clouds has the following expression:

T_{c} =_{W_{2}}^{β} T^{- 1} \cdot R_{z} (k) \cdot_{W_{1}}^{α} T

(22)

where

R_{Z} (k) = (\begin{matrix} cos (ρ_{θ} \cdot k) & sin (ρ_{θ} \cdot k) & 0 & 0 \\ - sin (ρ_{θ} \cdot k) & cos (ρ_{θ} \cdot k) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix})

(23)

5. Results

In order to enable the comparison of the proposed alignment algorithm with some of the existing ones we use some objects that were employed in different comparative studies [2, 23, 31].

As in the comparative study by Salvi et al. [2], an analysis of efficiency of the algorithm will not be carried out since, as discussed here, this is very implementation dependent (in our case Matlab^® was used). For this reason, we assess the performance of the algorithm in terms of effectiveness by measuring the alignment error for different free-form objects.

However, as a reference, it can be said that when the point clouds have an overlap of more than 70%, the time spent by the alignment algorithm is, in most cases, less than 5 s, while for very low overlap percentages, that value can be exceeded. In this case the time increment is due, first, to the need to use more starting points for the algorithm in order to avoid false matches and secondly because the choice of these points is not sufficiently suitable (given the simplicity of the points-of-interest selection algorithm), which implies, in both cases, an additional number of iterations.

Figure 9 shows an example of the results obtained for the alignment of two range images using the algorithm presented in the previous sections. To accelerate the evaluation of the correspondences, reduced point clouds have been used (see Figure 9b). To do this, a simple algorithm has been implemented which creates a grid of x-y coordinates (with a predetermined spacing) and calculates the z coordinates through interpolation of the initial values of the surroundings of each new point. Obviously, this reduces the precision of the original point clouds; however, the results obtained for different objects (Figure 10) from the Stuttgart University Database [32] demonstrate the algorithm is robust to these alterations of the data.

The CIRCON images shown in Figure 9c correspond to the points in which the greatest similarity measure was obtained for the three resolution levels. The color map used to show these images was chosen with the aim of visualizing the similarity better.

The number of resolution levels used for the experiments was three with 12, 24, and 48 angular divisions. The number of search columns, n_c, was, respectively, 8, 4, and 2 so that the number of search cells was always 96. These values were chosen for the implementation of our algorithm by testing different combinations to align synthetic point clouds. It was noted that with less than ten angular divisions the algorithm was faster at the first level, but it favored the emergence of false correspondences, which increases the computation time of the next levels and can lead to incorrect final alignment. On the other hand, we observed that 48 angular divisions for the highest resolution level are sufficient to obtain an acceptable approximate alignment. Although the maximum error on the rotation around the normal vector that could be committed is 3.75°, in practice the algorithm evaluates the similarity of so many correspondences with different orientations of the normal vector that usually the rotation error is under that value (as shown in the results).

Figure 9d shows the coarse alignment obtained for the reduced point cloud and a 3D rendering when the transformation obtained by the algorithm is also applied to the original data.

The rotation and translation errors were computed using similar expressions to those introduced in Section 5.3. In this case the transformation matrix chosen for comparison was obtained by refinement using a variant of the ICP algorithm [24].

As can be observed in the caption of Figure 9 and in Table 1, the rotation errors are less than 5° and the translation errors are, in all cases, less than the resolution of the two reduced point clouds that are used by the coarse alignment algorithm.

Table 1 Errors obtained by the coarse alignment algorithm for ten different objects

Full size table

This demonstrates that the algorithm is able to achieve a good performance despite using point clouds with few points (less than 6% of the original quantity) and different resolutions, which make it suitable for aligning point clouds with low density acquired by different devices.

The alignment algorithm was also evaluated using three different similarity measures for comparing the descriptors: CC, MI, and the measure proposed (SM). This experiment aimed to demonstrate that the combination of descriptor and similarity measure proposed has a better performance under conditions of low density and overlap than the others. Using one single range image and removing different parts two new range images were created for different degrees of overlap. Previously, the resolution of the original range images was reduced by interpolation. Thus, the algorithm was evaluated starting from low resolution with the aim of obtaining an alignment with rotation and translation errors of less than 5° and 5 mm, respectively. If the algorithm were not capable of finding a good alignment, the resolution is increased successively having the original resolution as a limit. The number of points of the reduced range image (before removing parts) necessary for a good alignment is shown in Figure 11 for 12 range images corresponding to four different objects. These results show that the proposed algorithm and similarity measure have a good performance when both density and overlap are low, which is not possible with algorithms that use 2D or 3D descriptors [33], since they need a sufficient quantity of points in order to construct the descriptors.

Moreover, as is shown in Figure 12, the alignment algorithm can also be used to determine the pose of an object in a cluttered scene (problem that arises in bin-picking tasks). In this case, point cloud 1 represents the scene and point cloud 2 represents the object model. Thus, the radius of the point-of-interest environment must be limited to the maximum radius that can be obtained for the object (in Figure 12 this limit was fixed at 120 mm).

6. Conclusions

We have introduced a novel descriptor (CIRCON) which represents, through a cyclical image, the geometry of the environment of a point-of-interest in the cloud. In order to construct the image matrix we distribute the points in sectors which, in turn, are subdivided into cells that have the same radial length. The values of the matrix elements represent the maximum z coordinate of the points contained in their corresponding cells. This represents an important difference with respect to other methods that use 2D histograms, such as spin images [8], which make them more vulnerable to the density changes of the point clouds (especially when their densities are significantly different).

We have also designed a novel similarity measure that takes into account both the distances between the pixels of the descriptors and their degree of overlap, which are not considered by other methods due to the particular characteristics of the descriptors. Furthermore, this similarity measure takes advantage of the cyclical nature of the descriptor to obtain, along with the similarity value, an index that represents the rotation around the normal at the point-of-interest. When the similarity of two descriptors is evaluated, this rotation index, the matched points and their normal vectors can be used to calculate a Euclidean transformation matrix; that is, the two point clouds can be aligned by determining one single correspondence.

Using this similarity measure, the descriptors can be compared without having to restrict the neighborhood of the point-of-interest, so the discriminating power could be increased in order to avoid problems of misalignment when the objects have symmetries or repeated regions (problems that are not well solved by other methods, such as spin images, as is explained in [2]).

Based on this combination of descriptor and similarity measure we have designed a coarse alignment algorithm that eliminates the need to find a group of valid correspondences (which is necessary in most algorithms, including spin images [8]). One of the main advantages of this algorithm is that the stopping criterion is always evaluated when it finds a correspondence that exceeds the maximum similarity value found until that moment. Thus, if certain conditions are met, the algorithm ends without having to find additional correspondences.

The results show that the proposed algorithm is able to find a proper alignment despite using simple criteria for selecting the points-of-interest. However, in some cases these starting points are not the most appropriate and the algorithm has to perform more iterations than necessary. As one of the advantages of our proposed algorithm is that it can end once it finds a correspondence that has high similarity and that meets the stopping criterion, if the points-of-interest are appropriately selected, it is very likely that the algorithm could end after the first iterations on the majority of occasions. Furthermore, if these keypoints are obtained by new multiscale methods [34], the support sizes can be calculated for the descriptors in both point clouds and the alignment could be carried out, as in [35], using point clouds with different scale.

References

Sansoni G, Trebeschi M, Docchio F: State-of-the-art and applications of 3D imaging sensors in industry, cultural heritage, medicine, and criminal investigation. Sensors 2009, 9: 568-601. 10.3390/s90100568
Article Google Scholar
Salvi J, Matabosch C, Fofi D, Forest J: A review of recent range image registration methods with accuracy evaluation. Image Vis Comput 2007, 25: 578-596. 10.1016/j.imavis.2006.05.012
Article Google Scholar
Planitz BM, Maeder AJ, Williams JA: The correspondence framework for 3D surface matching algorithms. Comput Vis Image Understand 2005, 97: 347-383. 10.1016/j.cviu.2004.08.001
Article Google Scholar
Chua CS, Jarvis R: Point signatures: A new representation for 3D object recognition. Int J Comput Vis 1997, 25: 63-85. 10.1023/A:1007981719186
Article Google Scholar
Gelfand N, Mitra NJ, Guibas LJ, Pottmann H: Robust global registration. In Symposium on Geometry Processing. Vienna, Austria; 2005:197-206.
Google Scholar
Feldmar J, Ayache N: Rigid, affine and locally affine registration of free-form surfaces. Int J Comput Vis 1996, 18: 99-119. 10.1007/BF00054998
Article Google Scholar
Barequet G, Sharir M: Partial surface matching by using directed footprints. In Proc 12th Annual Symp Computational Geometry. Philadelphia, USA; 1996:409-410.
Google Scholar
Johnson AE, Hebert M: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Pattern Anal Mach Intell 1999, 21: 433-449. 10.1109/34.765655
Article Google Scholar
Ashbrook AP, Fisher RB, Robertson C, Werghi N: Aligning arbitrary surfaces using pairwise geometric histograms. In Proc NMBIA98. Glasgow, UK; 1998:103-108.
Google Scholar
Ashbrook AP, Fisher RB, Robertson C, Werghi N: Finding surface correspondence for object recognition and registration using pairwise geometric histograms. In Computer Vision-ECCV'98. Freiburg, Germany; 1998:674-686.
Chapter Google Scholar
Yamany SM, Farag AA: Surface signatures: an orientation independent free-form surface representation scheme for the purpose of objects registration and matching. IEEE Pattern Anal Mach Intell 2002, 24: 1105-1120. 10.1109/TPAMI.2002.1023806
Article Google Scholar
Masuda T: Automatic registration of multiple range images by the local log-polar range images. In Proc Third International Symposium on 3D Data Processing, Visualization, and Transmission. Chapel Hill, USA; 2006:216-223.
Chapter Google Scholar
Körtgen M, Novotni M, Klein R: 3D shape matching with 3D shape contexts. In The 7th Central European Seminar on Computer Graphics. Budmerice, Slovakia; 2003:1-12.
Google Scholar
Zhang D: Harmonic shape images: a 3-D free-form surface representation and its application in surface matching. Ph.D. dissertation, Carnegie Mellon University; 1999.
Google Scholar
Stein F, Medioni G: Structural indexing: efficient 2D object recognition. IEEE Trans Pattern Anal Mach Intell 1992, 14: 1198-1204. 10.1109/34.177385
Article Google Scholar
Wyngaerd JV, Koch R, Proesmans M, Gool LV: Invariant-based registration of surface patches. In IEEE International Conference on Computer Vision. Volume 1. Kerkyra, Greece; 1999:301-306.
Google Scholar
Krsek P, Pajdla T, Hlavác V, Martin R: Range image registration driven by a hierarchy of surfaces. In 22nd Workshop of the Austrian Association for Pattern Recognition. Illmitz, Austria; 1998:175-183.
Google Scholar
Song Chen C, Ping Hung Y, Bo Cheng J: RANSAC-based DARCES: a new approach to fast automatic registration of partially overlapping range images. IEEE Trans Pattern Anal Mach Intell 1999, 21: 1229-1234. 10.1109/34.809117
Article Google Scholar
Chua CS, Jarvis R: 3D free-form surface registration and object recognition. Int J Comput Vis 1996, 17: 77-99. 10.1007/BF00127819
Article Google Scholar
Cheng J, Don H: A graph matching approach to 3-D point correspondences. Pattern Recogn Artif Intell 1991, 5: 399-412. 10.1142/S0218001491000223
Article Google Scholar
Besl PJ, McKay HD: A method for registration of 3-D shapes. IEEE Pattern Anal Mach Intell 1992, 14: 239-256. 10.1109/34.121791
Article Google Scholar
Chen Y, Medioni G: Object modelling by registration of multiple range images. Image Vis Comput 1992, 10: 145-155. 10.1016/0262-8856(92)90066-C
Article Google Scholar
Rusinkiewicz S, Levoy M: Efficient variants of the ICP algorithm. In Proceedings of the Third Intl Conf on 3D Digital Imaging and Modeling. Quebec City, Canada; 2001:145-152.
Chapter Google Scholar
Wells WM III, Viola P, Atsumi H, Nakajima S, Kikinis R: Multi-modal volume registration by maximization of mutual information. Med Image Anal 1996, 1: 35-51. 10.1016/S1361-8415(01)80004-9
Article Google Scholar
Studholme C, Hill D, Hawkes D: An overlap invariant entropy measure of 3D medical image alignment. Pattern Recogn 1999, 32: 71-86. 10.1016/S0031-3203(98)00091-0
Article Google Scholar
Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G: Automated multi-modality image registration based on information theory. In Proc of International Conference on Information Processing in Medical Imaging. Ile de Berder, France; 1995:263-274.
Google Scholar
Studholme C, Hill DL, Hawkes DJ: Multiresolution voxel similarity measures for MR-PET registration. In Proc of International Conference on Information Processing in Medical Imaging. Ile de Berder, France; 1995:287-298.
Google Scholar
Skerl D, Likar B, Pernus F: A protocol for evaluation of similarity measures for rigid registration. IEEE Trans Med Imag 2006, 25: 779-791.
Article Google Scholar
Penney G, Weese J, Little J, Desmedt P, Hill D, Hawkes D: A comparison of similarity measures for use in 2D-3D medical image registration. IEEE Trans Med Imag 1998, 17: 586-595. 10.1109/42.730403
Article Google Scholar
Torre Ferrero C, Llata J, Robla S, Sarabia E: A similarity measure for 3D rigid registration of point clouds using image-based descriptors with low overlap. S3DV09. In IEEE 12th International Conference on Computer Vision, ICCV Workshops 2009. Kyoto, Japan; 2009:71-78.
Chapter Google Scholar
Zinsser T, Schmidt J, Niemann H: A refined ICP algorithm for robust 3-D correspondence estimation. In Proceedings of the International Conference on Image Processing. Barcelona, Spain; 2003:695-698.
Google Scholar
Eisele K, Hetzel G: Range image database, University of Stuttgart.[http://range.informatik.uni-stuttgart.de/htdocs/html/]
Mian A, Bennamoun M, Owens RA: A novel representation and feature matching algorithm for automatic pairwise registration of range images. Int J Comput Vis 2006, 66: 19-40. 10.1007/s11263-005-3221-0
Article Google Scholar
Mian A, Bennamoun M, Owens RA: On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes. Int J Comput Vis 2010, 89: 348-361. 10.1007/s11263-009-0296-z
Article Google Scholar
Novatnack J, Nishino K: Scale-dependent/invariant local 3D shape descriptors for fully automatic registration of multiple sets of range images. In Proceedings of the 10th European Conference on Computer Vision: Part III. Marseille, France; 2008:440-453.
Google Scholar

Download references

Acknowledgements

This study was carried out with the support of the Spanish CICYT project DPI2006-15313.

Author information

Authors and Affiliations

Electronics Technology, Systems and Automation Engineering Department, University of Cantabria, Santander, Av. Los Castros s/n, 39005, Cantabria, Spain
Carlos Torre-Ferrero, José R Llata, Luciano Alonso, Sandra Robla & Esther G Sarabia

Authors

Carlos Torre-Ferrero
View author publications
You can also search for this author in PubMed Google Scholar
José R Llata
View author publications
You can also search for this author in PubMed Google Scholar
Luciano Alonso
View author publications
You can also search for this author in PubMed Google Scholar
Sandra Robla
View author publications
You can also search for this author in PubMed Google Scholar
Esther G Sarabia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carlos Torre-Ferrero.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Torre-Ferrero, C., Llata, J.R., Alonso, L. et al. 3D point cloud registration based on a purpose-designed similarity measure. EURASIP J. Adv. Signal Process. 2012, 57 (2012). https://doi.org/10.1186/1687-6180-2012-57

Download citation

Received: 09 March 2011
Accepted: 06 March 2012
Published: 06 March 2012
DOI: https://doi.org/10.1186/1687-6180-2012-57

3D point cloud registration based on a purpose-designed similarity measure

Abstract

1. Introduction

2. CIRCON descriptor

2.1. Introduction

2.2. Descriptor construction

3. The proposed similarity measure

3.1. Introduction

3.2. Sets of pixels

3.3. Area represented by each cell

3.4 Weight of the pixels

3.5. Similarity measure expression

4. Coarse alignment algorithm

4.1. Point-of-interest selection

4.2. Correspondence search algorithm

4.3. Main algorithm: selection of the most suitable transformation

4.4. Calculation of the Euclidean transformation using a single correspondence

5. Results

6. Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords