Skip to content


Open Access

3D point cloud registration based on a purpose-designed similarity measure

  • Carlos Torre-Ferrero1Email author,
  • José R Llata1,
  • Luciano Alonso1,
  • Sandra Robla1 and
  • Esther G Sarabia1
EURASIP Journal on Advances in Signal Processing20122012:57

Received: 9 March 2011

Accepted: 6 March 2012

Published: 6 March 2012


This article introduces a novel approach for finding a rigid transformation that coarsely aligns two 3D point clouds. The algorithm performs an iterative comparison between 2D descriptors by using a purpose-designed similarity measure in order to find correspondences between two 3D point clouds sensed from different positions of a free-form object. The descriptors (named with the acronym CIRCON) represent an ordered set of radial contours that are extracted around an interest-point within the point cloud. The search for correspondences is done iteratively, following a cell distribution that allows the algorithm to converge toward a candidate point. Using a single correspondence an initial estimation of the Euclidean transformation is computed and later refined by means of a multiresolution approach. This coarse alignment algorithm can be used for 3D modeling and object manipulation tasks such as "Bin Picking" when free-form objects are partially occluded or present symmetries.


laser scanner3D point clouddescriptorsimilarity measurecoarse alignment3D registration

1. Introduction

The alignment of two point clouds is quite a frequent task, both in 3D modeling and in object recognition. Similarly, the need for automating certain applications, such as computer-aided manufacturing or bin-picking, has necessitated the use of 3D information about the parts being manipulated. This information can be sensed by 3D acquisition methods [1], such as laser scanners or time-of-flight cameras, which provide a range image for every different pose of the object.

Finding the rigid transformation producing a suitable alignment of the resulting point clouds, without having a previous estimate, is a problem that has been approached using different strategies [2, 3]. Although no solution has prevailed as the most accepted, algorithms based on intrinsic properties have been more widely applied due to their generality. These algorithms extract shape descriptors [415], curves [16, 17], structures [18, 19], or graphs [20] from both point clouds (sometimes meshes are used instead) in order to compare them. If several correspondences are found, then a coarse transformation that aligns them in a suitable way can be calculated.

On the other hand, the algorithms that use extrinsic properties will be subject to one important restriction: as they match properties that are relative to a coordinate system, the surfaces must be roughly aligned in order to establish point correspondences. Therefore, these algorithms (such as the ICP algorithm [21, 22] and its variants [23]) are used to refine that initial transformation and obtain a more precise one. Since this refinement process has been successfully achieved, the most challenging part of the 3D point cloud alignment problem is to determine the rough initial transformation.

2. CIRCON descriptor

2.1. Introduction

After reviewing the state-of-the-art, the following problems were found in the most significant alignment algorithms [2, 3]: lack of generality (they work well with objects of a given topology), excessive computation time, problems with symmetries, poor behavior when point clouds have low density and when they overlap each other in a small region, need for a method (usually based on robust statistical techniques) that discards false correspondences and obtains a valid correspondence group to obtain the Euclidean transformation. All the methods reviewed show, to a greater or lesser extent, at least two of these drawbacks.

The three characteristics to which we have given most importance when designing our alignment algorithm are it must have no restrictions regarding the type of objects that can be used, a good performance in the presence of symmetries (which is quite common in industrial components) and good behavior when the overlap and the density of the point clouds are low. However, we have also taken into account the other problems that may occur.

One of the main drawbacks observed after analysis of the commonly used descriptors in the state-of-the-art is that although many of them are based on geometric properties of the environment of the point-of-interest, the evaluation of their similarity does not have a direct relationship with the distance between the point clouds [4, 815]. Moreover, since a good alignment is characterized by a small distance between corresponding points, it would be more convenient to use a descriptor that represents the geometry of the environment better. Furthermore, the descriptors analyzed need to find at least three good correspondences to determine an approximate Euclidean transformation.

Another drawback associated with some descriptors is that at the end of the local matching stage, a considerable percentage of false matches can be found. This is usually caused by a descriptor with low discriminating capacity and a choice of similarity measure that is not sufficiently appropriate.

In our opinion, for the correspondence search to be effective, the descriptor used by the coarse alignment algorithm should:

  • Be based on the geometry in the environment of the points-of-interest.

  • Be highly descriptive, so that the correspondences can be adequately discriminated and no false matches appear.

  • Enable the use of a similarity measure based on distances between points of the cloud.

  • Enable the calculation of a Euclidean transformation based on a single correspondence.

  • Be useful for 3D modeling (alignment of two scans from two different views of the object) and for 3D object recognition (alignment of the point clouds in the scene and the model).

2.2. Descriptor construction

In order to obtain the descriptor associated with a particular point-of-interest in the cloud (let w p q be this point), it is necessary to express the cloud points in a local coordinate system centered on w p q and whose Z q -axis is its normal vector. The X q -axis is chosen so that it is perpendicular to both the Y w -axis of the reference coordinate system and the normal vector at the point-of-interest. Thus the Y q -axis is determined by the cross product of unit vectors along the X q and Z q axes. This criterion establishes a unique reference for the angles of rotation about the Z q -axis (i.e., above normal n ̄ q ), which will subsequently facilitate the calculation of the Euclidean transformation associated with that correspondence.

Once the cloud points are transformed to the local frame, the environment of the point-of-interest is considered to be divided into n s sectors (whose angle is ρ θ radians), which are further divided radially into cells with length ρ r mm (excluding the cell closest to the centre, "cell 0", which will be a sector with a radius 0.5 ρ r mm). The sectors are numbered clockwise starting with the sector that is centred on the X q -axis (θ1 = 0). Figure 1 shows, around the Z q -axis, this sense of numbering and the nature of the cells for the i th sector.
Figure 1

Construction of a CIRCON descriptor. Green shows cell division in sector i and red indicates the contour formed by the points of the cells with the greatest z coordinate.

Taking into account this division of the point cloud into sectors and cells, a transformation based on cylindrical coordinates is applied in order to obtain, for any point p d with coordinates (x d , y d , z d ) in the coordinate system with origin at p q , the i index corresponding to the number of sector to which it belongs, the j index indicating the cell within that sector, and the height value associated with its coordinate z d .
i = n s - tan - 1 y d x d ρ θ m o d n s + 1
j = x d 2 + y d 2 ρ r
c i , j = z d ρ z
where ρ θ is the angular resolution, ρ r the radial resolution, and ρ z the height resolution. Note also that x is the nearest integer to x. Applying these three equations to all points of the cloud and maintaining the maximum values c i, j for each cell (which is equivalent to saying that its z coordinate is maximum), we obtain a set of triples (i, j, c i, j ) that uniquely correspond to each of the cells into which the environment of the point-of-interest has been divided. Therefore, the descriptor consists of a set of coded values that represent the contours described by the points with the maximum z coordinate of each of the cells into which each sector is divided. Figure 1 shows, for sector i, the contour described by points with the largest 'z' in the cells. As each pair of indexes (i, j) has a single value of c i, j , we can build a matrix C whose row index is the sector number, i, and whose column index is the number of cell, j, within that sector (see Figure 2). Note that the cells closest to the center (the origin of the local coordinate system) are not represented in this matrix since their value would always be close to zero.
Figure 2

CIRCON matrix. The arrows show the decreasing direction for angle and the increasing direction for radius.

Since the sequence should be closed because it describes the environment of the point-of-interest, the first and last rows must be considered adjacent, since their elements with the same column index correspond to adjacent cells. In other words, this descriptor has the property of being cyclical.

This matrix can also be viewed as an image (see Figure 3) whose pixels represent the values c i, j , so that each of them has an associated colour (or a gray level). Hence, this descriptor can be considered as a cyclical image of the environment of a point-of-interest. Since each row represents a radial encoding of the contour described by the maximum heights of each of the cells in a sector, we will call this descriptor "CIRCON", which is an acronym of "Cyclical Image of Radial CONtours".
Figure 3

Example of a CIRCON image. Point cloud with the normal vector of a point in magenta (left). CIRCON image corresponding to that point (right).

3. The proposed similarity measure

3.1. Introduction

Unlike the similarity measures based on correlation coefficient (CC), mutual information (MI) [24, 25], joint entropy [26, 27], or others [28, 29] that have been used by the most popular coarse alignment algorithms, such as spin images, this similarity measure is based on distance between pixels and takes into account the problems of occlusion that can appear in real situations that need 3D registration or object recognition.

This similarity measure gives weighting to both the overlap and the proximity of two point clouds. This enables the simultaneous evaluation of the geometric consistency of the correspondences. Although computational cost increases with the number of correspondences evaluated, the Euclidean transformation associated with this correspondence can be directly calculated, given that the rotation around the normal necessary to align the two point clouds is determined. Therefore, it will be possible to determine which correspondences give rise to the Euclidean transformation which fits best and to base the stopping criterion on their validation. In this way, the algorithm finalizes when the coarse transformation satisfies the end conditions imposed (detailed in the algorithm in Section 5.3), without necessity to evaluate all the points-of-interest selected in the two point clouds.

3.2. Sets of pixels

Assuming two CIRCON descriptors A and B corresponding to two matched points; if both a pixel from A, a ij , and a pixel from B, b ij , represent a part of the point cloud, it will be considered that this pair of pixels, with indexes (i, j), are overlapped.

Taking into account that the matrix elements not belonging to the point cloud will be considered computationally as 'not-a-number' (i.e., NaN), the following sets of pixels have been defined:

Intersection Set (overlapped pixels):
I A B = { ( i , j ) | ( a i j N a N ) a n d ( b i j N a N ) }
Exclusive-OR Set (non-overlapped pixels):
X A B = { ( i , j ) | ( a i j N a N ) x o r ( b i j N a N ) }
Union Set (overlapped pixels and non-overlapped pixels):
U A B = { ( i , j ) | ( a i j N a N ) o r ( b i j N a N ) }

These sets of pixels will be taken into account in order to define the similarity measure.

3.3. Area represented by each cell

The top-view area represented by each cell (see Figure 4) increases with the distance from the central point (the point-of-interest where the descriptor has been generated). Therefore, the size of this area will determine which points in the cloud correspond to a cell of the image matrix.
Figure 4

Area of the cells into which the environment of a point-of-interest is divided. These cells are represented by the pixels in the CIRCON image.

Accordingly, the pixels corresponding to a specific column j in the CIRCON image represent the points that have a radius between ρ r (j-0.5) and ρ r (j+0.5). Therefore, the theoretical area corresponding to each one of these pixels will be the same, A pj , and it will be given by the following expression:
A p j = π n s ( ( ρ r ( j + 0 . 5 ) ) 2 - ( ρ r ( j - 0 . 5 ) ) 2 )

n s being the number of angular divisions.

A p j = 2 π n s ρ r 2 j

3.4 Weight of the pixels

In the same way, the theoretical area corresponding to a pixel of the first column (equivalent to a cell of the first ring of the Figure 4) will be
A p 1 = 2 π n s ρ r 2
Therefore, the relationship of areas between a pixel in the j th column and the one in the first column will be as follows
A p j A p 1 = j

Using this expression, the area represented by the pixels in different columns will be taken into account to correctly weight the contribution of each pixel to the average distance in the overlapped area.

3.5. Similarity measure expression

As was explained previously, the similarity measure selected will depend on the distance and the overlap among the CIRCON images.

To calculate the average distance between the pixels of the two images, those of the overlapped area and those of the non-overlapped area will be considered separately.

Therefore, the average distance in the overlapped area will be defined using the following expression:
D o v = ( i , j ) I A B A p j | a i j - b i j | ( i , j ) I A B A p j = ( i , j ) I A B j | a i j - b i j | ( i , j ) I A B j
And the overlap ratio is defined as
σ o v = ( i , j ) I A B j ( i , j ) U A B j

which expresses the relationship between the weighted number of overlapping pixels and the weighted total number of pixels pertaining to the object (overlapping and non-overlapping).

The similarity measure proposed considers both terms and it provides values between 0 and 1,
M S = σ o v ( ρ D o v + λ ) + σ o v ( 1 - λ )

where λ' is defined as λ' = ρ λ., λ being a parameter whose value represents the additional distance with which non-overlapping pixels are penalized. In contrast the parameter ρ modifies the relationship between the expected similarity value and the distance Dov that produces it, when the overlap is 100%.

The values for these parameters used in our experiments are ρ = 1 and λ = 1, which give a similarity value of 0.5 both when σov = 0.5 and Dov = 0 and when Dov = 1 and σov = 1.

Since, given a matching pair, the point coordinates and their corresponding normal vectors are known for both points, only a free parameter is needed to compute the rigid transformation: the rotation around the normal. This can be easily calculated using CIRCON images since these are cyclical. By shifting the last row to the top for the first CIRCON image (from point cloud 1) and leaving the second one fixed (from point cloud 2), the similarity measure for a rotation ρ θ can be calculated. If the last two rows are shifted to the top, the equivalent rotation will be 2ρ θ , and so on. This can be practically implemented by means of matrix blocks so that the similarity measures for all the shifts can be computed at the same time. Subsequently, the similarity value for a matching pair will be the maximum for all the possible rotations and it will be associated with an angle k·ρ θ (k being the number of row shifts). A preliminary analysis of this similarity measure can be found in [30].

4. Coarse alignment algorithm

4.1. Point-of-interest selection

First, it will be necessary to select a set of points-of-interest in both clouds, {p1i} and {p2i}. An important novelty is that the alignment algorithm does not need a thorough selection of the points-of-interest or a large number of them to obtain a proper alignment. Instead we have designed a simple algorithm that selects the points-of-interest by taking into account the particular characteristics of the descriptor proposed. First, normal vectors at each point are calculated by interpolation using the range image data. Applying basic morphological operations on the resulting image and the Laplace operator on the normal components (see Figure 5), points are extracted whose normal vectors are stable and also close to areas where the normal varies abruptly. This enables points-of-interest with a stable CIRCON descriptor to be selected that, in addition, represent close areas with relevant features.
Figure 5

Point-of-interest selection. (a) Range image. (b) Edge detection image. (c) Resulting image after applying the Laplace operator to the normal vector components. (d) Points-of-interest selected: Non-edge pixels whose value in image (c) is lower than the median value.

The number of points extracted depends on the topology of the object, although a minimum distance between them is considered so that this number is not very high.

4.2. Correspondence search algorithm

This search algorithm is the core of the coarse alignment algorithm. It is based on an iterative search for the greatest value of similarity measure using the array of cells, C1, into which the environment of a point-of-interest in point cloud 1 is divided. The chosen stopping criterion ensures that this search is convergent, since the environment where the correspondences are searched for is progressively reduced.

The algorithm will evaluate correspondences between different points of cloud 1 and a point-of-interest selected from cloud 2. The degree of validity of two matching points is to be determined based on the similarity between their CIRCON images: I1x(image of a point P1xin cloud 1) and I2 (target image). Since a single point, P1x, is extracted from each cell of the array C1, for each iteration, the algorithm performs as many similarity measure evaluations as the number of 'valid' cells in the distribution C1 around the point of the previous iteration. Note that the points whose CIRCON images obtain a low similarity value are stored in a list of non-valid indexes, ind nv . Thus, a cell will be considered 'valid' when its ratio of non-valid points, r nv , is less than a prefixed threshold τ nv . This allows to progressively reduce the number of cells to be checked.

Therefore, the similarity value returned by the algorithm, MSc, is the highest obtained in all the iterations until the stopping condition is met. The algorithm ends when an iteration uses a starting point whose distance to one of the previously used points is less than a preset δ (in our implementation δ = ρ r /16).

As will be explained in Section 5.3, the size of the descriptors used by this search algorithm depends on the resolution level in the main algorithm. Moreover, since one of its goals must be to avoid an incorrect alignment in the presence of occlusions and symmetries of the objects, the CIRCON images will represent the entire point clouds in order to increase their descriptiveness. However, depending on the application (e.g., mixed objects), the environment size of the descriptor can be varied.

Figure 6 shows the steps that must be followed in order to perform this iterative search for correspondences.
Figure 6

Correspondence search algorithm.

4.3. Main algorithm: selection of the most suitable transformation

Figure 7 shows all the steps that must be followed in order to obtain a coarse alignment between two point clouds. This algorithm uses the Correspondence Search Algorithm introduced in Section 5.2 in a multiresolution approach which refines the Euclidean transformation matrix.
Figure 7

Coarse alignment algorithm.

For each interest-point chosen in cloud 2, P2y, the search for correspondences by cells is established for nv levels of resolution. The number of levels and the lowest resolution must be determined through a compromise between computation time and accuracy of the point cloud alignment, which will depend on the application.

The starting points for the first level are the interest-points chosen in cloud 1, {p1i}. This first level enables the discarding of those zones (cells) of the surroundings of the chosen point, P1x, where, due to their low similarity, it is unlikely to find the desired correspondence. Once the Correspondence Search Algorithm has found an approximate correspondence, P1c, for the first level, the resolution is increased and a new search around this new starting point is performed but with smaller cells. In this way, the search zone is reduced, which will be the object of progressive refinement for the next resolution levels (since only the first n c columns of the array of cells C1care used for the correspondence search and this number is halved when the resolution is increased). When the convergence of the search associated with the last resolution level is achieved and the similarity value MSc of the resulting correspondence is greater than a value τ M S ( n v ) , its corresponding Euclidean transformation, Tc, is calculated using Equation (22).

We propose a stopping criterion for the alignment algorithm that also takes into account the characteristics of the descriptor. It uses the correspondence found and two additional ones, (P1m, P2n) and (P1r, P2s), to create a fictitious correspondence (see Figure 8) with which another transformation matrix, T f , is computed. Those additional points are chosen from corresponding cells in which both point clouds are distributed. The condition that must be satisfied for each of the two selected points in point cloud 1 is that the angle formed by their normal vectors and the normal at the point being evaluated should be very similar to the angle formed by their corresponding normal vectors in point cloud 2.
Figure 8

Evaluation of the stopping criterion. The points p t 1 and p t 2 and their normal vectors are obtained from the correspondence (p1a, p2b) found by the alignment algorithm and two additional correspondences (p1m, p2n) and (p1r, p2s). p t 1 and p t 2 are the centroids of the triangles.

Then the matrix associated with the fictitious correspondence, T f , is compared with that obtained by the algorithm, T c . The execution is stopped if a distance measure for the rotation, d R , and another for the translation, d t , do not exceed their respective thresholds τ R and τ t .

Thus, the rotation distance d R is obtained through the following expression:
d R = 1 3 ( α R 2 + β R 2 + γ R 2 )

where (α R , β R , γ R ) are the ZYX Euler angles from the rotation matrix R f T R c ; R c being the rotation matrix obtained by the algorithm, and R f the rotation matrix associated with the fictitious correspondence.

The translation distance d t will be calculated as the RMS distance between the translation vectors, t c and t f , associated with both correspondences.

As will be shown in the results, the solution obtained by the algorithm can be sufficiently accurate for object manipulation tasks; however, the Euclidean transformation could be refined using the ICP algorithm by taking advantage of the data provided by our algorithm about the correspondences between the points in the two clouds.

4.4. Calculation of the Euclidean transformation using a single correspondence

Once the correspondence with the highest similarity measure within the surroundings of the point-of-interest chosen for an iteration of the algorithm is found, the Euclidean transformation that coarsely aligns the point clouds can be calculated.

In the first place, it is necessary to express both point clouds within a frame of coordinates whose origin is each of the two points that have been matched and whose z-axis is aligned with the normal vectors at these points.

By convention, the x-axis of the new frame is perpendicular to the normal (new z-axis) and to the y-axis of the original frame W, since the CIRCON images were generated in this way.

Therefore, given an interest-point W P q with normal vector n ̄ q expressed in the frame W, the data will be rotated by using the following matrix:
W q R = [ x ̄ q n ̄ q × x ̄ q n ̄ q ] T
x ̄ q = ŷ W × n ̄ q | | ŷ W × n ̄ q | |
The translation vector is given by the coordinates of the point in the original frame. Therefore, the total Euclidean transformation will be given by the following expression:
W q T = W q R - W q R W P q 0 1 × 3 1 - 1

Given that the transformation matrix is 4 × 4, all the points in the following equations are expressed in homogeneous coordinates.

Suppose a correspondence between a point W 1P α of cloud 1 and a point W 2P β of cloud 2. Let W 1 α T be the transformation matrix that enables the expression within the interest-point frame of the coordinates of a point in the cloud 1 expressed in the original frame W1. In the same way, W 2 β T permits a similar transformation for point cloud 2.

A point W 1P i is expressed within the frame of point W 1P α in the following way:
α P i = W 1 α T W 1 P i
In the same way, a point W 2P j in the frame centered on W 2P β will be:
β P j = W 2 β T W 2 P j

As the points W 1P α and W 2P β form a correspondence, the origins of coordinates and the z-axes of the new frames must be coincident. To align a point α P i in point cloud 1 with its corresponding point β P j in cloud 2 it is necessary to rotate cloud 1 about the z-axis of the new frame by an angle of k·ρ θ radians, where k is the number of rows the CIRCON image 1 was rotated to achieve the similarity value associated with this correspondence.

Therefore, to coarsely align the two points, the following must be fulfilled:
β P j R z ( k ) α P i
Substituting (18) and (19) into (20)
W 2 β T W 2 P j R z ( k ) W 1 α T W 1 P i
Consequently, the final Euclidean transformation T c that coarsely aligns the two point clouds has the following expression:
T c = W 2 β T - 1 R z ( k ) W 1 α T
R Z ( k ) = cos ( ρ θ k ) sin ( ρ θ k ) 0 0 - sin ( ρ θ k ) cos ( ρ θ k ) 0 0 0 0 1 0 0 0 0 1

5. Results

In order to enable the comparison of the proposed alignment algorithm with some of the existing ones we use some objects that were employed in different comparative studies [2, 23, 31].

As in the comparative study by Salvi et al. [2], an analysis of efficiency of the algorithm will not be carried out since, as discussed here, this is very implementation dependent (in our case Matlab® was used). For this reason, we assess the performance of the algorithm in terms of effectiveness by measuring the alignment error for different free-form objects.

However, as a reference, it can be said that when the point clouds have an overlap of more than 70%, the time spent by the alignment algorithm is, in most cases, less than 5 s, while for very low overlap percentages, that value can be exceeded. In this case the time increment is due, first, to the need to use more starting points for the algorithm in order to avoid false matches and secondly because the choice of these points is not sufficiently suitable (given the simplicity of the points-of-interest selection algorithm), which implies, in both cases, an additional number of iterations.

Figure 9 shows an example of the results obtained for the alignment of two range images using the algorithm presented in the previous sections. To accelerate the evaluation of the correspondences, reduced point clouds have been used (see Figure 9b). To do this, a simple algorithm has been implemented which creates a grid of x-y coordinates (with a predetermined spacing) and calculates the z coordinates through interpolation of the initial values of the surroundings of each new point. Obviously, this reduces the precision of the original point clouds; however, the results obtained for different objects (Figure 10) from the Stuttgart University Database [32] demonstrate the algorithm is robust to these alterations of the data.
Figure 9

Alignment of two range images for the object 'hip' from the Stuttgart University Database. (a) Range images. (b) Corresponding points (normal vector in magenta) which obtained the maximum similarity value for the top resolution level. (c) CIRCON images found for the three resolution levels. (d) Alignment: reduced cloud (left) and 3D rendering using the original point clouds (right). Rotation error: 1.1909°. Translation error: 0.7436 mm.

Figure 10

Ten pairs of range images for different objects from the Stuttgart University database: (a) Ducky, (b) Femur, (c) Igea, (d) Fighter, (e) Dino, (f) Mole, (g) Isis, (h) Liberty, (i) Pitbull, (j) Female.

The CIRCON images shown in Figure 9c correspond to the points in which the greatest similarity measure was obtained for the three resolution levels. The color map used to show these images was chosen with the aim of visualizing the similarity better.

The number of resolution levels used for the experiments was three with 12, 24, and 48 angular divisions. The number of search columns, n c , was, respectively, 8, 4, and 2 so that the number of search cells was always 96. These values were chosen for the implementation of our algorithm by testing different combinations to align synthetic point clouds. It was noted that with less than ten angular divisions the algorithm was faster at the first level, but it favored the emergence of false correspondences, which increases the computation time of the next levels and can lead to incorrect final alignment. On the other hand, we observed that 48 angular divisions for the highest resolution level are sufficient to obtain an acceptable approximate alignment. Although the maximum error on the rotation around the normal vector that could be committed is 3.75°, in practice the algorithm evaluates the similarity of so many correspondences with different orientations of the normal vector that usually the rotation error is under that value (as shown in the results).

Figure 9d shows the coarse alignment obtained for the reduced point cloud and a 3D rendering when the transformation obtained by the algorithm is also applied to the original data.

The rotation and translation errors were computed using similar expressions to those introduced in Section 5.3. In this case the transformation matrix chosen for comparison was obtained by refinement using a variant of the ICP algorithm [24].

As can be observed in the caption of Figure 9 and in Table 1, the rotation errors are less than 5° and the translation errors are, in all cases, less than the resolution of the two reduced point clouds that are used by the coarse alignment algorithm.
Table 1

Errors obtained by the coarse alignment algorithm for ten different objects


Reduced point cloud 1

Reduced point cloud 2

Rot. error (degrees)

Translation error (mm)


Number of points (% of total)

Resolution (mm)

Number of point (% of total)

Resolution (mm)



859 (1.61%)


1142 (1.90%)





453 (1.54%)


497 (1.25%)





544 (0.93%)


419 (0.73%)





635 (2.13%)


401 (2.67%)





528 (4.92%)


314 (2.45%)





673 (1.45%)


652 (1.65%)





543 (3.50%)


381 (1.47%)





715 (5.27%)


428 (2.47%)





153 (0.83%)


108 (0.58%)





549 (5.58%)


244 (1.67%)




This demonstrates that the algorithm is able to achieve a good performance despite using point clouds with few points (less than 6% of the original quantity) and different resolutions, which make it suitable for aligning point clouds with low density acquired by different devices.

The alignment algorithm was also evaluated using three different similarity measures for comparing the descriptors: CC, MI, and the measure proposed (SM). This experiment aimed to demonstrate that the combination of descriptor and similarity measure proposed has a better performance under conditions of low density and overlap than the others. Using one single range image and removing different parts two new range images were created for different degrees of overlap. Previously, the resolution of the original range images was reduced by interpolation. Thus, the algorithm was evaluated starting from low resolution with the aim of obtaining an alignment with rotation and translation errors of less than 5° and 5 mm, respectively. If the algorithm were not capable of finding a good alignment, the resolution is increased successively having the original resolution as a limit. The number of points of the reduced range image (before removing parts) necessary for a good alignment is shown in Figure 11 for 12 range images corresponding to four different objects. These results show that the proposed algorithm and similarity measure have a good performance when both density and overlap are low, which is not possible with algorithms that use 2D or 3D descriptors [33], since they need a sufficient quantity of points in order to construct the descriptors.
Figure 11

Points of the reduced cloud (before removing parts) that are necessary for a correct alignment using three different similarity measures and under low overlap conditions. CC, correlation coefficient; MI, mutual information; SM, proposed similarity measure.

Moreover, as is shown in Figure 12, the alignment algorithm can also be used to determine the pose of an object in a cluttered scene (problem that arises in bin-picking tasks). In this case, point cloud 1 represents the scene and point cloud 2 represents the object model. Thus, the radius of the point-of-interest environment must be limited to the maximum radius that can be obtained for the object (in Figure 12 this limit was fixed at 120 mm).
Figure 12

3D alignment in cluttered scenes (point-of-interest environment radius equal to 120 mm). (a) Range images: scene (left) and object model (right). (b) Corresponding points (normal vector in magenta) which obtained the maximum similarity value. (c) CIRCON images associated with those points. (d) Alignment: reduced cloud (left) and 3D rendering using the original point clouds (right).

6. Conclusions

We have introduced a novel descriptor (CIRCON) which represents, through a cyclical image, the geometry of the environment of a point-of-interest in the cloud. In order to construct the image matrix we distribute the points in sectors which, in turn, are subdivided into cells that have the same radial length. The values of the matrix elements represent the maximum z coordinate of the points contained in their corresponding cells. This represents an important difference with respect to other methods that use 2D histograms, such as spin images [8], which make them more vulnerable to the density changes of the point clouds (especially when their densities are significantly different).

We have also designed a novel similarity measure that takes into account both the distances between the pixels of the descriptors and their degree of overlap, which are not considered by other methods due to the particular characteristics of the descriptors. Furthermore, this similarity measure takes advantage of the cyclical nature of the descriptor to obtain, along with the similarity value, an index that represents the rotation around the normal at the point-of-interest. When the similarity of two descriptors is evaluated, this rotation index, the matched points and their normal vectors can be used to calculate a Euclidean transformation matrix; that is, the two point clouds can be aligned by determining one single correspondence.

Using this similarity measure, the descriptors can be compared without having to restrict the neighborhood of the point-of-interest, so the discriminating power could be increased in order to avoid problems of misalignment when the objects have symmetries or repeated regions (problems that are not well solved by other methods, such as spin images, as is explained in [2]).

Based on this combination of descriptor and similarity measure we have designed a coarse alignment algorithm that eliminates the need to find a group of valid correspondences (which is necessary in most algorithms, including spin images [8]). One of the main advantages of this algorithm is that the stopping criterion is always evaluated when it finds a correspondence that exceeds the maximum similarity value found until that moment. Thus, if certain conditions are met, the algorithm ends without having to find additional correspondences.

The results show that the proposed algorithm is able to find a proper alignment despite using simple criteria for selecting the points-of-interest. However, in some cases these starting points are not the most appropriate and the algorithm has to perform more iterations than necessary. As one of the advantages of our proposed algorithm is that it can end once it finds a correspondence that has high similarity and that meets the stopping criterion, if the points-of-interest are appropriately selected, it is very likely that the algorithm could end after the first iterations on the majority of occasions. Furthermore, if these keypoints are obtained by new multiscale methods [34], the support sizes can be calculated for the descriptors in both point clouds and the alignment could be carried out, as in [35], using point clouds with different scale.



This study was carried out with the support of the Spanish CICYT project DPI2006-15313.

Authors’ Affiliations

Electronics Technology, Systems and Automation Engineering Department, University of Cantabria, Santander, Cantabria, Spain


  1. Sansoni G, Trebeschi M, Docchio F: State-of-the-art and applications of 3D imaging sensors in industry, cultural heritage, medicine, and criminal investigation. Sensors 2009, 9: 568-601. 10.3390/s90100568View ArticleGoogle Scholar
  2. Salvi J, Matabosch C, Fofi D, Forest J: A review of recent range image registration methods with accuracy evaluation. Image Vis Comput 2007, 25: 578-596. 10.1016/j.imavis.2006.05.012View ArticleGoogle Scholar
  3. Planitz BM, Maeder AJ, Williams JA: The correspondence framework for 3D surface matching algorithms. Comput Vis Image Understand 2005, 97: 347-383. 10.1016/j.cviu.2004.08.001View ArticleGoogle Scholar
  4. Chua CS, Jarvis R: Point signatures: A new representation for 3D object recognition. Int J Comput Vis 1997, 25: 63-85. 10.1023/A:1007981719186View ArticleGoogle Scholar
  5. Gelfand N, Mitra NJ, Guibas LJ, Pottmann H: Robust global registration. In Symposium on Geometry Processing. Vienna, Austria; 2005:197-206.Google Scholar
  6. Feldmar J, Ayache N: Rigid, affine and locally affine registration of free-form surfaces. Int J Comput Vis 1996, 18: 99-119. 10.1007/BF00054998View ArticleGoogle Scholar
  7. Barequet G, Sharir M: Partial surface matching by using directed footprints. In Proc 12th Annual Symp Computational Geometry. Philadelphia, USA; 1996:409-410.Google Scholar
  8. Johnson AE, Hebert M: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Pattern Anal Mach Intell 1999, 21: 433-449. 10.1109/34.765655View ArticleGoogle Scholar
  9. Ashbrook AP, Fisher RB, Robertson C, Werghi N: Aligning arbitrary surfaces using pairwise geometric histograms. In Proc NMBIA98. Glasgow, UK; 1998:103-108.Google Scholar
  10. Ashbrook AP, Fisher RB, Robertson C, Werghi N: Finding surface correspondence for object recognition and registration using pairwise geometric histograms. In Computer Vision-ECCV'98. Freiburg, Germany; 1998:674-686.View ArticleGoogle Scholar
  11. Yamany SM, Farag AA: Surface signatures: an orientation independent free-form surface representation scheme for the purpose of objects registration and matching. IEEE Pattern Anal Mach Intell 2002, 24: 1105-1120. 10.1109/TPAMI.2002.1023806View ArticleGoogle Scholar
  12. Masuda T: Automatic registration of multiple range images by the local log-polar range images. In Proc Third International Symposium on 3D Data Processing, Visualization, and Transmission. Chapel Hill, USA; 2006:216-223.View ArticleGoogle Scholar
  13. Körtgen M, Novotni M, Klein R: 3D shape matching with 3D shape contexts. In The 7th Central European Seminar on Computer Graphics. Budmerice, Slovakia; 2003:1-12.Google Scholar
  14. Zhang D: Harmonic shape images: a 3-D free-form surface representation and its application in surface matching. Ph.D. dissertation, Carnegie Mellon University; 1999.Google Scholar
  15. Stein F, Medioni G: Structural indexing: efficient 2D object recognition. IEEE Trans Pattern Anal Mach Intell 1992, 14: 1198-1204. 10.1109/34.177385View ArticleGoogle Scholar
  16. Wyngaerd JV, Koch R, Proesmans M, Gool LV: Invariant-based registration of surface patches. In IEEE International Conference on Computer Vision. Volume 1. Kerkyra, Greece; 1999:301-306.Google Scholar
  17. Krsek P, Pajdla T, Hlavác V, Martin R: Range image registration driven by a hierarchy of surfaces. In 22nd Workshop of the Austrian Association for Pattern Recognition. Illmitz, Austria; 1998:175-183.Google Scholar
  18. Song Chen C, Ping Hung Y, Bo Cheng J: RANSAC-based DARCES: a new approach to fast automatic registration of partially overlapping range images. IEEE Trans Pattern Anal Mach Intell 1999, 21: 1229-1234. 10.1109/34.809117View ArticleGoogle Scholar
  19. Chua CS, Jarvis R: 3D free-form surface registration and object recognition. Int J Comput Vis 1996, 17: 77-99. 10.1007/BF00127819View ArticleGoogle Scholar
  20. Cheng J, Don H: A graph matching approach to 3-D point correspondences. Pattern Recogn Artif Intell 1991, 5: 399-412. 10.1142/S0218001491000223View ArticleGoogle Scholar
  21. Besl PJ, McKay HD: A method for registration of 3-D shapes. IEEE Pattern Anal Mach Intell 1992, 14: 239-256. 10.1109/34.121791View ArticleGoogle Scholar
  22. Chen Y, Medioni G: Object modelling by registration of multiple range images. Image Vis Comput 1992, 10: 145-155. 10.1016/0262-8856(92)90066-CView ArticleGoogle Scholar
  23. Rusinkiewicz S, Levoy M: Efficient variants of the ICP algorithm. In Proceedings of the Third Intl Conf on 3D Digital Imaging and Modeling. Quebec City, Canada; 2001:145-152.View ArticleGoogle Scholar
  24. Wells WM III, Viola P, Atsumi H, Nakajima S, Kikinis R: Multi-modal volume registration by maximization of mutual information. Med Image Anal 1996, 1: 35-51. 10.1016/S1361-8415(01)80004-9View ArticleGoogle Scholar
  25. Studholme C, Hill D, Hawkes D: An overlap invariant entropy measure of 3D medical image alignment. Pattern Recogn 1999, 32: 71-86. 10.1016/S0031-3203(98)00091-0View ArticleGoogle Scholar
  26. Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G: Automated multi-modality image registration based on information theory. In Proc of International Conference on Information Processing in Medical Imaging. Ile de Berder, France; 1995:263-274.Google Scholar
  27. Studholme C, Hill DL, Hawkes DJ: Multiresolution voxel similarity measures for MR-PET registration. In Proc of International Conference on Information Processing in Medical Imaging. Ile de Berder, France; 1995:287-298.Google Scholar
  28. Skerl D, Likar B, Pernus F: A protocol for evaluation of similarity measures for rigid registration. IEEE Trans Med Imag 2006, 25: 779-791.View ArticleGoogle Scholar
  29. Penney G, Weese J, Little J, Desmedt P, Hill D, Hawkes D: A comparison of similarity measures for use in 2D-3D medical image registration. IEEE Trans Med Imag 1998, 17: 586-595. 10.1109/42.730403View ArticleGoogle Scholar
  30. Torre Ferrero C, Llata J, Robla S, Sarabia E: A similarity measure for 3D rigid registration of point clouds using image-based descriptors with low overlap. S3DV09. In IEEE 12th International Conference on Computer Vision, ICCV Workshops 2009. Kyoto, Japan; 2009:71-78.View ArticleGoogle Scholar
  31. Zinsser T, Schmidt J, Niemann H: A refined ICP algorithm for robust 3-D correspondence estimation. In Proceedings of the International Conference on Image Processing. Barcelona, Spain; 2003:695-698.Google Scholar
  32. Eisele K, Hetzel G: Range image database, University of Stuttgart.[]
  33. Mian A, Bennamoun M, Owens RA: A novel representation and feature matching algorithm for automatic pairwise registration of range images. Int J Comput Vis 2006, 66: 19-40. 10.1007/s11263-005-3221-0View ArticleGoogle Scholar
  34. Mian A, Bennamoun M, Owens RA: On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes. Int J Comput Vis 2010, 89: 348-361. 10.1007/s11263-009-0296-zView ArticleGoogle Scholar
  35. Novatnack J, Nishino K: Scale-dependent/invariant local 3D shape descriptors for fully automatic registration of multiple sets of range images. In Proceedings of the 10th European Conference on Computer Vision: Part III. Marseille, France; 2008:440-453.Google Scholar


© Torre-Ferrero et al; licensee Springer. 2012

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.