- Research
- Open access
- Published:

# A novel point cloud registration using 2D image features

*EURASIP Journal on Advances in Signal Processing*
**volume 2017**, Article number: 5 (2017)

## Abstract

Since a 3D scanner only captures a scene of a 3D object at a time, a 3D registration for multi-scene is the key issue of 3D modeling. This paper presents a novel and an efficient 3D registration method based on 2D local feature matching. The proposed method transforms the point clouds into 2D bearing angle images and then uses the 2D feature based matching method, SURF, to find matching pixel pairs between two images. The corresponding points of 3D point clouds can be obtained by those pixel pairs. Since the corresponding pairs are sorted by their distance between matching features, only the top half of the corresponding pairs are used to find the optimal rotation matrix by the least squares approximation. In this paper, the optimal rotation matrix is derived by orthogonal Procrustes method (SVD-based approach). Therefore, the 3D model of an object can be reconstructed by aligning those point clouds with the optimal transformation matrix. Experimental results show that the accuracy of the proposed method is close to the ICP, but the computation cost is reduced significantly. The performance is six times faster than the generalized-ICP algorithm. Furthermore, while the ICP requires high alignment similarity of two scenes, the proposed method is robust to a larger difference of viewing angle.

## 1 Introduction

During the last few decades, 3D scanner technology has rapidly developed and many different approaches have been proposed to build 3D-scanning devices that collect the shape data of objects and possibly their colors. Since the information obtained by 3D scanners is more accurate for 3D reconstruction than 2D images, the 3D scan technology is widely used in computer vision, industrial design, reverse engineering, robotics navigation, topography measurement, filmmaking, game creation, etc. Typically, the common data type obtained by 3D scanners is point cloud. And, point cloud registration plays an important role in 3D reconstructions because point clouds of given shapes are multiple views of an object and are in different coordinate systems. The goal of the registration is to find a transformation that optimally positions two given shapes, which are the reference and source in a common coordinate system [1, 2].

In [3], registration algorithms are classed as voting methods [4, 5] and corresponding feature pair-based methods [6–11]. In voting methods, the transformation space is initially reduced into a multi-dimensional table. Then, for each point pair of the two given point clouds, the transformation between the reference and the source is computed and a vote is recorded in the corresponding cell of the table. The optimal transformation is selected by the most votes. The basic approach of the latter is finding the corresponding points, curves, planes of 3D scenes or other features between the reference and the source. A rigid transformation can be derived by a set of corresponding features. In [12] and [13], iterated closest point (ICP) algorithm, a popular method for aligning two point clouds, was proposed. In ICP, the reference is fixed in position and orientation and the source is then iteratively transformed to finding the best rotation and translation that minimizes an error metric based on the distance. The basic steps of ICP are described in the following: The first step is finding the closest point in the reference point cloud for each point in the source point cloud. Then, an estimation of the combination of rotation and translation is performed by using a mean squared error cost function. After obtaining a predicted transformation, the source points are transformed and these steps are performed, iterated until the transformation is negligible. In most cases, the ICP has a good alignment result, but the computation cost is huge. The execution time of ICP heavily depends on the choice of the corresponding point-pairs and the distance function. Recently, many ICP-based algorithms have been proposed [3, 14–20]. In [21], the reliability of ICP which corresponds to the second order coefficients of the ICP objective function was proposed to improve the registration. The variance of the ICP registration error is inversely proportional to the reliability. The proposed ICPIF takes account of the Euclidean invariant features, including moment invariants and spherical harmonics invariants, in [18]. An absolute minimization was also proposed to decrease the probability of being trapped in a local minimum. In [22] and [23], expectation–maximization principles were combined into ICP to improve the robustness. A coarse-to-fine approach based on an annealing scheme was also proposed in [22]. In [24], a lie group-based affine registration was proposed. Affine registration problem can be simplified to a quadratic programming problem by transforming the affine transform matrix as exponential mappings of lie group and their Taylor approximations.

However, since most existing 3D registration methods are based on 3D geometrical surface features of objects, the computation complexity is high. While 3D feature based approaches take more time in finding the corresponding point pairs between the reference and the source, registrations with 2D features can reduce the computation complexity by using 2D image matching. Furthermore, many efficient matching approaches and feature descriptors have been proposed in recent decades.

In order to find the corresponding points between the two shapes of an object efficiently, some approaches initially transform the 3D shapes to 2D images. The depth image of a 3D image is usually adopted for transforming 3D point cloud to a 2D image. The transformation is simple and fast because the gray level of the depth image of a point present the distance from the view point to a point on the object surface, but the depth image discards some important geometric information of the object, e.g., the relation of a point and it neighbors. Depth images are too simple to be used in 3D registration which requires high precision. In [25], a novel approach for transforming a 3D point cloud to a 2D image was proposed, namely the bearing angle image. A bearing angle image is the gray level image composed from the angle between the point and the neighbor points, highlighting the edge formed by the angle. This paper presents a novel 3D alignment method based on 2D local feature matching. The proposed method transforms the point clouds into 2D bearing angle images and then uses the 2D image feature matching method, SURF, to find matching point pairs of two images.

The rest of this paper is organized as follows. The proposed 3D registration algorithm based on 2D image features is introduced in Sections 2 and 3. The 2D bearing angle image will also be reviewed in Section 2. Section 3.2 provides some experimental results and discussions. The conclusion of this paper is found in Section 4.

## 2 Transforming 3D shapes to 2D images

The main contribution of the proposed algorithm is reducing the computation of finding the corresponding points of two 3D shapes significantly. The key method is finding corresponding pixels of 2D images and then tracing back to find the corresponding points of two 3D shapes of an object. Thus, the transformation between the two shapes can be derived by the fine corresponding points. As shown in Fig. 1, the proposed method is divided into several steps: (1) converting the point clouds into bearing angle images; (2) extracting the features of the two 2D images by SURF and matching two 2D images; (3) obtaining the 3D corresponding points of the two point clouds with respect to the 2D matching points; (4) using the top half of the corresponding pairs to find the optimal rotation matrix by the least squares approximation; (5) aligning two point clouds according to the transformation matrix. The details of the steps will be described in next.

### 2.1 Bearing angle image

The depth image of a 3D image is usually adopted for transforming 3D point cloud to a 2D image. However, because a pixel of a depth image is the value of the Z-coordinate of a point cloud, the relation between a point and its neighbor points is not represented in a depth image. Unlike the depth image, a bearing angle image (BA image) proposed in [25] can highlight the edge formed by angle. A BA image is a gray level image composed from the angle between a point and its neighbor point and thus more features can be extracted from the BA image. In [25], the BA image was proposed for the extrinsic calibration of calibration of a 3D LIDAR and camera.

Consider a 3D shape of an object. The gray level of a pixel of its BA image is defined as the angle between the laser beam and the vector from the point to a consecutive point. This angle is calculated for each point of the shape along the four defined directions which are the horizontal, vertical, or diagonal directions. In this paper, the diagonal direction is adopted. In Fig. 2, the blue points, *PC*
_{
i,j
} and *PC*
_{
i-1,j-1
}, are the measurement points and the black point, *O*, is the origin of the point cloud, which is also the source of the laser. The angle value *BA*
_{
i,j
} of a point *PC*
_{
i,j
} can be defined as

where *ρ*
_{
i,j
} is the measured value of the *j*th scanned point of the *i-*th scanning layer and *ρ*
_{
i-1,j-1
} is the measured value of the (*j-1*)th scanned point of the (*i-1*)th scanning layer. The *dφ* is the corresponding angle increment (laser beam angular step in the direction of the trace). A 3D point can be converted to a gray-level pixel of 2D image and a 2D image can be obtained by converting all points of a captured 3D image. Figure 3a is a raw point cloud and Fig. 3b is a bearing angle image from the same point cloud.

### 2.2 The feature extraction and matching of BA images

After the previous step, the two point clouds of the 3D shapes of the reference and the source are converted into 2D images. Then, the feature based matching methods for common 2D images can be used for finding the corresponding points of these two BA images. In recent years, many efficient feature based matching methods have been proposed, e.g., scale invariant feature transform (SIFT), speed-up robust features (SURF) and spin-image method. The SURF reviewed below is adopted for feature extraction and matching of 2D images in this paper.

Speed-up robust features (SURF) proposed by H. Bay et al. [26, 27] is a descriptor of feature points of images. The method, inspired by SIFT, uses integral images and Haar wavelet responses based on approximation. SURF reserves most of the features of SIFT, and reduces the computing time by decreasing the number of dimensions. There are four main steps in SURF algorithm: (1) obtaining the integral image, (2) using the Hessian matrix to determine the feature point, (3) deciding the feature point and its direction, and (4) extracting the feature description vector.

A feature descriptor extracted by SURF assigns its main direction and 64 feature vectors of its neighbors. The matching is by computing the Euclidean distance of two feature vectors. In addition, a good matching is determined by the ratio between the minimum distance and the second minimum distance. If the ratio is greater than a threshold, then the matching pair is a good corresponding pair of images.

After the matching step, a number of corresponding points are found. Figure 4a is two bearing angle images and Fig. 4b shows the corresponding pixels found by SURF. The matching result shows their correspondence mostly fall on the edge of chair, sofa next to a wall. These matching pixels will be used to find the corresponding point pairs of two point clouds (Fig. 5a and b).

## 3 Derive the transformation matrix by orthogonal Procrustes method with corresponding point sets

In the previous section, the corresponding pixel pairs of the two BA images can be obtained by the SURF. For a pixel of the 2D BA image, *BA*
_{
ij
}, its original 3D point of the point cloud is the *j*th scanned point of the *i*th scanning layer. The projection is an one-to-one mapping between 3D points and BA pixels. Therefore, the original 3D point of a 2D image pixel *BA*
_{
ij
} is 3D point *PC*
_{
ij
}. Figure 5a and b shows the matching 3D points, which are derived by the matching pixels of the 2D BA images, of two point clouds. A matched point pair is marked with the same color.

SURF-based approach is a good matching method and most of the corresponding pairs are correct, but there are still a few mismatched pairs. In this paper, the corresponding pairs are used to derive the transformation matrix by the least squares approach. However, with outliers in the data set, the least squares approximation is not optimal. The mismatched pairs have to be discarded in advance to improve the accuracy of transformation. Since the corresponding pairs are sorted by their distance between matching features, only the top half of the corresponding pairs are used to find the optimal rotation matrix by the least squares approximation. The problem is now represented as finding the optimal rigid transformation of two corresponding point sets.

### 3.1 The transformation between two point clouds without translation

Over the past few decades several existing algorithms were proposed for finding the optimal rigid transformation of two corresponding point cloud, ** P** and

**[28–31]. Basically, these approaches can be divided into two methods, the SVD-based method and quaternion-based method. However, for the highest level of accuracy and stability, the SVD-based method is adopted in this paper. Since the two corresponding point pair sets,**

*Q***and**

*P*

*Q**,*are usually more than three pairs, finding the rotation matrix between the two sets can be regarded as orthogonal Procrustes problem. First, it is assumed that the translation is the centroid of

**to the centroid of**

*P***. Thus, restating the problem without translation, the points of two sets are rewritten as**

*Q*where \( \overline{p} \) and \( \overline{q} \), the centroids of two sets, are expressed by

The new corresponding point sets are *P*
_{
c
} and *Q*
_{
c
}. Since the optimal rotation ** R** implies the minimal transformation error, the relation can be expressed as

where ‖ ⋅ ‖_{
F
} is Frobenius norm defined as

The optimization can be rewritten by the trace form.

Since \( {\left\Vert {Q}_c\right\Vert}_F^2 \) and \( {\left\Vert {P}_c\right\Vert}_F^2 \) are constant, the minimization of Eq. (6) is dominated by maximizing the term, *tr*(*P*
_{
c
}
^{T}
*Q*
_{
c
}
*R*
^{T})). It can be represented as

Since the *R* is an orthogonal matrix, the constrained optimization problem can be solved by the method of Lagrange multipliers. The Lagrange multipliers is defined as

Since \( \frac{\partial L}{\partial R}=0 \) implies *tr*(*Λ*(*RR*
^{T} − *I*)) = 0, the first deviation derivative of (8) is

Eq. (9) can be rewritten as

Since (*Λ* + *Λ*
^{T}) is symmetric, *P*
_{
c
}
^{T}
*Q*
_{
c
}
*R*
^{T} is also symmetric. Considering the SVD of *P*
_{
c
}
^{T}
*Q*
_{
c
} is *UΣV*
^{T}, thus we have

where

From Eq. (11), the optimal rotation matrix can be obtained by

Thus, the optimal transformation is represented as

where \( t=\overline{q}-\overline{p} \) and *R* = *UV*
^{T}. As a result, the two point clouds can be aligned together with the obtained rotation matrix and translation vector in (14).

### 3.2 Simulation result

The simulation is performed in an indoor environment as shown in Fig. 6 and the result shows that the proposed algorithm has high performance. In a 10 × 10 m room, several frames are captured by LIDAR and the disparity between two consecutive frames is 30°. A captured frame is a point cloud with 180 layers and 200 points of a layer.

The matching result of two point clouds of Fig. 4 is shown and it is clear that the most of matched pairs are correct since the pair has high similarity. However, there are a few mismatched pairs. Since the corresponding pairs are sorted by their Euclidean distance between matching features, only the top half of the corresponding pairs are used to find the optimal rotation matrix by least squares approximation.

Using the top half of the corresponding pairs, the optimal transformation matrix can be obtained. In Fig. 8a, b, the registration results of Fig. 7 are shown in top view and side view, respectively. As shown in Fig. 8, the proposed algorithm works well for aligning the two point clouds with the larger disparity of view angle. In Fig. 9, a comparison of the proposed algorithm and G-ICP (Generalized-ICP) is given. The G-ICP (generalized-ICP) [32] combines the iterative closest point and point-to-plane ICP algorithms into a single probabilistic framework. In this experiment, the maximum iteration of G-ICP is 100 and the maximum distance threshold in the correspondence of G-ICP is set as 8 cm. The proposed algorithm is as good as G-ICP except the area marked with a red circle.

The zoom-in figures of red circle area of Fig. 9a and b obtained by the proposed algorithm and G-ICP are shown in Fig. 10a and b, respectively. The maximum dimensions of the captured point clouds is about 4 m × 4 m × 4 m. The average matching error of the proposed algorithm is 4.1 cm and the average matching error of the G-ICP is 1.5 cm. The precision of the proposed algorithm is 99%.

Registration of four different types of objects is also given in Fig. 11. The sizes of these point clouds are smaller than the size of the previous one and the algorithm also works well. The computation times of two algorithms are shown in Table 1. It is obvious that the proposed algorithm reduces the computation time significantly and the performance is 10 times faster than the G-ICP algorithm. In the fourth column of Table 1, the proposed algorithm is used as a coarse alignment algorithm and the G-ICP is used as a fine alignment algorithm. The total running time is faster than pure G-ICP algorithm and the running time is also reduced 63%.

## 4 Conclusions

This paper presents a novel 3D alignment method based on 2D local feature matching. The proposed method converts the point clouds to 2D bearing angle images and then uses the 2D image feature-based matching method, SURF, to find matching pixel pairs of two images. Those corresponding pixels can be used to obtain the original corresponding pairs of the two 3D point clouds. Since the two corresponding point pair sets are usually more than three pairs, only the top 50% of the best corresponding pairs are used to find the optimal rigid transformation matrix by the least squares approximation of orthogonal Procrustes problem.

The main contribution of this paper is proposing a fast and robust approach of 3D point cloud registration. Since the proposed algorithm is finding the corresponding points in 2D image and without iterative process, the performance of the proposed algorithm is better than ICP based algorithms. In our simulation, the proposed algorithm is a high precision registration in which the displacement is 1%. Furthermore, the proposed algorithm has good performance and the proposed algorithm is 10 times faster than the G-ICP algorithm. Some ICP-based approaches have an initial guess or a coarse alignment algorithm to speed up the ICP algorithm. The proposed algorithm can be used as a coarse alignment algorithm. In the simulation, the execution time can be reduced by 63%.

There are some potential research topics which can be discussed in the future. First, it is obvious that if mismatching pairs are selected for the least squares approximation of orthogonal Procrustes problem, the solution is not optimal and the techniques of removing outliers should be used. Furthermore, the corresponding pairs can be selected by RANSAC approach to remove outlier data. With the obtained transformation matrix as the initial guess of ICP, the performance of ICP may be increased significantly.

## References

O Cordón, S Damas, J Santamaría, A fast and accurate approach for 3D image registration using the scatter search evolutionary algorithm. Pattern Recogn. Lett.

**27**(11), 1191–1200 (2006)L Silva, ORP Bellon, KL Boyer, Precision range image registration using a robust surface interpenetration measure and enhanced genetic algorithms. IEEE Trans. Pattern Anal. Mach. Intell.

**27**(5), 762–776 (2005)N Gelfand et al., Robust global registration. Symposium on geometry processing

**2**(3), 5 (2005)YC Hecker, RM Bolle, On geometric hashing and the generalized Hough transform, IEEE Transactions on Systems. Man and Cybernetics

**24**(9), 1328–1338 (1994)HJ Wolfson, I Rigoutsos,

*Geometric hashing: an overview, IEEE Computational Science & Engineering*, 1997, pp. 10–21BKP Horn, Extended Gaussian images. Proc. IEEE

**72**(12), 1671–1686 (1984)RJ Campbell, PJ Flynn,

*Eigenshapes for 3D object recognition in range data, 1999. IEEE Computer Society Conference on Computer Vision and Pattern Recognition*, vol. 2, 1999R Osada et al., Shape distributions. ACM Transactions on Graphics (TOG)

**21**(4), 807–832 (2002)RB Rusu et al.,

*Aligning point cloud views using persistent feature histograms. 2008. IROS 2008. IEEE/RSJ International Conference on Intelligent Robots and Systems*, 2008, pp. 3384–3391RB Rusu, N Blodow, M Beetz,

*Fast point feature histograms (FPFH) for 3D registration. 2009. ICRA’09. IEEE International Conference on Robotics and Automation*, 2009, pp. 3212–3217RB Rusu et al.,

*Fast 3d recognition and pose using the viewpoint feature histogram. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)*, 2010, pp. 2155–2162PJ Best, ND McKay, A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell.

**14**(2), 239–256 (1992)Z Zhang, Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vis.

**13**(2), 119–152 (1994)A Gruen, D Akca, Least squares 3D surface and curve matching. ISPRS J. Photogramm. Remote Sens.

**59**(3), 151–174 (2005)AW Fitzgibbon, Robust registration of 2D and 3D point sets. Image Vis. Comput.

**21**(13), 1145–1153 (2003)A Makadia, A Patterson, K Daniilidis, Fully automatic registration of 3D point clouds. IEEE Conference on Computer Vision and Pattern Recognition

**1**, 1297–1304 (2006)AE Johnson, M Hebert, Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell.

**21**(5), 433–449 (1999)GC Sharp, SW Lee, DK Wehe, ICP registration using invariant features. IEEE Trans. Pattern Anal. Mach. Intell.

**24**(1), 90–102 (2002)S Rusinkiewicz, M Levoy, S Rusinkiewicz, M Levoy,

*Efficient variants of the ICP algorithm. 2001. Proceedings. 2001 IEEE Third International Conference on 3-D Digital Imaging and Modeling*, 2001, pp. 145–152A Nüchter, K Lingemann, J Hertzberg,

*Cached kd tree search for ICP algorithms. 2007 IEEE Sixth International Conference on 3-D Digital Imaging and Modeling*, 2007, pp. 419–426B-U Lee, C-M Kim, R-H Park, An orientation reliability matrix for the iterative closest point algorithm. IEEE Trans. Pattern Anal. Mach. Intell.

**22**(10), 1205–1208 (2000)S Granger, X Pennec,

*Multi-scale EM-ICP: A fast and robust approach for surface registration, European Conference on Computer Vision*, 2002, pp. 418–432Y Liu, Automatic registration of overlapping 3D point clouds using closest points. Image Vis. Comput.

**24**(7), 762–781 (2006)S Du et al., Affine iterative closest point algorithm for point set registration. Pattern Recogn. Lett.

**31**(9), 791–799 (2010)D Scaramuzza, A Harati, R Siegwart,

*Extrinsic self calibration of a camera and a 3D laser range finder from natural scenes. 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems*, 2007, pp. 4164–4169H Bay et al., Speeded-up robust features (SURF). Comput. Vis. Image Underst.

**110**(3), 346–359 (2008)H Bay, T Tuytelaars, L Van Gool,

*Surf: Speeded up robust features, in Computer vision--ECCV 2006*(Springer, 2006), pp. 404–417KS Arun, TS Huang, SD Blostein, Least-squares fitting of two 3-D point sets. IEEE Trans. Pattern Anal. Mach. Intell.

**5**, 698–700 (1987)BKP Horn, Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am.

**4**(4), 629–642 (1987)BKP Horn, HM Hilden, S Negahdaripour, Closed-form solution of absolute orientation using orthonormal matrices. J. Opt. Soc. Am.

**5**(7), 1127–1135 (1988)MW Walker, L Shao, RA Volz, Estimating 3-D location parameters using dual number quaternions. CVGIP: Image Understanding

**54**(3), 358–367 (1991)A Segal, D Haehnel, S Thrun, Generalized-ICP, in

*Robotics: Science and Systems*. 4, 2009

### Acknowledgement

This work was supported in part by the Ministry of Science and Technology of Taiwan under the grants no. MOST 104-2221-E-224-038.

### Authors’ contributions

CCL and YCT conceived and designed the experiments. JJL performed the experiments. CCL and YSC analyzed the data. CCL contributed reagents/materials/analysis tools. CCL and YCT are writers of the paper. All authors read and approved the final manuscript.

### Competing interests

The authors declare that they have no competing interests.

## Author information

### Authors and Affiliations

### Corresponding author

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Lin, CC., Tai, YC., Lee, JJ. *et al.* A novel point cloud registration using 2D image features.
*EURASIP J. Adv. Signal Process. * **2017**, 5 (2017). https://doi.org/10.1186/s13634-016-0435-y

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/s13634-016-0435-y