Radar SLAM using visual features

Callmer, Jonas; Törnqvist, David; Gustafsson, Fredrik; Svensson, Henrik; Carlbom, Pelle

doi:10.1186/1687-6180-2011-71

Research
Open access
Published: 23 September 2011

Radar SLAM using visual features

Jonas Callmer¹,
David Törnqvist¹,
Fredrik Gustafsson¹,
Henrik Svensson² &
…
Pelle Carlbom³

EURASIP Journal on Advances in Signal Processing volume 2011, Article number: 71 (2011) Cite this article

8520 Accesses
30 Citations
Metrics details

Abstract

A vessel navigating in a critical environment such as an archipelago requires very accurate movement estimates. Intentional or unintentional jamming makes GPS unreliable as the only source of information and an additional independent supporting navigation system should be used. In this paper, we suggest estimating the vessel movements using a sequence of radar images from the preexisting body-fixed radar. Island landmarks in the radar scans are tracked between multiple scans using visual features. This provides information not only about the position of the vessel but also of its course and velocity. We present here a navigation framework that requires no additional hardware than the already existing naval radar sensor. Experiments show that visual radar features can be used to accurately estimate the vessel trajectory over an extensive data set.

I. Introduction

In autonomous robotics, there is a need to accurately estimate the movements of a vehicle. A simple movement sensor like a wheel encoder on a ground robot or a pit log on a vessel will under ideal circumstances provide quite accurate movement measurements. Unfortunately, they are sensitive to disturbances. For example, wheel slip due to a wet surface will be interpreted incorrectly by a wheel encoder, and strong currents will not be correctly registered by the pit log why a position estimate based solely on these sensors will drift off. In applications like autonomous robotics, the movement accuracy needs to be high why other redundant movement measurement methods are required.

A common approach is to study the surroundings and see how they change over time. By relating the measurements of the environment k seconds ago to the present ones, a measurement of the vehicle translation and rotation during this time interval can be obtained. A system like this complements the movement sensor and enhances the positioning accuracy.

Most outdoor navigation systems such as surface vessels use global navigation satellite systems (GNSS) such as the Global Positioning System (GPS) to measure their position. These signals are weak making them very vulnerable to intentional or unintentional jamming [1–3]. A supporting positioning system that is redundant of the satellite signals is therefore necessary. By estimating the vessel movements using the surroundings, a mean of measuring the reliability of the GPS system is provided. The movement estimates can also be used during a GPS outage providing accurate position and movement estimates over a limited period of time. This support system could aid the crew in critical situations during a GPS outage, avoiding costly PR disasters such as running aground.

For land-based vehicles or surface vessels, three main sensor types exist that can measure the environment: cameras, laser range sensors and radar sensors. Cameras are very rich in information and have a long reach but are sensitive to light and weather conditions. Laser range sensors provide robust and accurate range measurements but also they are very sensitive to weather conditions. The radar signal is usually the least informative signal of the three and is also quite sensitive to what the signals reflect against. On the other hand, the radar sensor works pretty much equally well in all weather conditions.

In this paper, radar scan matching to estimate relative movements is studied. The idea is to use the radar as an imagery sensor and apply computer vision algorithms to detect landmarks of opportunity. Landmarks that occur during consecuting radar scans are then used for visual odometry, that gives speed, relative position and relative course. The main motivation for using visual features to match radar scans instead of trying to align the radar scans is that visual features are easily matched despite large translational and rotational differences, which is more difficult using other scan matching techniques. The landmarks can optionally be saved in a map format that can be used to recognize areas that have been visited before. That is, a by-product of the robust navigation solution is a mapping and exploration system.

Our application example is based on a military patrol boat, Figure 1, that often maneuvers close to the shore in high speeds, at night, without visual aid in situations where GPS jamming or spoofing cannot be excluded. As the results will show, we are able to navigate in a complex archipelago using only the radar and get a map that is very close to ground truth.

To provide a complete backup system for GPS, global reference measurements are necessary to eliminate the longterm drift. The surface navigation system in [4, 5], assumed that an accurate sea chart is available. The idea was to apply map matching between the radar image and the sea chart, and the particle filter was used for this mapping. Unfortunately, commercial sea charts still contain rather large absolute errors of the shore, see [1, 2], which makes them less useful in blind navigation with critical maneuvers without visual feedback.

The radar used in these experiments measures the distances to land areas using 1,024 samples in each direction, and a full revolution is comprised of roughly 2,000 directions. Each scan has a radius of about 5 km giving a range resolution of roughly 5 m. These measurements are used to create a radar image by translating the range and bearing measurements into Cartesian coordinates. An example of the resulting image is shown in Figure 2.

The radar image gives a birds eye view of the surrounding islands and by tracking these islands, information about how the vessel is moving is obtained. We use the Scale-Invariant Feature Transform (SIFT) [6] to extract trackable features from the radar image which are subsequently matched with features from later scans. These features are shown to be distinct and stable enough to be used for island tracking. Other feature detectors like Speeded Up Robust Features (SURF) [7] could equally well have been used. When these features are tracked using a filter, estimates of the vessel movements are obtained that over time give an accurate trajectory estimate.

The outline is as follows; Section II gives a overview of the related work followed by a theoretical filtering framework in Section III. In Section IV, the performance of SIFT is evaluated on radar images, and the trajectory estimation performance on experimental data is given in Section V. The paper then ends in Section VI with conclusions and suggested future work.

II. Background and relation to slam

The approach in this contribution is known as the Simultaneous Localization And Mapping (SLAM) problem. Today, SLAM is a fairly well-studied problem with solutions that are reaching some level of maturity [8, 9]. SLAM has been performed in a wide variety of environments such as indoors [10], in urban [11–14] and rural areas [13, 15], underwater [16, 17] and in the air [18] and the platform is usually equipped with a multitude of sensors such as lasers, cameras, inertial measurement units, wheel encoders, etc. In this work, we will use only the radar sensor of a naval vessel to perform SLAM in a maritime environment. The data used were recorded in the Stockholm archipelago by Saab Bofors Dynamics [19].

Radars have been used for a long time to estimate movements, for example in the early experiments by Clark and Durrant-Whyte [20]. Radar reflecting beacons in known positions were tracked using a millimeter radar, and this was shown to improve the movement estimates. Shortly after, Clark and Dissanayake [21] extended the work by tracking natural features instead of beacons.

Thereafter, laser range sensors became more popular since they are more reliable, giving a range measurement in all directions. The problem of estimating the vehicle movements became a problem of range scan alignment. This was studied among others in [22–25].

The advantages of the radar, such as its ability to function in all weather conditions, have though resulted in it making a comeback. Lately, microwave radars have been used in SLAM experiments but now using a landmark free approach. In [13], SLAM was performed in both urban and rural areas by aligning the latest radar scan with the radar map using 3D correlations to estimate the relative movements of the vehicle. The radar map was constructed by consecutively adding the latest aligned radar scan to the previous scans. Checchin et al. [14] performed SLAM in an urban scenario by estimating the rotation and translation of the robot over a sequence of scans using the Fourier-Mellin Transform. It can match images that are translated, rotated and scaled and can therefore be used to align radar scans [26]. Chandran and Newman [27] jointly estimated the radar map and the vehicle trajectory by maximizing the quality of the map as a function of a motion parametrization.

Millimeter wave radars have also become more commonplace in some segments of the automotive industry, and the number of applications for them are growing. For example, the road curvature has been estimated using the radar reflections that will be used in future systems in collision warning and collision avoidance [28, 29].

The problem of radar alignment is also present in meteorology where space radar and ground radar observations are aligned to get a more complete picture of the weather in [30]. The scans are aligned by dividing them into smaller volumes that are matched by their respective precipitation intensities.

Visual features like SIFT or SURF have been used in camera-based SLAM many times before. Sometimes, the features were used to estimate relative movements [16, 31, 32], and other times, they were used to detect loop closures [10, 17, 33, 34].

The combination of radar and SIFT has previously been explored by Li et al. in [35], where Synthetic Aperture Radar measurements were coregistered using matched SIFT features. Radar scan matching using SIFT was also suggested in the short papers [36, 37]. A system with parallel stationary ground radars is discussed and SIFT feature matching is suggested as a way to estimate the constant overlaps between the scans. No radar scans ever seem to be matched in those papers though. To the best of the authors knowledge, this is the first time visual features have been used to estimate the rotational and translational differences between radar images.

III. Theoretical framework

All vessel movements are estimated relative a global position. The positions of the tracked landmarks are not measured globally but relative to the vessel. Therefore, two coordinate systems are used, one global for positioning the vessel and all the landmarks and one local relating the measured feature positions to the vessel. Figure 3 shows the local and global coordinate systems, the vessel and a landmark m.

The variables needed for visual odometry are summarized in Table 1.

Table 1 Summary of notation

Full size table

A. Detection model

Each radar scan has a radius of about 5 km with a range resolution of 5 meters, and the antenna revolution takes about 1.5 s.

If a landmark is detected at time t, the radar provides a range, r_t , and bearing, θ_t , measurement to the island landmark i as

y_{t}^{i} = (\begin{gathered} r_{t}^{i} \\ θ_{t}^{i} \end{gathered}) + e_{t}^{i}

(1)

where $e_{t}^{i}$ is independent Gaussian noise. These echos are transformed into a radar image using polar to rectangular coordinates conversion, and the result is shown in Figure 2. Figures 1 and 2 also show that the forward and sideways facing parts of the scans are the most useful ones for feature tracking. This is due to the significant backwash created by the jet propulsion system of the vessel, which is observed along the vessel trajectory in Figure 1. This backwash disturbs the radar measurements by reflecting the radar pulse, resulting in the stripe-shaped disturbances behind the vessel in Figure 2.

SIFT is today a well-established standard method to extract and match features from one image to features extracted from a different image covering the same scene. It is a rotation and affine invariant Harris point extractor that uses a difference-of-Gaussian function to determine scale. Harris points are in turn regions in the image where the gradients of the image are large, making them prone to stand out also in other images of the same area. For region description, SIFT uses gradient histograms in 16 subspaces around the point of interest.

In this work, SIFT is used to extract and match features from radar images. By tracking the SIFT features over a sequence of radar images, information about how the vessel is moving is obtained.

B. Measurement model

Once a feature has been matched between two scans, the position of the feature is used as a measurement to update the filter.

Since the features are matched in Cartesian image coordinates, the straightforward way would be to use the pixel coordinates themselves as a measurement. After having first converted the pixel coordinates of the landmark to coordinates in the local coordinate system, the Cartesian feature coordinates are now related to the vessel states as

{\bar{y}}_{t}^{i} = (\begin{gathered} y_{x, t}^{i} \\ y_{y, t}^{i} \end{gathered}) + ē_{t}^{i} = R (ψ_{t}) (\begin{gathered} m_{X, t}^{i} - X_{t} \\ m_{Y, t}^{i} - Y_{t} \end{gathered}) + (\begin{matrix} ē_{X, t}^{i} \\ ē_{Y, t}^{i} \end{matrix})

(2)

where $y_{x, t}^{i}$ is the measured x-coordinate of feature i in the local coordinate frame at time t and R(ψ_t ) is the rotation matrix between the ship orientation and the global coordinate system. (X, Y ) and $(m_{X}^{i}, m_{Y}^{i})$ are global vessel position and global position of landmark i, respectively.

The problem with this approach is that $ē_{X, t}$ and $ē_{Y, t}$ in (2) are dependent since they are both mixtures of the range and bearing uncertainties of the radar sensor. These dependencies are also time dependent since the mixtures depend on the bearing of the radar sensor. Simply assuming them to be independent will introduce estimation errors.

A better approach is to convert the Cartesian landmark coordinates back to polar coordinates and use these as a measurement

y_{t}^{i} = (\begin{gathered} r_{t}^{i} \\ θ_{t}^{i} \end{gathered}) = (\begin{matrix} \sqrt{{(m_{X, t}^{i} - X_{t})}^{2} + {(m_{Y, t}^{i} - Y_{t})}^{2}} \\ arctan (\frac{m_{Y, t}^{i} - Y_{t}}{m_{X, t}^{i} - X_{t}}) - ψ_{t} \end{matrix}) + (\begin{matrix} e_{r, t}^{i} \\ e_{θ, t}^{i} \end{matrix}) .

(3)

This approach results in independent noise parameters $e_{r} ~ N (0, σ_{r}^{2})$ and $e_{θ} ~ N (0, σ_{θ}^{2})$ , which better reflect the true range and bearing uncertainties of the range sensor.

C. Motion model

The system states describing the vessel movements at time instant t are

z_{t} = {(X_{t} Y_{t} v_{t} ψ_{t} ω_{t} ϕ_{t})}^{T}

(4)

where v is the velocity, ψ is the course, ω is the angular velocity and ϕ _t is the crab angle, i.e. the wind and stream induced difference between course and velocity vector (normally small). Due to the size and the speed of the vessel, Figure 1, the crab angle is assumed to be very small throughout the experiments. The system states are more extensively described in Table 1 and are also shown in Figure 3. We will be using a coordinated turn model, though there are many possible motion models available.

When landmarks at unknown positions are tracked to estimate the movements of the vessel, these should be kept in the state vector. If the same landmarks are tracked over a sequence of radar scans, a better estimate of the vessel movement is acquired than if they are tracked between just two.

The system states are therefore expanded to also include all landmarks within the field of view to create a visual odometry framework. The new state vector becomes

z_{t} = {(X_{t} Y_{t} v_{t} ψ_{t} ω_{t} ϕ_{t} m_{X, t}^{k} m_{Y, t}^{k} . . . m_{Y, t}^{l})}^{T} .

(5)

Only the l - k + 1 latest landmarks are within the field of view why only these are kept in the state vector. As the vessel travels on, the landmarks will one by one leave the field of view why they will be removed from the state vector and subsequently replaced by new ones.

When all old landmarks are kept in the state vector even after they have left the field of view, it is a SLAM framework. If an old landmark that left the field of view long ago was rediscovered, this would allow for the whole vessel trajectory to be updated. This is called a loop closure and is one of the key features in SLAM. The SLAM state vector is therefore

z_{t} = {(X_{t} Y_{t} v_{t} ψ_{t} ω_{t} ϕ_{t} m_{X, t}^{1} m_{Y, t}^{1} \dots)}^{T} .

(6)

A discretized linearization of the coordinated turn model using the SLAM landmark augmentation gives

(\begin{matrix} X_{t + Δ t} \\ Y_{t + Δ t} \\ v_{t + Δ t} \\ ψ_{t + Δ t} \\ ω_{t + Δ t} \\ ϕ_{t + Δ t} \\ m_{X, t + Δ t}^{1} \\ m_{Y, t + Δ t}^{1} \\ ⋮ \end{matrix}) = (\begin{matrix} X_{t} + \frac{2 v_{t}}{ω_{t}} sin (\frac{ω_{t} Δ t}{2}) cos (ψ_{t} + ϕ_{t} + \frac{ω_{t} Δ t}{2}) \\ Y_{t} + \frac{2 v_{t}}{ω_{t}} sin (\frac{ω_{t} Δ t}{2}) sin (ψ_{t} + ϕ_{t} + \frac{ω_{t} Δ t}{2}) \\ v_{t} + ν_{v, t} \\ ψ_{t} + ω_{t} Δ t \\ ω_{t} + ν_{ω, t} \\ ϕ_{t} + ν_{ϕ, t} \\ m_{X, t}^{1} \\ m_{Y, t}^{1} \\ ⋮ \end{matrix})

(7)

where Δt is the difference in acquisition time between the two latest matched features. ν_v , ν_ω and ν_ϕ are independent Gaussian process noises reflecting the movement uncertainties of the vessel.

D. Multi-rate issues

Having defined a motion model and a measurement model, state estimation is usually straightforwardly implemented using standard algorithms such as the extended Kalman filter (EKF), the unscented Kalman filter (UKF) or the particle filter (PF), see [38]. These filters iterate between a prediction step based on the motion model and a correction step based on the measurement model and the current measurement. The most natural approach is to stack all landmarks from one radar revolution into a large measurement vector and then run the filter with Δt = T = 1.5, where T is the radar revolution time (1.5 in our application). There are, however, two nonstandard problems here.

The first problem is that all matched features should not be used to update the filter at the same time. Since the vessel is moving while the radar is revolving, a shift in the radar image is introduced. A vessel speed of 10 m/s will result in a difference in landmark position of about 15 m from the beginning to the end of the scan. If the vessel is traveling with a constant velocity and course, then all relative changes in feature positions between two scans would be equal, but if the vessel is turning or accelerating, a shift in the relative position change is introduced. This results in different features having different relative movements that will introduce estimation errors if all measurements are used at the same time. We have found the error caused by this batch approach to be quite large. Therefore, the filter should be updated by each matched feature independently.

If independent course and velocity measurements were available, they could be used to correct the skewness in the radar image. The filter estimates of velocity and course should though not be used for scan correction, since this would create a feedback loop from the estimates to the measurements that can cause filter instability.

Second, if there was one landmark detected in each scan direction, the filter could be updated with the rate $Δ t = T ∕ N = \frac{1.5}{2000}$ where N is the number of measurements per rotation (2,000 in our application). This is not the case though and we are facing a multi-rate problem with irregularly sampled measurements. The measurement model can now conveniently be written as

y_{t} = \{\begin{matrix} y_{t}^{i} & i f l a n d m a r k i i s d e t e c t e d a t t i m e t \\ N a N & o t h e r w i s e . \end{matrix}

(8)

Now, any of the filters (EKF, UKF, PF) can be applied at rate Δt = T/N using (8), with the understanding that a measurement being NaN simply means that the measurement update is skipped.

E. Alternative landmark free odometric framework

The landmark-based framework derived above suffers from one apparent shortcoming: the number of features will grow very fast. After only a short time period, thousands of potential landmarks will have been found, causing large overhead computations in the implementation. Either a lot of restrictions must be made on which of the new landmarks to track, or a different approach is needed.

If the map is not central, an approach based on differential landmark processing could be taken. Instead of tracking the same feature over a sequence of scans, features are only matched between two scans to compute the relative movement between the sweeps.

1) Relative movement estimation: As described in Section III-D, all features from the entire scan should not be used to estimate the relative movement at the same time. If the vessel is accelerating or turning, the scan will be scewed causing estimation errors. The idea is therefore to use subsets of features, measured over a short time interval, to compute the relative changes in course Δψ_t and position ΔX_t and ΔY_t between two scans. The scans are simply divided into multiple slices where each segment covers a time interval τ_t . The relative movement and course estimates are therefore calculated multiple times per scan pair.

The relative change in position and course can be described as a relationship between the landmark positions measured in the local coordinate frame at time t and t - T. These landmark positions are related as

(\begin{gathered} y_{x, t - T}^{i} \\ y_{y, t - T}^{i} \end{gathered}) = (\begin{matrix} c o s (Δ ψ_{t}) & - s i n (Δ ψ_{t}) \\ s i n (Δ ψ_{t}) & c o s (Δ ψ_{t}) \end{matrix}) (\begin{gathered} y_{x, t}^{i} \\ y_{y, t}^{i} \end{gathered}) + (\begin{matrix} Δ X_{t} \\ Δ Y_{t} \end{matrix})

(9)

where $y_{x, t}^{i}$ is the measured x - coordinate of landmark i at time instant t in the local coordinate system. $y_{x, t - T}^{i}$ is the measured x - coordinate in the previous scan.

If (9) was used to estimate the changes in course and position between two scans using each segment independently, quite large course changes could be experienced. Since each scan pair is used multiple times because it divided into segments, practically the same course and position change would be calculated over and over again. For example, the change in course registered between the scans will be similar for two adjacent segments. The only truly new information in the next segment are the changes experienced over that specific segment, not the changes experienced over the rest of the full scan because that has already been studied. To avoid calculating the same course change multiple times, the changes in course and position can be calculated recursively as

Δ ψ_{t} = Δ ψ_{t - τ} + δ ψ_{t}

(10a)

Δ X_{t} = Δ X_{t - τ} + δ X_{t}

(10b)

Δ Y_{t} = Δ Y_{t - τ} + δ Y_{t} .

(10c)

The change in course is subsequently divided into two parts: the estimated change in course Δψ_t-τ using the previous segment, which is known, and a small change in course δψ_t experienced during the segment, which is unknown. Even though the vessel used for data acquisition is very maneuverable, δψ_t can be assumed small.

The sine and cosines of (9) can now be rewritten using (10a) and simplified using the small angle approximation cos (δψ_t ) ≈ 1 and sin (δψ_t ) ≈ δψ_t

\begin{align} cos (Δ ψ) & = cos (Δ ψ_{t - τ} + δ ψ_{t}) o \\ = \underset{c_{Δ}}{\underset{⏟}{cos (Δ ψ_{t - τ})}} \underset{\approx 1}{\underset{⏟}{cos (δ ψ_{t})}} - \underset{s_{Δ}}{\underset{⏟}{sin (Δ ψ_{t - τ})}} \underset{\approx δ ψ_{t}}{\underset{⏟}{sin (δ ψ_{t})}} \\ \approx c_{Δ} - s_{Δ} δ ψ_{t} \end{align}

(11)

\begin{align} sin (Δ ψ) & = sin (Δ ψ_{t - τ} + δ ψ_{t}) \\ = \underset{s_{Δ}}{\underset{⏟}{sin (Δ ψ_{t - τ})}} \underset{\approx 1}{\underset{⏟}{cos (δ ψ_{t})}} + \underset{c_{Δ}}{\underset{⏟}{cos (Δ ψ_{t - τ})}} \underset{\approx δ ψ_{t}}{\underset{⏟}{sin (δ ψ_{t})}} \\ \approx s_{Δ} + c_{Δ} δ ψ_{t} \end{align}

(12)

where c_Δ and s_Δ are known.

Dividing the change in position into two parts as in (10b) and (10c) does not change the equation system (9) in practice, why ΔX_t and ΔY_t are left as before.

The equation system becomes

\begin{align} (\begin{matrix} y_{x, t - T}^{i} \\ y_{y, t - T}^{i} \end{matrix}) & = (\begin{matrix} c_{Δ} - {s_{Δ}_{ψ}}_{_{t}} δ ψ_{t} & - s_{Δ} - c_{Δ} δ ψ_{t} \\ s_{Δ} + c_{Δ} δ ψ_{t} & c_{Δ} - s_{Δ} δ ψ_{t} \end{matrix}) (\begin{matrix} y_{x, t}^{i} \\ y_{y, t}^{i} \end{matrix}) \\ + (\begin{matrix} Δ X_{t} \\ Δ Y_{t} \end{matrix}) \Leftrightarrow \\ (\begin{matrix} y_{x, t - T}^{i} \\ y_{y, t - T}^{i} \end{matrix}) & = ((\begin{matrix} c_{Δ} & - s_{Δ} \\ s_{Δ} & c_{Δ} \end{matrix}) + δ ψ_{t} (\begin{matrix} - s_{Δ} & - c_{Δ} \\ c_{Δ} & - s_{Δ} \end{matrix})) (\begin{matrix} y_{x, t}^{i} \\ y_{y, t}^{i} \end{matrix}) \\ + (\begin{matrix} Δ X_{t} \\ Δ Y_{t} \end{matrix}) \end{align}

(13)

which can be rewritten as

(\begin{gathered} y_{x, t - T}^{i} - c_{Δ} y_{x, t}^{i} + s_{Δ} y_{y, t}^{i} \\ y_{y, t - T}^{i} - s_{Δ} y_{x, t}^{i} - c_{Δ} y_{y, t}^{i} \end{gathered}) = (\begin{matrix} - s_{Δ} y_{x, t}^{i} - c_{Δ} y_{y, t}^{i} & 1 0 \\ c_{Δ} y_{x, t}^{i} - s_{Δ} y_{y, t}^{i} & 0 1 \end{matrix}) (\begin{gathered} δ ψ_{t} \\ Δ X_{t} \\ Δ Y_{t} \end{gathered})

(14)

The equation system is now approximately linear and by stacking multiple landmarks in one equation system

(\begin{matrix} y_{x, t - T}^{i} - c_{Δ} y_{x, t}^{i} + s_{Δ} y_{y, t}^{i} \\ y_{y, t - T}^{i} - s_{Δ} y_{x, t}^{i} - c_{Δ} y_{y, t}^{i} \\ y_{x, t - T}^{j} - c_{Δ} y_{x, t}^{j} + s_{Δ} y_{y, t}^{j} \\ y_{y, t - T}^{j} - s_{Δ} y_{x, t}^{j} - c_{Δ} y_{y, t}^{j} \\ ⋮ \end{matrix}) = (\begin{matrix} - s_{Δ} y_{x, t}^{i} - c_{Δ} y_{y, t}^{i} & 1 & 0 \\ c_{Δ} y_{x, t}^{i} - s_{Δ} y_{y, t}^{i} & 0 & 1 \\ - s_{Δ} y_{x, t}^{j} - c_{Δ} y_{y, t}^{j} & 1 & 0 \\ c_{Δ} y_{x, t}^{j} - s_{Δ} y_{y, t}^{j} & 0 & 1 \\ ⋮ \end{matrix}) (\begin{matrix} δ ψ_{t} \\ Δ X_{t} \\ Δ Y_{t} \end{matrix})

(15)

an overdetermined system is acquired and δψ_t , ΔX_t and ΔY_t can be determined using a least squares solver.

This estimated change in position and course can in turn be used to calculate a velocity and an angular velocity measurement as

{\bar{y}}_{t} = (\begin{matrix} {\bar{v}}_{t} \\ {\bar{ω}}_{t} \end{matrix}) = (\begin{matrix} \frac{\sqrt{{(Δ X_{t})}^{2} + {(Δ Y_{t})}^{2}}}{T} \\ \frac{δ ψ_{t}}{τ_{t}} \end{matrix}) + (\begin{matrix} ē_{v, t} \\ ē_{ω, t} \end{matrix}) .

(16)

The measurement noises $ē_{v}$ and $ē_{ω}$ are assumed to be independent Gaussian noises. This transformation that provides direct measurements of speed and course change gives what is usually referred to as odometry.

Although this approach simplifies the implementation a lot, it comes with certain drawbacks. First, the landmarks correctly associated between two images are used only pairwise, and this is sub-optimal since one loses both the averaging effects that occur when the same landmark is detected many times and also the correlation structure between landmarks. Second, assuming no cross-correlation between $ē_{v}$ and $ē_{ω}$ is a simplification since ${\bar{v}}_{t}$ and ${\bar{ω}}_{t}$ are based on ΔX_t , ΔY_t and δψ_t which are not calculated independently. Therefore, the measurements ${\bar{v}}_{t}$ and ${\bar{ω}}_{t}$ are actually dependent making the noise independence assumption incorrect. And third, in order to estimate the relative movements, the time interval used to detect the landmarks must be informative enough to calculate δψ_t , ΔX_t and ΔY_t , but not long enough to allow a significant scan skewedness to appear. This trade-off is vessel specific and must be balanced. By ensuring that the vessel cannot be expected to turn more than for example 10 degrees during each time interval, the small angle approximation holds.

2) ESDF: The simplified odometric model above can still be used for mapping if a trajectory-based filtering algorithm is used. One such framework is known in the SLAM literature as the Exactly Sparse Delayed-state Filter (ESDF) [39]. It has a state vector that consists of augmented vehicle states as

z_{1 : t} = (\begin{matrix} z_{1 : t - 1} \\ z_{t} \end{matrix}),

(17)

where the state z_t is given in (4) and z_{1:t- 1}are all previous poses. If no loop closures are detected, then the ESDF is simply stacking all pose estimates, but once a loop closure is detected and the relative pose between the two time instances is calculated, the ESDF allows for the whole trajectory to be updated using this new information.

Once the trajectory has been estimated, all radar scans can be mapped to world coordinates. By overlaying the scans on the estimated trajectory, a radar map is obtained. Each pixel now describes how many radar detections that have occurred in that coordinate.

IV. SIFT performance on radar images

SIFT is used to extract visual island features from the radar images. Figure 4 shows the features that are extracted from the upper right quadrant of a radar scan example. Two types of features are detected; island features and vessel-related features. The latter originate from radar disturbances caused by the vessel and the waves and are visible in the bottom left corner of Figure 4. Unfortunately, this section of the image cannot just be removed since the vessel commonly travels very close to land making island features in the area sometimes crucial for navigation.

The total number of features detected is of course depending on the number of islands in the area, but also on where these islands are situated. A large island close to the vessel will block a large section of the radar scan, resulting in few features. In these experiments, an average of 650 features was extracted per full radar scan.

A. Matching for movement estimation

The SIFT features are matched to estimate the relative movement of the vessel between multiple consecutive scans. Figure 5a,b shows examples of how well these features actually match. In Figure 5a, a dense island distribution results in a lot of matches that provide a good movement estimation. In Figure 5b, there are very few islands making it difficult to estimate the movements accurately.

There are two situations that can cause few matches. One is when there are few islands, and the other is when a large island is very close to the vessel, blocking the view of all the other islands. When the vessel passes close to an island at high speed, the radar scans can differ quite significantly between two revolutions. This results not only in few features to match but also in features that can be significantly more difficult to match causing the relative movement estimates to degrade. On average though, about 100 features are matched in each full scan.

B. Loop closure matching

Radar features can also be used to detect loop closures that would enable the filter to update the entire vessel trajectory. The rotation invariance of the SIFT features makes radar scans acquired from different headings straightforward to match. Quite a large difference in position is also manageable due to the range of the radar sensor. This requires of course that no island is blocking or disturbing the view. Figure 6a shows example locations a, b and c that were used to investigate the matching performance of the visual features.

In area a, Figure 6b, and 6b, Figure 6c, the features are easy to match despite the rather long translational difference over open water in b. In both cases, a 180° difference in course is easily overcome by the visual features. This shows the strength of both the radar sensor and of the visual features. The long range of the sensor makes loop closures over a wide passage of open water possible. These scans would be used in a full-scale SLAM experiment to update the trajectory.

In area c, Figure 6d, only two features are matched correctly, and there are also two false positives. If the scans are compared ocularly, it is quite challenging to find islands that are clearly matching, mostly due to blurring and to blocking islands. It is also noticeable that the radar reflections from the islands differ due to differences in radar positions which of course alters the SIFT features. The poor matching result is therefore natural in this case.

C. Feature preprocessing

Two problems remain that have not been addressed. First is the problem of false feature matches. In Figure 6b, a false feature match is clearly visible, and it would introduce estimation errors if not handled properly. An initial approach would be to use an algorithm like RANSAC[40] to remove matches that are inconsistent with all other matches. One could also use the filtering framework to produce an estimate of the probable feature position and perform a significance test on the features based on for example Mahalanobis distance. Only the features that pass this test would be allowed to update the filter estimates.

The other problem is accidental vessel matching. In heavily trafficked areas like ports, other vessels will be detected on the radar scans. If a moving vessel is deemed stationary and is used to update the filter, errors will be introduced. Two approaches could be taken to handle this problem. Again, a significance test could be used to rule out features from fast moving vessels. Alternatively, a target tracking approach could be used. By tracking the features over multiple scans, the vessels can be detected and ruled out based on their position inconsistency compared to the stationary features. Describing such a system is though beyond the scope of this paper. The joint SLAM and target tracking problem has previously been studied in [41].

V. Experimental results

The experimental results in this section come from the master thesis by Henrik Svensson [42]. The implemented framework is the one described in Sections III-E1 and III-E2.

A. Results

The trajectory used in the SLAM experiment is shown in bold in Figure 7. The track is about 3,000 s long (50 min) and covers roughly 32 km. The entire round trip was unfortunately never used in one single experiment since it was constituted of multiple data sets.

The estimated trajectory with covariance is compared to the GPS data in Figure 8. The first two quarters of the trajectory consist of an island rich environment, see Figure 6a, resulting in a very good estimate. The third quarter covers an area with fewer islands causing the performance to degrade. This results in an initial misalignment of the final quarter that makes the estimated trajectory of this segment seem worse than it actually is.

Both velocity and course estimates, Figures 9 and 10, are quite good when compared to GPS data. There is though a positive bias on the velocity estimate, probably due to the simplifications mentioned in Section III-E. The course error grows in time since the estimate is the sum of a long sequence of estimated changes in course, see (16), and there are no course measurements available. The final course error is about 30 degrees.

B. Map estimate

A radar map of the area was generated by overlaying the radar scans on the estimated trajectory. Figure 11a shows the estimated radar map that should be compared to the map created using the GPS trajectory, Figure 11b. They are very similar although small errors in the estimated trajectory are visible in Figure 11a as blurriness. Some islands appear a bit larger in the estimated map because of this. Overall, the map estimate is good.

The estimated map should also be compared to the satellite photo of the area with the true trajectory marked in white as shown in Figure 11c. When compared, many islands in the estimated map are easily identified. This shows that the rather simple approach of using visual features on radar images can provide good mapping results.

VI. Conclusions

We have presented a new approach to robust navigation for surface vessels based on radar measurements only. No infrastructure or maps are needed. The basic idea is to treat the radar scans as images and apply the SIFT algorithm for tracking landmarks of opportunity. We presented two related frameworks, one based on the SLAM idea where the trajectory is estimated jointly with a map, and the other one based on odometry. We have evaluated the SIFT performance and the odometric framework on data from a high-speed patrol boat and obtained a very accurate trajectory and map.

An interesting application of this work would be to apply this method to underwater vessels equipped with synthetic aperture sonar as the imagery sensor, since there are very few low-cost solutions for underwater navigation.

References

Volpe J: Vulnerability assessment of the transportation infrastructure relying on global positioning system final report. National Transportation Systems Center, Tech Rep 2001.
Google Scholar
GNSS vulnerability and mitigation measures, (rev. 6) European Maritime Radionavigation Forum, Tech Rep 2001.
Grant A, Williams P, Ward N, Basker S: GPS jamming and the impact on maritime navigation. J Navig 2009,62(2):173-187. 10.1017/S0373463308005213
Article Google Scholar
Dalin M, Mahl S: Radar Map Matching. Master's thesis Linköping University, Sweden; 2007.
Google Scholar
Karlsson R, Gustafsson F: Bayesian surface and underwater navigation. IEEE Trans Signal Process 2006,54(11):4204-4213.
Article MathSciNet Google Scholar
Lowe D: Object recognition from local scale-invariant features. ICCV '99: Proceedings of the International Conference on Computer Vision 1999., 2:
Google Scholar
Bay H, Tuytelaars T, Van Gool L: Surf: speeded up robust features. 9th European Conference on Computer Vision Graz Austria 2006.
Google Scholar
Durrant-Whyte H, Bailey T: Simultaneous localization and mapping ( SLAM ): part I. Robot. Autom Mag IEEE 2006,13(2):99-110.
Article Google Scholar
Bailey T, Durrant-Whyte H: Simultaneous localization and mapping ( SLAM ): part II. Robot. Autom Mag IEEE 2006,13(3):108-117.
Article Google Scholar
Newman P, Ho K: SLAM -loop closing with visually salient features. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2005.
Google Scholar
Bosse MC, Zlot R: Map matching and data association for large-scale two-dimensional laser scan-based SLAM . Int J Robot Res 2008,27(6):667-691. 10.1177/0278364908091366
Article Google Scholar
Granström K, Callmer J, Ramos F, Nieto J: Learning to detect loop closure from range data. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2009.
Google Scholar
Rouveure R, Monod M, Faure P: High resolution mapping of the environment with a ground-based radar imager. Proceedings of the International Radar Conference 2009.
Google Scholar
Checchin P, Grossier F, Blanc C, Chapuis R, Trassoudaine L: Radar scan matching SLAM using the Fourier-Mellin transform. IEEE International Conference on Field and Service Robotics 2009.
Google Scholar
Ramos F, Nieto J, Durrant-Whyte H: Recognising and modelling landmarks to close loops in outdoor SLAM . Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2007.
Google Scholar
Eustice R, Singh H, Leonard J, Walter M, Ballard R: Visually navigating the RMS Titanic with SLAM information filters. Proceedings of Robotics: science and Systems 2005.
Google Scholar
Mahon I, Williams S, Pizarro O, Johnson-Roberson M: Efficient view-based SLAM using visual loop closures. IEEE Trans Robot 2008,24(5):1002-1014.
Article Google Scholar
Bryson M, Sukkarieh S: Bearing-only SLAM for an airborne vehicle. Proceedings of the Australasian Conference on Robotics and Automation (ACRA) 2005.
Google Scholar
Carlbom P: Radar map matching. Technical Report 2005.
Google Scholar
Clark S, Durrant-Whyte H: Autonomous land vehicle navigation using millimeter wave radar. Proceedings of the IEEE International Conference on Robotics and Automation 1998.
Google Scholar
Clark S, Dissanayake G: Simultaneous localisation and map building using millimetre wave radar to extract natural features. Proceedings of the IEEE International Conference on Robotics and Automation 1999.
Google Scholar
Feng L, Milios E: Robot pose estimation in unknown environments by matching 2D range scans. Computer Vision and Pattern Recognition, 1994. Proceedings CVPR '94., 1994 IEEE Computer Society Conference on 1994.
Google Scholar
Lu F, Milios E: Globally consistent range scan alignment for environment mapping. Auton Robot 1997,4(4):333-349. 10.1023/A:1008854305733
Article Google Scholar
Chen Y, Medioni G: Object modeling by registration of multiple range images. Image Vis Comput 1992,10(3):145-155. 10.1016/0262-8856(92)90066-C
Article Google Scholar
Ramos F, Fox D, Durrant-Whyte H: CRF-matching: Conditional random fields for feature based scan matching. Proceedings of Robotics: science and Systems 2007.
Google Scholar
Chen Q, Defrise M, Deconinck F: Symmetric phase-only matched filtering of fourier-mellin transforms for image registration and recognition. IEEE Trans Pattern Analysis Mach Intell 1994,16(12):1156-1168. 10.1109/34.387491
Article Google Scholar
Chandran M, Newman P: Motion estimation from map quality with millimeter wave radar. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems 2006.
Google Scholar
Tsang SH, Hall PS, Hoare EG, Clarke NJ: Advance path measurement for automotive radar applications. IEEE Trans Intell Transp Syst 2006,7(3):273-281. 10.1109/TITS.2006.880614
Article Google Scholar
Lundquist C: Automotive Sensor Fusion for Situation Awareness. In Licentiate Thesis No 1422 L. Linköping University, Sweden; 2009.
Google Scholar
Bolen SM, Chandrasekar V: Methodology for aligning and comparing spaceborne radar and ground-based radar observations. J Atmos Ocean Technol 2003, 20: 647-659. 10.1175/1520-0426(2003)20<647:MFAACS>2.0.CO;2
Article Google Scholar
Se S, Lowe D, Little J: Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. Int J Robot Res 2002, 21: 735-758. 10.1177/027836402761412467
Article Google Scholar
Jensfelt P, Kragic D, Folkesson J, Bjorkman M: A framework for vision based bearing only 3-D SLAM . Proceeding IEEE International Conference on Robotics and Automation (ICRA) 2006.
Google Scholar
Cummins M, Newman P: Probabilistic appearance based navigation and loop closing. IEEE International Conference on Robotics and Automation 2007.
Google Scholar
Callmer J, Granström K, Nieto J, Ramos F: Tree of words for visual loop closure detection in urban slam. Proceedings of the 2008 Australasian Conference on Robotics and Automation (ACRA) 2008.
Google Scholar
Li F, Zhang G, Yan J: Coregistration based on SIFT algorithm for synthetic aperture radar interferometry. Proceedings of ISPRS Congress 2008.
Google Scholar
Schikora M, Romba B: A framework for multiple radar and multiple 2D/3D camera fusion. 4th German Workshop Sensor Data Fusion: trends, Solutions, Applications (SDF) 2009.
Google Scholar
Essen H, Luedtke G, Warok P, Koch W, Wild K, Schikora M: Millimeter wave radar network for foreign object detection. 2nd International Workshop on Cognitive Information Processing (CIP) 2010.
Google Scholar
Gustafsson F: Statistical Sensor Fusion. Studentlitteratur, Lund, Sweden; 2010.
Google Scholar
Eustice R, Singh H, Leonard J: Exactly sparse delayed-state filters for view-based SLAM . IEEE Trans Robot 2006,22(6):1100-1114.
Article Google Scholar
Fischler MA, Bolles RC: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 1981, 24: 381-395. 10.1145/358669.358692
Article MathSciNet Google Scholar
Wang CC, Thorpe C, Thrun S, Hebert M, Durrant-Whyte H: Simultaneous localization, mapping and moving object tracking. Int J Robot Res 2007,26(9):889-916. 10.1177/0278364907081229
Article Google Scholar
Svensson H: Simultaneous Localization and Mapping in a Marine Environment using Radar Images. In Master's thesis. Linköping University, Sweden; 2009.
Google Scholar

Download references

Acknowledgements

This work was supported by the Strategic Research Center MOVIII, funded by the Swedish Foundation for Strategic Research, SSF and CADICS, a Linnaeus center funded by the Swedish Research Council. The authors declare that they have no competing interests.

Author information

Authors and Affiliations

Division of Automatic Control, Linköping University Linköping, Sweden
Jonas Callmer, David Törnqvist & Fredrik Gustafsson
Nira Dynamics Linköping, Sweden
Henrik Svensson
Saab Dynamics Linköping, Sweden
Pelle Carlbom

Authors

Jonas Callmer
View author publications
You can also search for this author in PubMed Google Scholar
David Törnqvist
View author publications
You can also search for this author in PubMed Google Scholar
Fredrik Gustafsson
View author publications
You can also search for this author in PubMed Google Scholar
Henrik Svensson
View author publications
You can also search for this author in PubMed Google Scholar
Pelle Carlbom
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonas Callmer.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Callmer, J., Törnqvist, D., Gustafsson, F. et al. Radar SLAM using visual features. EURASIP J. Adv. Signal Process. 2011, 71 (2011). https://doi.org/10.1186/1687-6180-2011-71

Download citation

Received: 10 December 2010
Accepted: 23 September 2011
Published: 23 September 2011
DOI: https://doi.org/10.1186/1687-6180-2011-71

Radar SLAM using visual features

Abstract

I. Introduction

II. Background and relation to slam

III. Theoretical framework

A. Detection model

B. Measurement model

C. Motion model

D. Multi-rate issues

E. Alternative landmark free odometric framework

IV. SIFT performance on radar images

A. Matching for movement estimation

B. Loop closure matching

C. Feature preprocessing

V. Experimental results

A. Results

B. Map estimate

VI. Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords