# Motion detection using binocular image flow in dynamic scenes

- Qi Min
^{1}and - Yingping Huang
^{1}Email author

**2016**:49

https://doi.org/10.1186/s13634-016-0349-8

© Min and Huang. 2016

**Received: **7 September 2015

**Accepted: **12 April 2016

**Published: **19 April 2016

## Abstract

Motion detection is a hard task for intelligent vehicles since target motion is mixed with ego-motion caused by moving cameras. This paper proposes a stereo-motion fusion method for detection of moving objects from a moving platform. A 3-dimensional motion model integrating stereo and optical flow has been established to estimate the ego-motion flow. The mixed flow is calculated from an edge-indexed correspondence matching algorithm. The difference between the mixed flow and the ego-motion flow yields residual target motion flow where the intact target is segmented from. To estimate the ego-motion flow, a visual odometer has been implemented. We first extract some feature points in the ground plane that are identified as static points using the height constraint and Harris algorithm. And then, 6 DOF motion parameters of the moving camera are calculated by fitting the feature points into the linear least square algorithm. The approach presented here is tested on substantial traffic videos, and the results prove the efficiency of the method.

## Keywords

## 1 Introduction

Detection on moving obstacles like pedestrians and vehicles is of critical importance for autonomous vehicles. Vision-based sensing systems have been used for object detection in many applications including autonomous vehicles, robotics, and surveillance. Compared with the static systems such as the traffic and crowd surveillance, motion detection from a moving platform (vehicle) is more challengeable since target motion is mixed with camera’s ego-motion. This paper addresses on this issue and presents a binocular stereovision-based in-vehicle motion detection approach which integrates stereo with optical flow. The approach fully makes use of two pairs of image sequences captured from a stereovision rig, i.e., disparity from left and right pair images and motion fields from consecutive images.

Vision-based motion detection methods can be categorized into three major classes, i.e., temporal difference, background modeling and subtraction, and optical flow. Temporal difference methods [1] readily adapt to sudden changes in the environment, but the resulting shapes of moving objects are often incomplete. Background modeling and subtraction is mainly used in video surveillance where the background is relatively fixed and static. Its basic idea is to subtract or differentiate the current image from a reference background model [2]. However, the generated background model may not be applicable in some scenes such as gradual or sudden illumination changes and dynamic background (wave trees). To address these issues, a hierarchical background modeling and subtraction [3] and a self-adaptive background matching method [4] have been proposed. Adaptive background models have also been used in autonomous vehicles in an effort to adapt surveillance methods to the dynamic on-road environment. In [5], an adaptive background model was constructed, with vehicles detected based on motion that differentiated them from the background. Dynamic modeling of the scene background in the area of the image where vehicles typically overtake was implemented in [6].

Optical flow, a fundamental machine vision tool, has advantages that directly reflect an accurate estimation of point motion, representing an evident change in position of a moving point. It has been used for motion detection and tracking in defense [7] and abnormal crowd behavior detection in video surveillance [8]. In autonomous vehicles, monocular optical flow has been used to detect head-on vehicle [9], overtaking vehicles in the blind spot [10] and crossing obstacles [11]. In [12], interest points that persisted over long periods of time were detected and tracked using the hidden Markov model as vehicles traveling parallel to the ego vehicle. In [13], optical flow was used to form a spatiotemporal descriptor, which was able to classify the scene as either intersection or non-intersection. The use of optical flow has also heavily been found in stereovision-based motion detection, i.e., stereo-motion fusion method, which benefits from motion cues as well as depth information. There are many different fusion schemes. In [14], Pantilie et al. fuse motion information derived from optical flow into a depth-adaptive occupancy grid (bird-view map) generated from stereovision 3D reconstruction. As an improvement of stereovision-based approach, the method is of benefits to distinguish between static and moving obstacles and to reason about motion speed and direction. Franke and Heinrich [15] propose a depth/flow quotient constraint. Independently moving regions of the image do not fulfill the constraints and are detected. Since the fusion algorithm compare the flow/depth quotient against a threshold function at distinct points only, it is computationally efficient. However, the approach reduces the possibility of carrying out geometrical reasoning and lacks a precise measurement of the detected movements. In addition, the approach is limited with respect to robustness since only two consecutive frames are considered. To get more reliable results, the Kalman filter is equipped to integrate the observations over time. In [16], Rabe et al. employ a Kalman filter to track image points and to fuse the spatial and temporal information so that static and moving pixels can be distinguished before any segmentation is performed. The result is an improved accuracy of the 3D position and an estimation of the 3D motion of the detected moving objects. In [17], Kitt et al. use a sparse set of static image features (e.g., corners) with measured optical flow and disparity and apply the *Longuet-Higgins-Equations* with an implicit extended Kalman filter to recover the ego-motion. The feature points with optical flow and disparity flow not consistent with the estimated ego-motion indicate the existence of independently moving objects. In [18], Bota and Nedevschi focus on fusing stereo and optical flow for multi-class object tracking by designing Kalman filter fitted with static and dynamic cuboidal object models. In [19], interest moving points are first detected and projected on 3D reconstruction ground plane using optical flow and stereo disparity. The scene flow is computed via finite differences for a track up to five 3D positions, and points with a similar scene flow are grouped together as rigid objects in the scene. A graph-like structure connecting all detected interest points is generated, and the resulting edges are removed according to scene flow differences exceeding a certain threshold. The remaining connected components describe moving objects.

A precise recovery of the ego-motion is essential in order to distinguish between static and moving objects in dynamic scenes. One of the methods of ego-motion estimation was to use in-vehicle inertial navigation system (INS) [15]. However, ego-motion from the in-car sensor is not sufficient for a variety of reasons like navigation loss, wheel slip, INS saturation, and calibration errors. Thus, it is ideal to estimate the camera ego-motion directly from the imagery. Ego-motion estimation using monocular optical flow and integrated detection of vehicles was implemented in [20]. Several groups have reported stereo-based ego-motion estimation based on tracking point features. In [18], the concept of 6D vision, i.e., the tracking of interest points in 3D using Kalman filtering, along with ego-motion compensation, was used to identify moving objects in the scene. In [21], vehicle’s ego-motion was estimated from computational expensive dense stereo and dense optical flow with the method of iterative learning from all points in the image.

Stereo-motion fusion has been studied in a theoretical manner by Waxman and Duncan [22]. The important result was the relationship between camera’s 3D motion and corresponding image velocities with stereo constraints. Our work builds on the basic principles presented in [22] and extends it to dynamic scene analysis. In this work, a mathematical model, integrating optical flow, depth, and camera ego-motion parameters, is firstly derived from Waxman and Duncan’s theoretical analysis. Camera’s ego-motion is then estimated from the model by using ground feature points, and accordingly ego-motion flow of the image is calculated from the model. A moving target is detected from the difference of the mixed flow and the ego-motion flow.

The main contributions of this work can be summarized as follows: (1) The relationship between optical flow, stereo depth, and camera ego-motion parameters has been established based on Waxman and Duncan’s theoretical model. Accordingly, a novel motion detection approach fusing stereo with optical flow sensor has been proposed for in-vehicle environment sensing systems. A visual odometer able to estimate camera’s ego-motion has also been proposed. Motion detection using stereo-motion fusion normally identifies image points [16, 19] or features [17] as static or moving and then segment moving objects accordingly. Our method works on the image level, i.e., the difference between the mixed flow image and the ego-motion flow image. (2) Existing motion detection approaches often make some assumptions on object/vehicle motion or scene structure. Our approach can detect moving objects without any constraints on object/vehicle motion or scene structure since the proposed visual odometer can estimate all six motion parameters. (3) When fusing stereo with optical flow, the computational load, accuracy, and comparability (or consistence) between stereo and optical flow calculations are practical issues. Our method uses the edge-indexed method for all calculations and therefore greatly reduces computational load without impact on detection performance, improves calculation accuracy especially on the mixed flow, and provides pixel-wise consistence for all calculations so that the stereo depth, the mixed flow, and the ego-motion flow can be compared pixel by pixel.

## 2 Approaches

### 2.1 Overview of the approach

The mixed flow of the scene is caused by both camera motion and target motion and is obtained from correspondence matching between consecutive images. The ego-motion flow is caused only by camera motion and calculated from a mathematical model derived from Waxman and Duncan’s theoretical analysis [22], which indicates the relation between optical flow, depth map, and camera ego-motion parameters. To calculate the ego-motion flow, we need first know the ego-motion parameters of six degree of freedom. A visual odometer has been implemented for this purpose, in which six motion parameters are estimated by solving a set of equations fitted with a fixed number of feature points using the linear least square method. The feature points are selected as corner points lying on the road surface and determined by using height constraint and Harris corner detection algorithm [23]. Within the two stages, the depth of the image points is provided by the stereovision disparity map. The difference between the mixed flow and the ego-motion flow yields an independent flow which is purely caused by the target motion. The moving target is extracted according to the continuity of the similar independent flow.

To reduce the computational workload and considering that object contour is the most effective cue for object segmentation, all calculations are edge-indexed, i.e., we only conduct calculations on edge points for stereo matching, the mixed flow, and the ego-motion flow calculations. This tactic greatly increases the real-time performance and has no impact on object detection performance.

### 2.2 The mixed flow

- Step 1.
Generate edge image using Canny operator and use the edge points as seed points to find the correspondence points in next frame.

- Step 2.
Define the searching range as a square area centered at the seed point and define a rectangular matching window.

- Step 3.
Use the normalized cross correlation coefficients as a measure of greyscale similarity of two matching windows. The correspondence points are regarded as those with the maximum cross correlation coefficient that must be greater than a predefined threshold.

- Step 4.
Achieve the sub-pixel estimation of the calculated optical flow along the vertical and horizontal directions by introducing a quadratic interpolation. This is to improve the optical flow resolution so that a higher optical flow accuracy can be achieved.

### 2.3 3-dimensional motion and ego-motion flow

*X*,

*Y*,

*Z*) is located at the center of image coordinates (

*x*,

*y*), and the

*Z*-axis is directed along optical axis of the camera. The translational velocity of the camera is \( \overline{V}=\left({V}_x,{V}_y,{V}_z\right) \), and the rotational velocity \( \overline{W}=\left({W}_x,{W}_y,{W}_z\right) \).

*P*(

*X*,

*Y*,

*Z*) in space moves to point

*P*′(

*X*′,

*Y*′,

*Z*′), the relation between the point motion and camera motion is as below [22]:

*P*(

*X*,

*Y*,

*Z*) and camera’s rotational velocity vector can be represented as

*X-*,

*Y-*, and

*Z*-axes, × refers to cross-product. Thus, Eq. (2) can be rewritten as

*p*(

*x y*) of the world point

*P*(

*X*,

*Y*,

*Z*) projected in the image plane can be expressed as

*f*denotes the focal length of the stereo camera. The optical flow (

*u*,

*v*) of

*P*(

*X*,

*Y*,

*Z*) can be obtained by estimating the derivatives along

*X*-axis and

*Y*-axis in 2D image coordinates.

where \( A=\left[\begin{array}{ccc}\hfill \frac{f}{Z}\hfill & \hfill 0\hfill & \hfill -\frac{x}{Z}\hfill \\ {}\hfill 0\hfill & \hfill \frac{f}{Z}\hfill & \hfill -\frac{y}{Z}\hfill \end{array}\ \begin{array}{ccc}\hfill -\frac{xy}{f}\hfill & \hfill \frac{f^2+{x}^2}{f}\hfill & \hfill -y\hfill \\ {}\hfill -\frac{f^2+{y}^2}{f}\hfill & \hfill \frac{xy}{f}\hfill & \hfill x\hfill \end{array}\right] \).

Equation (7) indicates the relationship between the ego-motion flow, the depth and the six parameters of the camera motion. It is evident that the ego-motion flow can be calculated from Eq. (7) if the depth and the six motion parameters are known. The depth can be obtained from stereovision as reported in our previous work [26]. Two methods can be used to obtain the motion parameters: one is to use an in-vehicle INS or gyroscope to measure them; the other is to use a visual odometer. However, subject to problems like navigation loss, wheel slip, INS saturation, and calibration errors between the IMU and the cameras, in-vehicle INS may cause inaccurate motion estimation in some cases. Thus, it is ideal to estimate the camera motion directly from the imagery. Ultimately, it could be fused with other state sensors to produce a more accurate and reliable joint estimate of cameral/vehicle motion.

### 2.4 Visual odometry

It can be known from Eq. (7) that if the ego-motion flow and the depth of six or more points in the scene are known, we can set up a set of equations with six unknown variables, i.e., six camera motion parameters and estimate these variables by solving the equations set using the least square fitting method. The points used for the least square fitting must be assured with accurate optical flow calculation and must not be any moving points.

In this work, the corner points lying on the road surface are selected for this purpose since the ground points are static and the corner points are of good stability and inflexibility to light intensity, therefore possessing relatively accurate optical flow.

#### 2.4.1 Extraction of ground corner points using stereovision and Harris method

*Y*

_{ g }of the ground points, namely their

*Y*-axis coordinate, depends on the camera installation height

*H*

_{ c }, the tilt angle towards the road plane

*θ*, and distance

*Z*

_{ g }, as indicated in Eq. (8) and Fig. 3. Those points with

*Y*-axis coordinate less than

*Y*

_{ g }are regarded as ground points.

*I*(

*X*,

*Y*) moves in any directions by small displacements (

*∇x*,

*∇y*), the autocorrelation function is defined as below:

*φ*(

*x*,

*y*) is Gaussian weighting function used here to reduce the impact of noise;

*W*(

*x*,

*y*) denotes window blocks centered at the point;

*I*

_{ x }is the gradient in

*x*direction; and

*I*

_{ y }is the gradient in

*y*direction. The Sobel convolution kernel

*ω*

_{ x }and its transposed form

*ω*

_{ y }are used to obtain

*I*

_{ x }=

*I*(

*X*,

*Y*) ⊗

*ω*

_{ x }, and

*I*

_{ y }=

*I*(

*X*,

*Y*) ⊗

*ω*

_{ y }.

*M*(

*x*,

*y*) is called the autocorrelation matrix and

*M*) =

*λ*

_{1}×

*λ*

_{2}and traceM =

*λ*

_{1}+

*λ*

_{2},

*λ*

_{1}and

*λ*

_{2}denote the eigenvalues of the matrix

*M*, we set α = 0.04. The point with CRF bigger than a certain threshold is regarded as a corner point.

#### 2.4.2 Ego-motion parameter estimation using the linear square algorithm

*u*,

*v*).

*u*,

*v*) is calculated from the method introduced in Section 2.2. The estimated optical flow \( \left(\widehat{u},\widehat{v}\right) = A{\left(\overline{V},\overline{W}\right)}^T \) is obtained from Eq. (7). The minimum value of the object function is found by setting the gradient to zero and the optimal parameter values are

*f*of the stereo camera, the depth

*Z*, and the image coordinates as shown in Eq. (7).

### 2.5 Independent flow and target segmentation

*u*

_{ r }

*v*

_{ r }] denotes the independent flow in the horizontal and vertical directions, [

*u*

_{ m }

*v*

_{ m }] the mixed flow, and [

*u*

_{ e }

*v*

_{ e }] the ego-motion flow. The synthetic of the two components of the independent flow is calculated as \( s=\sqrt{u^2+{v}^2\ } \). Target segmentation is based on the synthetic independent flow.

- 1.
For a threshold

*t*,*s*_{min}<*t*<*s*_{max}, define the variance*ε*(*t*) between the moving target’s independent flow and the background’s independent flow aswhere$$ \varepsilon (t)={P}_o{\left({s}_o-t\right)}^2+{P}_g{\left({s}_g-t\right)}^2 $$(15)*s*_{ o }denotes the mean of the independent flows of the moving points, \( {s}_o=\frac{{\displaystyle \sum }{s}_i*{p}_i}{p_o}\ \left({s}_i>t,\ i=1,\kern1em 2,\kern1em 3\dots \right) \),*s*_{ g }denotes the mean of the independent flows of the background points, \( {s}_g=\frac{{\displaystyle \sum }{s}_i*{p}_i}{p_g}\ \left({s}_i<t,\kern0.75em i=1,\kern1em 2,\kern1em 3\dots \right) \);*p*_{ o }denotes the proportion of the points with*s*>*t*,*p*_{ g }the proportion of the points with*s*<*t*, and*p*_{ i }the proportion of the points with*s*<*s*_{ i }. - 2.
Search for the

*t*from*s*_{min}to*s*_{max}to make*ε*(*t*) maximum and use it as the threshold to segment the moving objects from the background. This process endures a maximum between-class distance.

We cancel out the pixels with the independent flow below the threshold determined above. For the pixels with the independent flow above the threshold, we use the region-growing method to cluster similar potentials together to form the eventual segmentation. Actually, in this work, the independent flow is also combined with the disparity (depth) for object clustering. This tactic is especially useful for separating objects close to each other or with occultation.

## 3 Experiments and results

### 3.1 Disparity of stereovision

Figure 4b, e shows the edge maps obtained from a Canny detector. The edge points in the left image are used as seed points to search for the correspondence points in the right image by using greyscale similarity as the measure. The resulting disparity maps are displayed in Fig. 4c, f. A color scheme is used to visualize the disparity. The depth information of the image points can be derived from the disparity map. It should be noted that some points like the trees out of the detection range are not presented in the disparity maps. It is worthy to be noted that contour occluding could be generated due to the different viewpoints of the two cameras and may bring troubles for stereo correspondence matching especially for a short distance with a wider baseline. In our application, we use a relatively short stereo baseline of 218.95 mm, and the detection range is 4 to 50 m. The occluding effect is not significant. In addition, stereo matching depends on the selection of matching windows and setting of threshold of correlation coefficient. The detailed edge-indexed stereo matching procedure can be found in our previous work [26]. All experiments show that the edge-indexed stereo matching can successfully generate an edge-indexed disparity map.

### 3.2 Mixed flow results

### 3.3 Visual odometer results

*V*

_{ z }are significant and

*V*

_{ y },

*W*

_{ x },

*W*

_{ y }, and

*W*

_{ z }are tiny. This is reasonable since the vehicle was moving with a certain speed in a relatively flat road. For scenario 2,

*V*

_{ x }is also significant because the vehicle was left turning in a bend. For scenario 1,

*V*

_{ x }is equal to 0.17 m/frame, indicating that the vehicle was not strictly moving in longitudinal direction and had a small lateral moving at the moment.

Results of ego-motion estimation

Ego-motion parameters | Scenario 1 | Scenario 2 |
---|---|---|

\( \begin{array}{ccc}\hfill {V}_{x\ }\hfill & \hfill {V}_y\hfill & \hfill {V}_z\ \hfill \end{array}\;\left(\mathrm{m}/\mathrm{frame}\right) \) | −0.22 −0.04 227.04 | 48.61 −0.03 214.25 |

\( \begin{array}{ccc}\hfill {W}_{x\ }\hfill & \hfill {W}_y\hfill & \hfill {W}_z\ \hfill \end{array}\;\left(\mathrm{rad}/ frame\right) \) | 0.12 −0.15 −0.08 | 0.14 −0.02 0.07 |

During the video acquisition, a spatial NAV 982 Inertial Navigation System was fitted in the car to measure the ego-motion parameters. Although the INS may lose detection in some cases, the comparison between the effective data of two systems shows that the difference of the results is within 4 %, indicating that our visual odometer is reasonably accurate.

### 3.4 Ego-motion flow results

### 3.5 Independent flow and motion extraction

### 3.6 Evaluation of the system

*Recall*and

*Precision*are usually used to assess the accuracy of object detection.

*Recall*is defined as follows:

*tp*is the total number of true-positively detected objects,

*fn*is the total number of false-negatively detected objects, and (

*tp + fn*) indicates the total number of objects in the ground truth.

*Precision*is defined as follows:

*fp*is the total number of false-positively detected objects, and (

*tp + fp*) indicates the total number of the detected objects.

Accuracy rate of our method

Object type under detection | Precision | Recall |
---|---|---|

Pedestrian | 94.0 % | 92.2 % |

Vehicle | 94.5 % | 93.1 % |

The system is implemented with C++ language in an industrial computer equipped with a 2.40-GHz Intel Dual Core i5 processor and 4 GB of RAM. In general, we can achieve a processing rate of 10–15 frames per second (FPS), depending on complexity of the images. This processing rate includes the stereo pre-processing time. Ideally, it should work at least 25 FPS for a real-time system. But we believe that it will not be a problem to achieve this by using a bespoke image processing hardware in future.

### 3.7 Comparison with other methods

Comparison with other research work

Category | Approach | Object type under detection | Precision (%) | Recall (%) | Note |
---|---|---|---|---|---|

Self-adaptive background matching | BBM-based Cauchy distribution [4] | Pedestrian | 98.8 | 88.1 | Video surveillance with static camera |

Vehicle | 91.3 | 72.0 | |||

Optical flow | Hidden Markov model (HMM) [12] | Vehicle only | – | 86.6 | |

Stereo-motion fusion | Longuet-Higgins-Equations combined with extended Kalman filter [17] | Pedestrian or car | – | 96 | Result for feature points detection. The recall definition is slightly different from ours |

Cuboidal object model with extended Kalman filter [18] | Pedestrian or car | – | 71.3 | Result for object tracking | |

Our approach | Pedestrian | 94.0 | 92.2 | ||

Vehicle | 94.5 | 93.1 |

## 4 Conclusions

This paper presents a novel motion detection approach using a stereovision sensor for in-vehicle environment sensing system. The relationship between optical flow, stereo depth, and camera ego-motion parameters has been established. Accordingly, a visual odometer has been implemented for estimation of six ego-motion parameters by solving a set of equations fitted with a number of feature points using the linear least square method. The feature points are selected as corner points lying on the road surface and determined by using height constraint and Harris corner detection algorithm. The ego-motion flow evoked by the moving camera/vehicle is calculated from the relational model by using the estimated ego-motion parameters. The mixed flow caused by both camera motion and target motion is obtained from the correspondence matching between consecutive images. The difference between the mixed flow and the ego-motion flow yields the independent flow which attributes purely to the target motion. The moving targets are extracted according to the continuity of the similar independent flow. The approach presented here was tested on substantial complex urban traffic videos. The experimental results demonstrate that the approach can detect moving objects with a correction rate of 93 %. The accuracy of ego-motion estimation is within 4 %, comparing to an in-vehicle INS sensor. The processing rate reaches 10–15 FPS on an industrial computer equipped with a 2.40-GHz Intel Dual Core i5 processor and 4 GB of RAM.

## Declarations

### Acknowledgements

This work was sponsored by Specialized Research Fund for the Doctoral Program of Higher Education (Project No. 20133120110006), the National Natural Science Foundation of China (Project No. 61374197), and the Science and Technology Commission of Shanghai Municipality (Project No. 13510502600).

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- JE Ha, WH Lee, Foreground objects detection using multiple difference images. Opt. Eng
**4**, 047–201 (2010)Google Scholar - M.C. arco, F. Michela, B. Domenico, M. Vittorio, Background subtraction for automated multisensor surveillance: a comprehensive review. EURASIP. J. Adv. Signal. Process. 2010, 343057. doi:10.1155/2010/343057
- L Wei, H Yu, H Yuan, H Zhao, X Xu, Effective background modelling and subtraction approach for moving object detection. IET Computer Vision
**9**(1), 13–24 (2015)View ArticleGoogle Scholar - FC Cheng, SJ Ruan, Accurate motion detection using a self-adaptive background matching framework. IEEE Trans. Intell. Transp. Sys
**13**(2), 671–679 (2012)View ArticleGoogle Scholar - A. Broggi, A. Cappalunga, S. Cattani, P. Zani. In Proceedings of IEEE Intell. Veh. Symp. Lateral vehicles detection using monocular high resolution cameras on TerraMax, (2008), pp. 1143–1148Google Scholar
- Y Zhu, D Comaniciu, M Pellkofer, T Koehler, Reliable detection of overtaking vehicles using robust information fusion. IEEE Trans. Intell. Transp. Syst
**7**(4), 401–414 (2006)View ArticleGoogle Scholar - L. Kui, D. Qian, Y. He, M. Ben. Optical flow and principal component analysis-based motion detection in outdoor videos. EURASIP. J.Adv. Signal Process. 2010, 680623. doi:10.1155/2010/680623
- L. Yang, X.F. Li, J. Limin, In Proceedings of the 11th World Congress on intelligent control and automation (WCICA). Abnormal crowd behavior detection based on optical flow and dynamic threshold (2014), pp. 2902-2906Google Scholar
- E. Martinez, M. Diaz, J. Melenchon, J. Montero, I. Iriondo, J. Socoro, In Proc. IEEE Intell. Veh. Symp.. Driving assistance system based on the detection of head-on collisions (2008); pp. 913–918Google Scholar
- J Diaz Alonso, E Ros Vidal, A Rotter, M Muhlenberg, Lane-change decision aid system based on motion-driven vehicle tracking. IEEE Trans. Veh. Technol
**57**, 2736–2746 (2008)View ArticleGoogle Scholar - I. Sato, C. Yamano, H. Yanagawa, In Proc. IEEE IV. Crossing obstacle detection with a vehicle-mounted camera (2011), pp. 60–65Google Scholar
- H Jazayeri, J Cai, Y Zheng, M Tuceryan, Vehicle detection and tracking in car video based on motion model. IEEE Trans. Intell. Transp. Syst.
**12**(2), 583–595 (2011)View ArticleGoogle Scholar - H. Geiger, B. Kitt, In Proc. IEEE Intell. Veh. Symp. Object flow: a descriptor for classifying traffic motion (San Diego, USA, 2010), pp. 287–293Google Scholar
- D. Pantilie, S. Nedevschi, In Proc. IEEE Conference on Intelligent Transportation Systems. Real-time obstacle detection in complex scenarios using dense stereo vision and optical flow (Funchal, 2010), pp. 439 – 444Google Scholar
- U Franke, S Heinrich, Fast obstacle detection for urban traffic situations. IEEE Trans. Intell. Transp. Syst
**3**(3), 173–181 (2002)View ArticleGoogle Scholar - C. Rabe, U. Franke, S Gehrig, In Proceedings of IEEE Intelligent Vehicles Symposium. Fast detection of moving objects in complex scenarios (2007), pp. 398–403Google Scholar
- B. Kitt, B. Ranft, H. Lategahn, In Proc. 13th Int. IEEE Conf. on ITSC. Detection and tracking of independently moving objects in urban environments (2010), pp. 1396–1401Google Scholar
- S. Bota, S. Nedevschi, In Proc. 14th Int. IEEE Conf. ITSC, Tracking multiple objects in urban traffic environments using dense stereo and optical flow (2011), pp. 791–796Google Scholar
- P. Lenz, J. Ziegler, A. Geiger, In Proc. IEEE Intell. Veh. Symp.. Roser M. Sparse scene flow segmentation for moving object detection in urban environments (Baden-Baden, Germany, 2011), pp. 926–932Google Scholar
- T. Yamaguchi, H. Kato, Y. Ninomiya, In Proc. IEEE Intell. Veh. Symp.. Moving obstacle detection using monocular vision (2006), pp. 288–293Google Scholar
- A. Talukder, L. Matthies, In Proc. IEEE Int. Conf. Intelligent Robots and Systems. Real-time detection of moving vehicles using dense stereo objects from moving and optical flow (2004), pp. 3718-3725Google Scholar
- AM Waxman, JH Duncan, Binocular image flows: steps towards stereo-motion fusion. IEEE Trans Pattern Anal Mach Intell
**8**, 715–729 (1986)View ArticleGoogle Scholar - C. Harris, M.A. Stephens, In Proceedings of the 4th Alvey Vision Conference. Combined corner and edge detector, (1988), pp. 147–151Google Scholar
- B McCane, K Novins, D Crannitch, B Galvin, On benchmarking optical flow. Comput Vis Image Underst
**84**, 126–143 (2001)View ArticleMATHGoogle Scholar - Y. Huang, K. Young. Binocular image sequence analysis: integration of stereo disparity and optic flow for improved obstacle detection and tracking. EURASIP. J Adv. Signal. Process. 2008, 843232. doi:10.1155/2008/843232
- Y Huang, S Fu, C Thompson, Stereovision-based object segmentation for automotive applications. EURASIP J Appl Signal Process
**14**, 2322–2329 (2005)View ArticleMATHGoogle Scholar - KITTI Vision, Available online: http://www.cvlibs.net/datasets/kitti/, Accessed 18 Jul 2015.