5.1 Path planning realization
For the path planning algorithm in this paper, the RRT* algorithm is selected and has been introduced in detail in Chapter 2. The extend function of the RRT* algorithm includes the design of the cost function and the collision detection function. The specific selection scheme of these two functions will be introduced below.
5.1.1 The choice of cost function
The meaning of the cost function is to represent the cost from one point to another point. It is represented by symbol \(c\) in the extend algorithm. The choice of the cost function can directly affect the performance of the RRT* algorithm path planning. However, setting a cost function is very appropriate to the actual problem. The difficulty is no less than that of designing a path planning algorithm. In addition, the cost function will also have a great impact on the convergence rate of the algorithm. Common cost functions include the Euclidean distance between two points, the weighted sum of different terrain lengths, and the cost function for energy consumption.
The simulation environment of this project is a fully constrained car model under the same terrain. There is no need to consider energy consumption. Therefore, choosing Euclidean distance as the cost function can accelerate the convergence of the algorithm and is easy to understand, as shown in Eq. 10. The total cost from the starting point to a certain point is derived as Eq. 11.
$$c(Line(x_{1} ,x_{2} )) = \left\ {x_{1}  x_{2} } \right\$$
(10)
$${\text{Cos}} t(x) = \left\{ {\begin{array}{*{20}l} {{\text{Cos}} t(xParent) + c(Line(xParent,x)),} \hfill & {x \ne root} \hfill \\ {0,} \hfill & {x = root} \hfill \\ \end{array} } \right.$$
(11)
5.1.2 Vehicle and obstacle collision detection
In the RRT* algorithm, collision detection between cars and obstacles is required. A reasonable collision detection algorithm is very important to ensure the safety of car driving. Common collision detection methods include AABB enclosing box method and enclosing circle method [9]. This project uses the relatively simple calculation of the enclosing circle method. The specific implementation steps are as follows:
If the irregular object is converted to a circle with a slightly larger area, the collision detection of two objects will be converted to the collision detection of two circles. If the distance between the centers of the two circles is less than or equal to the sum of the radii of the two circles, the two circles collide. Otherwise, there will be no collision.
On this basis, an additional safety distance \(d\) is maintained between the car and the obstacle. The boundary of the obstacle needs to be expanded. The radius of the obstacle is added to the original basis, and the safety distance is added as the new radius of the obstacle. Furthermore, the coverage of the car can also be expressed as a circular area. If the radius of the car is also added to the radius of the obstacle, the car can be assumed to be a point, as shown in Fig. 9.
The RRT * algorithm requires to detect whether the vehicle trajectory collides with the surrounding obstacles. In the case of the encircling circle method being used for collision detection, the question is converted to whether the trajectory of the car at time \(\Delta t\) and the circular range covered by the obstacles could intersect, as shown in Fig. 10.
To determine whether there is an intersection between a line segment of length \(\Delta S\) and a circle, the following methods are used:
If the line unit direction vector of the line segment is \(\vec{e}\), the unit normal vector is \(\vec{n}\). The vector from the center of an obstacle to the left end of the line segment is \(\vec{p}_{1}\), and the vector to the right end is \(\vec{p}_{2}\). The equivalent radius of the obstacle is \(r\), as shown in Fig. 11.
If \(\left {\mathop {p_{1} }\limits^{ \to } \bullet \mathop n\limits^{ \to } } \right \le r\), at this time, there is an intersection point between the line of the line segment and the circle, but there may not be an intersection point between the line segment and the circle. If one end point of the line segment is inside the circle, the line segment must intersect with the circle. If both end points of the line segment are outside the circle and if \((\mathop {p_{1} }\limits^{ \to } \bullet \mathop e\limits^{ \to } ) \times (\mathop {p_{2} }\limits^{ \to } \bullet \mathop e\limits^{ \to } ) > 0\), it means that the line segment does not intersect with the circle. Otherwise, the line segment intersects with the circle.
5.1.3 Setting of safety distance sum
The setting of the safety distance \(d\): The safety distance is the additional distance between the obstacle and the vehicle that needs to be maintained on the basis that the vehicle and the obstacle do not collide. The setting of this distance needs to consider the ability of the reinforcement learning algorithm to avoid static and moving obstacles. If the reinforcement learning algorithm is more sensitive to the approach of obstacles, the safe distance should be as large as possible. If the reinforcement learning algorithm avoids closer obstacles with its strong ability, the safety distance can be appropriately reduced. The setting of this distance also has a great influence on the vehicle's optimal path selection. If the value is too large, the path planning algorithm will choose a "safer" path, rather than a shortest path. Here is the final choice \(d = 3\;{\text{m}}\) based on the experimental situation.
\(\eta\) reflects the maximum distance between the new point generated each time and the closest point on the random search tree. The larger the value \(\eta\) is, the faster the tree grows. However, if the value \(\eta\) is too large, the search will end soon. The sampling points are sparse and difficult to get the optimal path. This project uses reinforcement learning methods for path following, and comprehensively considers the growth speed of the random search tree and the optimality of the path to select \(\eta = {1}\;{\text{m}}\).
5.2 Automatic driving obstacle avoidance scheme based on dynamic path following
5.2.1 Program overview
The automatic driving obstacle avoidance scheme based on dynamic path following consists of two parts, which are global path planning and obstacle avoidance algorithm of reinforcement learning. Global path planning has been introduced in 4.1 and 2.1.4. The obstacle avoidance algorithm of reinforcement learning does not need to be retrained in a dynamic environment, and the network trained in Chapter 3 can be directly used for vehicle obstacle avoidance control.
The program firstly performs path planning based on the global static obstacles and then, uses the strategy network trained by reinforcement learning to dynamically follow the global path and avoid static and dynamic obstacles that may be encountered.
5.2.2 Dynamic following algorithm
The dynamic follow algorithm is used to dynamically update the target point that the vehicle is currently following, so that the vehicle can roughly move along the global planned path, thereby avoiding the problem of falling into the local optimum. There are many traditional paths following algorithms, including pure tracking method, Stanley method, dynamic model tracking, optimal predictive control, etc. This project refers to the way to determine the preview point in the pure tracking method to firstly judge the path point closest to the vehicle. The path point is used as the center of the circle, and the preview distance \(l_{d}\) is the radius to make a circle. The first point in front of the path point that is not in the circle is taken as the next target point to follow. In this experiment, the value \(\eta\) of the RRT* algorithm is 1 m. If \(l_{d} = \eta = 1\;{\text{m}}\) is set, select the next point on the path as the point to follow. This preview distance is greater than the wheelbase of the car. There is a reasonable preview distance. Meanwhile, this project does not require the vehicle trajectory to strictly conform to the global path, so the setting here can be looser. The specific algorithm steps are as follows:

1.
Select the point \(p\) closest to the current vehicle position on the global path.

2.
If \(p\) is not the end point, set the \(p\) next node as the target point to be followed by the vehicle. If \(p\) is the end point, set \(p\) as the target point to be followed by the vehicle.
After executing this algorithm, the target points the vehicle will follow is the end point or the second closest point on the path ahead, as shown in Fig. 12.
5.3 Simulation environment and performance analysis
The simulation environment of the final test is a dynamic environment, including static and dynamic obstacles. The static obstacles are the same as those in the simulation environment in Chapter 3. Dynamic obstacles are newly added here.
5.3.1 Dynamic obstacle model
The dynamic obstacle adopts the bicycle model [22]. The kinematic bicycle model assumes that the description object is shaped like a bicycle, and its control can be simplified as \((acc,\delta )\). Thereinto, \(acc\) is acceleration, stepping on the accelerator pedal means positive acceleration, and stepping on the brake pedal means negative acceleration. \(\delta\) is the steering wheel angle, because the front wheel angle is approximately proportional to the steering wheel angle. So, it can be assumed that this steering wheel angle is the current angle of the front tires. In this experiment, the speed of the dynamic obstacle is constant at 2 m/s, so the control amount for the dynamic obstacle is \(\delta\) only. The trajectory of the dynamic obstacle in each time is approximately an arc, as shown in Fig. 13. Thereinto, \(R\) is the curvature radius of the trajectory point where the rear wheel is located. \(L\) is the wheelbase, which is set to 0.8 m in this project.
The collision of dynamic obstacles obeys the completely elastic collision without rotation. The dynamic obstacle may collide with another moving obstacle, a stationary obstacle or boundary during the movement. In this project, it is assumed that all collisions obey the nonrotational fully elastic collision theorem.