In present section, we will introduce a novel prerequisite parameter identification method of TS fuzzy model in detail. Firstly, FCM algorithm is used to initialize inputoutput space, decompose input space into c fuzzy subspace, and determine the clustering center of fuzzy subspace. After that, the center of the fuzzy subspace which is gotten in the first step is substituted into the Gaussian membership function. In the third step, the PSO algorithm is utilized to optimize the width of the Gaussian function and determine the membership function while keeping the center of the Gaussian function unchanged. The center and width of the Gaussian function are not easy to be determine. Finally, RLS method is used to identify the conclusion parameters. Then, the identification model is obtained and the specific flow diagram of this method is shown in Fig. 1.
The key problem of the new modeling method proposed in this paper lies in the application of Gaussian membership function and how to quickly optimize its two parameters, center and width. These are discussed in detail in this section.
4.1 A novel premise parameter identification method is based on FCM and PSO
In this part, we will elaborate the method of using traditional FCM clustering algorithm and PSO algorithm to determine the parameters of Gaussian function. FCM algorithm is used to obtain rough tuning results, and then, PSO algorithm is used to achieve finetuning. The used methods of minute forming fuzzy set are all conventional algorithms, which are characterized by simple structure and helpful to make the identification of premise parameters more concise and effective.
4.1.1 Determination of center of Gaussian membership function by FCM
The FCM algorithm [10] can be expressed as minimizing the following objective function:
$$ J_{m}(U,v)=\sum\limits_{j=1}^{n}\sum\limits_{i=1}^{c}\left(\mu_{ij}\right)^{m}\left(d_{ij}\right)^{2} $$
(4)
satisfying
$$ \sum\limits_{i=1}^{c}\mu_{ij}=1, 1\leq j\leq n, \mu_{ij}\geq 0, 1\leq i\leq c $$
(5)
where n is the input variable dimension and c is the cluster center number. m>1 is weight index of membership function. If m is too small, the membership of the input variable is around 1, which will affect the identification accuracy; if m is too large, the number of crossover among membership functions is too much, which will also affect the identification accuracy. In practice, m=2 is often taken. U is a fuzzy partition matrix containing the membership of each feature vector for each cluster. z is the center of clustering, z={z_{1},z_{2},…,z_{c}},z_{i}∈R^{n}. The clustering center can be calculated according to formula (6):
$$ z_{i}=\sum\limits_{j=1}^{n}\left(\mu_{ij}\right)^{m}x_{j}/\sum\limits_{j=1}^{n}\left(\mu_{ij}\right)^{m}, \forall i $$
(6)
The fuzzy membership function matrix U can be obtained by the following formulas:
$$ \mu_{ij}=1/\sum\limits_{k=1}^{c}\left(\frac{d_{ij}}{d_{kj}}\right)^{2/(m1)} $$
(7)
$$ d_{ij}=\x_{j}z_{i}\>0, \forall i,j $$
(8)
if d_{ij}=0, then μ_{ij}=1,μ_{kj}=0, for all k≠i
The initial value of the FCM center matrix z is given at random; after that, the fuzzy partition matrix U is calculated by using formula (7) for all the eigenvectors. The initialization of z is obtained by randomly selecting the eigenvalues of each cluster center (z_{ij}), which should be within the set of the listed eigendata. The stop condition is achieved by setting ε. Set it according to users’ needs.
Offline calculation method is as follows:

(1)
Random number generator is used to give the initial value to the clustering center matrix z, and the clustering center was recorded, and set k=0;

(2)
The initial value of the fuzzy partition matrix U^{(k=0)} is calculated by using Eqs. (7) and (8);

(3)
Increase k so that k=k+1, and use Eq. (6) to update cluster center z;

(4)
Equations (7) and (8) are used to renew the fuzzy partition matrix U^{(k)};

(5)
If ∥U^{(k)}−U^{(k−1)}∥<ε is satisfied, the calculation stops; otherwise, repeat steps 3 ∼5.
The center of Gaussian function (the clustering center) can be obtained from the above steps.
4.1.2 Optimization of the width of Gaussian membership function by PSO
In 1995, Kennedy et al. proposed PSO algorithm [24], which has the advantages of evolutionary computation and swarm intelligence, and it is a heuristic global optimization algorithm. In this paper, the purpose of using PSO is to optimize the width of Gaussian function and realize the finetuning of fuzzy division of premise parameters to get higher modeling accuracy. In addition, when optimizing the width parameter, the minimum mean square error (MSE) (formula (18)) is used as the objective function of PSO algorithm for global search to find the best particle location.
The PSO algorithm is briefly described as follows: let particles search in Ddimensional space, and the number of particles is N. Where the position of kth particle is B_{k}=(b_{k1},b_{k2},…,b_{kD}), the velocity of the particle is V_{k}=(v_{k1},v_{k2},…,v_{kD}), each particle is a solution to the optimization problem, and the particle finds a new solution by constantly changing its position and speed. The optimal solution of the kth particle searched so far is P_{k}=(p_{k1},p_{k2},…,p_{kD}), and the optimal position experienced by the whole group is P_{g}=(p_{g1},p_{g2},…,p_{gD}). The velocity and position of each particle vary in line with Eqs. (9) and (10):
$$ \begin{aligned} v_{kd}(t+1)=&\omega v_{kd}(t)+c_{1}r_{1}\left(p_{kd}(t)b_{kd}(t)\right)\\ &+c_{2}r_{2}\left(p_{gd}(t)b_{kd}(t)\right) \end{aligned} $$
(9)
$$ b_{kd}(t+1)=b_{kd}(t)+v_{kd}(t+1) $$
(10)
where r_{1} and r_{2} are the random numbers between [0,1]; c_{1} and c_{2} are the normal numbers, which are called accelerators; and w is the inertia weight. The range of velocity and position variation in ddimension of each particle is [−v_{d,max},v_{d,max}] and [−x_{d,max},x_{d,max}]. If the maximum velocity of the particle, v_{d,max}, is too high, it might cause the particle to fly through the best solution; if the maximum velocity is too small to make the search speed too slow, it may lead to fall into local optimal solution. Inertia weight w can well control the search range of particles. When w is large, particles are searched in a wide range. When w is small, particles are excavated in a small range. When PSO algorithm is used to optimize the width of Gaussian function, the learning factors c_{1},c_{2} are both set as 2 and the inertia weight ω is updated by the following formula:
$$ \omega=\omega_{\text{min}}+DT\cdot \frac{\omega_{\text{max}}\omega_{min}}{\text{max}{DT}} $$
(11)
where DT is the number of iterations. Let maxDT=100 be the maximum number of iterations, and ω_{min}=0.4,ω_{max}=0.9.
According to the above methods, the optimal widths of Gaussian membership function are obtained. The new premise parameter identification method can be specifically described as:

1)
Determine the number of input variables r, and make a fuzzy division of each input space (determine c). Initialize the center and width of the Gaussian.

2)
FCM algorithm is used to optimize the centers of Gaussian function and determine the centers of Gaussian function. The center of Gaussian function is determined by FCM algorithm. Firstly, the FCM algorithm is used to automatically obtain the initial cluster centers of the dataset. Then, it is optimized step by step. Finally, the determined clustering centers are treated as the centers of Gaussian function. This algorithm is not sensitive to the initial value. On the basis of the above results, the width of Gaussian membership function is determined by PSO algorithm. First, the initial value is 0.4 according to the experience, and then, it is optimized gradually to determine the width of Gaussian function.

3)
Under the condition that the center is determined and unchanged, PSO intelligent optimization algorithm is used to optimize the width of Gaussian function, and a relatively ideal membership function is finally obtained.
4.2 Consequent parameter identification
The identification of the premise parameters is determined, followed by the identification of the consequent parameters.
The output of the system can be expressed as:
$$ y=\sum\limits_{i=1}^{c}\omega_{i}y_{i}/\sum\limits_{i=1}^{c}\omega_{i} $$
(12)
$$ \begin{aligned} &\omega_{i}=\prod\limits_{k\in I}{\mu_{A_{kj}}(x_{k})}\\ &I=\{1,2,\ldots,n\}, i=1,2,\ldots,c \end{aligned} $$
(13)
where x_{k} is the kth input variable of the fuzzy model; \(\mu _{A_{jk}}\) is the membership of the jth fuzzy subset of variable x_{k}, which is obtained by the previous fuzzy partition; y_{i} is the output of rule i; and \(\prod \) is a fuzzy operator, usually using small operation.
Define
$$ \overline{\omega_{i}}=\omega_{i}/\sum\limits_{m=1}^{c}\omega_{m} $$
(14)
so the output of the fuzzy system is:
$$ \begin{aligned} y&=\sum\limits_{i=1}^{c}\overline{\omega_{i}}y_{i}\\ &=\sum\limits_{i=1}^{c}\overline{\omega_{i}}\left(p_{0}^{i}+p_{1}^{i}x_{1}+p_{2}^{i}x_{2}+\ldots+p_{n}^{i}x_{n}\right)\\ &=\left[\begin{array}{ccccccccc}\overline{\omega_{1}} & \overline{\omega_{1}}x_{1} & \ldots & \overline{\omega_{1}}x_{n} & \overline{\omega_{c}} & \overline{\omega_{c}}x_{1} & \ldots & \overline{\omega_{c}}x_{n} \end{array}\right]\\ &\quad\times\left[\begin{array}{ccccccccc}p_{0}^{1} & p_{1}^{1} & \ldots & p_{n}^{1} & \ldots & p_{0}^{c} & p_{1}^{c} & \ldots & p_{n}^{c}\end{array}\right]^{T} \end{aligned} $$
(15)
substitute N pairs of input and output data into (14) to get a matrix equation.
where P is the L=(r+1)cdimensional consequent parameter vector and Y and X are the matrices of N×1 and N×L. r is the number of input variables, and c is the fuzzy rule number. P^{∗}=(X^{T}X)^{−1}X^{T}Y is the least square estimation of P. In order to iteratively optimize the consequent parameter matrix P and avoid matrix inverse, the recursive least squares algorithm is adopted here. If the ith row vector of X is x_{i} and the ith component of Y is y_{i}, then the recursive algorithm is:
$$ P_{i+1}=P_{i}+\frac{S_{i+1}\cdot X_{i+1}^{T}\cdot\left(y_{i+1}X_{i+1}^{T}\cdot P_{i}\right)}{1+X_{i+1}\cdot S_{i}\cdot X_{i+1}^{T}} $$
(17)
$$ \begin{aligned} &S_{i+1}=S_{i}\frac{S_{i+1}\cdot X_{i+1}^{T}\cdot X_{i+1}\cdot S_{i})}{1+X_{i+1}\cdot S_{i}\cdot X_{i+1}^{T}}\\ &i=0,1,\ldots,N1 \end{aligned} $$
(18)
Initial condition is P_{0}=0,S_{0}=αI. α is always going to be more than 10,000. I is the identity matrix of L×L. Formula (16) is used to calculate the optimal conclusion parameters in the sense of error square, and output the conclusion parameters and the minimum mean square error MSE after the recursive termination.
$$ MSE=\sum\limits_{i=1}^{N}\left(y_{i}\widehat{y_{i}}\right)^{2}/N $$
(19)
The complete fuzzy identification algorithm proposed in this paper is as follows:

(1)
Determine the number of input variables r, and conduct fuzzy division of each input space (determine c);

(2)
Calculate the premise parameters \(\mu _{A_{ij}}(x_j)\) according to Eq. (2) of this paper;

(3)
Get X from Eq. (14);

(4)
P is obtained by using Eqs. (16) and (17);

(5)
Calculate the performance indicator MSE. If the value is less than the threshold or two adjacent times are unchanged, then go to step 6. Otherwise, go to step 4;

(6)
If MSE satisfies the required recognition accuracy, the identification is terminated; if not, add c and go to step 2.