 Research
 Open Access
 Published:
Efficient and accurate image alignment using TSKtype neurofuzzy network with dataminingbased evolutionary learning algorithm
EURASIP Journal on Advances in Signal Processing volume 2011, Article number: 96 (2011)
Abstract
Image alignment is considered a key problem in visual inspection applications. The main concerns for such tasks are fast image alignment with subpixel accuracy. About this, neural networkbased approaches are very popular in visual inspection because of their high accuracy and efficiency of aligning images. However, such methods are difficult to identify the structure and parameters of neural network. In this study, a TakagiSugenoKangtype neurofuzzy network (NFN) with dataminingbased evolutionary learning algorithm (DMELA) is proposed. Compared with traditional learning algorithms, DMELA combines the selforganization algorithm (SOA), datamining selection method (DMSM), and regularized least square (RLS) method to not only determine a suitable number of fuzzy rules, but also automatically tune the parameters of NFN. Experimental results are shown to demonstrate superior performance of the DMELA constructed image alignment system over other typical learning algorithms and existing alignment systems. Such system is useful to develop accurate and efficient image alignment systems.
1. Introduction
Accurate and efficient image alignment is widely applied to many industrial applications, such as automatic visual inspection, factory automation, and robotic machine vision. Among them, visual inspection is usually required at finding a geometric transformation to align images. More specifically, the geometric transformation is commonly used as an affine transformation, which is consists of scaling, rotation, and translation, for aligning images. In other words, an affine transformation is considered of great importance in designing image alignment systems. Thus, it raises a challenge to provide an efficient affine transformation. To this end, neural networkbased methods have widespread to address this challenge because such methods often feed global features of inspected images into a trained neural network to estimate affine transformation parameters [1–4]. In other words, neural networks are helpful for designing image alignment systems. Thus, there is a need to develop a neural networkbased image alignment system to demonstrate high performance [5]. To this end, the aim of this study is to design a learning algorithm to train a neural network that can estimate affine parameters precisely.
Regarding the aim, this study adopts weighted gradient orientation histograms (WGOH) [6] as an image descriptor, which extracts the features from inspected images, to be the input of the neural network. Such representation technique has been proven a good descriptor in several literatures [7, 8]. After that, we propose a novel learning algorithm to improve the robustness of neural networks. To be more specific, the proposed learning algorithm combines the selforganization algorithm (SOA), datamining selection method (DMSM), and regularized least square (RLS) method to automatically identify the structure and parameters of the network. Once our learning method is applied, the structure of the network will be variable instead of a fixed one. Moreover, automatic tuning the parameters of the network can get more dynamic search space than a heuristic way. In other words, the structure and parameters of neural networks will become more robustness. The major contribution of this study is that the proposed learning method is helpful to develop efficient image alignment systems by automatically tuning the systems' structure and parameters.
The rest of this article is organized as follows. Section 2 gives a review of related studies. In Section 3, the proposed methodology for automatic aligning industrial images is introduced. The experimental results are presented in Section 4. In Section 5, a conceptual framework for developing image alignment systems is described. The conclusion is attained in the last section.
2. Related studies
The problem of precisely aligning images has been well studied in several fields. For a broad introduction to image alignment methods, the related literature has been reviewed on several occasions [9, 10]. To brief survey, prior aligning methods can be classified as feature and areabased methods [9, 11]. Zitova and Flusser [10] pointed out that areabased methods are preferably applied to the images which have not many details. Moreover, Amintoosi et al. [11] indicated that as the signaltonoise ratio (SNR) is low, areabased methods produce better results than featurebased methods. In this study, we assume that our proposed image alignment system is developed for industrial inspection tasks such that the captured images usually have less detail. Thus, areabased methods that adopt global descriptors are recommended in this article.
Recently, neural networkbased image alignment utilizing global features has been a relatively new research subject. Such methods demonstrated high alignment speed since it only needs to feed the extracted feature vectors into the trained neural network to estimate the transformation parameters. For example, Ethanany et al. [1] presented a feedforward neural network (FNN) to align images through 144 discrete cosine transform (DCT) coefficients as the feature vectors. Their study showed that the FNN demonstrated high tolerance in deformed and noisy images. Moreover, based on FNN research, Wu and Xie [2] utilized loworder Zernike moments to replace DCT to further improve the performance of Ethanany's study, which adopted larger dimension of feature vector to represent an image sufficiently for the unorthogonality of DCTbased space. As shown in their results, the proposed method can reduce the dimension of feature vector but their alignment results are not satisfied. More recently, Xu and Guo [3] adopted isometric mapping (ISOMAP) to reduce the dimension of feature vector. Their study demonstrated that ISOMAP can drastically reduce the dimension of feature vector to improve the computational efficiency. Nevertheless, the over fitting problem could happen in FNN when a neural network is over learnt for training sets. Thus, the unseen pattern may be hard applied to this overtrained FNN since the network cannot provide the good ability of generalization. Owing to this problem, Xu and Guo [12] used a Bayesian regularization method to improve the capability of generalizing the FNN. They showed some comparative experiments that FNN with regularization indeed performed better than without regularization.
Aforementioned studies indicated that the FNN is helpful to improve the alignment efficiency. However, such methods used steepest descent technique to minimize the error function such that it may reach the local minimal. In addition, it must take a large number of iterations to minimize the error function and several training attempts are needed to provide a robust FNN. In that respect, evolutionary algorithms appear to be better candidates than steepest descent method [13–15]. Because such learning methods are global and parallel search, they have more chance to converge toward global solution. Therefore, training a neural network utilizing evolutionary algorithms has been an important field.
In this respect, several evolutionary algorithms were proposed [16–18]. Gomez and Schmidhuber [16] proposed enforced subpopulations using subpopulations of neurons for the fitness evaluation and overall control. The subpopulations that are used to evaluate the solution locally can obtain better performance compared to systems that only use one population for evaluating the solution. Moriarty and Miikkulainen [17] used a symbiotic evolution method to train a neural network. The authors indicated that the symbiotic evolution performed better than traditional genetic algorithms. Recently, Hsu et al. [18] proposed a multigroups cooperationbased symbiotic evolution (MGCSE) to train a TakagiSugenoKang (TSK)type neurofuzzy network (TNFN). Their results showed that MGCSE can obtain better performance and convergence than symbiotic evolution. Although MGCSE is a good approach for training a TNFN, it would not be suitable for image alignment tasks. The reason is that the dimension of the input of a neural network is always high and the number of hidden node is not small such that large amount of parameters must be trained. For instance, in the experiments described in this article, the dimension of the input and output of the network is 33 and 4, respectively, and the number of fuzzy rules is 25. Thus, in MGCSE's model, the total number of parameters is 5050 (r*(2 * n + m*(n + 1)), r = 25, n = 33, m = 4). Such a great number would lead the algorithm not only to impossibly converge to optimal solution, but also to estimate bad image alignment results. In addition, MGCSE performed random group combination to construct a network. In spite of such action can sustain diversity, there is no systematic way to identify suitable groups for selecting chromosomes. Thus, it could result in slow rate of convergence.
To this end, this study proposes a TNFN with dataminingbased evolutionary learning algorithm (DMELA) to solve the abovementioned problems. In the first place, DMELA encodes an antecedent part of a TSKtype fuzzy rule into a chromosome and utilizes a RLS to estimate the consequent part of a TSKtype fuzzy rule. Such combination not only reduces the number of parameters that must be trained, but also increases the convergence speed. Later, DMSM is used to explore the association rules that can identify suitable and unsuitable groups for chromosome selection. This action would solve the random group combination problem yielded by MGCSE. Finally, the SOA is utilized to decide suitability of different number of fuzzy rules. Thus, SOA is useful to automatic construct the structure of neurofuzzy networks (NFNs). In short, DMELA benefits both structure and parameters learning of a TNFN and it collocates with WGOH descriptor to provide a framework to develop accurate and efficient image alignment systems.
3. Methodology
The flow chart of the proposed image alignment algorithm, which consists of learning and executing phase, is illustrated in Figure 1. During the learning phase, the synthesized training images are created by applying the reference image to affine transformations with randomly selected parameters, and then use the WGOH descriptor to represent these training images as feature vectors. Finally, the feature vectors and desired targets are employed to train a TNFN using DMELA. During the executing phase, the sensed image is sent to the WGOH descriptor to extract a feature vector and then feed it into the DMELAtrained TNFN to estimate affine transformations parameters.
3.1. Synthesized training images
Image alignment can be viewed as a mapping between two images by means of a geometric transformation. Typically, affine transformation, which composites of translation, rotation, and scaling, is the most commonly used type. This article adopts the affine transformation as the transformation model and it can be described by the following matrix equations:
where (x_{1}, y_{1}) indicates the original image coordinate, (x_{2}, y_{2}) indicates transformed image coordinate, s is a scaling factor, (Δx, Δy) is a translation vector, θ is a rotation angle, and (x_{ c }, y_{ c }) is the center of rotation. Thus, the synthesized training images can be generated by applying various combination of translation, rotation, and scaling transformations within a predefined range.
3.2. WGOH descriptor
The WGOH has been proven a good descriptor by a global feature selection approach (GFSA), which has been presented in our previous research [7]. Such descriptor was compared with other five global descriptors and results showed that WGOH demonstrated best performance. Therefore, this article adopts WGOH as a descriptor to represent inspected images.
The WGOH descriptor was inspired by scale invariant feature transform (SIFT) descriptor [19], and presented by Bradley et al. [20] to show its high speed. The main idea of the WGOH is that it calculates the orientation histograms within a region, and uses the magnitude of the gradient at each pixel and the 2D Gaussian function to weight the histogram [6]. Therefore, for the WGOH descriptor, there are four steps for representing an image:

1.
For each image, we capture the template window, whose location is at the center of the image, to be a place of extracting features. Within the window, we divide the length and width of the window into four equal parts to form 4 × 4 grids. Each grid is considered a subimage. Thus, the template window can be split into 4 × 4 subimages.

2.
On each pixel of the subimage (I(x, y)), the gradient magnitude m(x, y), and orientation θ(x, y) are computed using pixel difference which the equations can be written as
$$m\left(x,y\right)=\sqrt{{\left(I\left(x+1,y\right)I\left(x1,y\right)\right)}^{2}+{\left(I\left(x,y+1\right)I\left(x,y1\right)\right)}^{2}},$$(2)$$\theta \left(x,y\right)={tan}^{1}\left(\left(I\left(x,y+1\right)I\left(x,y1\right)\right)\u2215\left(I\left(x+1,y\right)I\left(x1,y\right)\right)\right).$$(3) 
3.
Calculate the 8bin orientation histograms (each bin cover 45°) within each subimage which are weighted by the gradient magnitude, and the Gaussian function.

4.
Concatenate 8bin histograms of 16 subimages into a 128element feature vector, and normalize it to a unit length. To reduce strong gradient magnitudes, the elements of the feature vector are limited to 0.2, and this vector is normalized again.
Consequently, each image can be represented by a 128elemet feature vector. Figure 2 illustrates an example of WGOH computation steps. Because the 128elemet feature vector is still too high to train a TNFN, there is a requirement of finding a dimensionality reduction method to lower the dimension of the feature vector. According to [21], genetic algorithm outperformed than principal component analysis and linear discriminate analysis as dealing with their speaker recognition case. Thus, in our image alignment case, we adopted genetic algorithm method described in [22] to reduce a 128elemet into a 33element feature vector in the experimental section.
3.3. Structure of TNFN
In general, three typical types of NFN are the TSKtype, Mamdanitype, and singletontype. According to [23, 24], the authors have shown that a TNFN can offer better network size and learning accuracy than a Mamdanitype NFN. Thus, for our image alignment task, we only compare the TNFN with the singletontype NFN in the experimental section to prove that the TNFN outperforms the singletontype NFN.
A TNFN [25] employs a linear combination of the crisp inputs as the consequent part of a fuzzy rule. The fuzzy rule of the TSKtype neurofuzzy system is shown in Equation 4, where n and j represent the dimension of the input and the number of the fuzzy rules, respectively.
The structure of a TNFN is shown in Figure 3, where n represents the dimension of the input. It is a fivelayer network structure. In the TNFN, the firing strength of a fuzzy rule is calculated by performing the following "AND" operation on the truth values of each variable to its corresponding fuzzy sets by:
where ${u}_{{}^{i}}^{\left(1\right)}={x}_{i}$ and ${u}_{{}^{ij}}^{\left(3\right)}$ are the outputs of first and third layers; m_{ ij } and σ_{ ij } are the center and the width of the Gaussian membership function of the j th term of the i th input variable x_{ i }, respectively. In this article, the reason of adopting the Gaussian membership function is that it can be a universal approximator of any nonlinear functions on a compact set [23].
The output of the fuzzy system is computed by:
where u^{(5)} is the output of fifth layer, w_{ ij } is the weighting value with i th dimension and j th rule node, and M is the number of a fuzzy rule. Here, the dimension of the output is set to be 4, and they are represented as a scaling factor (s), a rotation angle (θ), and translation parameters (Δx, Δy), respectively.
3.4. Dataminingbased evolutionary learning algorithm
The proposed DMELA aims to improve MGCSE [18]. Unlike MGCSE encoding one fuzzy rule into a chromosome, DMELA only encodes an antecedent part of a fuzzy rule into a chromosome. The consequent part of a fuzzy rule used in DMELA is estimated by an RLS approach. These two operations could not only reduce the number of parameters that must be trained, but also increase the convergence speed. Therefore, details of the coding step and RLS approach are described as follows:
(1) Coding step
The coding structure of chromosomes in our proposed DMELA is shown in Figure 4. This figure describes an antecedent part of a fuzzy rule that has the form in Equation 4, where m_{ ij } and σ_{ ij } represent a Gaussian membership function with mean and deviation of i th dimension and j th rule node, respectively.
(2) RLS approach
Since the coding step only decides an antecedent part of a fuzzy rule, the consequent part is undetermined. In this article, RLS is adopted to estimate the consequent part. For simplicity, we only use two inputs (x_{1}, x_{2}) and one output (y) to represent a tworule TSKtype neurofuzzy system, which is described as follows:
Rule 1
Rule 2
where A _{ ij } and B_{ ij } are the linguistic parts with respect the input i and Rule j. From Equation 6, the output can be written as:
where u_{1} and u_{2} are the firing strengths of Rules 1 and 2, respectively, ${\mathit{\xfb}}_{1}={u}_{1}\u2215\left({u}_{1}+{u}_{2}\right)$, and ${\mathit{\xfb}}_{2}={u}_{2}\u2215\left({u}_{1}+{u}_{2}\right)$. Combine Equations 79, and we can get the following equation:
Since ${\mathit{\xfb}}_{1}$, ${\mathit{\xfb}}_{2}$, x_{1}, and x_{2} are known values, the only unknown value is the consequent part w_{ ij }. Suppose a given set of training inputs and desired outputs is ${\left\{x\left(t\right),{y}_{d}\left(t\right)\right\}}_{t=1}^{M}$. Equation 10 can be rewritten as:
where
and
In general, there is no exact solution to solve for W. Instead, a least square method is utilized to obtain an approximate solution. Moreover, to get the smooth estimation, the regularization is adopted. To this end, such method is named as RLS approach. Using RLS, the approximation solution is as follows:
where λ is a regularization parameter which adjusts the smoothness. Therefore, by getting Equation 14, we complete the estimation the consequent part of fuzzy rules. Such operation can easily be expanded to n input, m output, and r fuzzy rules of a TNFN. To compare with MGCSE, the consequent part used in this article is computed by an RLS approach rather than tuned by an evolutionary procedure. Such action would increase the convergence speed because RLS approach directly calculates the consequent part one time to minimize the errors between real and desire outputs. Nevertheless, evolutionary method tunes the consequent part many times to gradually minimize the errors.
In addition to the above two processes, to consider the structure of TNFN, DMELA adopts the variable length of a combination of chromosomes with RLS method to construct a TNFN. To deal with this, multigroups symbiotic evolution (MGSE) is utilized in this article. Unlike the traditional symbiotic evolutions (TSEs) [17], each population in MGSE is divided into several groups, and each group represents a set of chromosomes that belongs to an antecedent part of one fuzzy rule. The structure of chromosomes to construct TNFNs in DMELA is shown in Figure 5. As shown in the figure, each antecedent part of a fuzzy rule represents a chromosome selected from a group, P_{size} denotes that there are P_{size} groups in a population, and M_{ k } means that there are M_{ k } rules used in TNFN construction. Such construction allows variable number of rules in TNFN.
The learning process of DMELA in each group involves seven major operators: initialization, SOA, DMSM, fitness assignment, reproduction strategy, crossover strategy, and mutation strategy. This process stops as the number of generations or the fitness value reaches a predetermined condition. The whole learning process is described below:
3.4.1. Initialization
Before we start to design DMELA, the initial groups of individuals should be generated. The initial groups of DMELA are generated randomly within a fixed range. The following formulations show how to generate the initial chromosomes in each group:
Deviation: Chr_{ g, c } [p] = random[σ_{min}, σ_{max}],
Mean: Chr_{ g, c } [p] = random[m_{min}, m_{max}],
where Chr_{ g, c } represents c th chromosome in the g th group, N_{C} is the total number of chromosomes in each group, p represents the p th gene in a Chr_{ g, c }, and [σ_{min}, σ_{max}], [m_{min}, m_{max}] represent the predefined range to generate the chromosomes.
3.4.2. Selforganization algorithm
To select fuzzy rules automatically, SOA utilized the building blocks (BBs) to present the suitability of TNFN with different fuzzy rules. In Figure 6, SOA encodes the probability vector ${V}_{{M}_{k}}$, which stands for the suitability of a TNFN with M_{ k } rules, into BBs. In addition, in SOA, the minimum and maximum number of rules must be predefined to limit the number of fuzzy rules to a certain bound, i.e., [M_{min}, M_{max}].
After BBs is defined, we use SOA to determine the suitable selection times of each number of fuzzy rules. The "selection times" indicates how many TNFNs should be produced in one generation. In other words, SOA is used to determine the number of TNFN with M_{ k } rules in every generation. After the SOA is carried out, the selection times of the suitable number of fuzzy rules in a TNFN will increase; otherwise, the selection times of the unsuitable ones in a TNFS will decrease. The processing steps of the SOA are described as follows:
Step 0. Initialize the probability vectors of the BBs:
and
Step 1. Update the probability vectors of the BBs according to the following equations:
where ${V}_{{M}_{k}}$ is the probability vector in the BBs, λ is a predefined threshold value, Avg represents the average fitness value in the whole population, $Best\text{\_}Fitnes{s}_{{M}_{k}}$ represents the best fitness value of TNFN with M_{ k } rules, and $fi{t}_{{M}_{k}}$ is the sum of the fitness values of the TNFN with M_{ k } rules. In Equation 19, the conditions "$fi{t}_{{M}_{k}}$ ≥ or < Avg" would affect the suitability of TNFNs with M_{ k } rules to be increased or decreased.
Step 2. Determine the selection times of TNFNs with different rules according to the probability vectors of the BBs as follows:
where Selection_Times represents the total selection times in each generation and $R{p}_{{M}_{k}}$ represents the selection times of TNFNs with M_{ k } rules in one generation.
Step 3. In SOA, to prevent suitable selection times from falling into the local optimal solution, we use two different actions to update ${V}_{{M}_{k}}$. Such actions are defined in the following equations:
where SOATimes is a predefined value, Best_Fitness_{ g } represents the best fitness value of the best combination of chromosomes in the g th generation, and Best_Fitness represents the best fitness value of the best combination of chromosomes in the current generations. If Equation 27 is satisfied, then it indicates that the suitable selection times may fall into the local optimal solution. At this time, the processing step of SOA should return to Step 0 to initialize the BBs.
3.4.3. The DMSM
After the selection times are determined, DMELA further performs the selection step, which includes the selection of groups and chromosomes. In selection of groups, this article proposes DMSM to determine the suitable groups for chromosomes selection. To prevent the selected groups from falling into the local optimal solution, DMSM uses normal and explore actions to select wellperformed groups. The details of the DMSM are discussed below:
Step 0. The transactions are built, as in the following equations:
where i = 1, 2,..., M_{ k }, M_{ k } = M_{min}, M_{min+1},..., M_{max}, j = 1, 2,...,TransactionNum, the $Fitnes{s}_{{M}_{k}}$ represents the fitness value of TNFN with M_{ k } rules, ThreadFitnessvalue is a predefined value, TransactionNum is the total number of transactions, Transaction_{ j } [i] represents the i th item in the j th transaction, $TF\text{C}RuleSe{t}_{{M}_{k}}\left[i\right]$ represents the i th group in the M_{ k } groups used for chromosomes selection, and Performance Index = g and Performance Index = b represent the good and bad performances, respectively. Hence, transactions have the form shown in Table 1. As shown in Table 1, the first transaction means that the threerule TNFN formed by the first, fourth, and eighth groups have "good" performance. In contrast, the second transaction indicates that the fourrule TNFN formed by the second, fourth, seventh, and the tenth groups have "bad" performance.
Step 1. Normal action:
The aims of this action include two parts: accumulate the transaction set and select groups. Regarding the first part, if the groups fit Equations 28 and 29, then the groups are stored in a transaction. Regarding the second part, DMSM selects groups using the following equation:
where i = 1, 2,..., M_{ k }, M_{ k } = M_{min}, M_{min+1},..., M_{max}, Accumulatar defined in Equation 30 are used to determine which action should be adopted, GroupIndex[i] represents the selected i th group of the M_{ k } groups, and P_{ size } indicates that there are P_{ size } groups in a population in DMELA. If the best fitness value does not improve for a sufficient number of generations (NormalTimes), then DMSM selects groups according to explore action.
Step 2. Explore action:
If Accumulator exceeds the NormalTimes, then the current action switches to the explore action. The objective of this action is to adopt the notion of DMSM to explore suitable groups in transactions. The major operations of DMSM include FPgrowth performing, association rules generating, and suitable groups selecting. The details of these three operations are presented below.
i. FPgrowth performing In this operation, only good groups, whose performance index showed "g" in Table 1, are performed with FPgrowth and bad groups are skipped. Thus, frequently occurring groups can be found according to the predefined Minimum_Support, which stands for the minimum fraction of transactions containing the item set. After Minimum_Support is defined, data mining using FPgrowth is performed. The FPgrowth algorithm can be divided into two parts: FPtree construction and FPgrowth. The sample transactions are shown in Table 2. In this example, Minimum_Support = 3.
(1) FPtree Construction
To construct a FPtree, we first scan the transactions and retrieve the frequent 1groupset which represents the set with bigger support counts than Minimum_Support in transactions. Then, the retrieved frequently occurring groups are arranged in descending order based on their supports. After that, we discard the infrequently occurring groups and sort the remaining groups. Then, the result is shown in Table 3. Thus, the ordered transactions appeared in this table are utilized to construct a FPtree.
The steps in FPtree construction are illustrated in Figure 7a. In this figure (formed by scanning the last transaction in Table 3), the rightmost chart is called a prefixtree of the frequent 1groupset. Each node of the prefixtree is composed of one group, a count of the frequent 1groupset, and a node frequently occurring group link. Afterward, the completed FPtree shown in Figure 7b is created by combining the prefixtree of the 1groupset and the headertable.
(2) FPgrowth
The FPgrowth operation can be done by following steps: First, we choose each frequent 1groupset as a suffix group, and find the corresponding set of paths connecting to the root of the FPtree. The set of prefix paths is called the conditional group base. Second, we accumulate the count of each group in the base to construct the conditional FPtree of the corresponding suffix group. Third, after exploring the frequently occurring groups in the conditional FPtree, FPgrowth data mining is completed by the concatenation of the suffix group with the generated frequently occurring groups. Finally, the frequent groups generated by the FPgrowth are shown in Table 4.
ii. Association rules generating Once the frequently occurring groups are found, we can produce association rules from these frequent ones. For the purpose of identifying the association rules with good performance, the frequent groups must combine the groups owing bad performance shown in Table 1 to count the confidence degree. The confidence degree can be computed by the following formula:
where P(goodfrequent groups) is the conditional probability, frequent groups ∪ good or bad means the union of frequent groups and good or bad performance, and supp(frequent groups ∪ good or bad) stands for the counts of frequent groups with good or bad performance occurring in transactions. Then the rule is valid if
where minconf represents the minimal confidence given by user or expert. Hence, we can infer that if a rule satisfies Equation 32, then the frequent groups can be viewed as the suitable groups, otherwise they would be unsuitable groups. For instance, if the confidence of {1,3,6} ≥ {g} is bigger than the minimum confidence, then we construct this association rule. This rule indicates that the combination of the first, third, and sixth groups results in "good" performance. After doing so, the frequent groups are conduct to the association rules and generate the AssociatedGoodPool which contains all frequent groups satisfied Equation 32.
iii. Suitable groups selecting After the association rules are identified, DMSM selects groups according to the association rules. The group indexes are selected from the associated good groups as the following equations:
where q = 1, 2,..., AssociatedGoodPoolNum i = 1, 2,..., M_{ k }, M_{ k } = M_{min}, M_{min+1},..., M_{max}, ExploreTimes are the predefined value that judges to perform the exploring action, AssociatedGoodPool represents the sets of good item set that obtain from association rules, AssociatedGoodPoolNum presents the total number of sets in AssociatedGoodPoolNum and GoodItemSet[i] presents a good item set that select from AssociatedGoodPool randomly. In Equation 33, if M_{ k } greater than the size of GoodItemSet, then remaining groups are selected by Equation 30.
Step 3. If the best fitness value does not improve for a sufficient number of generations (ExploreTimes), then DMSM selects groups based on the normal action.
Step 4. After the M_{ k } groups are selected, M_{ k } chromosomes are selected from M_{ k } groups as follows:
where q = Random[1, N_{ c }], i = 1, 2,..., k, N_{ c } represents the total number of chromosomes in each group, and ChromosomeIndec[i] represents the index of a chromosome that is selected from the i th group.
3.4.4. Fitness assignment
In this step, the fitness value of an antecedent part of a fuzzy rule (an individual) is calculated by summing up the fitness values of all possible combinations in the chromosomes that are selected from M_{ k } groups that are decided by DMSM. The steps in the fitness value assignment are described below:
Step 1. Choose M_{ k } antecedent part of fuzzy rules with RLS method to construct a TNFN $R{p}_{{M}_{k}}$ times from M_{ k } groups with size N_{C}. The M_{ k } groups are obtained from the DMSM.
Step 2. Evaluate every TNFN that is generated from Step 1 to obtain a fitness value.
Step 3. Divide the fitness value by M_{ k } and accumulate the divided fitness value to the selected antecedent part of fuzzy rules with their fitness value records that were set to zero initially.
Step 4. Divide the accumulated fitness value of each chromosome from M_{ k } groups by the number of times that it has been selected. The average fitness value represents the performance of an antecedent part of a fuzzy rule. In this article, the fitness value is designed according the follow formulation:
where
where y_{ i } and ${\u0233}_{i}$ represent the desired and predicted values of the i th output, respectively, $E\left(\stackrel{y}{},\stackrel{\text{\_}}{y}\right)$ is a error function, and N represents the number of the training data in each generation.
3.4.5. Reproduction strategy
Reproduction is a procedure of copying individuals according to their fitness value. This study adopted our previous researchelitebased reproduction strategy (ERS) [18] to perform reproduction. In ERS, every chromosome in the best combination of M_{ k } groups must be kept by performing reproduction step. In the remaining chromosomes in each group, this study uses the roulettewheel selection method [26] for this reproduction process. The wellperformed chromosomes in the top half of each group [27] proceed to the next generation. The other half is created by executing crossover and mutation operations on chromosomes in the top half of the parent individuals.
3.4.6. Crossover strategy
Although the reproduction operation can preserve the best existing individuals, it does not create any new individuals. In nature, an offspring can inherit genes from two parents. The major way to the inheritance of parents is the crossover operator, the operation of which occurs for a selected pair with a crossover rate. In this article, a twopoint crossover strategy [26] is adopted and such strategy is illustrated in Figure 8. From this figure, exchanging the site's values between the selected sites of individual parents creates new individuals. The benefit of the twopoint crossover is its ability of introducing a higher degree of randomness into the selection of genetic material [28].
3.4.7. Mutation strategy
In spite of many new strings the crossover strategy produced, new information to every group at the site of an individual is still not provided by these strings. Mutation can randomly alter the allele of a gene. In this article, uniform mutation [26] is adopted, and the mutated gene is drawn randomly from the domain of the corresponding variable. The advantages of uniform mutation are not only to provide new information for a population but also to preserve diversity [29].
3.5. Termination criterion
If the learning steps meet one of the following conditions, DMELA is terminated, and output the final results.

(1)
The number of generations reaches a predefined maximal iteration value.

(2)
Fitness value is greater than a fitness threshold.
Consequently, the whole learning process of DMELA is summarized in Figure 9.
3.6. Time complexity analysis
In this section, to analyze the complexity of the proposed algorithm, we divide our method into six stages (skip the initialization stage) to discuss the complexity individually. Suppose the size of population is P_{size}, the size of subpopulation is N_{ c }, the number of fuzzy rules is M, the number of constructing fuzzy systems in one generation is S (i.e., the Selection_Times defined in Equation 23), the number of the training data is N, and the input dimension of NFN is n. The discussion of the complexity for each stage is as follows:

(1)
SOA: in this stage, the only computation is to update the probability vectors (Equation 19) S times in one generation. Therefore, the complexity of SOA is O(S).

(2)
DMSM: The DMSM operation includes normal and explore actions. In the normal action, since this action would be performed NormalTimes (appeared in Equation 30) in the overall learning process, the complexity of this action is O(NormalTimes). In the explore action, because the FPgrowth and association rules mining are performed only in the beginning of this action or when the system falls into local optima. As a result, the effect caused by these two operations on the overall learning efficiency is not crucial. The complexity of these two operators can be skipped. Moreover, since the explore action would be performed (ExploreTimesNormalTimes) times in the overall learning process, the complexity of this action is O(ExploreTimesNormalTimes), where ExploreTimes appeared in Equation 33.

(3)
Fitness assignment: according to Equations 6 and 36, the evaluation of fitness one time requires NMn computations. Furthermore, there are S evaluation times in one generation. Thus, the complexity of fitness assignment is O(SNMn).

(4)
Reproduction: in this stage, the roulettewheel selection method is chosen to perform reproduction. Since each selection requires N_{ c } steps and N_{ c } spins to fill the subpopulations [30], the total computation for a whole population in a generation is ${N}_{c}^{2}{P}_{size}$. Therefore, the complexity of reproduction stage is $O\left({{N}_{c}}^{2}{P}_{size}\right)$.

(5)
Crossover: to consider the selection of parents, the tournament selection is adopted to select parents. Since the tournament selection can be performed in constant time and N_{ c } P_{ Size } competitions are required to fill one generation [30], the complexity of the tournament selection is O(N_{ c } P_{ Size }). Moreover, the computation of twopoint crossover is constant in one generation. Thus, the complexity of crossover stage is O(N_{ c } P_{ Size }).

(6)
Mutation: because the uniform mutation is adopted and the mutated gene is picked randomly from the chromosome, the mutation operator needs N_{ c } P_{ Size } steps to fill overall populations. Hence, the complexity of mutation step is O(N_{ c } P_{ Size }).
In summary, the dominate complexity of the proposed algorithm is the stage of fitness assignment (O(SNMn))). It indicates that the fitness assignment step would occupy most of the learning time.
3.7. Executing procedure
After training a TNFN, the executing phase of the proposed image alignment system merely consists of computing the WGOH descriptor and then feeding it into the DMELAtrained TNFN to get a scaling factor s, a rotation angle θ, and translation parameters (Δx, Δy). About this, the proposed system is simple and efficient.
4. Experimental results
In the following experiments, visual inspection images, which are 640 × 480 pixels size, are used to examine the utility of the proposed image alignment method. Figure 10 depicts an example about such images where the left side is a reference image and the other side is a transformed image by a scaling, rotation and translation. Also in this figure, the dashed window represents a template window (the size is 200 × 200, and feature vectors are extracted within this window), and the cross sign denotes the reference location of the template.
In Table 5, four types of experimental images are prepared for simulation. The first three types of images are the synthesized ones generated randomly within the range in Table 6. In the last type of images are real ones captured from a camera. Moreover, Table 6 indicates the searching range for image alignment. If the affine transformation exceeds the range, then the image alignment system may not promise high accuracy. Thus, the range of the image alignment defined in this article is restricted in Table 6.
All the experiments are performed using an Intel Core i7 860 chip with a 2.8 GHz CPU, a 3G memory, and the Matlab 7.5 simulation software.
The experimental results in this section contain four sections. Section 4.1 performs the comparison with different types of NFNs. Comparison with existing learning methods is presented in Section 4.2. In Section 4.3, synthesized images are used to compare the proposed image alignment system with other systems. Section 4.5 uses real images to validate the alignment accuracy of the proposed system.
4.1. Comparison with different types of NFN
In this section, we perform the comparison of a TSKtype and a singletontype NFN. To setup an experiment, we run both types of NFN with the same number of fuzzy rules and the same population size described in Table 7 for 100 generations learning. Then such experiment is repeated 15 times using different initial conditions and final results are shown in Table 8. From this table, the TSKtype NFN exhibits lower image alignment error than the singletontype NFN. Thus, we can conclude that the TNFN would be performed better than the singletontype NFN in our image alignment case.
4.2. Comparison with existing learning methods
Two typical evolutionary learning methods TSE [17] and MGCSE [18] are implemented carefully to compare with the proposed DMELA. To explore the number of fuzzy rules for TSE and MGSE, the fuzzy rules are tuned by setting the range of 20100 in increments of 5. Thus, the results find that 85 and 80 rules are suitable for TSE and MGCSE, respectively.
In this simulation, training and testing images are randomly generated by the way specified in Table 5. Then 33element feature vectors are obtained by applying WGOH with genetic algorithmbased dimensionality reduction described in [22] to abovegenerated images. Moreover, before training, the initial parameters of DMELA are given in Table 7.
To consider SOA in DMELA, Figure 11 shows the results of the average probability vectors for 15 runs in different training and testing images. In this figure, the optimal number of fuzzy rules is 24. It represents that in most cases a 24rule TNFN would have better performance than other rules within [M_{min}, M_{max}] = [18, 25].
Figure 12ab depicts the learning curves and root mean square error (RMSE) of three different methods. From this figure, DMELA demonstrates fast convergence speed and less RMSE than TSE and MGCSE. In addition, due to RLS method utilized, the high initial fitness value would occur in Figure 12a.
Furthermore, to discuss the learning time, we add the time measurement on the proposed algorithm and perform comparison with MGCSE and TSE. The running time defined in this article is to measure the time as the fitness of the algorithm reaches the predefined value. Thus, the results of three algorithms over 15 runs at different initial conditions are reported in Table 9. As shown in this table, the proposed algorithm (DMELA) is much faster than MGCSE and TSE.
4.3. Comparison with existing image alignment systems
To evaluate the proposed system in comparison with other existing systems [3, 12, 19], the implementation of these existing systems are carefully cited in their original article. The comparison in this section consists of the alignment accuracy and robustness. These comparisons are discussed in the following parts.
4.3.1 Alignment accuracy
To compare the alignment accuracy of different systems, the training images, which are used to train neural networks, and the testing images, which are used to check the alignment accuracy, are generated by the way described in Table 5.
Figure 13 depicts an alignment example for a testing image on three different systems. The cross sign in this figure denotes the estimated results. From this figure, the proposed system can estimate more accurate position and orientation of the cross sign than other systems.
In addition, 15 runs using different training and testing images are performed to further examine the alignment accuracy of the proposed system. The simulation results are shown in Table 10, which presents the average and standard deviation error of three image alignment systems. From this table, the proposed system exhibits the lowest alignment error than other systems. Moreover, the simulated data indicate that the alignment accuracy reach the subpixel level; thus, the proposed system can provide a useful way to align images very accurately.
4.3.2. Alignment speed
To demonstrate the alignment speed, the execution time required in performing one image alignment task is discussed. In this article, the steps of performing one image alignment task consists of capturing the template window from the input image, computing the feature within the window, and feeding the calculated feature into the trained network to get the affine parameters.
In this experiment, we utilize 240 testing images to perform image alignment tasks. The average execution time of image alignment in the proposed system, Isomap, KICA, and SIFT takes about 30, 330, 65, and 57 ms, respectively. From this result, we infer that the proposed system is efficient and can apply to realtime tasks.
4.3.3. Alignment robustness
Next, the robustness of the proposed image alignment system under different levels of random additive Gaussian noise is discussed. In this experiment, 420 training images are first added with Gaussian noise and then the remained 180 testing images are added with noise of the same strength as that in training images. Figure 14 illustrates an example of aligning a testing image with the reference image under 10 dB SNR condition. As shown in this figure, the proposed system estimates the rotation and translation of the cross sign more accurately than other methods.
The simulation results of the absolute estimating errors of affine parameters under eight levels of SNR are presented in Figure 15ad. From the figure, the proposed system demonstrates lower affine parameters error than other systems, especially as SNR is larger than 15 dB. It stands for the propose system with high robustness against noise.
4.4. Real image alignment case
In this section, real images are utilized to verify the effectiveness of the proposed system. Figure 16ad presents the results of aligning the same real image using the proposed system, ISOMAP, KICA, and SIFT, respectively. As shown in this figure, the proposed system demonstrates more accurate rotation and position of the cross sign than other alignment systems. Thus, applying the proposed image alignment system to real image cases is feasible.
5. A conceptual framework for aligning visual inspection images
To sum up the findings, this study proposes a conceptual framework to assist users in designing image alignment systems (see Figure 17). As shown in Figure 17, three stages are introduced. In the first stage, a feature extraction approach, which named WGOH descriptor, is adopted for generating feature vectors. Subsequently, three training procedures are developed to reach the aims of automatically determining the structure and tuning the parameters of TNFN. Finally, the estimated affine transformation parameters are used to align the inspected image with the reference image.
6. Conclusion
In this article, DMELA is proposed for training a TNFN to perform image alignment tasks. Thus, this study tends to investigate two aims including developing an evolutionary learning algorithm and designing an efficient and accurate image alignment system.
Regarding the first aim, the proposed DMELA combines chromosome encoding and RLS method to determine the antecedent and consequent part of fuzzy rules. Such combination can offer faster convergence and less RMSE in comparison with other evolutionary algorithm. Moreover, this article utilizes a DMSM to select suitable groups and identify unsuitable groups for chromosome selection. Such operation would solve the random group selection problem yielded by TSE algorithm. Finally, an SOA is adopted to evaluate suitability of different number of fuzzy rules such that the automatic structure construction of a NFN is feasible.
Regarding the second aim, by integrating a WGOH descriptor with a DMELAtrained TNFN to form an image alignment system could estimate affine parameters accurately. The evidence can be found in experimental results on both synthesized and real images. The results show that the proposed alignment system can reach a subpixel accuracy, realtime speed, and high noise robustness level. Consequently, this finding is helpful to develop efficient and accurate image alignment systems.
In spite of the proposed system demonstrating good performance, there still have some limitations. More specifically, the searching range of image alignment is not large enough. Such case would limit the alignment performance. Thus, future study should be taken into account the coarse to fine image alignment to enlarge the searching range. Moreover, the image alignment accuracy in the case of low SNR is not high enough. There is a need to improve the WGOH descriptor to suppress noise.
Abbreviations
 BBs:

building blocks
 DCT:

discrete cosine transform
 DMELA:

Dataminingbased evolutionary learning algorithm
 DMSM:

datamining selection method
 ERS:

elitebased reproduction strategy
 FNN:

feedforward neural network
 GFSA:

global feature selection approach
 ISOMAP:

isometric mapping
 MGCSE:

multigroups cooperation based symbiotic evolution
 MGSE:

groups symbiotic evolution
 NFN:

neurofuzzy network
 RLS:

regularized least square
 SIFT:

scale invariant feature transform
 SNR:

signaltonoise ratio
 SOA:

selforganization algorithm
 TNFN:

TSKtype neurofuzzy network
 TSE:

traditional symbiotic evolution
 TSK:

TakagiSugenoKang
 WGOH:

weighted gradient orientation histograms.
References
 1.
Elhanany I, Sheinfeld M, Beckl A, Kadmon Y, Tal N, Tirosh D: Robust image registration based on feedforward neural networks. In Proceedings of IEEE International Conference on System, Man and Cybernetics. Volume 2. Nashville, USA; 2000:15071511.
 2.
Wu J, Xie J: Zernike momentbased image registration scheme utilizing feedforward neural networks. In Proceedings of the 5th World Congress on Intelligent Control and Automation. Volume 5. Hangzhou, P.R. China; 2004:40464048.
 3.
Xu AB, Guo P: Isomap and neural networks based image registration scheme. Lecture Notes in Computer Science 2006, 3972: 486491. 10.1007/11760023_71
 4.
Abche AB, Yaacoub F, Maalouf A, Karam E: Image registration based on neural network and Fourier transform. In Proceedings of the 28th IEEE EMBS annual international conference. New York, USA; 2006:8034806.
 5.
Sarnel H, Senol Y, Sagirlibas D: Accurate and robust image registration based on radial basis neural networks. In IEEE International Symposium on Computer and Information Sciences. Istanbul, Turkey; 2008:15.
 6.
Hofmeister M, Liebsch M, Zell A: Visual selflocalization for small mobile robots with weighted gradient orientation histograms. In 40th International Symposium on Robotics (ISR). Barcelona, Spain; 2009:8791.
 7.
Hsu CY, Hsu YC, Lin SF: A hybrid learning neural network based image alignment system using global feature selection approach. Adv Comput Sci Eng 2011,6(2):129157.
 8.
Hofmeister M, Vorst P, Zell A: A comparison of efficient global image features for localizing small mobile robots. In Proceedings of ISR/ROBOTIK. Munich, Germany; 2010:143150.
 9.
Brown LG: A survey of image registration techniques. ACM Comput Surv 1992,24(4):325376. 10.1145/146370.146374
 10.
Zitova B, Flusser J: Image registration methods: a survey. Image Vis Comput 2003,21(11):9771000. 10.1016/S02628856(03)001379
 11.
Amintoosi M, Fathy M, Mozayani N: Precise image registration with structural similarity error measurement applied to superresolution. EURASIP J Adv Signal Process 2009, 7. Article ID 305479
 12.
Xu AB, Guo P: Image registration with regularized neural network. Lecture Notes in Computer Science 2006, 4233: 286293. 10.1007/11893257_32
 13.
Juang CF: A TSKtype recurrent fuzzy network for dynamic systems processing by neural network and genetic algorithms. IEEE Trans Fuzzy Syst 2002,10(2):155170. 10.1109/91.995118
 14.
Hsu YC, Lin SF: Reinforcement group cooperation based symbiotic evolution for recurrent waveletbased neurofuzzy systems. Neurocomputing 2009, 72: 24182432. 10.1016/j.neucom.2008.12.027
 15.
Li M, Wang Z: A hybrid coevolutionary algorithm for designing fuzzy classifiers. Inf Sci 2009,179(12):19701983. 10.1016/j.ins.2009.01.045
 16.
Gomez F, Schmidhuber J: Coevolving recurrent neurons learn deep memory POMDPs. In Proceeding of Conference on Genetic and Evolutionary Computation. Washington, DC, USA; 2005:491498.
 17.
Moriarty DE, Miikkulainen R: Efficient reinforcement learning through symbiotic evolution. Mach Learn 1996, 22: 1132.
 18.
Hsu YC, Lin SF, Cheng YC: Multi groups cooperation based symbiotic evolution for TSKtype neurofuzzy systems design. Expert Syst Appl 2010,37(7):53205330. 10.1016/j.eswa.2010.01.003
 19.
Lowe D: Distinctive image features from scaleinvariant keypoints. Int J Comput Vis 2004,60(2):91110.
 20.
Bradley DM, Patel R, Vandapel N, Thayer SM: Realtime imagebased topological localization in large outdoor environments. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Edmonton, Canada; 2005:36703677.
 21.
Zamalloa M, RodriguesFuentes LJ, Penagarikano M, Bordel G, Uribe JP: Feature dimensionality reduction through genetic algorithms for faster speaker recognition. In 16th European Signal Processing Conference. Lausanne, Switzerland; 2008.
 22.
Neshatian K, Zhang M: Dimensionality reduction in face detection: a genetic programming approach. In 24th International Conference Image and Vision Computing. Wellington, New Zealand; 2009:391396.
 23.
Juang CF, Lin CT: An online selfconstructing neural fuzzy inference network and its applications. IEEE Trans Fuzzy Syst 1998,6(1):1232. 10.1109/91.660805
 24.
Sugeno M, Tanaka K: Successive identification of a fuzzy model and its applications to prediction of a complex system. Fuzzy Sets Syst 1991,42(3):315334. 10.1016/01650114(91)90110C
 25.
Takagi T, Sugeno M: Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cybern 1985,15(1):116132.
 26.
Cordon O, Herrera F, Hoffmann F, Magdalena L: Genetic Fuzzy Systems Evolutionary Tuning and Learning of Fuzzy Knowledge Bases, Advances in Fuzzy SystemsApplications and Theory. Volume 19. World Scientific Publishing, NJ; 2001.
 27.
Juang CF, Lin JY, Lin CT: Genetic reinforcement learning through symbiotic evolution for fuzzy controller design. IEEE Trans Syst Man Cybern B 2000,30(2):290302. 10.1109/3477.836377
 28.
Cox E: Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration. 1st edition. Morgan Kaufman Publications, San Francisco; 2005.
 29.
Dempsey I: Constant generation for the financial domain using grammatical evolution. In Genetic and Evolutionary Computation Conference Workshop Program. Washington, DC, USA; 2005:350353.
 30.
Goldberg DE, Deb K: A comparative analysis of selection schemes used in genetic algorithms. In Foundations of Genetic Algorithms. Volume 1. San Mateo, CA, USA; 1991:6993.
Acknowledgements
The authors gratefully acknowledge the reviewers for their valuable comments and suggestions.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
About this article
Cite this article
Hsu, C., Cheng, Y. & Lin, S. Efficient and accurate image alignment using TSKtype neurofuzzy network with dataminingbased evolutionary learning algorithm. EURASIP J. Adv. Signal Process. 2011, 96 (2011). https://doi.org/10.1186/16876180201196
Received:
Accepted:
Published:
Keywords
 subpixel accuracy
 TSKtype neurofuzzy network
 datamining based evolutionary learning algorithm
 regularized least square