University of Birmingham Joint parameter estimation and Cramer-Rao bound analysis in ground-based forward scatter radar

Nowadays, ensuring high quality can be considered the main strength for a com- pany ’ s success. Especially, in a period of economic recession, quality control is crucial from the operational and strategic point of view. There are different quality control methods and it has been proven that on the whole companies using a continuous improvement approach, eliminating waste and maximizing productive ﬂ ow, are more ef ﬁ cient and produce more with lower costs. This paper presents a method to optimize the quality control stage for a wood manufacturing ﬁ rm. The method is based on the employment of the principal component analysis in order to reduce the number of critical variables to be given as input for an arti ﬁ cial neural network (ANN) to identify wood veneer defects. The proposed method allows the ANN clas- si ﬁ er to identify defects in real time and increase the response speed during the quality control stage so that veneers with defects do not pass through the whole production cycle but are rejected at the beginning.


Introduction
In recent years, quality concept has become very crucial not just for the products themselves, but as a competitiveness factor for the companies. Nowadays, the concept of quality for companies is synonymous with efficiency. The companies which reach high levels of quality are more efficient because they produce better products with lower costs. Companies whose organization or processes do not exhibit acceptable levels of quality are characterized by a number of internal errors such as dead times, poor coordination for numerous activities which are overlapping and disjointed, waste of resources, and a lack of tools and procedures for collecting feedback to a process of continuous improvement.
Quality control may generally be defined as a system that maintains a desired level of quality, through feedback on product/service characteristics and implementation of remedial actions in case of a deviation of such characteristics from a specified standard (Mitra, 2012). Several tools stemming from statistics, computer science, and other similar fields are used to perform and improve the process of quality control. For instance, artificial neural networks (ANNs) (McCulloch & Pitts, 1943) have been used in many real-world applications related to quality control and in particular an ANN classifier has been proven to give the best results to correctly recognize wood veneer defects (D'Addona & Teti, 2013;Pham & Liu, 1995). The ANN takes as input 17 statistical features extracted from wood veneer images captured using a charge-coupled device (CCD) matrix camera. These images were converted into gray-level histograms after applying segmentation and image-processing algorithms in order to extract the features. Using these features an ANN could be trained to distinguish between 12 veneer defects together with clear wood giving 13 classes. During training the ANN takes the 17 features as input and assigns it to one of the 13 classes, and during recall it receives the 17 features and indicates the class to which it belongs to.
In order to improve the quality control and reduce the processing time, it is necessary to identify the critical variables and eliminate the redundant and the noisy features. A useful tool for this is the principal component analysis (PCA) (Pearson, 1901); this multivariate statistical technique, employed in many fields uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components (PCs). In this way, it is possible to reduce an original space R p by representing the variables in a new space R k with k < p.
The aim of this paper is to improve the quality control for the identification of wood veneers defects through the integration of the PCA and ANN. The procedure consists of identifying the PCs in the 17 features in order to reduce the number of inputs to be given to the ANN necessary to detect the defects in real time with minimal error, whereby reducing the whole quality control process time.
This paper is organized as follows: Section 2 surveys the literature on the identification of wood veneer defects; the experimental procedure is given in Section 3; the PCA pre-processing is given in Section 4; neural network design is given in Section 5; the results are given in Section 6; and finally conclusions and future research directions are given in Section 7.

Literature review
This section gives a survey of the papers regarding the identification of wood veneer defects, highlighting the pros and cons of the methods used.
In order to detect wood veneer defects, an automatic visual inspection system was developed by Alcock (1996, 1999a) to segment the images of birch wood boards. Monochrome images of the veneer are pre-processed by automated algorithms that locate defect areas (Pham & Alcock, 1999b) where a set of numerical descriptors is extracted for further analysis. Seventeen statistical attributes of the local gray-level distribution were identified as relevant for defect identification (Lappalainen, Alcock, & Wani, 1994;Pham & Alcock, 1999c). Twelve possible veneer defects were distinguished in contrast to clear wood giving 13 classes. For each data sample, a classifier takes the 17-dimensional vector of image feature and decides to which of the 13 classes the pattern belongs to.
Among several algorithms, ANN has given the best results regarding the ability to recognize wood veneers defects. A three-layered multi-layer perceptron (MLP) gave 85% identification rates (Packianather, 1997;Packianather, Drake, & Rowlands, 2000;Pham & Liu, 1995). Learning vector quantization networks have been applied to perform the classification of wood defects with high accuracy (Pham & Sagiroglu, 2000). In order to improve the classification accuracy of a single network, a decision tree of smaller and more specialized modular neural networks were introduced to achieve classification by successive refinements .
In Packianather and Drake (2004), Response surface methodology was used to design a MLP network for classifying surface defects on wood veneer. The results showed that although the performance of the neural network could be improved by this method extrapolation outside the tested parameter range should be avoided.
A comparison between the minimum distance classifier (MDC) and neural networks to identify wood veneer defects was performed showing that the MDC does not perform as well as a neural network (Packianather & Drake, 2005).
Further, the Bees Algorithm (BA) was employed in place of the Back Propagation (BP) algorithm to optimize the weights of neural network for identification of wood defects (Pham, Ghanbarzadeh, Koc, Otri, & Packianather, 2006). Both the algorithms showed the same accuracy and in addition the BA proved to be considerably faster.
Lastly, the evolutionary ANN Generation and Training algorithm was used in the design and training of MLP classifier for identification of wood veneer defects (Castellani & Rowlands, 2009). The algorithm enabled the neural network topology and the weights to evolve over time. Compared to the approach based on the Taguchi method for the manual optimization of the MLP structure and the control parameters of the BP rule for tuning the connection weights, this algorithm performed equally well to the ANN solutions but using considerably smaller NN architectures. This paper expands previous works on the classification of wood veneer defects using statistical features extracted from wood veneer defect images; in particular it proposes a method to optimize the identification processes using the integration of ANN and PCA. The PCA method has been widely used as a means of reducing the dimensionality of the input space (Ahmadzadeh & Lundberg, 2013;Charytoniuk & Chen, 2000;Mohamed-Saleh & Hoyle, 2008;Rajput, Das, Mishra, Singh, & Dwivedi, 2010;Sratthaphut, Jamrus Woothianusorn, & Toyama, 2013;Tabe, Simons, Savery, West, & Williams, 1999). In this paper, PCA is used to reduce the number of input features for the wood veneer defects identification problem.

Case study: wood veneer defects identification problem
Plywood is made of thin layers of wood, called veneers, joined together using an adhesive. The quality of a board is determined by the types and number of defects in the constituent sheets. In any case, high-quality boards should be made up of sheets containing as few defects as possible or preferably no defects. To ensure this, careful inspection of the sheets is required as part of the quality control process. Defects of the veneer are identified by human inspectors as the sheets are transported to an assembly on a conveyor. This task is extremely stressful and demanding and a short disturbance or loss of attention results in misclassifications. A study conducted on human inspectors on wood mills reported that they could only obtain up to 55% accuracy in wood sheet inspection (Pölzleitner & Schwingshakl, 1992). Hence, an automatic visual inspection system was developed in order to increase the accuracy in wood sheet inspection (Drake & Packianather, 1998;Pham & Alcock, 1996, 1999c. Wood sheets were presented to a CCD camera which captured their images. These were segmented to separate clear wood and defective areas. Features were then extracted from the segmented images. The feature vectors obtained were finally presented to the defect classification module that performed the task of grouping them into one of 12 types. Seventeen statistical features (Table 1) of the local gray-level distribution were identified as relevant for defect identification (Lappalainen et al., 1994;Pham & Alcock 1999c). Twelve possible defects of the Mode grey level 4 Standard deviation of the grey levels σ 5 Skewness 6 Kurtosis 7 Number of pixels with a grey level of less than or equal to 80 8 Number of pixels with a grey level of greater than or equal to 220 9 Grey level p for which there are 20 pixels below 10 Grey level s for which there are 20 pixels above 11 Histogram tail length on the dark side (q − p) 12 Histogram tail length on the bright side (s − r) 13 Number of edge pixels after thresholding a segmented window at mean value 14 Number of pixels after thresholding at μ − 2σ 15 Calculate the number of edge pixels for feature 14 16 Number of pixels after thresholding at μ + 2σ 17 Calculate the number of edge pixels for feature 16 veneer can be distinguished in contrast to clear wood giving 13 possible classes shown in Figure 1. For each data sample, a classifier takes the 17-dimensional vector of image features and decides to which of the 13 classes the pattern belongs. The experimental procedure for training the ANN classifier is described in the following section.

Experimental procedure
For each defect shown in Figure 1, a 20 × 17 matrix was available except for the defects of curly grain, holes, and worm holes where their matrix dimensions were lower. In order to avoid any imbalance in the training data these three defects were excluded in this study. The number of rows in the matrix dimension indicates the number of exemplars available for each class and the number of columns indicates the total number of features extracted as given in Table 1. An initial examination of features 7 and 8 showed that a considerable number of feature values were zero and for this reason they were excluded from this study. The PCA-based preprocessing was then applied on 15 features in order to find the principal variables to be given as input for the ANN. A feed-forward ANN with a Back-Propagation learning algorithm has been chosen using 75% of the data-set for training and 25% for testing. The results were compared with those obtained using all the features as input for the ANN.
Subsequently, several experiments have been conducted in order to find the best ANN configuration for the identification of wood veneer defects problem. The experiments have been carried out using different numbers of hidden layers, different numbers of neurons in the hidden layers, different values of the Pearson coefficient, and different numbers of input for the network. Finally, Taguchi analysis (Roy, 1990) has been performed in order to analyze the results and identify the best configuration of the ANN.

The PCA based feature selection
The PCA is a multivariate statistical tool that uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called PCs. The procedure starts calculating the covariance matrix of the variables if they were expressed in the same unit. In our case the original variables show different units and orders of magnitude. For this reason, the original variables had to be expressed in terms of standardized deviations using the correlation matrix instead of the covariance one.
Each PC is expressed as linear combination of the standardized deviations of the p variables. The first PC is: The score for the i-th statistical unit is: where a 1s is the coefficient of the first PC and s-th variable. The sign of this coefficient reveals the relationship between the first PC and the s-th variable and its value shows how much this variable contributes to the scores of the first PC. In general, considering the first k PCs, the n × k score matrix is:

Production & Manufacturing Research: An Open Access Journal
Using the normalization condition, The scores become standardized with mean equal to zero and variance equal to the first eigenvalue. In fact, considering the first PC: Alternatively, it is possible to obtain the scores of the PCs with mean equal to zero but variance equal to one, dividing each score by the root of the respective eigenvalue. Another way is by considering a νs satisfying the following expression: The a νs coefficients are the correlation coefficients between components of each of the p variables:

Number of PCs
There are several criteria to choose the right number k of the PCs. Considering a PCA using the correlation matrix, the first criterion suggests choosing a number of PCs which contain a high percentage (at least 80%) of the total variance. This criterion can be modified considering a percentage threshold changing as a function of the starting number of variables.
The second criterion suggests considering all the PCs having eigenvalue higher than one. In this way, each PC explains a percentage of the total variance higher than that of a single variable.
The third criterion suggests making a graphic called scree plot of the eigenvalues λ ν as a function of the number v of PCs (v = 1, 2 … p). Since the eigenvalues are obtained in a decreasing order, this plot is descending; the criterion suggests choosing k as the number of PCs if there is a marked variation between kth and k + 1th eigenvalues.
Usually, all the three criteria are used in order to choose the optimum number k of PCs.

PCA application
The PCA has been applied on the ten 20 × 15 matrices, one for each of the wood defects and clear wood considered, composing the n × p data matrix where n is the number of exemplars and goes from 1 to 200, while p is the number of features from 1 to 15. In Table 2 the mean, the standard deviation, and the number of measurements available for each variable have been reported, while in Table 3 the correlation matrix between the variables is reported.
The Bartlett's test of sphericity has been applied on the correlation matrix, and the null hypothesis of absence of correlation between the variables indicates the safe use of PCA on this data. Table 4 shows the percentages of variance related with the extracted PCs. This indicates that all the variables except 11th, are well explained by the extracted PCs having percentages of variance higher than 50%, between 58.8 and 98.4%. Table 5 contains the amount of total variance contributed by each component and the percentage of cumulative variance. Considering just the first four components, the percentage of cumulative variance was equal to 81.876%.
The scree plot in Figure 2 shows a change of the slope between the considered four or five components. Therefore, according to the scree plot there are four PCs.
Since the first four components are responsible for 81.876% of the total variance, which is clearly higher than the 46.329% suggested for p = 15, and have eigenvalues higher than 1, and the scree plot shows a pronounced bend between 4 and 5, the three criteria agree to choose 4 as the optimum number of PCs.
In Table 6 the correlation coefficients between each PC and each variable are reported. The sign of the coefficients indicates if the relation between the coefficient and the variable is direct or inverse, while the numeric value indicates the correlation strength between the variables.
The first PC shows a strong direct relation with the variables number 4, 5, and 6; the second PC shows a strong direct relation with the variables 1, 2, 3, and 9; the third PC has an inverse relation with the variable 17; finally the fourth PC shows a strong relation with the variable 12. For all these cases of strong relation, a Pearson coefficient α ≥ .7 has been considered. The Pearson coefficient indicates the number of variables sufficient to represent a particular PC. Table 7 shows that when α ≥ .6, indicating a moderate relation between the variables, 13 variables instead of the initial 15 should be considered, while when α ≥ .7, indicating a strong relation between the variables, just 9 variables should be considered.
The above results have found four PCs. Based on the results in Table 7 and α ≥ .6, features 4, 5, 6, 15, and 16 are merged under one component which means they have something in common. These features are standard deviation, skewness, kurtosis, number of edge pixels for the pixels after thresholding at μ − 2σ, and number of edge pixel for the pixels after thresholding at μ + 2σ, respectively. These features are statistical features which illustrate the characteristics of the image data distributions, according to the Gaussian distribution. Therefore, these features merged under one PC represent the Gaussian characteristics of the image. The features 1, 2, 3, 9, and 10 are mean gray level, median gray level, mode gray level, gray level p for which there are 20 pixels below, and gray level s for which there are 20 pixels above, respectively. What is common in these features is that they are calculated from the average gray level.
The third PC contains features 11 and 17, where feature 11 is related with the histogram tail length on the dark side and feature 17 is the number of the edge pixels after thresholding at μ + 2σ. There is a link between the edge pixels and histogram of both dark and bright side, where it is expected to have brighter image on the edge after threshold which changes the histogram of both bright and dark pixels. However, PC has only found the direct relationship between dark side and thresholded edges. Therefore, this PC is sensitive to the histogram changes at the edge.

ANN design
In this section the ANN, based on the results of the PCA, was designed in order to identify wood veneer defects.
A feed-forward ANN with a Back-Propagation learning algorithm has been chosen using 75% of the data-set for training and 25% for testing. The hidden layer has a tansigmoid transfer function because the data are normalized between −1 and +1, while the output layer has a logsigmoid transfer function because the output of the network should be 0 (for defect free) or 1 (for defect). Figure 3 shows a network with an input layer of nine neurons representing the critical features considered, two hidden layers with 10 and 10 neurons, respectively, and finally the output layer with 10 neurons, one for each defect and clear wood. Figure 4 shows the classification results in terms of training curves in the case of reduced number of features according to the PCA and using all the features. It is clear how the performance of the network improves when the PCA is applied. The average results in terms of training time improved by 56.6%.

Results and Taguchi analysis
Several network configurations have been tested in order to improve the performance. Moreover, Taguchi method has been applied on the training and test results in order to evaluate the effects of the three factors on the performance of the network. The Taguchi orthogonal array (design set) used is given in Table 8. The signal-to-noise ratio according to the criteria 'Larger is better' is expressed by the following expression   Each experiment has been repeated five times and the average results have been taken into account. The main goal of these experiments is to investigate the outputs of the ANN in order to produce near-optimal expected results. The first factor is the number of hidden layers for which two levels were considered, low and high. The low level denotes one hidden layer and high level denotes more than one hidden layer (in this case two hidden layers). In order to use an L9 Orthogonal Array a dummy level was used for level 3 which was set to level 1. The second factor was the number of neurons in the hidden layer and three levels were chosen, low, medium, and high. For single hidden layer, these levels varied from 5 to 15 neurons, whereas in the case of two hidden layers, levels were presented as lower than 10, 10, and higher than 10 neurons. Finally, the third factor considered was Pearson coefficient and three levels were chosen: complete correlation (level 1), moderate correlated features (a ≥ .6, a ≤ −.6) (level 2), and highly correlated features (a ≥ .7, a ≤ −.7) (level 3), respectively.  Table 9 shows the results using one hidden layer, changing the number of neurons in the hidden layer from 5 to 15 using three levels of the Pearson coefficient. Table 10 shows the results when two hidden layers are used, changing the number of neurons in a way that their values are always included between the number of input and output neurons.
ANN training using one hidden layer, from 5 to 15 neurons per layer and the complete set of features in the issue maximizes the mean performance and minimizes standard deviation (i.e. S/N larger is better) (Figure 5(a)). The training of ANN is then influenced by the interaction between variables. Reducing the number of features to highly correlated variables (alpha person in level 3) has optimal effect in mean training performance if number of hidden layers is limited to 1. Increasing the number of neurons from 10 to 15 (level 3 in Figure 5(b)) has no interaction with layer complexity. When training ANNs with all features available (more input to the network), then the number of neurons does not influence the mean performance. Moderate neurons per layer (10, i.e. level 2) reduce training variance and time simultaneously maintaining percentage of training success around 92%. The fact that the performance during the training does not change mean and S/N value much indicates that it is better to use lesser inputs to the network in order to reduce the computational time. Figure 6(a) shows the signal-to-noise ratios and the results, according to the criteria 'Larger is better.' When the number of neurons increases, the variance minimizes. Moreover, the minimum number of hidden layers and more input in the network maximize the mean and minimize the variance. The optimal testing performance is achieved with 15 neurons. Figure 6(b) shows interaction between correlated variables and number of hidden layers of the mean performance of the ANN testing stage. The performance of ANN with the minimum number of hidden layers is found to be 73% with the reduced number of features. The performance of ANN testing stage with moderate correlated variables is 80%, which reduces testing process time by around 40%. This performance has been found with more than one hidden layer and 10 neurons in each layer. The worst performance is found with less number of inputs to the network and less number of neurons. The best performance is found with highly correlated features of the minimum level of hidden layers with more than 10 neurons per layer.
In summary, the results of Taguchi analysis show that the best performance of the ANN is reached using one hidden layer, a higher number of neurons in the hidden layer and a lower value of Pearson coefficient.

Conclusion
This paper has introduced ANN-based intelligent quality control for the detection of wood veneer defects with lower inspection criteria. To reduce the training time and increase the testing performance, a principal component analysis (PCA) based dimension reduction stage has been proposed. The proposed PCA method is based on the determination of the critical features for the inspection process. The proposed method has been applied on a case study for identifying defects on wood veneer. The reduced feature set has been used as inputs to train the ANN classifier which successfully identified the defects and clear wood. The best performance with one hidden layer was found to be with more than 10 neurons. The reduction of features reduces the number of epochs. According to the best performance, a reduction of 61 epochs increased the quality of outputs in testing stage by 18%. Different ANN designs have been studied by running some experiments with three control factors and carrying out Taguchi analysis on the results to determine the best-performing ANN topology.