 Research
 Open Access
 Published:
Radar detection with the Neyman–Pearson criterion using supervisedlearningmachines trained with the crossentropy error
EURASIP Journal on Advances in Signal Processing volume 2013, Article number: 44 (2013)
Abstract
The application of supervised learning machines trained to minimize the CrossEntropy error to radar detection is explored in this article. The detector is implemented with a learning machine that implements a discriminant function, which output is compared to a threshold selected to fix a desired probability of false alarm. The study is based on the calculation of the function the learning machine approximates to during training, and the application of a sufficient condition for a discriminant function to be used to approximate the optimum Neyman–Pearson (NP) detector. In this article, the function a supervised learning machine approximates to after being trained to minimize the CrossEntropy error is obtained. This discriminant function can be used to implement the NP detector, which maximizes the probability of detection, maintaining the probability of false alarm below or equal to a predefined value. Some experiments about signal detection using neural networks are also presented to test the validity of the study.
Introduction
In this article, an extension of the theoretical study presented in[1] about the capability of supervised learning machines to approximate the Neyman–Pearson (NP) detector is presented. This detector can be implemented by comparing the likelihood ratio, Λ(z), to a detection threshold fixed taking into account Probability of False Alarm (P _{FA}) requirements, as stated in expression (1)[2, 3], being f(zH _{ i }), i ∈ {0,1}, the likelihood functions under both, the null (H _{0}) and the alternative (H _{1}) hypothesis.
The NP criterion has been widely used in radar applications. The robustness of the likelihood ratio detector for moderately fluctuating targets was studied in[4]. In the last years, considering also radar applications, the NP criterion has been applied in MIMO radars[5, 6], distributed radar sensor networks[7], and for the detection of ships in marine environments[8]. The NP criterion has also been applied in some other topics: watermarking[9, 10], faultinduced dips detection[11], detection in sensor networks[12], disease diagnosis[13, 14], biometric[15], or gravitational waves detection[16]. Great efforts are being made nowadays to solve a number of problems related to signal detection in noise[17].
For the NP detector to be implemented, both likelihood functions must be known. Usually, statistical models of interference and targets are assumed and their parameters are estimated using the available data. Obviously, detection losses are expected when the interference or target statistical properties vary from those assumed in the design. In addition, when some of the parameters are random, and the composite hypothesis tests must be used, the average likelihood ratio can lead to intractable integrals that should be solved by numerical approximations. The usage of learning machines based detectors, allows us to approximate the NP detector, just using training data obtained experimentally, without knowledge of the likelihood functions. The main advantage of this approach is that no statistical models have to be assumed during the design, and if a suitable error function is used during training, a good approximation to the optimal NP detector is obtained[1]. The main drawback is the difficulty of obtaining representative training samples, and the definition of the most suitable learning machine architecture.
The application of supervised learning machines to approximate the NP detector has already been studied. The easiest way is to use a learning machine with only one output, which is compared to a threshold in order to decide in favor of the null or the alternative hypothesis. The threshold is used to fix the desired P _{FA}. This scheme has been used previously is several works[1, 18–20]. An equivalent implementation consist in varying the bias of the output neuron[21, 22]. A different approach is used in[23]: a twooutput NN with outputs in (0,1) was used, comparing the subtraction of both outputs to a threshold. This approach is equivalent to using a NN with only one output and desired outputs {−1,1}[24]. More recently, Radial Basis Function Neural Networks (RBFNN) have also been applied to approximate the NP detector[25–27]. Support Vector Machines (SVMs) have been applied to signal detection in background noise[28]. Finally, detectors based on committee machines have also been proposed[29–32].
The possibility of approximating the NP detector using adaptive systems trained in a supervised manner was studied in[33]. A sufficient condition for a discriminant function to be suitable for implementing the NP detector was obtained. In[1], those results were used to carry out a more general study about the capability of learning machines to approximate the NP detector when they are trained in a supervised manner to minimize an error function, demonstrating that the SumofSquares error is suitable to approximate the NP detector, and the Minkowski error with R = 1 is suitable to approximate the minimum probability of error classifier. With R = 1, the Minkowski error reduces to the mean absolute deviation.
The SumofSquares error is optimal for training supervised learning machines in order to detect or to classify Gaussian signals. If nonGaussian interference is assumed in the radar, probably there exists some other error functions which give rise to better results, motivating the study to know if they fulfil the sufficient condition established in[1, 33]. In this article, one more error function is considered, the CrossEntropy error. The study demonstrates that the CrossEntropy error is also suitable to be used for training supervised learning machines in order to approximate the NP detector, even improving the performance of learning machines trained with the SumofSquares error.
The article is structured as follows: In Section 2, the problem this article deals with is presented. The function that a learning classifier with one output approximates to, when has been trained to minimize the CrossEntropy error, is calculated in Section 3. The condition stated in[1] is applied to demonstrate that the approximated function is useful to approximate the NP detector. In Section 4, some experiments are presented to illustrate the previous theoretical studies. Finally, in Section 5 the main contributions of this article are summarized, and conclusions are extracted.
Problem statement
The usefulness of supervised learning machines trained to minimize the SumofSquares error to approximate the NP detector, and the usefulness of the Minkowski error with R = 1 to approximate the minimum probability of error classifier, have been demonstrated in[1]. In this article, we extend the study to the CrossEntropy error. The discriminant function the learning machine approximates to after training is obtained, and the sufficient condition stated in[33] is applied. The detector is implemented by comparing the output of the discriminant function to a threshold. The final approximation error will depend on the selected error function, the selected training and validation sets, the system structure, and the training algorithm[34]. In order to obtain a good approximation, the training set must be a representative subset of the input space, the function implemented by the learning machine must be sufficiently general that there is a choice of adaptive parameters which makes the error function sufficiently small, and the learning algorithm must be able to find the appropriate minimum of the error function[35].
In our study, a learning machine with one output is considered, that is used to classify input vectors$\mathbf{z}={[{z}_{1},{z}_{2},\dots ,{z}_{L}]}^{T}$ into two hypotheses or classes, H _{0} and H _{1}, which stand for the absence of target and for its presence, respectively, in radar detection problems. The basic detector scheme is represented in Figure1.
Given a decision rule, let${\mathcal{Z}}_{i}$ be the set of all possible input vectors that will be assigned to hypothesis H _{ i }, and$\mathcal{Z}$ the ensemble of all possible input vectors (${\mathcal{Z}}_{0}\cup {\mathcal{Z}}_{1}=\mathcal{Z}$). The output of the learning machine is represented by F(z), and the desired output by${t}_{{H}_{i}}$. A training set, =_{0}∪_{1}, where _{1} is composed of N _{1} training patterns from hypothesis H _{1}, and _{0} is composed of N _{0} training patterns from hypothesis H _{0} (N = N _{1} + N _{0}), is available.
In order to study the suitability of the error function to be used to approximate the NP detector, the same strategy applied in[1] is used. The function the learning machine approximates to after training is obtained, as a function of the likelihood functions and the prior probabilities. The implemented detector compares the learning machine output to a threshold η _{0}, which varies to fix the P _{FA}. The NP detector is usually implemented by comparing the likelihood ratio to a threshold η _{ l r }, fixed according to the required P _{FA}. A sufficient condition has been established in[1, 33], which states that for a learning machine to approximate the NP detector, the relation between η _{ l r } and η _{0} doesn’t depend on the input vector.
In the following sections, in order to obtain the function the learning machine approximates to after training, the strong law of large numbers is going to be applied[36]. It asserts that if 〈X _{ i }〉 is a sequence of independent and identically distributed random variables which has an expectation μ, then:
Discriminant function approximated by a learningmachine trained to minimize the crossentropy error
The error function to be studied is the CrossEntropy Error[34, 37], defined in the following expression, when a one output learning machine is considered, the desired outputs are one and zero, and$F\left(\mathbf{z}\right):\mathcal{Z}\to (0,1)$ (the function implemented by the system maps$\mathcal{Z}$ into the interval (0,1)):
If the number of patterns tends to infinity (N → ∞), the error can be expressed as follows:
Applying the strong law of large numbers, expression (5) is obtained:
The function F(z) that minimizes E _{ m }, which will be denoted by F _{0}(z), is obtained using calculus of variations, and particularly the EulerLagrange differential equation[38, 39]. The calculus of variations can be used to find the function F(z) that minimizes the functional J(F) defined as follows:
where z =[z _{1},z _{2},…,z _{ L }]^{T}, I is twice differentiable with respect to the indicated arguments, and F is a function in${C}^{2}\left(\mathcal{Z}\right)$ that assumes prescribed values at all points of the boundary$\delta \mathcal{Z}$ of the domain$\mathcal{Z}$. The function F that minimizes J(F) can be obtained by solving the EulerLagrange equation (7), where${F}_{k}^{\prime}=\frac{\mathrm{\partial F}}{\partial {z}_{k}}$.
In our problem, J(F) = E _{ m }(F), and I(z,F(z)) = −P(H _{1})f(zH _{1}) ln(F(z)) − P(H _{0})f(zH _{0}) ln(1 − F(z)), which does not depend on the first derivatives of F. Therefore, F only needs to be defined in${C}^{0}\left(\mathcal{Z}\right)$ and the EulerLagrange equation reduces to:
The function F _{0}(z) that minimizes E _{ m } is given in (9) and the detection rule is obtained by comparing F _{0}(z) to η _{0} (10).
Dividing the numerator and denominator of the left side of rule (10) by f(zH _{0}), an equivalent rule can be obtained, which is a function of the likelihood ratio:
Extracting the likelihood ratio, a new equivalent rule (12) can be derived, which compares the likelihood ratio to a new threshold, η _{ l r }. The expression which relates η _{ l r } and η _{0} is presented in (13):
The relation between η _{ l r } and η _{0} does not depend on the input vector, z. Thus, according to the sufficient condition proposed in[1, 33], the detection rule (10) is an implementation of the NP detector.
Experiments
In this section, detectors based on Multilayer Perceptrons (MLPs) are designed for three cases studies: detection of colored Gaussian signals in white Gaussian interference, detection of colored Gaussian signals in correlated Gaussian clutter plus white Gaussian noise, and detection of nonfluctuating targets in Kdistributed interference[40, 41]. In practical situations, the statistical properties of the interference can be estimated and tracked to some degree, but the target parameters are very difficult to estimate. When the target parameters are unknown, the NP detector is built with the AverageLikelihood ratio, that is compared to a threshold. The three cases study have been selected because the optimum NP detector can be easily approximated using a Maximum Likelihood estimator of the AverageLikelihood ratio.
Two strategies are followed to check the performance of the proposed detectors:

First, the supervised learning machinebased detectors are compared with an approximation of the NP detector. The AverageLikelihood ratio is approximated by a MaximumLikelihood estimator, based on the Constrained Generalized Likelihood Ratio (CGLR)[42]. The CGLR is built with a number of filters equal to the dimension of the input vector (L). An increase in the number of filter does not produce a significant improvement in the performance of the CGLR.

Second, the detectors obtained after training the MLPs with the CrossEntropy error, are compared with the equivalent obtained after training with the SumofSquares error, and the Minkowski error with R=1. These comparisons are only performed for the first case study, due to space limitations, but similar comparative results are obtained in the other two cases study. The comparison is completed with a representation of P _{ D } versus SNR for the best detectors obtained with the different error functions, for the first case study.
The Receiver Operating Characteristic (ROC) curves of all the considered detectors are represented, to show the validity of our approach.
Considered detection problems
The following detection problems are studied in this article, to assess the capability of learning machines trained to minimize the CrossEntropy error to implement good approximations of the NP optimum detector:

Case study 1: Detection of Gaussian fluctuating targets in presence of Additive White Gaussian Noise (AWGN). This case corresponds to target detection in the clear conditions. This case study has been subdivided into two:

Detection of Gaussian targets with unknown correlation coefficient.

Detection of Swerling I (SWI) targets with unknown Doppler shift.


Case study 2: Detection of Gaussian fluctuating targets in presence of correlated Gaussian clutter and AWGN. This model can be used for target detection in AWGN and sea/land clutter with low resolution radar systems, or high resolution radar systems with incidence angle higher than 10 degrees. Again, this case study has been subdivided into two:

Detection of Gaussian targets with unknown correlation coefficient.

Detection of SWI targets with unknown Doppler shift.


Case study 3: Detection of non fluctuating targets in presence of spiky Kdistributed clutter (ν=0.5, where ν is the shape parameter of the Kdistribution). This model is suitable for target detection in sea/land clutter with high resolution radar systems and low grazing angles. In this case, the problem of detecting Swerling V (SWV) targets with unknown Doppler shift is considered.
Detectors architecture, training and test parameters
Multilayer Perceptrons with real arithmetic are designed. Each MLP has an input layer, one hidden layer, and one output. In these examples, a pulsed radar which provides eight complexvalued echoes in each exploration, due to antenna rotation and beamwidth, is considered (this is the usual case in air traffic control radars). Each complex valued echo consist of the inphase and quadrature components. Considering that the input vector is composed of eight complex valued echoes, L = 16 real inputs are required. The dependence of performance on the number of neurons in the hidden layer (M) is studied. The output is compared to a hard threshold, selected according to P _{FA} (Figure1). According to this, the architectures of the MLPs are labeled with MLP L/M/1. The activation function of the processing units is the sigmoidal one.
For training the different MLPs, a training set composed of 50,000 patterns has been used. The training set consists of patterns belonging to both hypothesis, that are considered with equal prior probabilities (the same number of patters for H _{0} and H _{1}). A crossvalidation strategy has been used during training to avoid overfitting, following the kfold approach with k = 5. The validation set is composed of 10,000 patterns, 5,000 from each hypothesis.
For testing, a different set of patterns has been used. The number of patterns under hypothesis H _{0} is 2·10^{7}, to estimate P _{FA} values higher than 10^{−6} using conventional MonteCarlo simulation with a relative error lower than 10%, while the number of patterns under hypothesis H _{1} is 5·10^{4}, to estimate P _{ D }.
The algorithm used for training with the CrossEntropy error is the one described in[43], while the algorithm used for training with the SumofSquares error and the Minkowski error (R = 1), is the Conjugate Gradient method[44].
Results
In this subsection the results obtained for the above mentioned detection problems are presented. Different Signal to Interference Ratios (SIR) are considered in the experiments to obtain P _{ D } > 0.8 for P _{FA} values of interest in radar applications. The SIR becomes the Signal to Noise Ratio (SNR), when the interference is only noise, and Signal to Clutter Ratio (SCR), when the interference is only clutter. For the MLPs, the influence of the number of hidden neurons, and therefore, the influence of the learning machine architecture is studied too. In some of the cases, the results are better when the number of hidden neurons is higher, but for detecting targets in spiky Kdistributed clutter, very good results are obtained even with a low number of neurons in the hidden layer. This is because the surface which separates the acceptance regions of both hypothesis in the optimum detector can be approximated with a simpler architecture. In all the cases, there exists an architecture of the neural network that guarantees a very good approximation to the optimum detector, when the neural network is trained with the CrossEntropy error.
Detection of Gaussian fluctuating targets in AWGN
First, the detection of Gaussian fluctuating targets with unknown correlation coefficient in AWGN is studied. In this case, the SNR is 7 dB. MLPs with a number of hidden neurons which varies from 14 to 23, in steps of 3, have been trained and tested. The ROC curves of the detectors based on MLPs trained with the CrossEntropy error are presented in Figure2. As higher the number of neurons, better the approximation to the CGLR detector, demonstrating that the optimum detector can be approximated with a MLP if the number of freedom parameters of the architecture to be fitted during training is high enough for the considered detection problem.
For comparison purposes, MLPs have been trained to minimize the SumofSquares error and the Minkowski error (R = 1). The ROC curves are presented in Figures3 and4, respectively. The results training with the SumofSquares error show high variability, and the results training with the Minkowski error (R = 1) are clearly worse, as expected from the theoretical study presented in[1], and with high variability too. To show the dependence with SNR, the best MLPbased detectors have been selected, and tested with different SNR values. The results are presented in Figure5.
In a second experiment, the SWI targets with unknown Doppler shift in AWGN is considered. The Doppler shift is modeled as a uniform random variable in the interval [0,2Π). In this case, the SNR is 10 dB. Again, MLPs with different number of hidden neurons have been trained and tested with the CrossEntropy error, the SumofSquares error, and the Minkowski error (R = 1), to study the dependence of performance on the network architecture. The number of hidden neurons that has been considered varies from 14 to 23, in steps of 3. The ROC curves are presented in Figures6,7 and8.
The results obtained training with the CrossEntropy error are clearly the best. The results obtained training with the SumofSquares error show higher variability and are slightly worse. The results obtained training with the Minkowski error (R = 1) are clearly the worst, as expected.
Detection of Gaussian fluctuating targets in presence of correlated Gaussian clutter and AWGN
In this section, we focus on the study of the CrossEntropy error function. The results obtained with the other two above mentioned error functions are not presented, due to space limitations, but similar conclusions could be extracted in this case. Now, the number of hidden neurons varies from 14 to 20, in steps of 3, because a good approximation to the NP detector can be obtained with a simpler architecture.
In this case, the detection of Gaussian fluctuating targets in correlated Gaussian clutter and AWGN is considered. The level of interference is expressed with the SIR, but the clutter to noise ratio (CNR) should also be known. Again, two different kinds of targets are considered: those with unknown correlation coefficient (ρ _{ t }), and Swerling I targets with unknown Doppler shift (the Doppler shift is modeled as a uniform random variable in the interval [0,2Π)).
The results for the detection of correlated Gaussian targets in correlated Gaussian clutter and AWGN are presented in Figures9 and10. In both figures, the CNR = 20dB, and ρ _{ t } is unknown, modeled as a uniform random variable in the interval [0,1]. In Figure9, ρ _{ c } = 0.7 and SIR = 0dB. In Figure10, ρ _{ c } = 0.995 and SIR = −10dB.
The results for the SWI targets with unknown Doppler shift in correlated Gaussian clutter and AWGN are presented in Figures11 and12. In both figures, the CNR = 20 dB, and the Doppler shift is unknown, modeled as a uniform random variable in the interval [0,2Π). In Figure11, ρ _{ c } = 0.7 and SIR = 13 dB. In Figure12, ρ _{ c } = 0.995 and SIR = 1 dB. Again, training with the CrossEntropy error, as higher the number of hidden neurons, better the obtained approximation to the CGLR detector taken as reference.
Detection of non fluctuating targets in presence of spiky Kdistributed clutter
In this case, the results obtained with MLPs with different number of hidden neurons for detecting nonfluctuating targets in spiky Kdistributed clutter are presented. These experiments have been included to show the utility of our approach for detection purposes with high resolution radars and low grazing angles. In this case, the considered interference is only clutter (ρ _{ c } = 0 and SCR = 9 dB in Figure13, and ρ _{ c } = 0.9 and SCR = −3 dB in Figure14). The good approximation to the reference detector can be observed in all cases, even for very low P _{FA} values, and with a reduced number of neurons in the hidden layer.
Conclusions
In this article, the possibility of approximating the NP detector using learning machines trained in a supervised manner to minimize the CrossEntropy error has been studied.
Conventional coherent radar detectors usually apply Doppler processors (MTD, Moving Target Detectors, or MTI, Moving target Indicators) to reduce the clutter in the received signal. Most of these approaches assume Gaussian statistics, and are implemented with linear filters. The modulus of the filtered observation vector is obtained, and finally compared to a detection threshold. Due to clutter residuals, Constant False Alarm (CFAR) techniques are applied to fulfil P _{FA} requirements. Many of the proposed solutions, assume a Gaussian distributed background. In the literature, the detection of radar targets in nonGaussian clutter has also been addressed, but most of the approaches are based on the design of CFAR detectors that assume a specific probability density function of the clutter, and try to estimate the detection threshold for maintaining the desired P _{FA}. In this article, the learning capabilities of supervised learning machines are exploited to approximate the NP detector in cases where not only the clutter but also target parameters, are unknown. This is the general case study in a radar problem, where detection is formulated as a composite hypothesis test. Instead of using the SumofSquares error, the CrossEntropy error is considered for training, in order to exploit its better properties with respect to the sensitivity to the presence of outlayers in the training set.
The function approximated by the learning machine after training has been calculated using the calculus of variations, with the objective of finding the function that minimises the formulated CrossEntropy error.
Once the function the supervised learning machines approximates to after training has been obtained, the method proposed in[33] has been applied to demonstrate that this function can be used to implement the NP detector by comparing the trained learning machine output to a threshold, selected according to P _{FA} requirements.
This theoretical result has been assessed with some experiments. Neural networks based detectors have been considered for detecting different types of signals in different types of interferences. Results prove that an MLP trained to minimize the CrossEntropy error, can implement a very good approximation to the NP detector for the considered detection problems, even for low P _{FA} values.
Different experiments have been performed for detecting fluctuating radar targets (Gaussian targets with unknown correlation coefficient and Swerling I target with unknown Doppler shift) in AWGN, and in correlated Gaussian clutter plus AWGN. Also, the detection of nonfluctuating targets in spiky Kdistributed clutter has been studied. In all the cases, the capability of learning machines (particularly, MLPs) trained to minimize the CrossEntropy error, to approximate the optimum detector in the NP sense, has been demonstrated. To obtain a good approximation, the number of hidden neurons must be high enough. This number is related to the minimum number of hyperplanes necessary to enclose the acceptance regions of the detection problem, but this theoretical study is beyond the objective of this article.
For comparison purposes, some experiments have been carried out with MLPs trained to minimize the SumofSquares error, and the Minkowski error with R=1. The results obtained training with the CrossEntropy error are better than the results obtained training with the SumofSquares error, but both can be used to approximate the NP detector. The detectors trained with the Minkowski error with R=1 are the worst, and this detector is very far from the NP optimum detector.
Compared with conventional radar detectors, this approach has the following advantages:

Good approximations to the optimal NP detector can be obtained if a suitable error function is selected, if a representative training set is available, if the learning machine architecture has a high enough number of free parameters, and finally, if a good training algorithm is used.

No statistical models have to be assumed during the design. On the contrary, most of the CFAR detectors that can be found in the literature assume some statistical model for the interference that is used to adjust the detection threshold to maintain the desired P _{FA}.

The implementation of the NP detector based on the AverageLikelihood ratio, when some of the parameters or the statistical models assumed in the design are random, can lead to intractable integrals. When supervised learning machines are used to approximate the NP detector, only training data obtained experimentally are necessary, without knowledge of the statistical distributions, and without solving those integrals.
Obviously, supervised learning machines are not an ideal solution in radar detection problems. The main drawback is the difficulty of obtaining representative training samples, and the definition of the most suitable learning machine architecture.
References
 1.
JaraboAmores M, RosaZurera M, GilPita R, LópezFerreras F: Study of two error functions to approximate the Neyman–Pearson detector using supervised learning machines. IEEE Trans. Signal Proces 2009, 57(11):41754181.
 2.
Neyman J, Pearson K: On the problem of the most efficient test of statistical hypotheses. Philosph. Trans. Roy. Soc. Lond. A 231 1933, 492510.
 3.
Trees HV: Detection, estimation, and modulation theory. New York: Wiley; 1968.
 4.
di Vito A, Naldi M: Robustness of the likelihood ratio detector for moderately fluctuating radar targets. IEE Proc. Radar Sonar Navig 1999, 146(2):107122. 10.1049/iprsn:19990261
 5.
He Q: MIMO radar diversity with Neyman–Pearson signal detection in nonGaussian circumstance with nonorthogonal waveforms. In Proceedings of IEEEICASSP. Prague, Czech Republic; 2011:27642767.
 6.
He Q, Blum R: Diversity gain for MIMO Neyman–Pearson signal detection. IEEE Trans. Signal Proces 2011, 59(3):869881.
 7.
Yang Y, Blum R, Sadler B: A distributed and energyefficient framework for NeymanPearson detection of fluctuating signals in largescale sensor networks. IEEE J. Sel. Areas Commun 2010, 28(7):11491158.
 8.
VicenBueno R, CarrascoAlvarez R, JaraboAmores M, NietoBorge J, AlexandreCortizo E: Detection of ships in marine environments by square integration mode and multilayer perceptrons. IEEE Trans. Instrum. Meas 2011, 60(3):712724.
 9.
Furon T, Josse J, le Squin S: Some theoretical aspects of watermarking detection. Proc. SPIE 2006, 6072: 553564.
 10.
Ng T, Garg H: Maximumlikelihood detection in DWT domain image watermarking using Laplacian modeling. IEEE Signal Process. Lett 2005, 12: 285288.
 11.
Gu I, Ernberg N, Styvaktakis E, Bollen M: A statisticalbased sequential method for fast online detection of faultinduced voltage dips. IEEE Trans. Power Del 2004, 19: 497504. 10.1109/TPWRD.2003.823199
 12.
Chamberland J, Veeravalli V: Decentralized detection in sensor networks. IEEE Trans. Signal Proces 2003, 139: 407416.
 13.
Sebastiani P, Gussoni E, Kohane I, Ramoni M: Statistical challenges in functional genomics. Stat. Sci 2003, 18: 3360. 10.1214/ss/1056397486
 14.
Dudoit S, Fridlyand J, Speed T: Comparison of discrimination methods for the classification of tumors usign gene expression data. J. Am. Stat. Assoc 2002, 97: 7787. 10.1198/016214502753479248
 15.
Nandakumar K, Chen Y, Dass S, Jain A: Likelihood ratiobased biometric score fusion. IEEE Trans. Pattern Anal. Mach. Intell 2008, 30(2):342347.
 16.
Prix R, Krishnan B: Targeted search for continuous gravitational waves: Bayesian versus maximumlikelihood statistics. Class. Quant. Grav 2009., 26(20): 10.1088/02649381/26/20/204013
 17.
Sung Y, Tong L, Poor H: Neyman–Pearson detection of Gauss–Markov signals in noise: closedform error exponent and properties. IEEE Trans. Inf. Theory 2006, 52: 13541365.
 18.
Andina D, SanzGonzález J, JiménezPajares J: A comparison of criterion functions for a neural network applied to binary detection. In Proc. IEEE Int. Conf. Neural Netw. Perth, WA; 1995:329333.
 19.
Andina D, SanzGonzález J, RodríguezMartín O: Performance improvements for a neural network detector. In Proc. IEEE Int. Conf. Neural Netw. Perth, WA, Australia; 1995:492495.
 20.
Burian A, Kuosmanen P, Saarinen J: Neural detectors with variable threshold. In Proc. ISCAS. Orlando, FL, USA; 1999:599602.
 21.
Ramamurti V, Rao SS, Gandhi PP: Neural detectors for signals in nongaussian noise. In Proc. ICASSP. Minneapolis, MN, USA; 1993:481484.
 22.
Gandhi P, Ramamurti V: Neural networks for signal detection in nongaussian noise. IEEE Trans. Signal Process 1997, 45: 28462851. 10.1109/78.650111
 23.
Munro D, Ersoy O, Bell M, Sadowsky J: Neural network learning of lowprobability events. IEEE Trans. Aerosp. Electron. Syst 1996, 32: 898910.
 24.
Ruck D, Rogers S, Kabrisky M, Oxley M, Suter B: The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Netw 1990, 1: 296298. 10.1109/72.80266
 25.
Casasent D, Chen X: New training strategies for RBF neural networks for Xray agricultural product inspection. Pattern Recogn 2003, 36: 535547. 10.1016/S00313203(02)000584
 26.
Casasent D, Chen X: Radial basis function neural networks for nonlinear Fisher discrimination and Neyman Pearson classification. Neural Netw 2003, 16: 529535. 10.1016/S08936080(03)000868
 27.
JaraboAmores P, GilPita R, RosaZurera M, LópezFerreras F: MLP and RBFN for detecting white Gaussian signals in white Gaussian interference. Lecture Notes in Computer Science vol. 2687. Maó, Spain; 2003:790797.
 28.
Davenport M, Baraniuk R, Scott C: Controling false alarms with support vector machines. In Proc. of ICASSP. Toulouse, France; 2006:589592.
 29.
Haykin S, Thomson D: Signal detection in a nonstationary environment reformulated as an adaptive pattern classification problem. IEEE Trans. Signal Proces 1998, 85(11):23252344.
 30.
Fernandes AM, Utkin AB, Lavrov AV, Vilar RM: Development of neural network committee machines for automatic forest fire detection using lidar. Pattern Recogn 2004, 37: 20392047. 10.1016/j.patcog.2004.04.002
 31.
MataMoya D, JaraboAmores P, VicenBueno R, RosaZurera M, LópezFerreras F: Neural network detectors for composite hypothesis tests. Lecture Notes in Computer Science, vol. 4224. Burgos, Spain; 2006:298305.
 32.
MataMoya D, JaraboAmores P, RosaZurera M, NietoBorge J, LópezFerreras F: Combining MLPs and RBFNNs to detect signals with unknown parameters. IEEE Trans. Instrum. Meas 2009, 58(9):29892995.
 33.
JaraboAmores P, RosaZurera M, GilPita R, LópezFerreras F: Sufficient condition for an adaptive system to approximate the Neyman–Pearson detector. In 2005 IEEE/SP 13th Workshop on Statistical Signal Processing. Bordeaux, France; 2005:295230.
 34.
Bishop CM: Neural Networks for Pattern Recognition. Oxford, UK: Oxford University Press; 1995.
 35.
Saerens M, Latinne P, Decaestecker C: Any reasonable cost function can be used for a posteriori probability approximation. IEEE Trans. Neural Netw 2002, 13: 12041210. 10.1109/TNN.2002.1031952
 36.
Slivka J, Severo N: On the strong law of large numbers. Proc. Am. Math. Soc 1970, 24: 729734. 10.1090/S00029939197002599939
 37.
Hampshire J, Pearlmutter B: Equivalence proofs for multilayer perceptron classifiers and the Bayesian discriminant function. In Proc. of 1990 Connectionist Models Summer School. Edited by: Touretzky DS, Elman JL, Sejnowski TJ, Hinton GE. San Diego, CA, USA; 1990:159172.
 38.
Gelfand I, Fomin S: Calculus of Variations. Mineola, New York: Courier Dover Publications; 2000.
 39.
Weinstock R: Calculus of Variations with Applications to Physics and Engineering. New York: McGraw Hill; 1952.
 40.
Conte E, Longo M, Lops M, Ullo S: Radar detection of signals with unknown parameters in Kdistributed clutter. IEE Proc. F Radar Signal Process 1991, 138(2):131138. 10.1049/ipf2.1991.0019
 41.
Conte E, Lops M, Ricci G: Radar detection in Kdistributed clutter. IEE Proc. Radar Sonar Navig 1994, 141(2):116118. 10.1049/iprsn:19949882
 42.
Nayebi MM, Aref MR, Bastani MH: Detection of coherent radar signals with unknown Doppler shift. IEE Proc. Radar Sonar Navig 1996, 143(2):7986. 10.1049/iprsn:19960158
 43.
ElJaroudi A, Makhoul J: A new error criterion for posterior probability estimation with neural nets. In Proc. of Int. Conf. on Neural Networks IJCNN. San Diego, CA, USA; 1990:185192.
 44.
Charalambous C: Conjugate gradient algorithm for efficient training of artificial neural networks. IEEE Proc. Circ. Dev. Syst 1992, 51(3):301310.
Acknowledgements
This work has been partially funded by the Spanish Ministry of Economy and Competitiveness with project TEC201238701.
Author information
Additional information
Competing interests
The author declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
About this article
Cite this article
JaraboAmores, M., la MataMoya, D.d., GilPita, R. et al. Radar detection with the Neyman–Pearson criterion using supervisedlearningmachines trained with the crossentropy error. EURASIP J. Adv. Signal Process. 2013, 44 (2013) doi:10.1186/16876180201344
Received
Accepted
Published
DOI
Keywords
 Radar detection
 Supervised learning machines
 Error function
 Crossentropy