Scan statistics with local vote for target detection in distributed system
 Junhai Luo^{1, 2}Email authorView ORCID ID profile and
 Qi Wu^{1}
https://doi.org/10.1186/s136340170467y
© The Author(s) 2017
Received: 7 December 2016
Accepted: 21 April 2017
Published: 3 May 2017
Abstract
Target detection has occupied a pivotal position in distributed system. Scan statistics, as one of the most efficient detection methods, has been applied to a variety of anomaly detection problems and significantly improves the probability of detection. However, scan statistics cannot achieve the expected performance when the noise intensity is strong, or the signal emitted by the target is weak. The local vote algorithm can also achieve higher target detection rate. After the local vote, the counting rule is always adopted for decision fusion. The counting rule does not use the information about the contiguity of sensors but takes all sensors’ data into consideration, which makes the result undesirable. In this paper, we propose a scan statistics with local vote (SSLV) method. This method combines scan statistics with local vote decision. Before scan statistics, each sensor executes local vote decision according to the data of its neighbors and its own. By combining the advantages of both, our method can obtain higher detection rate in low signaltonoise ratio environment than the scan statistics. After the local vote decision, the distribution of sensors which have detected the target becomes more intensive. To make full use of local vote decision, we introduce a variablestepparameter for the SSLV. It significantly shortens the scan period especially when the target is absent. Analysis and simulations are presented to demonstrate the performance of our method.
Keywords
1 Introduction
Target detection has an important research significance on military and civil applications in distributed system, such as intrusion detection and fire detection. The reliability of target detection result suffers from the problem of local false alarm, while data fusion can improve the precision of the target detection. For multiple sensor systems, sensors send their sense data to a fusion center, and the fusion center makes the final decision to improve the global probability of detection. Distributed detection using multiple sensors and optional fusion rules has been extensively investigated.
Chair and Varshney [1] present an optimum fusion structure to classical Bayesian detection problem in distributed sensor networks. To obtain the global decision, the fusion center weighs the reliability of every sensor and compares them with a threshold. The reliability of sensor is supported by the probability of detection and false alarm rate. Although this method can get optimum performance, it has to know the probability of detection and false alarm rate previously. Since we cannot get the target location before detection, this method cannot be applied to practical applications.
Niu and Varshney [2] put forward the counting rule, where the fusion center employs the total number of detections reported by local sensors for hypothesis testing and analyzes the performance of the counting rule with a significant number of sensors. In [3], the authors give performance analysis when sensors are deployed in a random sensor field. The counting rule does not need the probabilities of local detection and a false alarm in advance. It makes the counting rule more suitable for the practical environment. For the counting rule, it takes the results of sensors equally. However, sensors close to the target get higher accuracy than sensors far away. Sensors far away from the target degrade global probability of detection. In [4, 5], the authors propose a method which gives different weight to each sensor depending on the evaluated distance to the target or the signaltonoise ratio (SNR). It makes the final decision more accurate at the cost of sending more data to the fusion center.
Scan statistics has been used to an epidemic or computer intrusion in [8–11]. Moreover, Guerriero [12] puts the scan statistics to the signal processing community firstly. The detection is carried out in a mobile fusion center as a mobile agent (MA) which successively counts the number of binary decisions reported by local sensors lying inside its moving field of view. The MA, playing the role of the fusion center, makes the final decision about the presence of a target. The authors also demonstrate the existence of optimal size for the field of view and the disjointwindow test. In disjointwindow scan test, the MA travels across the sensor network and scans the network using no overlapping windows. In [13–15], the authors introduce the variable window scan statistics and investigate the performance of those variable window scan statistics methods. The disjointwindow scan statistics can shorten the scan period. However, it has poor performance compared with the scan statistics (SS).

A model of the SSLV is proposed. We analyze the difference between the SSLV and the traditional SS. The deduction of global false alarm ratio for the SSLV is developed.

We apply the SSLV to a grid sensor network and compare its performance with the SS. According to the simulation, we can prove that the SSLV overwhelms the SS in low SNR. We also verify that an optimal M _{ x } at a given situation does exist.

We introduce a variablestepparameter for the SSLV and analyze its influence on our method. From the simulation, we know that the variablestepparameter has little negative effect on detection performance of the SSLV. However, it significantly shortens the scan period especially when a target is absent.
The remainder of the paper is organized as follows. Section 2 demonstrates twodimensional scan statistics as a foundation for the SSLV. Section 3 describes the system model of scan statistics with the local vote and introduces a variablestepparameter into the SSLV. Section 4 applies the SSLV to a grid sensor network, and various simulations and analysis are provided. Finally, Section 5 concludes our research.
2 Scan statistics
In this part, we will introduce the classical twodimensional scan statistics algorithm. The scan statistics is a kind of distributed detection method. Each sensor makes its hypotheses according to its sense data and sends the result to the fusion center. The traditional counting rule algorithm collects data from all sensors in the field of interest and makes the global judgment through these data. Unlike the counting rule algorithm, the SS makes an MA sequentially collect the data from the agent area, and the MA makes the final decision for the global network. When a target is present, sensors near the target are more likely to make the right judgments. The SS considers this spatial correlation that makes it more accurate than the counting rule.
The full content can be found in [12, 16]. In [12], the authors also demonstrate the expression when the \(X^{\prime }_{i, j}\) conforms to Poisson distribution.
3 Twodimensional scan statistics with local vote decision
In this section, we give a detailed description of the SSLV that we proposed and the problems which are brought by local vote. Section 3.1 presents the scan statistics with local vote decision algorithm. The correlation of local sensors is presented in Section 3.2. According to our analysis, it turns out that the results provided by sensors are not independent and identically distributed (i.i.d) anymore after the local vote. It makes an expression (4) cannot be used in the SSLV, but a new expression for the SSLV is deduced in Section 3.3. Section 3.4 introduces a variablestepparameter into the SSLV in order to take full advantage of the local vote.
3.1 Scan statistics with local vote decision
For simplicity, we abbreviate \({{S}_{{{m}_{1~}}\times ~ {{m}_{2}}; {{~N}_{1}}~\times ~ {{N}_{2}}}}\phantom {\dot {i}\!}\) to S. The next step is to obtain the expression of \(P\left (S_{m_{1}~\times ~ m_{2}~N~\times ~ N} ~\geq ~ k\right)\) to make the SSLV useful.
3.2 Correlation of sensors
where M _{ i } is the number of neighbors which depends on local vote decision algorithm. M _{ x } is a variable that has a significant influence on the performance. σ _{ i } ^{2} = μ _{ i }(1 − μ _{ i }).
According to the deductions above, we can find out that decision X _{ i,j } is not i.i.d anymore after the local vote.
3.3 Approximation for P(S ≥ k)
In [16], the authors give the proof of approximation when the X _{ i,j } is i.i.d with the Markov Chain imbeddable systems [17]. Obviously, it is not applicable here. Luckily, there are different ways to give the accurate approximation for P(S ≥ k) and one of them is using the Haiman theorem [18–20].
Theorem 1
with an error of about 3.3(L − 1)(1 − Q _{32})^{2}.Assuming that L ≤ K and substituting (17) and (16) into (15), we can get the final expression we need.
where P(S≤k − 1) can be approximated by (15).
3.4 The SSLV with variablestepparameter
The scan region can be a rectangular region given by R(i _{1},i _{2}) = [i _{1} h _{1},(i _{1} + m)h _{1}] × [i _{2} h _{2},(i _{2} + m)h _{2}]. Assuming R(i _{1},i _{2}) is the scan region at the current time, then the next scan region is R[(i _{1}+s t e p)h _{1},(i _{1}+s t e p+m)h _{1}]×[i _{2} h _{2},(i _{2} + m)h _{2}], i _{1}+s t e p≤N − m + 1. We only introduce the step at onedimensional field for better performance. Whereas, the global false alarm rate can still be evaluated by (15).
4 Application of the SSLV in distributed system
In this section, we apply the SSLV into a particular situation and provide a detailed description concerning observation model, local vote decision model, false alarm probability at the MA, and the optimal M _{ x }.
4.1 Observations model
where \(Q\left (x \right)~=~\int _{x}^{\infty }{{1}/{\sqrt {2\pi }}\;{{e}^{{{{\xi }^{2}}}/{2}\;}}d\xi }\) is the unit Gaussian exceedance function.
4.2 Local vote decision model
We divide the ROI into M (the total number of sensors) little subsquares. The location of the sensor inside each small subsquare is known. Let h = b/N ^{′}, where N ^{′} satisfies N ^{′} ^{2} = M, and we divide the square of area b ^{2} into M cells so that each cell of area h ^{2} contains only one sensor. Let us denote the cell [i h,(i + 1)h] × [j h,(j + 1)h] by c(i,j). We define X ^{′} _{ i,j } as the binary data from the local sensor s inside c(i,j) with 0 ≤ i ≤ N ^{′}1 and 0≤j ≤ N ^{′}1.
where M _{ i } is the number of neighbors and M _{ x } is the preset threshold. We define X _{ i,j } as the binary data from the local sensor s inside c(i,j) with 1≤i≤N and 1≤j≤N after local vote where N=N ^{′}−2. We observe that for each 1 ≤ i ≤ N, the sequence (X _{ i,j })_{1 ≤ i ≤ N } is cdependent, and for each 1 ≤ j ≤ N, the sequence (X _{ i,j })_{1 ≤ j ≤ N } is also cdependent, where c=2 in our simulation. c has relations with the choice of local vote algorithm. In (15), we construct a 1dependent sequence to evaluate global false alarm probability. Only when c ≤ m can we guarantee the sequence is 1dependent.
4.3 False alarm probability at the MA
We note that (30) can be evaluated using the approximation as in (15) after substituting f _{ ov } with m and p _{ fa } with α.
4.4 Optimal M _{ x }
We can confirm the parameter k according to the shortest distance from α to α _{1} and α _{2}.
From the perspective of the theory, the increase of M _{ x } decreases the value of k for fixed α. The decrease of k can increase PD, meanwhile, the increase of M _{ x } can decrease \(p^{\prime }_{ds}\) from (27). The decrease of \(p^{\prime }_{ds}\) constrains the increase of PD, and PD mainly relies on \(p^{\prime }_{ds}\). The decrease of k can compensate the influence which is brought by the decrease of \(p^{\prime }_{ds}\) at the beginning. Hence, PD shows the unimodal characteristic.
5 Performance analysis
After all above analysis, we should compare the SSLV with the scan statistics and find out the difference in performance between them. Numerous simulations and analysis are given in this section.
6 Conclusions
This paper has introduced the SSLV algorithm specially designed to work with target detection in low SNR condition. The correlation between sensors and the expression for global false alarm ratio after the local vote have been described in detail. Moreover, based on the SSLV, the SSLV with variablestepparameter has been proposed. The two algorithms have been examined in simulation studies which revealed that they produce similar detection accuracies, but the SSLV with variable step method is substantially faster during once scan cycle. Nevertheless, there are some potential research topics which will be further discussed. Firstly, it is evident that getting the optimal M _{ x } from the simulation is not the optimal method and a new expression for the optimal M _{ x } should be deduced. Furthermore, the variablestepparameter for the SSLV can be extended to twodimensional without losing any detection performance.
Declarations
Acknowledgements
This work was supported in part by the Overseas Academic Training Funds, University of Electronic Science and Technology of China (OATF, UESTC) (Grant No.201506075013), and the Program for Science and Technology Support in Sichuan Province (Grant Nos. 2014GZ0100 and 2016GZ0088).
Authors’ contributions
JHL and QW proposed the algorithm and carried out the simulations. JHL and QW analyzed the experimental results. JHL gave the critical revision and final approval. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Z Chair, PK Varshney, Optimal data fusion in multiple sensor detection systems. IEEE Trans. Aerosp Electron Syst. 22(1), 98–101 (1986).View ArticleGoogle Scholar
 R Niu, PK Varshney, MH Moore, D Klamer, Decision fusion in a wireless sensor network with a large number of sensors. presented at the 7th IEEE Int. Conf. Information Fusion (ICIF) (Stockolm, Sweden, 2004).Google Scholar
 R Niu, P Varshney, Performance analysis of distributed detection in a random sensor field. IEEE Trans. Signal Process. 56(1), 339–349 (2008).MathSciNetView ArticleGoogle Scholar
 D Marco, YH Hu, Distancebased decision fusion in a distributed wireless sensor network. Telecommun Syst. 26(2–4), 339–350 (2004).Google Scholar
 SH Javadi, A Peiravi, Fusion of weighted decisions in wireless sensor networks. IET Wirel Sens. Syst. 5(2), 97–105 (2015).View ArticleGoogle Scholar
 N Katenka, E Levina, G Michailidis, Local vote decision fusion for target detection in wireless sensor networks. IEEE Trans. Signal Process. 56(1), 329–338 (2008).MathSciNetView ArticleGoogle Scholar
 MS Ridout, An improved threshold approximation for local vote decision fusion. IEEE Trans. Signal Process. 61(5), 1104–1106 (2013).MathSciNetView ArticleGoogle Scholar
 T Tango, K Takahashi, A flexible spatial scan statistic with a restricted likelihood ratio for detecting disease clusters. Stat. Med. 31(30), 4207 (2012).MathSciNetView ArticleGoogle Scholar
 J Fu, W Lou, Distribution theory of runs and patterns and its applications (World Scientific, Singapore, 2003).View ArticleMATHGoogle Scholar
 DU Pfeiffer, KB Stevens, Spatial and temporal epidemiological analysis in the Big Data era. Prev. Vet. Med. 122(1–2), 213–220 (2015).View ArticleGoogle Scholar
 C Teljeur, A Kelly, M Loane, et al, Using scan statistics for congenital anomalies surveillance: the EUROCAT methodology, vol. 30, (2015).Google Scholar
 M Guerriero, P Willett, J Glaz, Distributed target detection in sensor networks using scan statistics. IEEE Trans. Signal Process. 57(7), 2629–2639 (2009).MathSciNetView ArticleGoogle Scholar
 TL Wu, J Glaz, A new adaptive procedure for multiple window scan statistics. Comput. Stat. Data Anal. 82(82), 164–172 (2015).MathSciNetView ArticleGoogle Scholar
 BJ Reich, Multiple window discrete scan statistic for higherorder Markovian sequences. J. Appl. Stat. 42(8), 1–16 (2015).MathSciNetGoogle Scholar
 X Wang, J Glaz, Variable window scan statistics for normal data. Commun. StatisticsTheory Methods. 43:, 2489–2504 (2014).MathSciNetView ArticleMATHGoogle Scholar
 MV Boutsikas, M Koutras, Reliability approximations for Markov chain imbeddable systems. Method. Comput. Appl. Probab. 2:, 393–412 (2000).Google Scholar
 WC Lee, Power of discrete scan statistics: a finite Markov chain imbedding approach. Methodol. Comput. Appl. Probab. 17(3), 833–841 (2015).MathSciNetView ArticleMATHGoogle Scholar
 A Amarioarei, C Preda, Approximations for twodimensional discrete scan statistics in some blockfactor type dependent models. J. Stat. Plann. Infer. 151(3), 107–120 (2014).MathSciNetView ArticleMATHGoogle Scholar
 G Haiman, C Preda, Estimation for the distribution of twodimensional discrete scan statistics. Methodol. Compu. Appl. Probab. 8(373), 382 (2006).MathSciNetMATHGoogle Scholar
 G Haiman, 1dependent stationary sequences for some given joint distributions of two consecutive random variables. Methodol. Comput. Appl. Probab. 14:, 445–458 (2012).MathSciNetView ArticleMATHGoogle Scholar
 HV Poor, An introduction to signal detection and estimation. Springer Texts Electr. Eng. 333(1), 127–139(13) (1998).Google Scholar
 S Kotz, DL Banks, CB Read, Read, Encyclopedia of statistical sciences (WileyInterscience, New York, 1997).MATHGoogle Scholar