Skip to main content

Scan statistics with local vote for target detection in distributed system

Abstract

Target detection has occupied a pivotal position in distributed system. Scan statistics, as one of the most efficient detection methods, has been applied to a variety of anomaly detection problems and significantly improves the probability of detection. However, scan statistics cannot achieve the expected performance when the noise intensity is strong, or the signal emitted by the target is weak. The local vote algorithm can also achieve higher target detection rate. After the local vote, the counting rule is always adopted for decision fusion. The counting rule does not use the information about the contiguity of sensors but takes all sensors’ data into consideration, which makes the result undesirable. In this paper, we propose a scan statistics with local vote (SSLV) method. This method combines scan statistics with local vote decision. Before scan statistics, each sensor executes local vote decision according to the data of its neighbors and its own. By combining the advantages of both, our method can obtain higher detection rate in low signal-to-noise ratio environment than the scan statistics. After the local vote decision, the distribution of sensors which have detected the target becomes more intensive. To make full use of local vote decision, we introduce a variable-step-parameter for the SSLV. It significantly shortens the scan period especially when the target is absent. Analysis and simulations are presented to demonstrate the performance of our method.

1 Introduction

Target detection has an important research significance on military and civil applications in distributed system, such as intrusion detection and fire detection. The reliability of target detection result suffers from the problem of local false alarm, while data fusion can improve the precision of the target detection. For multiple sensor systems, sensors send their sense data to a fusion center, and the fusion center makes the final decision to improve the global probability of detection. Distributed detection using multiple sensors and optional fusion rules has been extensively investigated.

Chair and Varshney [1] present an optimum fusion structure to classical Bayesian detection problem in distributed sensor networks. To obtain the global decision, the fusion center weighs the reliability of every sensor and compares them with a threshold. The reliability of sensor is supported by the probability of detection and false alarm rate. Although this method can get optimum performance, it has to know the probability of detection and false alarm rate previously. Since we cannot get the target location before detection, this method cannot be applied to practical applications.

Niu and Varshney [2] put forward the counting rule, where the fusion center employs the total number of detections reported by local sensors for hypothesis testing and analyzes the performance of the counting rule with a significant number of sensors. In [3], the authors give performance analysis when sensors are deployed in a random sensor field. The counting rule does not need the probabilities of local detection and a false alarm in advance. It makes the counting rule more suitable for the practical environment. For the counting rule, it takes the results of sensors equally. However, sensors close to the target get higher accuracy than sensors far away. Sensors far away from the target degrade global probability of detection. In [4, 5], the authors propose a method which gives different weight to each sensor depending on the evaluated distance to the target or the signal-to-noise ratio (SNR). It makes the final decision more accurate at the cost of sending more data to the fusion center.

The authors in [6] propose the local vote algorithm using decisions of neighboring sensors and making a collective decision as a network. The authors examine both distance-based and nearest neighbor-based versions of local vote algorithm for grid and random sensor deployments and show that in many situations, for a fixed system false alarm, the local vote correction achieves significantly higher target detection rate than the decision fusion based on uncorrected decisions (see Fig. 1). The authors in [7] propose an improved threshold approximation for the local vote decision fusion and demonstrate that this method can achieve a more accurate result.

Fig. 1
figure 1

Ordinary versus Local vote decision fusion under different deployment. After local vote, the number of sensors (black dot) that detected the target (red dot) has increased. a, b grid deployment, c, d random deployment

Scan statistics has been used to an epidemic or computer intrusion in [811]. Moreover, Guerriero [12] puts the scan statistics to the signal processing community firstly. The detection is carried out in a mobile fusion center as a mobile agent (MA) which successively counts the number of binary decisions reported by local sensors lying inside its moving field of view. The MA, playing the role of the fusion center, makes the final decision about the presence of a target. The authors also demonstrate the existence of optimal size for the field of view and the disjoint-window test. In disjoint-window scan test, the MA travels across the sensor network and scans the network using no overlapping windows. In [1315], the authors introduce the variable window scan statistics and investigate the performance of those variable window scan statistics methods. The disjoint-window scan statistics can shorten the scan period. However, it has poor performance compared with the scan statistics (SS).

How to improve the probability of detection while reducing the false alarm rate is an eternal topic. To handle complex network environments, improving the global performance in low SNR is our primary goal. The research mentioned above can improve the detection probability and decrease the false alarm rate. However, those algorithms cannot meet the expected performance in low SNR. The local vote algorithm can significantly improve the global performance, especially in low SNR. After the local vote, the counting rule is adopted for decision fusion. The counting rule does not consider the spatial correlation that sensors near the target have a higher probability of reporting detections. It weakens the advantage which is brought by local vote. In this paper, we combine two previous ideas: local vote decision and scan statistics. Sensors make a local vote, and the MA performs scan statistics. According to Fig. 1, we can see that local vote makes the distribution of sensors which have detected the target more concentrated. It inspires us to introduce a variable-step-parameter for the scan statistics with a local vote (SSLV). Our contributions in this paper are described as follows.

  • A model of the SSLV is proposed. We analyze the difference between the SSLV and the traditional SS. The deduction of global false alarm ratio for the SSLV is developed.

  • We apply the SSLV to a grid sensor network and compare its performance with the SS. According to the simulation, we can prove that the SSLV overwhelms the SS in low SNR. We also verify that an optimal M x at a given situation does exist.

  • We introduce a variable-step-parameter for the SSLV and analyze its influence on our method. From the simulation, we know that the variable-step-parameter has little negative effect on detection performance of the SSLV. However, it significantly shortens the scan period especially when a target is absent.

The remainder of the paper is organized as follows. Section 2 demonstrates two-dimensional scan statistics as a foundation for the SSLV. Section 3 describes the system model of scan statistics with the local vote and introduces a variable-step-parameter into the SSLV. Section 4 applies the SSLV to a grid sensor network, and various simulations and analysis are provided. Finally, Section 5 concludes our research.

2 Scan statistics

In this part, we will introduce the classical two-dimensional scan statistics algorithm. The scan statistics is a kind of distributed detection method. Each sensor makes its hypotheses according to its sense data and sends the result to the fusion center. The traditional counting rule algorithm collects data from all sensors in the field of interest and makes the global judgment through these data. Unlike the counting rule algorithm, the SS makes an MA sequentially collect the data from the agent area, and the MA makes the final decision for the global network. When a target is present, sensors near the target are more likely to make the right judgments. The SS considers this spatial correlation that makes it more accurate than the counting rule.

We assume that all sensors follow the same hypotheses: either H 0 (target absent) is valid or H 1 (target present) is uniformly accurate. R presents the region of interest (ROI). We deploy sensors in region R, and the region is defined by [0,T 1] × [0,T 2]. More specifically, let h i = T i /N i > 0, where N i are positive integers, and i = 1,2. For 1 ≤ iN 1 and 1 ≤ jN 2, let X i,j be the count of sensors that have been observed in the rectangular basic regions [(i − 1)h 1,i h 1] × [(j − 1)h 2,j h 2]. In the process of scan, the MA records the results of sum in its agent region, and the size of agent region is m 1 by m 2.

$$ v{}'_{i_{1},i_{2}}=\sum_{j~=~i_{2}}^{i_{2}~+~m_{2~}-~1}\sum_{i~=~i_{1}}^{i_{1}~+~m_{1}~-~1}X{}'_{xj} $$
(1)

where m 1 and m 2 are the width and height of the MA respectively. We also call the m 1 and m 2 the window sizes of the MA. The MA collects data and finds the maximum value to compare with a pre-set threshold value k.

$$\begin{array}{@{}rcl@{}} S_{m_{1}~\times~ m_{2}N_{1}~\times~ N_{2}}&=max \{ v{}'_{i_{1},i_{2}}; 1~\leq ~i_{1}~\leq~ N_{_{1}}~-~m_{1}\notag\\ &\quad+~1, 1~\leq~ i_{2}~\leq~ N_{_{2}}~-~m_{2}~+~1 \} \end{array} $$
(2)

If there is a maximum value greater than k, we can say that k events are clustered within the inspected region. Therefore, the global probability of detection PD and global probability of false alarm PF can be respectively expressed as

$$\begin{array}{@{}rcl@{}} &&PD=P\left (S_{m_{1}~\times~ m_{2}N_{1}~\times~ N_{2} }~\geq~ k|H_{1} \right) \notag \\ &&PF=P\left (S_{m_{1}~\times~ m_{2}N_{1~}\times~ N_{2} }~\geq~ k|H_{0}\right) \end{array} $$
(3)

It is important for us to give the expression of \(P\left (S_{m_{1}~\times ~ m_{2}N~\times ~ N} ~\geq ~ k\right)\). Although there is no exact expression for \(P\left (S_{m_{1}~\times ~ m_{2}N_{1}~\times ~ N_{2}} ~\geq ~ k\right)\), we can evaluate the approximation for it. When the X ij′ is the Bernoulli random variable with parameter P=α, where 0<α < 1, the accurate approximation for \(P\left (S_{m_{1}~\times ~m_{2}N_{1}~\times ~N_{2}} ~\geq ~ k\right)\) can be expressed as

$${} {\begin{aligned} &P(S_{m~\times~ m}\geq k) \\ &\approx 1-\left [\! \frac{[P\left \{ S_{m~\times~ m}\left (m, m \right) \leq k~-~1\right \}]^{\left (N~-~m~-~1 \right)^{2}}}{[P\left \{ S_{m~\times~ m}\left (m, m~+~1 \right) \leq k~-~1\right \}]^{2\left(N~-~m~-~1 \right)\left (N~-~m \right)}} \!\right ] \\ &\quad\times ~[P\left \{ S_{m~\times~ m}\left (m~+~1, m~+~1 \right) \leq k-1\right \}]^{\left (N~-~m \right)^{2}} \end{aligned}} $$
(4)

The full content can be found in [12, 16]. In [12], the authors also demonstrate the expression when the \(X^{\prime }_{i, j}\) conforms to Poisson distribution.

3 Two-dimensional scan statistics with local vote decision

In this section, we give a detailed description of the SSLV that we proposed and the problems which are brought by local vote. Section 3.1 presents the scan statistics with local vote decision algorithm. The correlation of local sensors is presented in Section 3.2. According to our analysis, it turns out that the results provided by sensors are not independent and identically distributed (i.i.d) anymore after the local vote. It makes an expression (4) cannot be used in the SSLV, but a new expression for the SSLV is deduced in Section 3.3. Section 3.4 introduces a variable-step-parameter into the SSLV in order to take full advantage of the local vote.

3.1 Scan statistics with local vote decision

Precisely, in this part, we will give an introduction about the SSLV in two-dimensional region. The underlying assumptions in the last part are still suitable here. The difference is that we let sensors make a local vote decision before scan statistics. According to [6], we can take various neighborhood algorithms, such as fixed distance r or fixed size. Any one of those algorithms can be selected, then all corresponding parameters can be confirmed. For better description, we will redefine some variables. Let X i,j be the event that has been observed in the rectangular sub-regions [i h 1,(i + 1)h 1]×[j h 2,(j + 1)h 2] after local vote where 1 ≤ iN 1 and 1 ≤ jN 2, N i = N i −2, and for simplicity, we exclude the rectangular sub-regions on the edge of the field in this part. For 1 ≤ iN 1 and 1 ≤ jN 2, let m 1 and m 2 be the positive integers, 1 ≤ m 1N 1 and 1 ≤ m 2N 2.

$$ {{v}_{{{i}_{1}},{{i}_{2}}}}~=~\sum\limits_{j~=~{{i}_{2}}}^{{{i}_{2}}~+~{{m}_{2}}~-~1}{\sum\limits_{i~=~{{i}_{1}}}^{{{i}_{1~}}+~{{m}_{1}}~-~1} {{{X}_{i,j}}}} $$
(5)

Similarly, if \({v_{^{{i_{1}}, {i_{2}}}}}\) exceeds a pre-set value of k, the MA makes the final decision that a target is present. The largest number of events in an agent region can be expressed as

$$\begin{array}{@{}rcl@{}} {{S}_{{{m}_{1}}~\times~ {{m}_{2}};~ {{N}_{1}}z~\times~ {{N}_{2}}}}&=&\max~\{{{v}_{{{i}_{1}}, {{i}_{2}}}}; \text{}1~\le~ {{i}_{1}}~\le~ {{N}_{1}}~-~{{m}_{1}}\notag\\ &&+~1, 1~\le~ {{N}_{2}}~-~{{m}_{2}}~+~1\} \end{array} $$
(6)

For simplicity, we abbreviate \({{S}_{{{m}_{1~}}\times ~ {{m}_{2}}; {{~N}_{1}}~\times ~ {{N}_{2}}}}\phantom {\dot {i}\!}\) to S. The next step is to obtain the expression of \(P\left (S_{m_{1}~\times ~ m_{2}~N~\times ~ N} ~\geq ~ k\right)\) to make the SSLV useful.

3.2 Correlation of sensors

Our algorithm introduces local vote decision into the traditional scan statistics. Therefore, we should figure out what has changed after the combination of two algorithms. The dependence among sensors should be examined first. For any sensor detection event Z i , we start by calculating the expected value μ i and variance σ i 2 of the updated decision.

$$ {{\mu }_{i}}~=~P~\left({{Z}_{i}}~=~1 \right)=\sum\limits_{n~={{~M}_{x}}}^{{{M}_{i}}}{\left(\begin{array}{c} {{M}_{i}} \\ n \\ \end{array} \right){{\alpha }^{n}}{{(1~-~\alpha)}^{{{M}_{i~}}-~n}}} $$
(7)

where M i is the number of neighbors which depends on local vote decision algorithm. M x is a variable that has a significant influence on the performance. σ i 2 = μ i (1 − μ i ).

The dependence between Z i and Z j has relations with the intersection of their respective neighborhoods U(i) and U(j). The number of sensors in the intersection U(i)∩U(j) can be denoted by n i,j . According to the expression of covariance, we first compute E(Z i Z j ) = P(Z i = Z j = 1) and then calculate the covariance between Z i and Z j . We divide the neighborhoods into three parts. Suppose that A is the number of positive decisions in U(i)∩U(j) and B is the number of positive decisions in U(i), but not in U(j), while C is the number of positive decisions in U(j), but not in U(i). Noting that A, B, and C are independent, we can have

$$ \begin{aligned} E\left({{Z}_{i}}{{Z}_{j}} \right)~&=~\sum\limits_{k~=~0}^{{{n}_{i,j}}}P(A~=~k)P(B~>~{{M}_{x}}_{i}~-~k)\\ &\quad \times P(C~>~{{M}_{xj}}~-~k) \end{aligned} $$
(8)
$$ P\left(A~=~k \right)=\left(\begin{array}{c} {{n}_{i, j}} \\ k \\ \end{array} \right){{\alpha }^{_{k}}}{{\left(1~-~\alpha \right)}^{{{n}_{i, j~}}-~k}} $$
(9)
$$\begin{array}{@{}rcl@{}} P\left(B>{{M}_{\mathrm{x}i}}~-~k \right) ~&=~\sum\limits_{q~=~{{M}_{\text{xi}}}~-~k~+~1}^{{{M}_{i}}~-~{{n}_{i, j}}}\left(\begin{array}{c} {{M}_{i}}~-~{{n}_{i, j}} \\ q \\ \end{array} \right)\notag\\ &{{\alpha }^{q}}{{(1~-~\alpha)}^{{{M}_{i}}~-~{{n}_{i,j}}~-~q}} \end{array} $$
(9)
$$\begin{array}{@{}rcl@{}} P\left(C~>~{{M}_{{{x}_{j}}}}~-~k \right) =\sum\limits_{q~=~{{M}_{{{x}_{j}}}}~-~k~+~1}^{{{M}_{j}}~-~{{n}_{_{i,j}}}}{\left(\begin{array}{c} {{M}_{j}}~-~{{n}_{^{i, j}}} \\ q \\ \end{array} \right)}\notag\\ {{\alpha }^{q}}{{(1~-~\alpha)}^{{{M}_{j}}~-~{{n}_{i,j}}~-~q}} \end{array} $$
(10)

The covariance is then given by

$$ Cov({Z_{i}},{Z_{j}}) ~=~ \left[ {E({Z_{i}},{Z_{j}}) ~-~ {\mu_{i}}{\mu_{j}}} \right]I({n_{i, j}} > 0) $$
(11)

According to the deductions above, we can find out that decision X i,j is not i.i.d anymore after the local vote.

3.3 Approximation for P(Sk)

In [16], the authors give the proof of approximation when the X i,j is i.i.d with the Markov Chain imbeddable systems [17]. Obviously, it is not applicable here. Luckily, there are different ways to give the accurate approximation for P(Sk) and one of them is using the Haiman theorem [1820].

Theorem 1

Let {X i } be a stationary 1-dependent sequence of r.v’s and for x < w, w = sup{u;P(X 1u)<1}, let q n = q n (x) = P{max(X 1,…X n ) ≤ x}. For any x such that P(X 1>x) = 1 − q 1 ≤ 0.025 and any integer n > 3 such that \(3.3n{{(1~-~{{q}_{^{1}}})}^{2}}~\le ~ 1\), we have

$$ \frac{\left| {{q}_{n}}-\frac{(2{{q}_{1}}~-~{{q}_{2}})}{{{(1~+~{{q}_{1}}~-~{{q}_{2}}~+~2{{({{q}_{1}}~-~{{q}_{2}})}^{2}})}^{n}}} \right|}{{{q}_{n}}}\le 3.3n{{(1~-~{{q}_{1}})}^{2}} $$
(12)

According to the Haiman theorem, we need to construct a stationary 1-dependent sequence. Supposing that N 1 = K m 1 and N 2 = L m 2, where K and L are positive integers, we have

$$ {{Z}_{k}}=\underset{\begin{array}{c} (k~-~1){{m}_{1}}~<~ t~\le~ k{{m}_{1}} \\ 0~<~s~\le~(L-1){{m}_{2}} \end{array}}{\mathop{\max }}\,{{v}_{ts}},k~=~1, 2,\ldots K~-~1 $$
(13)

{Z k } k = 1,…,k − 1 is a 1-dependent stationary sequence and \(P\left (S\le n \right)=P(\underset {k~=~1\ldots K~-~1}{\mathop {\max }}\,\{{{Z}_{k}}\}~\le ~n)\). Let Q 2 = P(Z 1n) and Q 3 = P(Z 1n,Z 2n). Then, if 1-Q 2 ≤ 0.025, we can get approximation from Haiman theorem

$$ \begin{aligned} P\left(S~\le~ n \right)&\approx (2{{Q}_{2}}~-~{{Q}_{3}})[1~+~{{Q}_{2}}~-~{{Q}_{3}}\\ &\quad+~2{{({{Q}_{2}}-{{Q}_{3}})}^{2}}]^{-(K~-~1)} \end{aligned} $$
(14)

with an error of about 3.3(K − 1)(1 − Q 2)2. To evaluate (15), one needs approximations for Q 2 and Q 3. Hence, the question is transformed into evaluating Q 2 and Q 3. We may apply Theorem 1 again considering the two sequences of random variables defined by

$${{Y}_{l}}=\underset{\begin{array}{c} 0 < t~\le~ {{m}_{1}} \\ (l~-~1){{m}_{2}}~<~ s~\le~ l{{m}_{2}} \\ \end{array}}{\mathop{\max }}\,{{v}_{ts}} $$

and

$${{Z}_{l}}=\underset{\begin{array}{c} 0 ~<~t\le 2{{m}_{1}} \\ (l~-~1){{m}_{2}}~<~ s\le l{{m}_{2}} \\ \end{array}}{\mathop{\max }}\,{{v}_{ts}},l=1,2,\ldots L-1 $$

which are also stationary and 1-dependent. Put Q 22=P(Y 1n), Q 23=P(Y 1n,Y 2n), Q 32=P(Z 1n) and Q 33 = P(Z 1n,Z 2n). We have

$${{Q}_{22}}~=~P\left(S({{m}_{1}},{{m}_{2}},2{{m}_{1}},2{{m}_{2}})~\le~ n \right), $$
$${{Q}_{23}}~=~P\left(S({{m}_{1}},{{m}_{2}},2{{m}_{2}},3{{m}_{2}})~\le~ n \right), $$
$${{Q}_{32}}~=~P(S({{m}_{1}},{{m}_{2}},3{{m}_{1}},2{{m}_{2}})~\le~ n), $$
$${{Q}_{33}}~=~P(S({{m}_{1}},{{m}_{2}},3{{m}_{1}},3{{m}_{2}})~\le~ n). $$

Then, if 1-Q 22 ≤ 0.025 and 1-Q 32 ≤ 0.025, we can still get the approximations from Theorem 1.

$${} {{Q}_{2}}\approx (2{{Q}_{22}}-{{Q}_{23}}){{\left[1+{{Q}_{22}}-{{Q}_{23}}+2{{({{Q}_{22}}-{{Q}_{23}})}^{2}}\right]}^{-(L\text{-}1)}} $$
(15)

with an error of about 3.3(L−1)(1−Q 22)2 and

$${} {{Q}_{3}}\approx (2{{Q}_{32}}-{{Q}_{33}}){{[\!1+{{Q}_{32}}-{{Q}_{33}}+2{{({{Q}_{32}}-{{Q}_{33}})}^{2}}]}^{-(L-1)}} $$
(16)

with an error of about 3.3(L − 1)(1 − Q 32)2.Assuming that LK and substituting (17) and (16) into (15), we can get the final expression we need.

The total error on the resulting approximation of P(Sn) is bounded by about

$$\begin{array}{@{}rcl@{}} {{E}_{app}} &=3.3(L-1)(K~-~1)({{(1~-~{{Q}_{22}})}^{2}}~+~{{(1~-~{{Q}_{32}})}^{2}}\notag\\ &+(L-1){{({{Q}_{22}}-{{Q}_{23}})}^{2}}). \end{array} $$
(17)

The exact formulas for Q uv ,u,v{2,3} is hard to be obtained. Thus, we can use Monte Carlo simulation to evaluate these quantities. The final expression can be given by

$$\begin{array}{@{}rcl@{}} P\left(S\ge k \right)=1-P\left(S<k \right) =1-P(S\le k-1) \end{array} $$
(18)

where P(Sk − 1) can be approximated by (15).

3.4 The SSLV with variable-step-parameter

The traditional scan statistics is a kind of continuous scan. Disjoint-window scan statistics means the MA travels across the ROI and scans the area using no overlapping windows. In [12], the authors investigate the disjoint-window test and compare its performance with the scan statistics. Obviously, the scan statistics overwhelms the disjoint-window, and its performance is more stable. However, the disjoint-window can shorten scan period. In this section, we will introduce a variable-step-parameter for the SSLV. In the process of scan, the MA makes a choice for the next start position according to the result of the current scan. Since the detection probability is based on the distance between the target and sensors, sensors near the target have a higher probability of detecting the target. If the result of detection is small, we can magnify the value of step to avoid the redundant scan especially when the target is absent. The variable step is given by

$$ step=\max \left\{ \left\lfloor \left(1~-~\frac{{{v}_{ts}}}{\text{k}} \right){{f}_{ov}} \right\rfloor,1 \right\} $$
(19)

The scan region can be a rectangular region given by R(i 1,i 2) = [i 1 h 1,(i 1 + m)h 1] × [i 2 h 2,(i 2 + m)h 2]. Assuming R(i 1,i 2) is the scan region at the current time, then the next scan region is R[(i 1+s t e p)h 1,(i 1+s t e p+m)h 1]×[i 2 h 2,(i 2 + m)h 2], i 1+s t e pNm + 1. We only introduce the step at one-dimensional field for better performance. Whereas, the global false alarm rate can still be evaluated by (15).

4 Application of the SSLV in distributed system

In this section, we apply the SSLV into a particular situation and provide a detailed description concerning observation model, local vote decision model, false alarm probability at the MA, and the optimal M x .

4.1 Observations model

In this section, we will present the observation model depicted in Fig. 1. All sensors are assigned in a grid pattern in Fig. 1 a and in a random pattern in Fig. 1 c. We consider the two-dimensional field is a square of areas b 2. The number of total sensors can be expressed as M. (x s ,y s ) for s=1,…M present the coordinates of sensor s. The coordinate of each sensor is known. Noises at local sensors are i.i.d and follow the Gaussian distribution with zero mean and variance σ w 2.

$$ {{w}_{s}} \sim \textrm{N}(0,{{\sigma }_{w}}^{2})\mathrm{s}~=~\text{1,}\ldots\mathrm{M} $$
(20)

We design each sensor s to decide between the following hypotheses

$$\begin{array}{@{}rcl@{}} {{H}_{0}}:{{r}_{s}}& = &{{w}_{s}} \notag \\ {{H}_{1}}:{{r}_{s}}& = &{{y}_{s}}+{{w}_{s}} \end{array} $$
(21)

where r s is the received signal at sensor s. Sensors make their decisions according to the value of r s . y s =a s /d s , and a s is i.i.d which follows the Gaussian distribution with zero mean and variance σ 2 (σ 2 represents the power of the signal that is emitted by the target at distance d s = 1m), and d s is the Euclidean distance between the local sensor s and the target

$$ {{d}_{s}}~=~\sqrt{{{({{x}_{s}}~-~{{x}_{t}})}^{2}}~+~{{({{y}_{s}}~-~{{y}_{t}})}^{2}}} $$
(22)

and (x t ,y t ) are the unknown coordinates of the target. Sensors near the target receive more signals than those far away. Receiving more signals means higher probability of detection. In our simulations, we assume that the location of the target follows a uniform distribution, and all local sensors make their judgments by using the same threshold τ. According to the Neyman-Pearson lemma [21], the local sensor-level false alarm rate and probability of detection can be respectively obtained by

$$\begin{array}{@{}rcl@{}} {{p}_{fa}}& = &2Q\left(\sqrt{\frac{\tau }{{{\sigma }_{w}}^{2}}} \right) \end{array} $$
(23)
$$\begin{array}{@{}rcl@{}} {{p}_{ds}}& = &2Q\left(\sqrt{\frac{\tau }{{{\sigma }_{w}}^{2}+\frac{{{\sigma }^{2}}}{{{d}_{s}}^{2}}}} \right) \end{array} $$
(24)

where \(Q\left (x \right)~=~\int _{x}^{\infty }{{1}/{\sqrt {2\pi }}\;{{e}^{{-{{\xi }^{2}}}/{2}\;}}d\xi }\) is the unit Gaussian exceedance function.

4.2 Local vote decision model

We divide the ROI into M (the total number of sensors) little sub-squares. The location of the sensor inside each small sub-square is known. Let h = b/N , where N satisfies N 2 = M, and we divide the square of area b 2 into M cells so that each cell of area h 2 contains only one sensor. Let us denote the cell [i h,(i + 1)h] × [j h,(j + 1)h] by c(i,j). We define X i,j as the binary data from the local sensor s inside c(i,j) with 0 ≤ iN -1 and 0≤jN -1.

If sensors are deployed along a regular grid, sensors at the vertex of a square can be selected as the neighbors for local vote algorithm. The number of the neighbors is fixed. Each sensor contains nine neighbors (including itself) in our simulation. When sensors are randomly deployed in the field, sensors within a fixed distance can be selected as neighbors for local vote algorithm. The number of the neighbors is not fixed in this version. Every sensor receives the decision from its neighbors ignoring the sensors on the edge of the field. If the sum of these decisions exceeds the given threshold M x , the sensor makes the decision of target presence. After the local vote decision, the local sensor-level false alarm rate and probability of detection are given by

$$ {{{p}'}_{fa}}~=~\sum\limits_{n~=~{{M}_{x}}}^{{{M}_{i}}}{\left(\begin{array}{c} {{M}_{i}} \\ n \\ \end{array} \right){{p}_{fa}}^{n}{{(1~-~{{p}_{fa}})}^{{{M}_{i}}~-~n}}} $$
(25)
$$ {{{p}'}_{\text{ds}}}~=~\sum\limits_{n~=~{{M}_{x}}}^{{{M}_{i}}}{\left(\begin{array}{c} {{M}_{i}} \\ n \\ \end{array} \right){{p}_{ds}}^{n}{{(1-{{p}_{ds}})}^{{{M}_{i}}~-~n}}} $$
(26)

where M i is the number of neighbors and M x is the pre-set threshold. We define X i,j as the binary data from the local sensor s inside c(i,j) with 1≤iN and 1≤jN after local vote where N=N −2. We observe that for each 1 ≤ iN, the sequence (X i,j )1 ≤ iN is c-dependent, and for each 1 ≤ jN, the sequence (X i,j )1 ≤ jN is also c-dependent, where c=2 in our simulation. c has relations with the choice of local vote algorithm. In (15), we construct a 1-dependent sequence to evaluate global false alarm probability. Only when cm can we guarantee the sequence is 1-dependent.

4.3 False alarm probability at the MA

The binary data from local sensors can be expressed as I s = {0,1}(s = 1,…,M ). M represents the number of all sensors except those on the edge of the field. When there is a detected target, I s takes the value 1; otherwise, it takes the value 0 after the local vote. It is easy to verify that

$$ \sum\limits_{s~=~1}^{M{}'}{{{I}_{s}}}~=~\sum\limits_{j~=~1}^{N}{\sum\limits_{i~=~1}^{N}{{{X}_{ij}}}} $$
(27)

The MA travels across the area and sequentially collects the local binary decisions from sensors located inside its agent region, which we consider to be squares of size f ov h. f ov is the size of the window. The sequential fusion rule at the MA for 1 ≤ i 1Nf ov +1 and 1 ≤ i 2Nf ov +1 is given by

$$ \left \{ \begin{array}{rl} {{v}_{{{i}_{1}},{{i}_{2}}}}~\ge~ k& \Rightarrow decide~{{H}_{1}} \\ otherwise & \Rightarrow MA~continues~to~scan \\ \end{array} \right. $$
(28)

where \({{v}_{{{i}_{1}},i}}_{2}~=~{\sum \nolimits }_{j~=~{{i}_{2}}}^{{{i}_{2}}~+~fov~+~1}{{\sum \nolimits }_{i={{i}_{1}}}^{i_{1}~+~fov~+~1}{{{X}_{ij}}}}\). At the MA, the probability of global false alarm PF is

$$ PF=P({{S}_{fov~\times~ fov}}\ge k|{{H}_{0}}) $$
(29)

We note that (30) can be evaluated using the approximation as in (15) after substituting f ov with m and p fa with α.

In Fig. 2, we plot the global probability of false alarm PF for the MA versus the local probability of false alarm p fa for the sensor. The curves obtained by using the SSLV approximation in (15) and simulations (based on 5000 Monte Carlo runs) are plotted. Fig. 2 shows that the approximation in (15) is very accurate.

Fig. 2
figure 2

Probability of false alarm for MA PF versus probability of false alarm for the local sensor p fa . Here, we have N = 27, b = 5, σ 2 = 1, σ w 2 = 4, k = 6, M x = 4, and f ov = 5. Simulations are based on 5000 runs

4.4 Optimal M x

Obviously, the global probabilities of false alarm and detection have relations with the value of M x . In this section, we are looking for the optimal M x and trying to show the existence of optimal value. The choice of M x must maximize the probability of detection at the given global false alarm rate so the expression can be written as

$$ \underset{{{M}_{x}}:PF~=~\alpha }{\mathop{\max }}\,PD $$
(30)

where PD is the probability of detection at the MA.For the given exact value of α, it is hard to confirm the related parameter k according to α because it involves discrete distributions. To solve this constrained optimization problem in (31), we should use a randomized test [22]. By defining α 1 and α 2 as follows

$$ \begin{aligned} P\left({{S}_{fov~\times~ fov}}\ge {{k}_{1}} \right)\approx {{\alpha }_{1}}<\alpha \\ P\left({{S}_{fov~\times~ fov}}\ge {{k}_{1}}-1 \right)\approx {{\alpha }_{2}}>\alpha, \\ \end{aligned} $$
(31)

we use a “coin-flip” decision with probability

$$\begin{array}{@{}rcl@{}} k & = & {{k}_{1}}\qquad \frac{\alpha -{{\alpha }_{1}}}{{{\alpha }_{2}}-{{\alpha }_{1}}}>1/2 \notag \\ k & = & {{k}_{1}}-1\qquad \text{otherwise.} \end{array} $$
(32)

We can confirm the parameter k according to the shortest distance from α to α 1 and α 2.

When an M x and the exact global probability of false alarm is given, we can get the corresponding k. The global probability of detection can be confirmed by k. In Fig. 3, combining with (3) and (30), we plot the global probability of detection versus M x . α is set to be 0.1. As shown in Fig. 3, there does exist an optimal M x that maximizes the probability of detection PD for the MA at the given condition. By employing this optimum M x , a significant improvement in PD can be achieved. In our simulation, with the increase of M x , PD increases as well. When M x = 4, PD reaches the maximum. After that, PD decreases with the increase of M x . The optimal M x has relations with other parameters and is different in different environments. However, under the given condition, there is indeed an optimal value that maximizes PD.

Fig. 3
figure 3

Probability of detection for MA versus M x . Here, we have N = 27, b = 5, σ 2 = 1, σ w 2 = 4, k = 6, p fa = 0.05, and f ov = 5. Simulations are based on 5000 runs

From the perspective of the theory, the increase of M x decreases the value of k for fixed α. The decrease of k can increase PD, meanwhile, the increase of M x can decrease \(p^{\prime }_{ds}\) from (27). The decrease of \(p^{\prime }_{ds}\) constrains the increase of PD, and PD mainly relies on \(p^{\prime }_{ds}\). The decrease of k can compensate the influence which is brought by the decrease of \(p^{\prime }_{ds}\) at the beginning. Hence, PD shows the unimodal characteristic.

5 Performance analysis

After all above analysis, we should compare the SSLV with the scan statistics and find out the difference in performance between them. Numerous simulations and analysis are given in this section.

In Fig. 4, we plot the global probability of false alarm for the MA versus the threshold k under different deployments. In Fig. 4 a, sensors are deployed along the regular grid, and M i = 9. In Fig. 4 b, sensors are randomly deployed in the field, and the neighborhood distance is set to be 0.1. We can see from Fig. 4 that with the increase of M x , the global probability of false alarm decreases significantly. What we need is to get lower false alarm rate. There is a crossing point in Fig. 4, which means the scan statistics has the same global probability of false alarm with the SSLV at that point. Before that critical point, the SSLV gets a lower PF than the scan statistics. After that, the scan statistics overwhelms the SSLV. It is because the local vote increases the count of sensors that has detected event compared with the scan statistics. This critical point is not an integer and does not exist in fact. However, the nearest two positive integers of crossing point are the practical key points.

Fig. 4
figure 4

Probability of false alarm for MA versus threshold k. Here, we have N = 27, b = 5, σ 2=1, σ w 2 = 4, p fa = 0.05, and f ov = 5. Simulations are based on 5000 runs. a Grid deployment. b Random deployment

In Fig. 5, we plot the global probability of detection for the MA versus the threshold k. From Fig. 5, we know that PD of the SSLV does not always overwhelm the scan statistics. However, it shows that for the large value of the threshold k, the SSLV performs better than the scan statistics. The larger threshold k means the smaller global probability of false alarm is demanded. It is related to the value of M x . With the increase of M x , we get lower false alarm rate; however, the probability of detection decreases as well. Hence, it is important to select the value of M x according to the various detection environments. At the given condition, we can use the method in Section 3.4 to evaluate the optimal M x . Overall, the SSLV can substantially decrease the probability of false alarm and improve the global probability of detection compared with the scan statistics.

Fig. 5
figure 5

Probability of detection for MA versus the threshold k. Here, we have N = 27, b = 5, σ 2 = 1, σ w 2 = 4, p f = 0.05, and f ov = 5. Simulations are based on 5000 runs. a Grid deployment. b Random deployment

In Fig. 6, we plot the global probability of detection for the MA versus σ 2 (power of the signal that is emitted by the target at a distance d s =1 m) at different local false alarm rates. The increase of σ 2 enables more sensors to sense the signal, and the local probability of detection increases as well. The local probability of detection affects the global probability of detection. Simulations are based on 5000 runs. From Fig. 6, we know the probability of detection increases with the increase of signal strength.

Fig. 6
figure 6

Probability of detection for MA versus σ 2. Here, we have N = 27, b = 5, M x = 3, σ w 2 = 6, and f ov = 5. Simulations are based on 5000 runs

In Fig. 7, we plot the global probability of detection versus σ 2 for different methods. Figure 7 illustrates the SSLV has higher PD than scan statistics when the SNR is low. It means our SSLV is more suitable for the tough environment. When the SNR is high, the advantage of the SSLV is not evident.

Fig. 7
figure 7

Probability of detection for MA versus σ 2. Here, we have N = 27, b = 5, σ w 2 = 4, k = 8, p fa = 0.05, and f ov = 5. Simulations are based on 5000 runs

After introducing a variable-step-parameter into the SSLV, we should figure out its influence on the SSLV. Figure 8 presents the receiver operating characteristic curves (ROC) of the SSLV and the SSLV with the variable-step-parameter. From Figure 8, we can see that the variable-step-parameter has little negative effect on the performance of detection. The local vote makes the distribution of sensors report event so concentrated that we can use this parameter without worry. Figure 9 shows the scan times of different methods to show the advantage of variable-step-parameter. When the target is absent, the SSLV with variable step significantly decreases the times of scan, meanwhile, when the target is present, the SSLV with variable step reduces the times of scan to some extent. In our simulation, the target is placed at the center of the field.

Fig. 8
figure 8

ROC of SSLV and SSLV with variable step. Here, we have N = 27, b = 5, σ 2 = 1, σ w 2 = 4, p fa = 0.05, k = 6, M x = 3, and f ov = 5. Simulations are based on 5000 runs

Fig. 9
figure 9

Scan times versus different hypotheses. Here, we have N = 27, b = 5, σ 2 = 1, σ w 2 = 4, k = 6, M x = 3, p fa = 0.05, and f ov = 5. The target is placed at the center of field under the H1 hypothesis

6 Conclusions

This paper has introduced the SSLV algorithm specially designed to work with target detection in low SNR condition. The correlation between sensors and the expression for global false alarm ratio after the local vote have been described in detail. Moreover, based on the SSLV, the SSLV with variable-step-parameter has been proposed. The two algorithms have been examined in simulation studies which revealed that they produce similar detection accuracies, but the SSLV with variable step method is substantially faster during once scan cycle. Nevertheless, there are some potential research topics which will be further discussed. Firstly, it is evident that getting the optimal M x from the simulation is not the optimal method and a new expression for the optimal M x should be deduced. Furthermore, the variable-step-parameter for the SSLV can be extended to two-dimensional without losing any detection performance.

References

  1. Z Chair, PK Varshney, Optimal data fusion in multiple sensor detection systems. IEEE Trans. Aerosp Electron Syst. 22(1), 98–101 (1986).

    Article  Google Scholar 

  2. R Niu, PK Varshney, MH Moore, D Klamer, Decision fusion in a wireless sensor network with a large number of sensors. presented at the 7th IEEE Int. Conf. Information Fusion (ICIF) (Stockolm, Sweden, 2004).

    Google Scholar 

  3. R Niu, P Varshney, Performance analysis of distributed detection in a random sensor field. IEEE Trans. Signal Process. 56(1), 339–349 (2008).

    Article  MathSciNet  Google Scholar 

  4. D Marco, Y-H Hu, Distance-based decision fusion in a distributed wireless sensor network. Telecommun Syst. 26(2–4), 339–350 (2004).

    Google Scholar 

  5. SH Javadi, A Peiravi, Fusion of weighted decisions in wireless sensor networks. IET Wirel Sens. Syst. 5(2), 97–105 (2015).

    Article  Google Scholar 

  6. N Katenka, E Levina, G Michailidis, Local vote decision fusion for target detection in wireless sensor networks. IEEE Trans. Signal Process. 56(1), 329–338 (2008).

    Article  MathSciNet  Google Scholar 

  7. MS Ridout, An improved threshold approximation for local vote decision fusion. IEEE Trans. Signal Process. 61(5), 1104–1106 (2013).

    Article  MathSciNet  Google Scholar 

  8. T Tango, K Takahashi, A flexible spatial scan statistic with a restricted likelihood ratio for detecting disease clusters. Stat. Med. 31(30), 4207 (2012).

    Article  MathSciNet  Google Scholar 

  9. J Fu, W Lou, Distribution theory of runs and patterns and its applications (World Scientific, Singapore, 2003).

    Book  MATH  Google Scholar 

  10. DU Pfeiffer, KB Stevens, Spatial and temporal epidemiological analysis in the Big Data era. Prev. Vet. Med. 122(1–2), 213–220 (2015).

    Article  Google Scholar 

  11. C Teljeur, A Kelly, M Loane, et al, Using scan statistics for congenital anomalies surveillance: the EUROCAT methodology, vol. 30, (2015).

  12. M Guerriero, P Willett, J Glaz, Distributed target detection in sensor networks using scan statistics. IEEE Trans. Signal Process. 57(7), 2629–2639 (2009).

    Article  MathSciNet  Google Scholar 

  13. TL Wu, J Glaz, A new adaptive procedure for multiple window scan statistics. Comput. Stat. Data Anal. 82(82), 164–172 (2015).

    Article  MathSciNet  Google Scholar 

  14. BJ Reich, Multiple window discrete scan statistic for higher-order Markovian sequences. J. Appl. Stat. 42(8), 1–16 (2015).

    MathSciNet  Google Scholar 

  15. X Wang, J Glaz, Variable window scan statistics for normal data. Commun. Statistics-Theory Methods. 43:, 2489–2504 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  16. MV Boutsikas, M Koutras, Reliability approximations for Markov chain imbeddable systems. Method. Comput. Appl. Probab. 2:, 393–412 (2000).

  17. WC Lee, Power of discrete scan statistics: a finite Markov chain imbedding approach. Methodol. Comput. Appl. Probab. 17(3), 833–841 (2015).

    Article  MathSciNet  MATH  Google Scholar 

  18. A Amarioarei, C Preda, Approximations for two-dimensional discrete scan statistics in some block-factor type dependent models. J. Stat. Plann. Infer. 151(3), 107–120 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  19. G Haiman, C Preda, Estimation for the distribution of two-dimensional discrete scan statistics. Methodol. Compu. Appl. Probab. 8(373), 382 (2006).

    MathSciNet  MATH  Google Scholar 

  20. G Haiman, 1-dependent stationary sequences for some given joint distributions of two consecutive random variables. Methodol. Comput. Appl. Probab. 14:, 445–458 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  21. HV Poor, An introduction to signal detection and estimation. Springer Texts Electr. Eng. 333(1), 127–139(13) (1998).

    Google Scholar 

  22. S Kotz, DL Banks, CB Read, Read, Encyclopedia of statistical sciences (Wiley-Interscience, New York, 1997).

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Overseas Academic Training Funds, University of Electronic Science and Technology of China (OATF, UESTC) (Grant No.201506075013), and the Program for Science and Technology Support in Sichuan Province (Grant Nos. 2014GZ0100 and 2016GZ0088).

Authors’ contributions

JHL and QW proposed the algorithm and carried out the simulations. JHL and QW analyzed the experimental results. JHL gave the critical revision and final approval. Both authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhai Luo.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, J., Wu, Q. Scan statistics with local vote for target detection in distributed system. EURASIP J. Adv. Signal Process. 2017, 32 (2017). https://doi.org/10.1186/s13634-017-0467-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-017-0467-y

Keywords