Quickest Detection of a Random Signal in Background Noise Using a Sensor Array

The problem of detecting the onset of a signal impinging at an unknown angle on a sensor array is considered. An algorithm based on parallel CUSUM tests matched to each of a set of discrete beamforming angles is proposed. Analytical approximations are developed for the mean time between false alarms, and for the detection delay of this algorithm. Simulations are included to verify the results of this analysis.


INTRODUCTION
In this paper, we consider the problem of detecting, as soon as possible, a target that appears abruptly and at an unknown angle in a sensor array. This is a problem that arises in a number of applications including radar, sonar, and communications. For a fixed angle of incidence and known signal and noise distributions, this is a classical problem in statistical change detection, and can be solved, for example, by the Page's CUSUM algorithm. However, here we consider the situation in which the angle of incidence and the signal and noise statistics are unknown. In this case, alternatives to the classic CUSUM must be considered, and a number of such methods have been developed for such problems [1,2,3,4,5].
Here, we use an approach motivated by Nikiforov [4] in which we discretize the set of incidence angles and run parallel change-detection algorithms, each one matched to a beamformer pointed at a particular angle. The presence of a signal is announced the first time the test statistic associated with any of these parallel algorithms crosses a threshold. The angle of incidence is then estimated as the pointing angle corresponding to the first test to detect. This test can be analyzed by adapting the methodology of Lorden [6], and we do so by deriving expressions for the mean time between false alarms and the asymptotic mean detection delay for our test. We include a number of simulation results to verify these expressions and to illustrate further properties of the proposed algorithm, including the effects on the performance of increasing the number of array elements. This paper is organized as follows. In Section 2, we describe a model for the problem of interest, including relevant performance criteria. In Section 3, we review briefly the action and properties of the classic Page's CUSUM test to provide a framework for our algorithm. Section 4 develops our parallel beamformer-based CUSUM algorithm, while Section 5 contains an analysis of the algorithm under the assumption of Gaussian noise. We also measure the performance of the proposed method against the optimal algorithm that has perfect knowledge of the signal and noise distributions together with the direction of arrival, for the case when both the signal and noise are Gaussian distributed. Section 6 discusses simulation results that illustrate the algorithm's properties. And, finally, Section 7 contains some concluding remarks.

STATEMENT OF THE PROBLEM
We assume a uniform linear sensor array with L elements and consider the following signal model: where ν is an unknown change point after which {X i }, an independent identically distributed (i.i.d.) narrowband d φ Figure 1: Linear sensor array with 5 elements with a narrowband source in the far-field impinging on the array from a direction φ.
complex-valued random signal source, is incident on the array (see Figure 1) at an unknown angle φ ∈ [−π/2, π/2), and a(φ) is the L×1 array response vector (also called the steering vector) associated with it. The array response has the following form: where λ is the wavelength and d is the sensor spacing, typically chosen as half the wavelength. Finally, {n i } is the ambient noise independent of the source signal, and white in both space and time, with covariance matrix σ 2 n I L . According to the above model, before an unknown time instant ν (the change point), there is only noise in the system, and after the change point, a random signal appears at an angle φ in addition to the noise. We wish to detect the appearance of this random signal as soon as possible and also to estimate the angle of arrival of this source. In particular, we would like to design a detection algorithm that does not rely on the knowledge of the distribution of the random process {X i } and the noise process {n i }, except that the noise variance is assumed to be known. In the following, we pose this problem formally as a quickest change detection problem [1,6], and define the criteria involved in designing an algorithm for this purpose.
Let P (ν) denote the distribution of the sequence of observations y 1 , y 2 , . . . , y ν−1 , y ν , y ν+1 , . . ., where ν is the change point, and let E (ν) denote expectation under P (ν) . We assume that, under P (ν) , the random variables {y i } are independent with a marginal probability density function (pdf) p 0 for i < ν and p 1 = p 0 for i ≥ ν. Let P 0 correspond to the case where ν = ∞, that is, {y i } ∼ p 0 for all i ≥ 1 (the no-change situation), and let expectation under P 0 be denoted by E 0 . Also, we use P 1 and E 1 instead of P (1) and E (1) for the case when ν = 1, that is, {y i } ∼ p 1 for all i ≥ 1. The goal is to minimize, over all possible stopping times N, the worst-case mean delay for detection, such that the mean time before a false alarm satisfies for a given γ > 0. So, the idea is to detect the presence of a change as soon as possible while keeping the false alarm rate below a desired level.

PAGE'S CUSUM TEST
Before moving on to the more complex case of composite hypotheses, in the following, we review the basic case in which both the pre-and postchange hypotheses are simple. Later on, we will discretize the parameter space of the angle of arrival, reduce the composite alternative to a set of simple hypotheses in that parameter, and apply parallel simple-change detection tests. When the likelihood ratio of the observations under the two hypotheses can be written explicitly, the following algorithm, called Page's CUSUM test [7], is known to be optimal in the minimax sense described above [6,8]. Page's test declares the detection of a change point the first time the CUSUM statistic or the equivalent and computationally efficient recursive form i m=n g(y m ) and g(y i ) = log p 1 (y i )/ p 0 (y i ) is the log-likelihood function (or score function) which should satisfy 0 < ρ E 1 {g(y i )} < ∞. Recall that ρ = E 1 {g(y i )} is the Kullback-Leibler distance between the two densities and is always positive, while E 0 {g(y i )} is always negative.
Thus, the stopping time N of the CUSUM algorithm is given by or, equivalently, where the equivalence is true under the condition that we use h > 0. For this algorithm, we have the following well-known result of Lorden [6]: Note that for the CUSUM stopping rule, the worst-case mean detection delay corresponds to the change point ν = 1, since this is when the CUSUM statistic Q i ≥ 0, for all i, is at its minimum (Q 0 = 0) and hence is the farthest from the threshold h > 0. The idea behind the CUSUM algorithm is that it stops at the first time instant i such that for some n ≤ i, the log-likelihood ratio test to decide between the hypotheses H 0 [i] : y n , . . . , y i ∼ i.i.d. p 0 and H (n) [i] : y n , . . . , y i ∼ i.i.d. p 1 exceeds a certain threshold. The basic operating principle of the recursive form (6) is that, before the change, E 0 {g(y i )} < 0, so that Q i remains close to zero; whereas after the change, Q i starts drifting upward with a positive mean ρ = E 1 {g(y i )} until it ultimately crosses the threshold h.
In general, when the likelihood ratio is not known explicitly, which we assume to be so in our situation, the score function g(·) can be replaced by any other function with negative mean before the change, and positive mean after the change; that is, satisfying the conditions E 0 {g(y i )} < 0 and E 1 {g(y i )} > 0. In this case, the stopping time is no longer guaranteed to be optimal but still a very good candidate once an appropriate function is chosen [1,9,10] satisfies

PARALLEL BEAMFORMER-BASED CUSUM ALGORITHM
Since the angle of arrival φ after the change point is unknown, we have a composite alternative hypothesis in that parameter. Motivated by Nikiforov's approach [4] for detecting a change under such a condition, we propose a simple scheme in which we run simultaneously K parallel CUSUM algorithms, each using a conventional beamformer. Since we assume no knowledge of the probability distributions of the target signal and noise, this suboptimal method acts as an energy detection scheme with each CUSUM "tuned" for the detection of signal energy from a particular direction of arrival. The array weight vector, w(θ), for the conventional beamformer [11] (also called the fixed-phased array beamformer) is given by w(θ) = a(θ)/L, where a(θ), defined in (2), is the array response or the steering vector associated with a source incident at angle θ. Hence, the output of the conventional beamformer is unity in the look direction w(θ) H a(θ) = (1/L)a(θ) H a(θ) = 1, where w(θ) H is the conjugate transpose of w(θ). Note that the beamformer response is maximal in the look direction θ, that is, with equality if and only if θ = φ. In general, given an array weight vector w(θ), the function for fixed θ, and as a function of φ, is called the beampattern corresponding to the beamformer pointing in the direction θ, and it is the collection of that beamformer's responses as the angle of incidence varies over φ. On the other hand, for fixed φ, z(θ, φ), as a function of θ, is called the steered response corresponding to the angle of incidence φ, and it is the collection of beamformer's responses as the look direction varies over θ.
To devise a test for (1), we discretize the parameter space [−π/2, π/2) into K angles {θ 1 , θ 2 , . . . , θ K } such that −π/2 ≤ θ 1 < · · · < θ K < π/2. We can then design a CUSUM test for the detection of a target incident from each of such angles, operate them in parallel to provide the coverage for the whole space, and then combine them into a single change-detection algorithm. For d = λ/2, the fixed-phased array beamformer pointing in the direction θ k is defined as The stopping time, N, of our parallel beamformer-based CUSUM test is then given by withQ i defined as follows: and, for each 1 ≤ k ≤ K, the CUSUM statistic Q k i is given by where we use the following equation: Defining the function g i (k) in the above fashion makes each CUSUM test act as an energy accumulator "tuned" to the direction θ k , which will make the alarm go off in the presence of a target once it starts drifting upward collecting signal energy coming from the look direction. In order to see this, first, we examine the behavior under the prechange (noise only) case. Notice that, since y i = n i for i < ν, we have E (ν) {y i y H i } = σ 2 n I L , and using the fact that w H Here, the bias term c must be chosen to satisfy the condition c > 0, so that the expected value of g i (k) is negative when there is no target present, that is, close to zero before the target presence. Next, looking at the postchange behavior (signal plus noise), it is easy to see that, since is the target signal energy and |w H k a(φ)| is the kth beamformer's response for the signal coming from direction φ. Now, we can see that the bias term c > 0 plays an important role in the postchange situation and should be chosen based on the designed set of beamformers and the minimum signal tonoise-ratio (SNR) requirements so that it satisfies the condition c < max 1≤k≤K |w H k a(φ)| 2 σ 2 s , for all φ. Then, for any angle of arrival φ, we have max 1≤k≤K E (ν) {g i (k)|φ} > 0 for i ≥ ν, which will guarantee that at least one of the CUSUM statistics (16) will start drifting upward with a positive mean that   is proportional to the target signal energy σ 2 s , and the beamformer response for that direction |w H k a(φ)| 2 , enabling the detection of the target as soon as it exceeds a certain threshold value h. In particular, let the angle of arrival of the incoming signal φ * be such that for some , θ = φ * , namely, we assume that the incoming angle of arrival matches exactly one of the look directions in our set of beamformers. Then, the th CUSUM test based on that beamformer is expected to be responsible for the detection of the target since that beamformer will have the unity (and maximal) response com-pared to the others, that is, |w H a(φ * )| = 1, for φ * = θ . Now, if K, the number of beamformers, is chosen large enough, then we can cover the whole interval [−π/2, π/2), and we do not need to restrict φ * to belong to the set {θ 1 , . . . , θ K }, that is, φ * can be any real number in the interval. In this situation, the response of the beamformer whose look direction is the closest to φ * will be approximately unity again. This is of course true if, for a given number of sensors L, the number of beamformers K is chosen such that the main lobe width of the beampatterns (see Figures 2a, 2b, 2c, 3a, 3b, and 3c) is not very small compared to the difference between the beamformer look directions. Namely, for large L, we have better resolution and, accordingly, K must also be chosen proportionately large so that we have close-to-unit response for all angles of arrival φ. Thus, each beamformer will have an interval of responsibility for detection, and the union of these intervals will cover the whole region.
We can now proceed to find the mean time to a false alarm for the parallel CUSUM algorithm whose stopping time is given in (14). First, we define the following K stopping times corresponding to each CUSUM rule (16): and the final stopping time, which is equivalent to (14), is then given by Now, using the equivalent representation (7), the stopping time N k (18) can be expressed as where with G i n (k) = i m=n g m (k). Hence, the stopping time (19) is given by For any finite n, T k (n) is the stopping time of a sequential hypothesis test acting on observations y n , y n+1 , . . . for which we have [1,10] which implies where s 0 (k) is the nonzero root of the following equation: Now, using Lorden's theorem [6] which states that, for a stopping time T with respect to a sequence of random variables y i , i = 1, 2, . . ., such that P 0 ( T < ∞) ≤ α, the extended stopping time N min{ T(n) | n = 1, 2, . . .}, where T(n) is obtained by applying T to y n , y n+1 , . . ., satisfies E 0 { N} ≥ 1/α, we get from (20)-(24), and most importantly, Next, based on the definition (19), the asymptotic detection delay can be derived using the upper bound together with (10), which leads to Finally, at the alarm time N, as a byproduct of this algorithm, we can obtain an estimate of the angle of arrival, which follows from the idea that the CUSUM rule corresponding to the beamformer with the largest response (the one whose look direction is the closest to the angle of incidence) will have the sharpest increase and reach the threshold more quickly. Hence the estimateφ is obtained viâ In the next section, we analyze the properties of our algorithm in the case where the noise in our model (1) is i.i.d. zero-mean Gaussian, n i ∼ N (0, σ 2 n I L ).

ANALYSIS UNDER GAUSSIAN NOISE
We first recall the parallel beamformer-based CUSUM test, presented in (14) withQ i and Q k i defined as before, and where it is easy to see that the threshold h and bias c values above are related to the ones in (14), (15), (16), and (17), by a scale factor σ 2 n /L. From now on, we will use the above equivalent form of the CUSUM test, which makes the following analysis and interpretation of the results simpler. Now, we look at the mean time to a false alarm for each CUSUM test in the parallel algorithm. Under H 0 , that is, y i = n i for i ≥ 1, we have and the solution to (25) is the nonzero root of the equation If we also assume that the real and imaginary components of the zero-mean complex Gaussian noise vector n i = n i + j n i are independent and have equal covariance, that is, E{ n i n T i } = E{ n i n T i } = (σ 2 n /2)I L , and E{ n i n T i } = 0, then Y = L|w H k n i | 2 /σ 2 n is an exponential random variable with mean E{Y } = (L/σ 2 n )w H k E{n i n H i }w k = 1, where we used the fact that w H k w k = 1/L. In order to see this, we write Y = Y 2 R + Y 2 I and obtain Under the assumptions made above, Y R and Y I are zeromean jointly Gaussian random variables with zero covariance, and hence are independent, that is, (35) Thus, Y is a chi-square random variable with two degrees of freedom, which is equivalent to an exponential random variable. The moment generating function of Y is then given by which can be solved numerically. The solution is a function of c and does not depend on k, so we will denote the root as s 0 (c) and from (26) and (27), we obtain In Figure 4, we plot the nonzero root s 0 of the above equation as a function of the bias term c > 0. Notice that, s 0 approaches 1 rapidly as the bias c increases from value 0. Next, we will look at the mean detection delay performance. Under the assumption that the incoming signal always coincides with one of the look directions, that is, φ = θ , 1 ≤ ≤ K, the stopping time corresponding to the beamformer that is perfectly tuned to the incoming signal (the one with unit response) will be N . Under H 1 , that is, when the target is present, we have y i = a(θ )X i + n i for i ≥ 1. Then using the independence of {X i } and {n i }, the fact that w H a(θ ) = 1, and w H w = 1/L, we obtain where σ 2 s = E{|X i | 2 } and c is chosen so that it satisfies 0 < c < Lσ 2 s /σ 2 n . Since the th beamformer's look direction is the closest to the target, clearly we have max 1≤k≤K as h → ∞, which does not depend on , and hence Now, in the case when the incoming signal is not constrained to belong to the discrete set of look directions, we will get where will be close to unity, according to the preceding discussion in Section 4, provided that a sufficient number of beamformers K is used according to the given array size L such that for every angle of arrival φ, the beamformer with the closest look direction has approximately unity response. Of course, this happens when the main lobes of our set of beamformers considerably overlap with each other. Now, assuming h is large enough that the asymptotically linear relationship applies, we can make the following observations. Mean detection delay is inversely proportional to the number of antenna elements L and increases linearly with the threshold value h; whereas, the lower bound for mean time to a false alarm is independent of L and increases exponentially with h. So, for a given SNR value, keeping the threshold h and bias value c fixed, we can reduce the mean detection delay by increasing L while satisfying the same false alarm requirement, or by increasing L and keeping c fixed, we can achieve the same mean detection delay with a higher h, yielding an exponential reduction in the false alarm rate.
Next, for the case where both the signal and noise are Gaussian, we will compare the performance of the proposed method to that of the optimal detector that has perfect knowledge of the signal and noise distributions and the angle of incidence. Then, as described in Section 3, the optimal CUSUM test takes the following form: with Q * i = (Q * i−1 + g * (y i )) + and g * (y i ) = log p 1 (y i )/ p 0 (y i ). So, we assume that under H 0 , we have y i = n i and y i ∼ N (0, Σ 0 ), where Σ 0 = σ 2 n I L , and under H 1 , we have y i = a(φ)X i + n i and y i ∼ N (0, Then It is easy to see that the determinants |Σ 1 (φ)| and |Σ 0 | are given by |Σ 0 | = (σ 2 n ) L and |Σ 1 (φ)| = (σ 2 n ) L−1 (Lσ 2 s +σ 2 n ), where we used the fact that a(φ) H a(φ) = L. Then where we have used the fact that E 1 {y i y H i |φ} = Σ 1 (φ). Note that ρ is independent of direction of arrival φ. Hence, from (9), the mean detection delay for the optimal detector is and the lower bound on the mean time to a false alarm is given by And for the proposed parallel CUSUM method, we have Now, we can investigate under what circumstances the proposed method's performance is close to that of the optimal one. First, we know from Figure 4 that, as the bias term c increases, the exponent s 0 (c) will get closer to 1, and hence the false alarm performance of the parallel CUSUM method will get closer to the optimal algorithm, assuming the performance is measured with respect to the exponent term in the lower bound expression for the mean time to a false alarm.
On the other hand, the detection delay performance is also related to the choice of c, and given the antenna array size L and desired target SNR to be detected, we can determine the bias value c in our proposed method that is needed in order to get the same asymptotic detection delay performance obtained from the optimal algorithm, assuming that the performance is measured with respect to the slope term in the asymptotic expression for the mean detection delay. So, from (41) and (46), with β = 1, we can write In Figure 5, we plot the values of c as a function of SNR for different numbers of antenna elements L. Now, we can make the following interpretation. Given L and SNR values, we can find the value of the bias term c that is needed to give us mean detection delay performance equivalent to that of the optimal algorithm, and from Figure 4, we can find the value of the exponent s 0 corresponding to that bias value c, which in turn specifies the achievable performance on the false alarm rate given by (38). Or, given L and the requirement on the false alarm rate determined by the value of the exponent s 0 , we can obtain from (37) c = (1/s) log(1/(1−s))−1, the value of the bias term c that is needed to satisfy this requirement; and corresponding to that bias value, we can retrieve from Figure 5 the SNR value for which we can achieve mean detection delay performance equivalent to that of the optimal algorithm.
Given the number of sensor elements L, we can see that the parallel CUSUM method, with c chosen according to (50), can achieve optimal detection delay performance, while its false alarm performance gets closer to the optimal as the SNR increases. For instance, looking at point B in Figure 5, we see that with L = 2 and for values of SNR ≥ 4.81 dB, we can find a bias value c ≥ 4, for which we get optimal detection delay with false alarm performance measured in terms of the exponent s 0 ≥ 0.993, where s 0 = 1 is the optimal exponent. Also, looking at point A, we see that with a larger array size L = 8, we can use the same bias value c ≥ 4, and achieve the optimal detection delay performance with the same false alarm performance (s 0 ≥ 0.993) under lower SNR conditions (SNR ≥ −1.21 dB). So by increasing the number of antenna elements L, we can achieve the same performance levels under lower SNR conditions.
Next, we will present some simulation results in order to make the ideas developed in this paper more concrete.

SIMULATION RESULTS
For the simulations, we will take both the noise and the incoming signal to be i.i.d. zero-mean complex Gaussian with noise covariance σ 2 n I and signal covariance σ 2 s I. We choose a value of SNR = σ 2 s /σ 2 n = 5 dB, and for different array sizes L, the bias term is chosen according to (50). Overall, we perform 10 000 Monte Carlo simulations for each data point in detection delay measurements, and 1000 Monte Carlo simulations for each data point in false alarm measurements. The reason why we used fewer simulations for false alarm measurements is that even for threshold values h that are much smaller than the ones we used for detection delay measurements, the false alarm times were much longer which made each Monte Carlo simulation take that much longer to complete.
In Figures 2a through 3c, some sample beampatterns for a conventional beamformer with two and six sensors are plotted. We see that as the number of sensors increases, the width of the main lobe decreases and the resolution of our detector improves, which is expected to result in a better performance in terms of angle-of-arrival estimation. For simplicity, we divide the interval [−π/2, π/2) equally into 180 points and employ 180 beamformers such that each beamformer w k points in the direction −π/2+((k − 1)/180)π, k = 1, . . . , 180. We also note that even for 10 sensors (L = 10), where the main lobe width is very small, a separation of one degree provides the essential overlap of the main lobes of the collection of beampatterns to cover the whole region. That is to say, according to this configuration with K = 180, as L ranges from 2 to 10, we get β = 0.9999 through β = 0.9969, where we have defined β as in (42). Now looking at Figures 2a through 3c, we notice the following. As the look direction moves away from 0 • in either direction, another sidelobe starts to appear, the beampattern starts to have asymmetry, and the peak of the second lobe increases in the opposite direction. For 90 • , there is perfect symmetry and the beamformer aimed at −90 • will have unity response for a signal coming from 90 • . For small L, this phe- nomenon is more dramatic; when we compare Figures 2b and 3b, we see that the second lobe for L = 2 is much larger than it is for L = 6. We estimate the angle of arrival (30) using the idea that the beamformer whose look direction corresponds to (or is the closest to) the angle of arrival will be the one responsible for the detection, since it will have the largest response among all beamformers. Ideally, we would like to have a collection of beampatterns in the shape of a daisy in which each petal corresponds to a specific beamformer so that we have the large responses from those beamformers that are looking in the direction of arrival, and not from those beamformers looking in the opposite direction. Now, for small L, and for a signal coming from, say −70 • , the beamformer looking in the +70 • direction will have a significant response due to its large second lobe. Because of this, we may get large estimation errors coupled with a sign ambiguity. For this reason, in our simulations, we restrict the target's angle of arrival to the interval [−50 • , 50 • ] so that even for L = 2, the response of the second lobe remains at a comfortably low level and we can get reliable estimates.
In the following, we investigate the properties of the algorithm under the case of a minimal configuration with two sensors (L = 2), and we also vary the number of sensor elements to observe the performance improvement achieved by increasing the array size. Figure 6 shows the mean detection delay for L = 2, 4, and 8 as threshold h varies from 20 to 400 in steps of 20. Solid line shows the simulation results and the asterisks show the locations obtained from the theoretical asymptotic expression (41). We observe that the mean detection delay follows the asymptotic expression very closely and increases linearly with h, meaning that the asymptotic result is fairly general (it holds for finite values of h), and it can be used as a design guideline. We also see that for fixed h, the mean detection delay decreases as the array size L increases, as expected. In particular, we can observe this in  Figure 7, where we plot the mean detection delay as a function of L, for threshold h = 400. In Figure 8, we show the histogram of detection delays for h = 400 and L = 2, based on which we can make the observation that it has a gammalike density.
In Figure 9, we plot the mean time to a false alarm for L = 2, as h varies from 1 to 6 in steps of 0.5. We see that the mean time to a false alarm increases exponentially with h. For this value of L, we used c = 4.16 obtained from (50), and corresponding to that, we have s 0 (c) = 0.99. Based on this, we can conclude that the theoretical lower bound (38) is rather loose since we get e 6 403, which is much smaller than the simulation results. In Figure 10, we show the histogram of false alarm times for h = 6 and L = 2, based on which we can make the observation that it has an exponential-like distribution. In Figures 11 and 12, we look at the mean detection delay and the mean false-alarm time properties of the optimal detector we have defined in Section 5. Comparing Figure 6 with Figure 11, and Figure 9 with Figure 12, we see that the mean detection delays for the parallel CUSUM exactly match those of the optimal detector, as expected, whereas the mean time to a false alarm for the optimal detector is higher than that achieved by the parallel CUSUM method for the same threshold h. We note that the theoretical lower bound (47) is very loose for the optimal algorithm as well. We also compare, in Figure 13, the parallel CUSUM method to the optimal algorithm in terms of the standard deviation of the detection delay as a function of threshold h for different array sizes L = 2, 4, and 8. We see the standard deviation increases with h and the gap between the optimal and the proposed method is smaller for larger L.
In Figure 14, we consider the mean-squared angle-ofarrival estimation error as a function of threshold h, while keeping the number of sensors fixed (L = 2). As we increase h (causing more delay in detection), initially we see considerable improvement which diminishes slowly as we further increase the threshold size. In Figure 15, we again look at the mean-squared estimation error; this time with a larger array size (L = 6), and see that the estimation errors are significantly lower. In Figure 16, we vary the number of sensors in the array for a fixed detection threshold h = 20, and observe that the mean-squared error exhibits a sharp decrease initially as L increases from 2, and then tapers off. Note that while comparing the estimation errors as a function of L for a fixed threshold h, we should keep in mind that we not only get a reduction in the mean-squared estimation error with a larger array size, but also we detect the target sooner as evident from Figures 6 and 7. Finally, Figure 17 shows the histogram of squared estimation error for h = 100 and L = 6.

CONCLUSION
We have examined the problem of detecting a target that appears abruptly in a noisy environment. For this purpose, we have applied a sensor array and devised a parallel CUSUM algorithm based on beamforming. The algorithm not only detects quickly the existence of the target, but it also provides an estimate on the target's angular direction. We have developed analytical bounds on the algorithm performance and verified these bounds through simulation, and also demonstrated the algorithm's effectiveness by varying different parameters in the system. We have also compared, under the Gaussian signal and noise case, the proposed algorithm's performance against the optimal algorithm that has perfect knowledge of the signal and noise distributions, together with the direction of arrival.