Detection of point sources on two-dimensional images based on peaks

This article considers the detection of point sources in two dimensional astronomical images. The detection scheme we propose is based on peak statistics. We discuss the example of the detection of far galaxies in Cosmic Microwave Background experiments throughout the paper, although the method we present is totally general and can be used in many other fields of data analysis. We assume sources with a Gaussian profile --that is a fair approximation of the profile of a point source convolved with the detector beam in microwave experiments-- on a background modeled by a homogeneous and isotropic Gaussian random field characterized by a scale-free power spectrum. Point sources are enhanced with respect to the background by means of linear filters. After filtering, we identify local maxima and apply our detection scheme, a Neyman-Pearson detector that defines our region of acceptance based on the a priori pdf of the sources and the ratio of number densities. We study the different performances of some linear filters that have been used in this context in the literature: the Mexican Hat wavelet, the matched filter and the scale-adaptive filter. We consider as well an extension to two dimensions of the biparametric scale adaptive filter (BSAF). The BSAF depends on two parameters which are determined by maximizing the number density of real detections while fixing the number density of spurious detections. For our detection criterion the BSAF outperforms the other filters in the interesting case of white noise.


INTRODUCTION
A very challenging aspect of data analysis in astronomy is the detection of pointlike sources embedded in one and two dimensional images. Some common examples are the separation of individual stars in crowded optical images, the identification of emission and absorption lines in noisy one dimensional spectra and the detection of faint extragalactic objects at microwave frequencies. This latter case, for example, is one of the most critical issues for the new generation of experiments that observe the Cosmic Microwave Background (CMB).
The CMB is the remnant of the radiation that filled the Universe immediately after the Big Bang. This weak radiation can provide us with answers to one of the most important set of questions asked in modern science -how the Universe did begin, how it evolved to the state we observe today, and how it will continue to evolve in the future. Unfortunately, we do not measure the CMB alone but a mixture of it with instrumental noise and other astrophysical radiations that are usually referred to as foregrounds.
Some foregrounds are due to our own Galaxy, for example the thermal emission due to dust grains in the Galactic plane or the synchrotron emission by relativistic electrons moving along the Galactic magnetic field. These foregrounds appear as diffuse emission in the sky, and their spectral behaviours (the way the emission scales from one wavelength of observation to another) are reasonably well known. Another foreground with a well known spectral behaviour is the Sunyaev-Zel'dovich effect, which is due to the hot gas contained in galaxy clusters that distorts the energy distribution of CMB photons. Foreground emissions carry information about the Galaxy structure, composition and physical parameters as well as about the number, distribution and evolution of galaxy clusters that map the distribution of matter in the Universe. Therefore, the study of the different foregrounds has great scientific relevance by itself. In order to properly study the CMB and the different foregrounds it is mandatory to separate the signals (components) that are mixed in the observations. This can be done by observing the sky at a number of frequencies at least as big as the number of components and then applying some statistical component separation method in order to recover the different astrophysical signals. Several component separation techniques have been suggested, including blind (Baccigalupi et  Another important foreground is due to the emission of far galaxies. Since the typical angular size of the galaxies in the sky is a few arcseconds and the angular resolution of the microwave detectors is typically greater than a few arcminutes 1 , galaxies appear as points to the detector, which is unable to resolve their inner structure. Therefore, they are usually referred to as extragalactic point sources (EPS) in the CMB jargon. Note that, however, they do not appear as points in the images but as the convolution of a pointlike impulse with the angular response of the detector ("beam"). The instruments (radiometers and bolometers) that are used in CMB experiments have angular responses that are approximately Gaussian and therefore EPS appear as small Gaussian (or nearly Gaussian) spots in the images 2 .
The problem with EPS is that galaxies are a very heterogeneous bundle of objects, from the radio galaxies that emit most of their radiation in the low frequency part of the electromagnetic spectrum to the dusty galaxies that emit mainly in the infrared (Toffolatti et al. 1998, Guiderdoni et al. 1998, Tucci et al. 2004). This makes it impossible to consider all of them as a single foreground to be separated from the other by means of multi-wavelength observations and statistical component separation techniques. EPS constitute an important contaminant in CMB studies at small angular scales (Toffolatti et al. 1998), affecting the determination of the CMB angular power spectrum and hampering the statistical study (e.g. the study of Gaussianity) of CMB and other foregrounds at such scales. Moreover, while there are good galaxy surveys at radio and infra-red frequencies, the microwave window of the electromagnetic spectrum is a practically unknown zone for extragalactic astronomy. Therefore, it is important to have detection techniques that are able to detect EPS with fluxes as low as possible.
One possibility is to consider the EPS emission at each frequency as an additional noise to be considered in the equations of a statistical component separation method. Once the algorithm has separated the different components, the residual that is obtained by subtracting the output foregrounds from the original data should contain the EPS plus the instrumental noise and some amount of foreground residuals that remain due to a non-perfect separation. As an example, Figure 1 shows the residual at 30 GHz after applying a Maximum Entropy component separation algorithm (Hobson et al. 1999) to a 12.8 × 12.8 sqr deg simulated sky patch as would be observed by the Planck satellite. The brightest point sources can be clearly observed over the residual noise. However, fainter point sources are still masked by a residual noise that is approximately Gaussian and must be detected somehow. Besides, the situation is more complex because the presence of bright EPS in the data affects the performance of the component separation algorithms so the recovered components are contaminated by point sources in a way that is difficult to control. Therefore, any satisfactory method should detect and extract at least the bright sources before the component separation. Then, after separation some additional low intensity EPS could be detected from the residual maps such as the one in Figure 1.
Several techniques based on linear filters have been proposed in the literature for the detection of point sources in CMB data. Linear filtering techniques are suitable The goal of filtering is to enhance the contrast between the source to be detected and the background that masks it. For example, if we filter the image in Figure 1, assuming that the background can be characterised by a white noise, with the well known matched filter (see section 4.1) at the scale of the 30 GHz detector beam (FWHM=33 arcminutes) the signal to noise ratio of the sources increases by more than 25%. Therefore, a source whose signal to noise ratio was ∼ 3 before filtering becomes a source with signal to noise ratio ∼ 4 and will be easier to detect.
After filtering, a detection rule is applied to the data in order to decide whether the source is present or not. The usual detection approach in astronomy is thresholding: for any given candidate (for example a local peak in the data) a positive detection is considered if the candidate has a signal to noise ratio greater than a certain threshold (in many astronomical applications, a typical value of this threshold is 5σ). This naive approach works fine for bright sources, but weak sources can be easily missed.
More sophisticated detection schemes can use additional information in order to improve the detection. If the detection is performed by means of the study of the statistics of maxima in the images, such information includes not only the amplitude of the maxima but also spatial information related with the source profile, for example the derivatives of the intensity; in our approach we will consider the amplitude, the curvature and the shear of the sources (the last two quantities are given by the properties of the beam in the case of point sources) to discriminate between maxima of the background and real sources. Moreover, in some cases a priori information on the distribution of intensity of the sources is known. We will therefore use a Neyman-Pearson detector that uses the three above mentioned elements of information (amplitude, curvature and shear) of the maxima as well as the a priori probability distribution of the sources. This technique has been successfully tested in images of one-dimensional fields (López-Caniego et al. 2004a,b). In this work we will generalise it to two dimensions.
The overview of this work is as follows: in section 2 we describe the statistics of the peaks for a two-dimensional Gaussian background in the absence and presence of a source. In section 3 we introduce the detection problem, define the region of acceptance and derive our detector. In section 4 we briefly review some of the linear filters proposed in the literature. In section 5 we describe a probability distribution of sources that is of interest and compare the performance of the filters, regarding our choice of detector. Finally, in section 6 we summarise our results.

PEAK STATISTICS
In this section we will study the statistics of peaks for a two-dimensional Gaussian background in both the absence and presence of a source. We will focus on three quantities that define the properties of the peaks: the intensity of the field, the curvature and the shear at the position of the peak. The first quantity gives the amplitude of the peak. The curvature and the shear give information about the spatial structure of the peak and are related to its sharpness and eccentricity, respectively.

Background
Let us assume a two-dimensional (2D) background represented by a Gaussian random field ξ( x) with average value ξ( x) = 0 and power spectrum P (q), where ξ( Q) is the Fourier transform of ξ( x) 3 and δ 2 D is the Dirac distribution in 2D. We are interested in the distribution of maxima of the background with respect to the three variables already mentioned: intensity, curvature and shear. Let us define the normalized field intensity ν, the normalized curvature κ and the normalized shear ǫ as: where ν ∈ (−∞, ∞), κ ∈ [0, ∞), ǫ ∈ [0, κ/2), λ 1 and λ 2 are the eigenvalues of the negative Hessian matrix, and the σ n are defined as The moment σ 0 is equal to the dispersion of the field. The distribution of maxima of the background in one dimension (1D) with respect to the intensity and curvature (the shear is not defined in 1D) was studied by Rice (1954). If we generalize it to 2D, including the shear, the expected number density of maxima per intervals ( x, x + d x), (ν, ν + dν), (κ, κ + dκ) and (ǫ, ǫ + dǫ) is given by whereñ b is the the expected total number density of maxima (i.e. number of maxima per unit area d and ρ and θ m are defined as In the previous equations θ c and θ m are the coherence scale of the field and maxima, respectively. The formula in equation (4) can be derived from previous works (Bond and Efstathiou 1987, Barreiro et al. 1997).

Background plus point source
To the previous 2D background we add a source with a known spatial profile τ ( x) and an amplitude A, so that the intensity due to the source at a given position . For simplicity, we will consider a spherical Gaussian profile given by where R is the Gaussian width (in the case of point sources convolved with a Gaussian beam, R is the beam width). We could easily consider other functional profiles 4 without any loss of generality. The expected number density of maxima per intervals ( x, x + d x), (ν, ν + dν), (κ, κ + dκ) and (ǫ, ǫ + dǫ), given a source of amplitude A in such spatial interval, is where ν s = A/σ 0 is the normalized amplitude of the source, κ s = −Aτ ′′ /σ 2 is the normalized curvature of the source and τ ′′ is the second derivative of the source profile τ with respect to x at the position of the source. Note that in equation (8) we are taking into account that the shear of the source is zero since we are considering a spherical profile. It is useful to define a quantity y s that is related to the curvature of the source:

THE DETECTION PROBLEM
Equations (4) and (8) can be used to decide whether a source is present or not in a data set. The tool that allows us to decide whether a point source is present or not in the data is called a detector. In this section we will describe the Neyman Pearson detector (NPD). We will study its performance in terms of two quantities: the number of true detections and the number of false (spurious) detections that emerge from the detection process. Our approach fixes the number density of spurious detections and determines the number density of true detections in each case.

The region of acceptance
We consider a peak in the 2D data set characterized by the normalized amplitude, curvature and shear (ν, κ, ǫ). The number density of background maxima n b (ν, κ, ǫ) represents the null hypothesis H 0 that the peak is due to the background in the absence of source. Conversely, the local number density of maxima n(ν, κ, ǫ) represents the alternative hypothesis, that the peak is due to the source added to the background. The local number density of maxima n(ν, κ, ǫ) can be calculated as In the last equation we have used the a priori probability p(ν s ), that gives the amplitude distribution of the sources. We can associate to any region R * (ν, κ, ǫ) in the (ν, κ, ǫ) parameter space two number densities n * b and n * n * = R * dν dκ dǫ n(ν, κ, ǫ).
where n * b is the expected number density of spurious sources, i.e. due to the background, in the region R * (ν, κ, ǫ), whereas n * is the number density of maxima expected in the same region of the (ν, κ, ǫ) space in the presence of a local source. The region R * will be called the region of acceptance.
In order to define the region of acceptance R * that gives the highest number density of detections n * for a given number density of spurious detections n * b , we assume a Neyman-Pearson Detector (NPD) using number densities instead of probabilities where L * is a constant. The proof follows the same approach as for the standard Neyman-Pearson detector. If L ≥ L * we decide that the signal is present, whereas if L < L * we decide that the signal is absent. From this ratio L ≥ L * , we derive the region of acceptance, that is given by the sufficient linear detector ϕ (see Appendix) where ϕ * is a constant and ϕ(ν, κ) is given by We remark that the detector is independent of the shear ǫ. This is due to the fact that we are considering a source with a spherical profile with shear ǫ s = 0. If the profile is not spherical, the detector may depend on the shear.

Spurious sources and real detections
Given a region of acceptance R * , we can calculate the number density of spurious sources and the number density of detections as given by equations (11) and (12) n Our approach is to fix the number density of spurious detections and then to determine the region of acceptance that gives the maximum number of true detections. This can be done by inverting the equation (16) to obtain ϕ * = ϕ * (n * b /n b ; ρ, y s ). Once ϕ * is known, we can calculate the number density of detections using equation (17).

THE FILTERS
Detection can, in principle, be performed on the raw data, but in most cases it is convenient to transform first the data in order to enhance the contrast between the distributions n b (ν, κ, ǫ) and n(ν, κ, ǫ). Hopefully, such an enhancement will help the detector to give better results (namely, a higher number of true detections). In this paper we will focus in the use of linear filters as a means to transform the data in such a way. Filters are suitable for this task because background fluctuations that have variation scales different from the source scale can be easily filtered out while preserving the sources. Different filters will improve detection in different ways: this paper compares the performance of several filters. The filter that gives the highest number density of detections, for a fixed number density of spurious sources, will be the preferred filter among the considered filters.
Let us consider a filter Ψ( x; R, b), where R and b define a scaling and a translation respectively. Since the sources we are considering are spherically symmetric and we assume that the background is statistically homogeneous and isotropic, we will consider spherically symmetric filters, If we filter our background with Ψ( x; R, b), the filtered field is The filter is normalized such that the amplitude of the source is the same after filtering: For the filtered field equation (3) becomes The values of ρ, θ m , θ c and all the derived quantities change accordingly. The curvature of the filtered source κ s can be obtained through equation (9), taking into account that for the filtered source Note that the function ψ(q) will depend as well on the scaling R. As an application of the previous ideas, we study the detection of point sources characterised by a Gaussian profile τ (x) = exp(−x 2 /2R 2 ), x = | x|, and Fourier transform τ (q) = R 2 exp(−(qR) 2 /2). This is the case we find in CMB experiments, where the profile of the point source is given by the instrumental beam, that can be approximated by a Gaussian. This profile introduces in a natural way the scale of the source R, the scale at which we filter. However, previous works in 1D using the MHW, MF, SAF and the BSAF have shown that the use of a modified scale αR can significantly improve the number of detections (Cayón et al. 2000, Vielva et al. 2001a,b, López-Caniego et al. 2004a. Therefore, we generalise the functional form of these filters to 2D and allow for this additional degree of freedom α.

The matched filter (MF)
Let us introduce a circularly-symmetric filter Ψ( x; R, b). The filtered field is given by equation (20). Now, we express the conditions to obtain a filter for the detection of the source s(x) = Aτ (x) at the origin taking into account the fact that the source is characterized by a single scale R o . We assume the following conditions: 0) is an unbiased estimator of the amplitude of the source.
2. The variance of w(R, b) has a minimum at the scale R o , i.e. it is an efficient estimator.
Then, the 2D filter satisfying these conditions is the so-called matched filter. As mentioned before, we will allow the filter scale to be modified by a factor α. If α = 1 we have the well-known standard matched filter use in the literature. For a source with a Gaussian profile, a scale-free power spectrum P (q) ∝ q −γ and allowing the filter scale to vary through the α parameter, the modified matched filter is where and Γ is the standard Gamma function. The parameters of the filtered background and source are The corresponding threshold as compared to the standard matched filter (α = 1) is where We remark that for the standard matched filter the curvature does not affect the region of acceptance and the linear detector ϕ(ν, κ) is reduced to ϕ = ν.

The scale-adaptive filter (SAF)
The scale-adaptive filter (or optimal pseudo-filter) has been proposed by Sanz et al. (2001). The filter is obtained by imposing an additional condition to the conditions that define the MF: 3. w(R, 0) has a maximum at (R o , 0).
Assuming a scale-free power spectrum, P (q) ∝ q −γ , a modified scale αR and a Gaussian profile for the source, the functional form of the filter in 2D is where and where m and ∆ are defined as in equation (25), t is defined as in equation (28). The parameters of the filtered background and source are where c = 2t/m and The corresponding threshold as compared to the standard matched filter (α = 1) is

The Mexican Hat wavelet (MH)
The MH is defined to be proportional to the Laplacian of the Gaussian function in 2D real space Thus, in Fourier space we get the modified Mexican Hat wavelet introducing the α parameter as follows Thus, the filtered background and source parameters are where m and ∆ are defined as in equation (25) and t is defined as in equation (28). The corresponding threshold is where c is an arbitrary constant related to the curvature of the maximum. For the case of a scale-free spectrum and allowing for a modified scale αR, the filter is given by the parameterized equation where m and ∆ are defined as in equation (25). We remark that c = 0 leads to the MF and if c ≡ 2t/mγ, with t defined as in equation (28), the BSAF defaults to the SAF. The parameters of the filtered background and source are where The equivalent threshold is given by In this section we will compare the performance of the different filters previously introduced. We use as example the interesting case of white noise as background. This is a fair approximation to the case presented in Figure 1, where the sources are embedded in a background that is a combination of instrumental noise (approximately Gaussian) and a small contribution of residual foregrounds that have not been perfectly separated. For this example, we will consider sources with intensities distributed uniformly between zero and a upper cut-off. The comparison of the filters is performed as follows: we fix the number density of spurious detections, the same for all the filters. Then, for any given filter we calculate the quantities σ n , ρ and y s . Using equation (16) it is possible to calculate the value of ϕ * that defines the region of acceptance. Then we calculate the number density of real detections using equation (17). The filter that leads to the highest number density of detections will be the preferred one. We do this for different values of α in order to test how the variation of the filtering scale affects the number of detections.

A priori probability distribution
As mentioned before, we will test a pdf of source intensities that is uniform in the interval 0 ≤ A ≤ A c . In terms of normalized intensities, we have the pdf We will consider a cut-off in the amplitude of the sources such that ν c = 2 after filtering with he standard MF, that is, we will focus on the case of faint sources that would be very difficult to detect if no filtering was applied. Note that while the value ν c is different for each filter (because each filter leads to a different dispersion σ 0 of the filtered field), the distribution in source intensities A is the same for all the cases.

Results for white noise
We want to find the optimal filter in the sense of maximum number of detections.
For the sources, we use a uniform distribution with amplitudes in the interval A ∈ [0, 2]σ 0 , where σ 0 is the dispersion of the linearly-filtered map with the standard MF. We focus on the interesting case of white noise (γ = 0) and explore different values of n * b and R. We study the performance of the different filters as a function of α. This allows us to test how the variation of the natural scale of the filters helps the detection. In the case of the BSAF, which has an additional free parameter, c in equation (41) Table 1: Number density of detections n * for the BSAF and the standard MF(α = 1) with optimal values of c and α for different values of n * b and R. RD means relative difference in number densities in percentage: RD ≡ 100(−1 + n * BSAF /n * M F ).
number of detections. Then the BSAF with such c parameter (that is a function of α, n * b and R) is compared with the other filters.
In Figure 2, we plot the expected number density of detections n * for different values of α, R = 1.5 pixels and n * b = 0.01. Note that for the 2D case the MHW and SAF are the same filter for γ = 0 and we have only included the latter in our figures. In this case, the curve for the BSAF always goes above the other filters. The maximum number of detections is found for small values of α. In this region, the improvement of the BSAF with respect to the standard matched filter is of the order ≃ 15%.
In Figure 3, we show the results for R = 2. We have increased the beam width as compared with the previous example and leave unchanged the number density of false detections. The BSAF outperforms all the other filters, and for small values of α the improvement is of the order ≃ 40%. Note that in this figure the MF takes values α ∈ [0, 1]. For greater values of α, with R = 2 and n * b = 0.01, we can not solve for ϕ * in the implicit equation (16) and can not calculate n * .
We remark that filtering at scales much smaller than the scale of the pixel does not make sense. This is due to the fact that we are not including the effect of the pixel in our theoretical calculations and, thus, the results would not exactly follow what would be found in a real image. Therefore, we present the results only for those values of α such that αR is at least ∼ 1.

CONCLUSIONS
Several techniques have been introduced in the literature to detect point sources in two-dimensional images. Examples of point sources in astronomy are far galaxies as detected by CMB experiments. An approach that has been thoroughly used in the literature for this case consists in linear filtering the data and applying detectors based on thresholding. Such approach uses only information on the amplitude of the sources: the potentially useful information contained in the local spatial structure of the peaks is not used at all. In our work, we design a detector based on peak statistics that uses the information contained in the amplitude, curvature and shear of the maxima. These quantities describe the local properties of the maxima and are used to distinguish statistically between peaks due to background fluctuations and peaks due to the presence of a source.
We derive a Neyman-Pearson detector (NPD) that considers number densities of peaks which leads to a sufficient detector that, in the case of the spherically symmetric sources that we consider, is linear in the amplitude and curvature of the sources. For this particular case, then, the information of the shear of the peaks is not relevant. In other cases, however, it could be useful.
It is a common practice in astronomy to linear filter the images in order to enhance very faint point sources and help the detection. The best filter would be the one that makes easier to distinguish between peaks coming from the background alone and those due to the presence of a source, according to the information used by the detector. In the case of simple thresholding, that considers only the amplitude of the peaks, the answer to the question of which is the best filter (in the previous sense) is well known: the standard matched filter. But in the case of the Neyman-Pearson detector, that considers other things apart from mere amplitudes, this is not longer true.
We have compared three commonly used filters in the literature in order to assess which one of them performs better when detecting sources with our scheme. In addition, we have designed a filter such that optimizes the number of true detections for a fixed number of spurious sources. The optimization of the number of true detections is performed by using the a priori pdf of the amplitudes of the sources. This filter depends on two free parameters and it is therefore called biparametric scale adaptive filter (BSAF). By construction, the functional form of the BSAF includes the standard MF as a particular case and its performance in terms of number of true detections for a fixed number of spurious detections must be at least as good as the standard MF's one.
Following the work done in the 1D case, we generalize the functional form of the filters to 2D and introduce an extra degree of freedom α that will allow us to filter at different scales αR, where R is the scale of the source. This significantly improves the results.
We have considered an interesting case, a uniform distribution of weak sources with amplitudes A ∈ [0, 2]σ 0 , where σ 0 is the dispersion of the field filtered with the standard matched filter, embedded in white noise (γ = 0). We have tested different values of the source size R and of the number density of spurious detections n * b that we fix. We find that the BSAF improves the number density of detections up to ≃ 40% with respect to the standard MF (α = 1) for certain cases. Note that since the Neyman-Pearson detector for the standard MF (α = 1) defaults to the classic thresholding detector that is commonly used in astronomy, the results of this work imply that it is possible, under certain circumstances, to detect more point sources than in the classical approach.
The generalization of these ideas to other source profiles and non-Gaussian back-grounds is relevant and will be discussed in a future work.