Analysis of the Spatial Distribution of Galaxies by Multiscale Methods

Galaxies are arranged in interconnected walls and filaments forming a cosmic web encompassing huge, nearly empty, regions between the structures. Many statistical methods have been proposed in the past in order to describe the galaxy distribution and discriminate the different cosmological models. We present in this paper results relative to the use of new statistical tools using the 3D isotropic undecimated wavelet transform, the 3D ridgelet transform and the 3D beamlet transform. We show that such multiscale methods produce a new way to measure in a coherent and statistically reliable way the degree of clustering, filamentarity, sheetedness, and voidedness of a dataset


Introduction
Galaxies are not uniformly distributed throughout the universe. Voids, filaments, clusters, and walls of galaxies can be observed, and their distribution constraints our cosmological theories. Therefore we need reliable statistical methods to compare the observed galaxy distribution with theoretical models and cosmological simulations.
The standard approach for testing models is to define a point process which can be characterized by statistical descriptors. This could be the distribution of galaxies of a specific type in deep redshift surveys of galaxies (or of clusters of galaxies). In order to compare models of structure formation, the different distribution of dark matter particles in N-body simulations could be analyzed as well, with the same statistics.
The two-point correlation function ξ(r) has been the primary tool for quantifying large-scale cosmic structure [24]. Assuming that the galaxy distribution in the Universe is a realization of a stationary and isotropic random process, the two-point correlation function can be defined from the probability δP of finding an object within a volume element δV at distance r from a randomly chosen object or position inside the volume: δP = n(1 + ξ(r))δV , where n is the mean density of objects. The function ξ(r) measures the clustering properties of objects in a given volume. It is zero for a uniform random distribution, positive (respectively, negative) for a more (respectively, less) clustered distribution. For a hierarchical clustering or fractal process, 1 + ξ(r) follows a power-law behavior with exponent D 2 − 3. Since ξ(r) ∼ r −γ for the observed galaxy distribution, the correlation dimension for the range where ξ(r) ≫ 1 is D 2 ≃ 3 − γ. The Fourier transform of the correlation function is the power spectrum. The direct measurement of the power spectrum from redshift surveys is of major interest because model predictions are made in terms of the power spectral density. It seems clear that the real space power spectrum departs from a single power-law ruling out simple unbounded fractal models [36]. The twopoint correlation function can been generalized to the N-point correlation function [35,25], and all the hierarchy can be related with the physics responsible for the clustering of matter. Nevertheless they are difficult to measure, and therefore other related statistical measures have been introduced as a complement in the statistical description of the spatial distribution of galaxies [20], such as the void probability function [21], the multifractal approach [18], the minimal spanning tree [1,15,8], the Minkowski functionals [22,13] or the J function [17,14] which is defined as the ratio J(r) = 1−G(r) 1−F (r) , where F is the distribution function of the distance r of an arbitrary point in R 3 to the nearest object in the catalog, and G is the distribution function of the distance r of an object to the nearest object. Wavelets have also been used for analyzing the projected 2D or the 3D galaxy distribution [9,28,19,23,16].
The Genus statistic [10] measures the topology or the degree of connectedness of the underlying density field. Phase correlations of the density field can be detected by means of this topological analysis. The Genus is calculated by (i) convolving the data by a kernel, generally a Gaussian, (ii) setting to zero all values under a threshold ν in the obtained distribution, and (iii) taking the difference D between the number of holes and the number of isolated regions. The Genus curve G(ν) is obtained by varying the threshold level ν. The first step of the algorithm, the convolution by a Gaussian, may be dramatic for the description of filaments, which are spread out along all directions, as it will be shown in next section. The Genus is related with one of the four Minkowski functionals that describe well the overall morphology of the galaxy distribution [26]. Minkowski functionals have been used to elaborate sophisticated tools to measure the filamentarity and planarity of the distribution by means of shape finders quantities [27].
The Sloan Digital Sky Survey (Early Data Release) has recently be analyzed using a 3D Genus Statistics [11] and results were consistent with that predicted by simulations of a Λdominated spatially-flat cold dark matter model.
New multiscale methods have recently emerged, the beamlet transform [4,6] and the ridgelet transform [3], which allows us to better represent data containing respectively filaments and sheets, while wavelets represent well isotropic features (i.e. cluster in 3D). As each of these three transforms represents perfectly one kind of feature, all of them are useful and should be used to describe a given catalog. Section 2 describes the 3D wavelet transform, and how wavelets can be used for estimating the underlying density. Sections 3 and 4 describes respectively the 3D ridgelet transform and the 3D beamlet transform. It is shown in section 4 through a set of of experiments how these three 3D transforms can be combined in order to describe statistically the distribution of galaxies.
2 The 3D Wavelet Transform

The Undecimated Isotropic Wavelet Transform
The wavelet transform of a signal produces, at each scale j, a set of zero-mean coefficient values {w j }. Using an algorithm such as the undecimated isotropic wavelet decomposition [33], this set {w j } has the same number of pixels as the signal and thus this wavelet transform is a redundant one. Furthermore, using a wavelet defined as the difference between the scaling functions of two successive scales the original cube c = c 0 can be expressed as the sum of all the wavelet scales and the smoothed array c J The set w = {w 1 , w 2 , ..., w J , c J }, where c J is a last smooth array, represents the wavelet transform of the data. If we denote as W the wavelet transform operator and N the pixel number of c, the wavelet transform w (w = Wc) has (J + 1)N pixels (redundancy factor of J + 1). The scaling function φ is generally chosen as a spline of degree 3, and the 3D implementation is based on three 1D sets of (separable) convolutions. Like the scaling function φ, the wavelet function ψ is isotropic (point symmetric). More details can be found in [33,32].
Given a function f ∈ L 2 (R 3 ), we define its wavelet coefficients by: Figure 1 shows an example of 3D wavelet function.

3D Galaxy Distribution Filtering
For the noise model, given that this relates to point pattern clustering, we have to consider the Poisson noise case. The autoconvolution histogram method used for X-ray image [34] can also be used here. It consists to calculate numerically the probability distribution function (pdf) of a wavelet w j,x,y,z coefficient with the hypothesis that the galaxies used for obtaining w j,x,y,z are randomly distributed. The pdf is obtained by autoconvolving n times the histogram of the wavelet function, n being the number of galaxies which have been used for obtaining w j,x,y,z , i.e. the number of galaxies in a box around (x, y, x), the size of the box depending of the scale j. More details can be found in [34,32]. Once the pdf relative to the coefficient w j,x,y,z is known, we can detect the significant wavelet coefficients easily. We derive two threshold values T min j,x,y,z and T max j,x,y,z such that P rob(W < T min j,x,y,z ) = ǫ P rob(W > T max j,x,y,z ) = ǫ (3) ǫ corresponding to the confidence level, and the positive (respective negative) wavelet coefficient is significant if it is larger than T max j,x,y,z (resp. lower than T min j,x,y,z ). A simple filtering method would now consist to set to zero (i.e. thresholding) all unsignificant coefficients and reconstruct the filtered data cube by addition of the different scales. But when a redundant wavelet transform is used, the result after a simple hard thresholding can still be improved by iterating [30]. We want the wavelet transform of our solutions to reproduce the same significant wavelet coefficients (i.e., coefficients larger than T j ). This can be expressed in the following way: (Ws) j,k = w j,k if w j,k is significant (4) where w j,k are the wavelet coefficients of the input data s at scale j and at position k = (x, y, z). The relation is not necessarily verified in the case of non-orthogonal transforms, and the resulting effect is generally a loss of flux inside the objects. The residual signal (i.e. s −s) still contains some information at positions where the objects are. Denoting M the multiresolution support of s (i.e. M j,k = 1 if w j,k is significant, and 0 otherwise), we want:

M.Ws = M.Ws
The solution can be obtained by the following Van Cittert iteration [33]: where R n = s −s n .

Iterative Filtering with a Smoothness Constraint
A smoothness constraint can be imposed on the solution.
min Ws ℓ 1 , subject to s ∈ C, where C is the set of vectorss which obey the linear constraints Here, the second inequality constraint only concerns the set of significant coefficients, i.e. those indices which exceed (in absolute value) a detection threshold t j . Given a tolerance vector e = {e 1 , ..., e j }, we seek a solution whose coefficients (Ws) j,k , at scale and position where significant coefficients were detected, are within e j of the noisy coefficients (Ws) j,k . For example, we can choose e j = σ j /2. In short, our constraint guarantees that the reconstruction be smooth but will take into account any pattern which is detected as significant by the wavelet transform. We use an ℓ 1 penalty on the coefficient sequence because we are interested in low complexity reconstructions. There are other possible choices of complexity penalties; for instance, an alternative to (6) would be min s T V , subject to s ∈ C.
where · T V is the Total Variation norm, i.e. the discrete equivalent of the integral of the Euclidean norm of the gradient. Expression (6) can be solved using the method of hybrid steepest descent (HSD) [38]. HSD consists of building the sequencẽ here, P is the ℓ 2 projection operator onto the feasible set C, ∇ J is the gradient of equation (6), and (λ n ) n≥1 is a sequence obeying (λ n ) n≥1 ∈ [0, 1] and lim n→+∞ λ n = 0. Unfortunately, the projection operator P is not easily determined and in practice we use the following proxy; compute Ws and replace those coefficients which do not obey the constraints |(Ws − Ws) j,k | ≤ e j (those which fall outside of the prescribed interval) by those of s; apply the inverse transform.
The filtering algorithm is: 1. Initialize L max = 1, the number of iterations N i , and δ λ = Lmax N i .
2. Estimate the noise standard deviation σ s , and set e j = σ j 2 for all j. 3. Calculate the transform: w (s) = Ws.
-For all coefficients α j,k do * Calculate the residual r j,k = w In practice, a small number of iterations (<10) is enough. The (.) + operator means that negative values are set to zero ((a) + = MAX(0, a)). Figure 2: Simulated data. Fig. 2 shows a simulation of a 60h −1 Mpc box of universe, made by A. Klypin (simulated data at http://astro.nmsu.edu/∼aklypin/PM/pmcode). It represents the distribution of dark matter in the present-day universe and each point is a dark matter halo where visible galaxies are expected to be located, i.e. the distribution of dark matter halos can be compared with the distribution of galaxies in catalogs of galaxies. Genus ν Wavelet denoised Figure 4: The genus curve for the N-body model shown in Fig. 2 convolving the data with a Gaussian filter with different values of σ and the genus for the wavelet filtered data set. Fig. 3 shows, at the right panel, the same data set filtered by the 3D wavelet transform, using the algorithm described previously. The two left panels correspond to Gaussian smoothing

Experiments
with σ = 1 and σ = 3. We can see that when the bandwidth is too small the discreteness and the noise dominate the density reconstructed field, while large value of σ erase all the small scale features of the distribution. This is illustrated in Fig. 4, where we can see the strong dependence of the genus curve with the width of the Gaussian filter, being its interpretation completely dependent of the choice of the bandwidth. The wavelet reconstructed density field keeps information at all scales due to its multiscale nature. This is observed in the 3D image at the right panel of Fig. 3, where we see how large filaments, big clusters and walls coexist with small scale features such as the density enhancement around groups and small clusters. The genus curve of this adaptive reconstructed density field is much more informative because it does not depend of the particular choice of the filter radius.

The 2D Ridgelet Transform
The two-dimensional continuous ridgelet transform of a function f ∈ L 2 (R 2 ) is defined as follows [3]. We first select a smooth function ψ ∈ L 2 (R), we assume that ψ satisfies the admissibility condition which holds if ψ has a sufficient decay and a vanishing mean ψ(t)dt = 0 (ψ can be normalized so that it has unit energy 1/(2π) |ψ(ξ) Given a function f ∈ L 2 (R 2 ), we define its ridgelet coefficients by: Figure 5: Definition of angle1 θ 1 and θ 2 in R 2 and R 3 It has been shown [3] that the ridgelet transform is precisely the application of a 1-dimensional wavelet transform to the slices of the Radon transform (where the angular variable θ 1 is constant). This method is therefore optimal to detect lines of the size of the image (the integration increase as the length of the line). More details on the implementation of the digital ridgelet transform can be found in [31]. Figure 6 (left) shows an example ridgelet function. This function is constant along lines x 1 cos θ + x 2 sin θ = const. Transverse to these ridges it is a wavelet (see figure 6 (right).

From 2D to 3D
The three-dimensional continuous ridgelet transform of a function f ∈ L 2 (R 3 ) is given by: Figure 6: Example of 2D ridgelet function.

Local 3D Ridgelet Transform
The ridgelet transform is optimal to find sheets of the size of the cube. To detect smaller sheets, a partitioning must be introduced [2]. The cube c is decomposed into blocks of lower side-length b so that for a N * N * N cube, we count N/b blocks in each direction. After the block partitioning, The detection is therefore optimal for sheets of size b × b and of thickness a j , a j corresponding to the different dyadic scales used in the transformation.

Definition
The X-ray transform of a continuum function f (x, y, z) with (x, y, z) ∈ R 3 is defined by where L is a line in R 3 , and p is a variable indexing points in the line. The transformation contains all line integrals of f . The Beamlet Transform (BT) can be seen as a multiscale digital X-ray transform. It is multiscale transform because, in addition to the multiorientation and multilocation line integral calculation, it integrated also over line segments at different length. The 3D BT is an extension to the 2D BT, proposed by Donoho and Huo [4].

The system of 3D beams
The first choice to consider is the line segment set. We would like to have an expressive set of line segments in the sense that it includes line segments with various lengths, locations and orientations lying inside a 3D volume and at the same time has reasonable size. A seemingly natural candidate for the set of line segments is the family of all line segments between any voxel corner and any other voxel corner, the set of 3-D beams. The beams set is expressive but can be of huge cardinality for even moderate resolutions. For a 3D data set with n 3 voxels we get O(n 6 ) 3D beams -So that is clearly infeasible to use the collection of 3-D beams as a basic data structure since any algorithm based on this set will have a complexity with lower bound of n 6 and hence unworkable for typical sizes 3-D images.

The Beamlet System
A dyadic cube C(k 1 , k 2 , k 3 , j) ⊂ [0, 1] 3 is the collection of points where 0 ≤ k 1 , k 2 , k 3 < 2 j for an integer j ≥ 0. We will refer to j as the scale of the dyadic cube Such cubes can be viewed as descended from the unit cube C(0, 0, 0, 0) = [0, 1] 3 by recursive partitioning. Hence, the result of splitting C(0, 0, 0, 0) in half along each axis is the eight cubes C(k 1 , k 2 , k 3 , 1) where k i ∈ {0, 1}, splitting those in half along each axis we get the 64 subcubes C(k 1 , k 2 , k 3 , 2) where k i ∈ {0, 1, 2, 3}, and if we decompose the unit cube into n 3 voxels using a uniform n-by-n-by-n grid with n = 2 J dyadic, then the individual voxels are the n 3 cells C(k 1 , k 2 , k 3 , J), 0 ≤ k 1 , k 2 , k 3 < n. Associated to each dyadic cube we can build a system of line segments that have both of their end-points lying on the cube boundary. We call each such segment A beamlet. If we consider all pairs of boundary voxel corners we get O(n 4 ) beamlets for a dyadic cube with a side length of n voxels, we will work with a slightly different system in which each line is associated with a slope and an intercept instead of its end-points as will be explained below. However, we will still have O(n 4 ) cardinality. Assuming a voxel size of 1/n we get J + 1 scales of dyadic cubes where n = 2 J , for any scale 0 ≤ j ≤ J there are 2 3j dyadic cubes of scale j and since each dyadic cube at scale j has a side length of 2 J−j voxels we get O(2 4(J−j) ) beamlets associated with the dyadic cube and a total of O(2 4J−j ) = O(n 4 /2 j ) beamlets at scale j. If we sum the number of beamlets at all scales we get O(n 4 ) beamlets.
We have constructed above a multi-scale arrangement of line segments in 3D with controlled cardinality of O(n 4 ), the scale of a beamlet is defined as the scale of the dyadic cube it belongs to so lower scales correspond to longer line segments and finer scales correspond to shorter line segments. Figure 10 shows 2 beamlets at different scales. To construct the set of beamlets for a given dyadic cube we use the slope-intercept pairs system. For a data cube of n × n × n voxels consider a coordinate system with the cube center of mass at the origin and a unit length for a voxel. Hence, for (x, y, z) in the data cube we have |x|, |y|, |z| ≤ n/2. We can consider three kinds of lines: x-driven, y-driven, and z-driven, depending on which axis provides the shallowest slopes. An x-driven line takes the form z = s z x + t z , y = s y x + t y with slopes s z ,s y , and intercepts t z and t y . Here the slopes |s z |, |s y | ≤ 1. y-and z-driven lines are defined with an interchange of roles between x and y or z, as the case may be.
Using the above family of lines for a given data cube we can define the set of beamlets belong to the cube to be the set of line segment obtained by taking the intersection of each line with the cube.
With the choice of indices range above we get that all beamlets associated with a data cube of size n have lengths bigger than n/2, half of the cube length and √ 3n, the cube main diagonal length.

Computational aspects
The Beamlet coefficients are the line integrals over the set of beamlets. A digital 3-D image can be regarded as a 3-D piece-wise constant function and each line integral is just a weighted sum of the voxel intensities along the corresponding line segment. Donoho and Levi [6] discuss in detail different approaches for computing line integrals in a 3-D digital image. Computing the beamlet coefficients for real applications data sets can be a challenging computational task since for a data cube with n × n × n voxels we have to compute O(n 4 ) coefficients. By developing efficient cache aware algorithms we are able to handle 3-D data sets of size up to n = 256 on a single fast machine in less than a day running time. We will mention that in many cases there is no interest in the coarsest scales coefficient that consumes most of the computation time and in that case the over all running time can be significantly faster. The algorithms can also be easily implemented on a parallel machine of a computer cluster using a system such as MPI in order to solve bigger problems.

The FFT-based transformation
Let ψ ∈ L 2 (R 2 ) a smooth function satisfying the admissibility condition (a 2D wavelet function), the three-dimensional continuous beamlet transform of a function f ∈ L 2 (R 3 )is given by:   The 3D beamlet transform can be built using the "Generalized projection-slice theorem" [39]. Let • f (x) an n dimensional function, • Rad m f its m-dimensional partial radon transform along the first m cardinal directions, m < n, Rad m f is a function of (p, µ m ; x m+1 , ..., x n ), µ m a unit directional vector in Rad m (note that for a given projection angle, the m dimensional partial radon transform of f (x) has (n − m) untransformated spatial dimension and a (n-m+1) dimensional projection profile), Let c(i 1 , i 2 , i 3 ) a cube of size (N, N, N ), the Beamlet algorithm consists in the following steps: 1. 3D-FFT. Computeĉ(k 1 , k 2 , k 3 ), the three-dimensional FFT of the cube c(i 1 , i 2 , i 3 ).
2. Cartesian to Spherical Conversion. Using an interpolation scheme, substitute the sampled values ofĉ obtained on the Cartesian coordinate system (k 1 , k 2 , k 3 ) with sampled values of on a spherical coordinate system (θ 1 , θ 2 , ρ).
3. Extract planes. Extract the 3N 2 planes (of size N × N ) passing through the origin (each line used in the 3D ridgelet transform defines set of orthogonal planes, we take the one which include the origin).

2D-WT.
Compute the two-dimensional wavelet transform on each plane. Figure 12 the 3D beamlet transform flowgraph. The 3D beamlet transform allows us to detect filament in a cube. The beamlet transform algorithm presented in this section differs from the one presented in [7], and relation between both algorithms is given in [6].

Experiment 1
We have simulated three data set containing respectively a cluster, a plane and a line. On each data set, Poisson noise have been added with eight different background levels. Then we have applied the three transforms on the 24 simulated data set. The coefficients distribution related to each transformation is normalized using twenty realizations of a 3D flat distribution with a Poisson noise and which have the same number of counts as in the data.  Figure 13 shows, from top to bottom, the maximum value of the normalized distribution versus the noise level for our three simulated data set. As expected, wavelets, ridgelets and beamlets are respectively the best for clusters, sheets and lines detection. It must also be underlined that a feature can be detected with a very high signal-to-noise ratio in a given basis, and and not detected in another basis. For example, the wall is detected at more than 60σ by the ridgelet transform, and less than 5σ by the wavelet transform. The line is detected almost at 10σ by the beamlet transform, and is under a 3σ detection level using wavelets. These results shows the importance of using several transforms for an optimal detection of all features contained in a data set.  We use here two simulated data sets to illustrate the discriminative power of the multiscale methods. The first one is a simulation from stochastic geometry. It is based on a Voronoi model. The second one is a mock catalog of the galaxy distribution drawn from a Λ-CDM Nbody cosmological model [12]. Both processes have very similar two-point correlation functions at small scales, although they look quite different and have been generated following completely different algorithms. closer set to the real galaxy distribution [12]. There are 15445 galaxies within a box with side 141.3 h −1 Mpc. Galaxies in this catalog have stellar masses exceeding 2 × 10 10 M ⊙ . Figure 14 shows the two simulated data set, and Figure 15 shows the two-point correlation function curve for the two point processes. The two point fields are different, but as it can be seen in Fig. 15, both have very similar two-point correlation functions in a huge range of scales (2 decades).

Experiment
We have applied the three transforms to each data set, and we have calculated the skewness vector S = (s j w , s j r , s j b ) and the kurtosis vector K = (k j w , k j r , k j b ) at each scale j. s j w , s j r , s j b are respectively the skewness at scale j of the wavelet coefficients, the ridgelet coefficients and the beamlet coefficients. k j w , k j r , k j b are respectively the kurtosis at scale j of the wavelet coefficients, the ridgelet coefficients and the beamlet coefficients. Figure 16 shows the kurtosis and the skewness vectors of the two data set at the two first scales. On the contrary to the two-point correlation function, this analysis shows strong differences between the two data set, particularly on the wavelet axis, which indicates that the second data contains more, or with a higher density, clusters than the first one.

Experiment 3
In this experiment, we have used a Λ-CDM simulation based on a N-body and hydrodynamical code, called RAMSES [37]. The code is based on Adaptive Mesh Refinement (AMR) technique, with a tree-based data structure allowing recursive grid refinements on a cell-by-cell basis. The simulated data have been obtained using 256 3 particles and 4.1 × 10 7 cells in the AMR grid, reaching a formal resolution of 8192 3 . The box size was set to 100h − 1 Mpc, with the following cosmological parameters: We used the results of this simulation at six different redshifts (z = 5, 3, 2, 1, 0.5, 0). Fig. 17 shows a projection of the simulated cubes along one axis. We have applied the 3D wavelet transform, the 3D beamlet transform and the 3D ridgelet transform on the six data set, and we calculate for each transform the standard deviation of the different scales. We will note σ 2 W,z,j , σ 2 R,z,j , σ 2 B,z,j the variance of the scale j relative to the transformation of the cube at redshift z by respectively the wavelet, the ridgelet and the beamlet transform. Figure 18 shows respectively from top to bottom the wavelet spectrum P w (z, j) = σ 2 W,z,j , the beamlet spectrum P b (z, j) = σ 2 B,z,j and the ridgelet spectrum P r (z, j) = σ 2 R,z,j . In order to see the evolution of matter distribution with the redshift and scale, we calculate the ratio M bw (j, z) = P b (z,j) Pw(z,j) and M rw (j, z) = Pr(z,j) Pw(z,j) . Figure 19 shows the M 1 and M 2 curves as a function of z and Figure 20 shows the M 1 and M 2 curves as a function of the scale number j.
The M 1 curve does not show too much evolution, while the M 2 curve presents a significant slope. This shows that the beamlet transform is much more sensible to the formation of clusters than the ridgelet transform. This is not surprising since the beamlet function support is much smaller than the ridgelet function support. M 2 increases with z showing clearly the cluster formation. This indicates that the combination of multiscale transformations allows us to get some information about the degree of clustering, filamentarity, and sheetedness.

Conclusion
We have introduced in this paper two new methods to analyze catalogs of galaxies. The first one consists in estimating the real underlying density though a wavelet denoising. Structures are first detected in the wavelet space, and an iterative reconstruction is performed. A smoothness constraint based on the l 1 norm of the wavelet coefficients is used, which reduce the amount of artifact in the reconstructed density, especially the ringing artifacts around strong features which are due to the wavelet function shape. We have shown that such an approach leads to much more reliable results than a Gaussian filtering when we want to derive a Genus curve from the catalog. This could also be true for other techniques which require to pre-process the data with a Gaussian filtering. The wavelet denoising preserves the resolution of the detected features whatever their sizes, and remove the noise in a non-linear way at all the scales.
The second approach does not require to detect anything. It is based on the analyzing of the distribution of coefficients obtained by several multiscale transforms. We have introduced two new multiscale decompositions, the 3D ridgelet transform and the 3D beamlet transform. We have described how to implement them using the FFT. Then we have shown that combining the information related to wavelet, ridgelet and beamlet coefficients leads to a new way to describe a data set. We have used in this paper the skewness and the kurtosis, but other recent statistic estimator such the Higher Criticism [5] could be used as well. Each multiscale transform is very sensible to one kind of feature, the wavelet to clusters, the beamlet to filaments, and the ridgelet to walls. A similar method has been proposed for analyzing CMB maps [29] where both the curvelet and the wavelet transform were used for the detection and the discrimination of non Gaussianities. This combined multiscale statistic is very powerful and we have shown that two data set that cannot be distinguished using a two point correlation function are clearly identified as different using our method. We believe that such an approach will permit to better constraint the cosmological models.