 Research Article
 Open Access
Improving Density Estimation by Incorporating Spatial Information
 Laura M. Smith^{1}Email author,
 Matthew S. Keegan^{1},
 Todd Wittman^{1},
 George O. Mohler^{1} and
 Andrea L. Bertozzi^{1}
https://doi.org/10.1155/2010/265631
© Laura M. Smith et al. 2010
 Received: 1 December 2009
 Accepted: 9 March 2010
 Published: 27 June 2010
Abstract
Given discrete event data, we wish to produce a probability density that can model the relative probability of events occurring in a spatial region. Common methods of density estimation, such as Kernel Density Estimation, do not incorporate geographical information. Using these methods could result in nonnegligible portions of the support of the density in unrealistic geographic locations. For example, crime density estimation models that do not take geographic information into account may predict events in unlikely places such as oceans, mountains, and so forth. We propose a set of Maximum Penalized Likelihood Estimation methods based on Total Variation and Sobolev norm regularizers in conjunction with a priori high resolution spatial data to obtain more geographically accurate density estimates. We apply this method to a residential burglary data set of the San Fernando Valley using geographic features obtained from satellite images of the region and housing density information.
Keywords
 Kernel Density Estimation
 Aerial Image
 Housing Density
 Valid Region
 Bregman Variable
1. Introduction
High resolution and hyperspectral satellite images, city and county boundary maps, census data, and other types of geographical data provide much information about a given region. It is desirable to integrate this knowledge into models defining geographically dependent data. Given spatial event data, we will be constructing a probability density that estimates the probability that an event will occur in a region. Often, it is unreasonable for events to occur in certain regions, and we would like our model to reflect this restriction. For example, residential burglaries and other types of crimes are unlikely to occur in oceans, mountains, and other regions. Such areas can be determined using aerial images or other external spatial data, and we denote these improbable locations as the invalid region. Ideally, the support of our density should be contained in the valid region.
Geographic profiling, a related topic, is a technique used to create a probability density from a set of crimes by a single individual to predict where the individual is likely to live or work [1]. Some law enforcement agencies currently use software that makes predictions in unrealistic geographic locations. Methods that incorporate geographic information have recently been proposed and are an active area of research [2, 3].
In this paper we propose a novel set of models that restrict the support of the density estimate to the valid region and ensure realistic behavior. The models use Maximum Penalized Likelihood Estimation [11, 12], which is a variational approach. The density estimate is calculated as the minimizer of some predefined energy functional. The novelty of our approach is in the way we define the energy functional with explicit dependence on the valid region such that the density estimate obeys our assumptions of its support. The results from our methods for this simple example are illustrated in Figures 1(f), 1(g), and 1(h).
The paper is structured in the following way. In Section 2 Maximum Penalized Likelihood Methods are introduced. In Sections 3 and 4 we present our set of models which we name the Modified Total Variation MPLE model and the Weighted Sobolev MPLE model, respectively. In Section 5 we discuss the implementation and numerical schemes that we use to solve for the solutions of the models. We provide examples for validation of the models and an example with actual residential burglary data in Section 6. In this Section, we also compare our results to the Kernel Density Estimation model and other Total Variation MPLE methods. Finally, we discuss our conclusions and future work in Section 7.
2. Maximum Penalized Likelihood Estimation
Here, is a penalty functional, which is generally designed to produce a smooth density map. The parameter determines how strongly weighted the maximum likelihood term is, compared to the penalty functional:
A range of penalty functionals has been proposed, including [11, 12] and [4, 11]. More recently, variants of the Total Variation (TV) functional [13], , have been proposed for MPLE [8–10]. These methods do not explicitly incorporate the information that can be obtained from the external spatial data, although some note the need to allow for various domains. Even though the TV functional will maintain sharp gradients, the boundaries of the constant regions do not necessarily agree with the boundaries within the image. This method also performs poorly when the data is too sparse, as the density is smoothed to have equal probability almost everywhere. Figure 1(e) demonstrates this, in addition to how this method predicts events in the invalid region with nonnegligible estimates.
The methods we propose use a penalty functional that depends on the valid region determined from the geographical images or other external spatial data. Figure 1 demonstrates how these models will improve on the current methods.
3. The Modified Total Variation MPLE Model
Once we have determined a valid region, we wish to align the level curves of the density function with the boundary of the valid region. The Total Variation functional is well known to allow discontinuities in its minimizing solution [13]. By aligning the level curves of the density function with the boundary, we encourage a discontinuity to occur there to keep the density from smoothing into the invalid region.
The parameter allows us to vary the strength of the alignment term. Two pansharpening methods, and Variational Wavelet Pansharpening [14, 15], both include a similar term in their energy functional to align the level curves of the optimal image with the level curves of the high resolution panchromatic image.
4. The Weighted Sobolev MPLE Model
The weighting away from the edges is used to control the diffusion into the invalid region. This method of weighting away from the edges can also be used with the Total Variation functional in our first model, and we will refer to this as our Weighted TV MPLE model.
5. Implementation
5.1. The Constraints
In the implementation for the Modified Total Variation MPLE method and Weighted MPLE method, we must enforce the constraints and to ensure that is a probability density estimate. The constraint will be satisfied in our numerical solution by solving quadratic equations that have at least one nonnegative root.
with
5.2. Weighted MPLE Implementation
and where is the given number of sampled events that occurred at the location . We chose our parameters and so that the GaussSeidel solver will converge. In particular, we have and , where the image is .
5.3. Modified TV MPLE Implementation
There are many approaches for handling the minimization of the Total Variation penalty functional. A fast and simple method for doing this is to use the Split Bregman technique (see [10, 19] for an in depth discussion, see also [20]). In this approach, we substitute the variable for in the TV norm and then enforce the equality using Bregman iteration. To apply Bregman iteration, we introduce the variable as the Bregman vector of the constraint. This results in a minimization problem in which we minimize both and .
We solved for with a GaussSeidel solver. Heuristically, we found that using the relationships and were sufficient for the solver to converge and provide good results. We also set to have values between and . The parameter is the last remaining free paramter. This parameter can be chosen using Vcross validation or other techniques, such as the sparsity information criterion [8].
6. Results
In this Section, we demonstrate the strengths of our models by providing several examples. We first show how our methods compare to existing methods for a dense data set. We then show that our methods perform well for sparse data sets. Next, we explore an example with an aerial image and randomly selected events to show how these methods could be applied to geographic event data. Finally, we calculate probability density estimates for residential burglaries using our models.
6.1. Model Validation Example
This is the error comparison of the five methods shown in Figure 2. Our proposed methods performed better than both the Kernel Density Estimation method and the TV MPLE method.
8,000 Events  

Kernel density estimate 

TV MPLE 

Modified TV MPLE 

Weighted MPLE 

Weighted TV MPLE 

6.2. Sparse Data Example
40 Events  4,000 Events  

Kernel density estimate 


TV MPLE 


Modified TV MPLE 


Weighted MPLE 


Weighted TV MPLE 


6.3. Orange County Coastline Example
6.3.1. Model Comparisons
6.4. Residential Burglary Example
7. Conclusions and Future Work
In this paper we have studied the problem of determining a more geographically accurate probability density estimate. We demonstrate the importance of this problem by showing how common density estimation techniques, such as Kernel Density Estimation, fail to restrict the support of the density in a set of realistic examples.
To handle this problem, we proposed a set of methods, based on Total Variation and regularized MPLE models, that demonstrates great improvements in accurately enforcing the support of the density estimate when the valid region has been provided a priori. Unlike the TVregularized methods, our model has the advantage that it performs well for very sparse data sets.
The effectiveness of the methods is shown in a set of examples in which burglary probability densities are approximated from a set of crime events. Regions in which burglaries are impossible, such as oceans, mountains, and parks, are determined using aerial images or other external spatial data. These regions are then used to define an invalid region in which the density should be zero. Therefore, our methods are used to build geographically accurate probability maps.
It is interesting to note that there appears to be a relationship in the ratio between the number of samples and the size of the grid. In fact, each model has shown very different behavior in this respect. The TVbased methods appear to be very sensitive to large changes in this ratio, whereas the method seems to be robust to these same changes. We are uncertain about why this phenomenon exists, and this would make an interesting future research topic.
There are many directions in which we can build on the results of this paper. We would like to devise better methods for determining the valid region, possibly evolving the edge set of the valid region using convergence [17]. Since this technique can be used for many types of event data, including residential burglaries, we would also like to apply this method to Iraq Body Count Data. Finally, we would like to handle possible errors in the data, such as incorrect positioning of events that place them in the invalid region, by considering a probabilistic model of their position.
Declarations
Acknowledgments
This work was supported by NSF Grant BCS0527388, NSF Grant DMS0914856, ARO MURI Grant 50363MAMUR, ARO MURI Grant W911NS0910559, ONR Grant N000140810363, ONR Grant N000141010221, and the Department of Defense. The authors would like to thank George Tita and the LAPD for the burglary data set. They would also like to thank Jeff Brantingham, Martin Short, and the IPAM RIPS program at UCLA for the housing density data, which was obtained using ArcGIS and the LA County tax assessor data. They obtained our aerial images from Google Earth.
Authors’ Affiliations
References
 Kim Rossmo D: Geographic Profiling. CRC Press; 2000.Google Scholar
 Mohler GO, Short MB: Geographic profiling from kinetic models of criminal behavior. in reviewGoogle Scholar
 O'Leary M: The mathematics of geographic profiling. Journal of Investigative Psychology and Offender Profiling 2009, 6: 253265. 10.1002/jip.111View ArticleGoogle Scholar
 Silverman BW: Kernel density estimation using the fast fourier transform. Applied Statistics, Royal Statistical Society 1982, 31: 9397.MATHGoogle Scholar
 Silverman BW: Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC; 1986.View ArticleMATHGoogle Scholar
 Davies PL, Kovac A: Densities, spectral densities and modality. Annals of Statistics 2004, 32(3):10931136. 10.1214/009053604000000364MathSciNetView ArticleMATHGoogle Scholar
 Kooperberg C, Stone CJ: A study of logspline density estimation. Computational Statistics and Data Analysis 1991, 12(3):327347. 10.1016/01679473(91)90115IMathSciNetView ArticleMATHGoogle Scholar
 Sardy S, Tseng P:Density estimation by total variation penalized likelihood driven by the sparsity information criterion. Scandinavian Journal of Statistics 2010, 37(2):321337. 10.1111/j.14679469.2009.00672.xMathSciNetView ArticleMATHGoogle Scholar
 Koenker R, Mizera I: Density estimation by total variation regularization. In Advances in Statistical Modeling and Inference, Essays in Honor of Kjell A. Doksum. World Scientific; 2007:613634.View ArticleGoogle Scholar
 Mohler GO, Bertozzi AL, Goldstein TA, Osher SJ: Fast TV regularization for 2D maximum penalized likelihood estimation. to appear in Journal of Computational and Graphical StatisticsGoogle Scholar
 Eggermont PPB, LaRiccia VN: Maximum Penalized Likelihood Estimation. Springer, Berlin, Germany; 2001.MATHGoogle Scholar
 Goodd IJ, Gaskins RA: Nonparametric roughness penalties for probability densities. Biometrika 1971, 58(2):255277.MathSciNetView ArticleMATHGoogle Scholar
 Rudin LI, Osher S, Fatemi E: Nonlinear total variation based noise removal algorithms. Physica D 1992, 60(1–4):259268.View ArticleMathSciNetMATHGoogle Scholar
 Moeller M, Wittman T, Bertozzi AL: Variational wavelet pansharpening. In CAM Report. UCLA; 2008.Google Scholar
 Ballester C, Caselles V, Igual L, Verdera J, Rougé B:A variational model for image fusion. International Journal of Computer Vision 2006, 69(1):4358. 10.1007/s112630066852xView ArticleGoogle Scholar
 Mumford D, Shah J: Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics 1989, 42(5):577685. 10.1002/cpa.3160420503MathSciNetView ArticleMATHGoogle Scholar
 Ambrosio L, Tortorelli VM:Approximation of functional depending on jumps by elliptic functional via  convergence. Communications on Pure and Applied Mathematics 1990, 43(8):9991036. 10.1002/cpa.3160430805MathSciNetView ArticleMATHGoogle Scholar
 Osher S, Burger M, Goldfarb D, Xu J, Yin W: An iterative regularization method for total variationbased image restoration. Multiscale Modeling and Simulation 2005, 4(2):460489. 10.1137/040605412MathSciNetView ArticleMATHGoogle Scholar
 Goldstein T, Osher S: Split bregman method for L1 regularized problems. SIAM Journal on Imaging Sciences 2009, 2: 323343. 10.1137/080725891MathSciNetView ArticleMATHGoogle Scholar
 Wang Y, Yang J, Yin W, Zhang Y: A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Sciences 2008, 1(3):248272. 10.1137/080724265MathSciNetView ArticleMATHGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.