Multi-sensor data merging with stacked neural networks for the creation of satellite long-term climate data records
© Loyola and Coldewey-Egbers; licensee Springer. 2012
Received: 15 July 2011
Accepted: 26 April 2012
Published: 26 April 2012
This article presents a novel artificial neural network technique for merging multi-sensor satellite data. Stacked neural networks (NNs) are used to learn the temporal and spatial drifts between data from different satellite sensors. The resulting NNs are then used to sequentially adjust the satellite data for the creation of a global homogeneous long-term climate data record. The proposed technique has successfully been applied to the merging of ozone data from three European satellite sensors covering together a time period of more than 16 years. The resulting long-term ozone data record has an excellent long-term stability of 0.2 ± 0.2% per decade and can therefore be used for ozone and climate studies.
Keywordsstacked neural networks multi-sensor data merging satellite ozone
Over the last decades, an increasing large number of ground-based and satellite sensors have been measuring physical and biogeochemical parameters that provide a global view of the state of the Earth's system and its temporal evolution. Numerous satellite-based datasets are complementary to each other in either their type of measurements or their temporal and/or spatial coverage.
An outstanding task nowadays is to develop intelligent algorithms to combine or fuse these multi-year observations derived from diverse sensors onboard different satellites for the creation of a consistent and homogeneous global long-term data record which enable solid scientific investigations of climate processes reflecting the state of the Earth and its variability. The optimally merged climate data record can then be compared with numerical models, it may serve as input for model simulation, or it can be used for trend analyses.
However, the combination of data retrieved from multiple orbiting platforms is hampered by several factors such as differences in spatial and/or temporal sampling, differences in sensor characteristics (e.g. spectral coverage or viewing geometry), limited calibration stability, characteristic biases among instruments, record continuity, or differences in retrieval algorithms. These uncertainties must be properly characterized as they may carry over into the merged data set.
Several recent data merging efforts using different approaches have addressed a variety of environmental variables. Stratospheric ozone has become of particular interest since the discovery of the ozone hole in the 1980s. A number of ground-based and space-borne ozone data records are available today; see the ozone homogenization section below for more details. Another atmospheric parameter is for example aerosol optical thickness where spectra from the sensors Sea viewing Wide Field of View Scanner (SeaWiFS) and Moderate Resolution Imaging Spectroradiometer (MODIS) are merged into a single data product using least squares fitting  or alternative methods . Global sea surface temperature datasets are produced combining in situ and space-borne measurements  as well as various satellite observations, which are then validated with buoys [4, 5]. For ocean colour, there are examples of merged products from SeaWiFS, MODIS and Medium Resolution Imaging Spectrometer (MERIS) radiances .
We present a novel computational intelligence technique for merging multi-sensor satellite data. Temporal and spatial drifts between different satellite data are corrected using stacked neural networks (NNs).
This article is organized as follows: the next section outlines the general methodology of using stacked NNs for merging multi-sensor datasets, then the successful application of this methodology to total ozone data is presented and finally the conclusions are given.
Multi-sensor data merging with stacked NNs
Artificial NNs are very effective mathematical tools for learning nonlinear relationships implicitly given by input/output datasets. Typical neural network applications in satellite sensors focus traditionally on classification problems, see for example . More recently, a general framework for solving forward and inverse problems in remote sensing using NNs was presented . Statistical retrievals based on NNs, for example for obtaining tropospheric ozone , are common nowadays. Data fusion techniques based on NNs have been developed and applied to rainfall measurements from space  and ground-based precipitation data .
In this study, we develop a novel approach for data merging based on stacked NNs (SNNs). SNNs were introduced some time ago  as an ensemble combination method with two levels of learning involved. On the first level, several models are trained on the dataset; on the second level a high-level model combines the first level models in an optimal way.
SNNs are commonly used in the literature for the ensemble combination of NNs organized in two levels . We extend the concept of SNNs to a modular combination of NNs for an unlimited hierarchical number of levels.
The training set for every single NN is generated by combining data measured simultaneously with the two sensors to be homogenized. Spatial and temporal information are the input and the corresponding drift between the two sensors are the output of the training set. After training, the NNs can model the adjustments needed to minimize the differences between the two datasets.
Spatial and temporal information of satellite data (e.g. longitude or month of year) have usually a circular structure that may introduce discontinuities in regression problems. Regression functions probably exhibit jumps when evaluated at the extreme points of circular data like for example between longitude 0°and 360°or between the months January and December. To avoid this problem, Chen  proposes to add new input variables for the topological representation of circular data. A drawback of this approach is that the dimensionality of the input space is increased. In this article, we use a different approach called circular resampling. Instead of adding new input variables, we add new patterns to correlate the samples at the extreme points of circular data.
Homogenization of long-term satellite ozone data record using SNNs
Stratospheric ozone has become of particular interest since the late 1970s as the release of ozone depleting substances (ODSs) by human activity led to a significant decrease in the total ozone abundance. Although the Montreal Protocol and its subsequent amendments have now regulated the production and release of ODSs, there are still open questions concerning the onset of ozone recovery, the timing of full recovery, and the role of climate change.
NASA started the satellite remote sensing of ozone in 1970 with the Backscatter Ultraviolet Spectrometer. The European contribution to satellite base measurements of atmospheric composition started with the Global Ozone Monitoring Experiment (GOME) sensor  onboard the ESA satellite ERS-2 launched in 1995. GOME measured ozone and a number of atmospheric composition gases like nitrogen dioxide, sulphur dioxide, bromine monoxide, water vapour, formaldehyde, chlorine dioxide, glyoxalin as well as clouds and aerosols. The GOME data record is continued with the Scanning Imaging Absorption Spectrometer for Atmospheric CHartographY (SCIAMACHY) sensor  onboard the ESA satellite ENVISAT launched in 2002, with the Dutch sensor Ozone Monitoring Instrument (OMI)  onboard the NASA satellite AURA launched in 2004, and with the GOME-2 sensor  onboard the EUMETSAT satellite MetOp-A launched in 2006.
Several long-term, well-calibrated ozone datasets have been set up in order to address those climate-related questions, but the task is extremely difficult because an overall stability better than 1% per decade is required . Different merging algorithms for total as well as vertically resolved ozone rely on (a) inter-satellite calibration, i.e. using one dataset as reference standard [20–23], (b) ground-based measurements as reference data [24, 25], (c) data assimilation techniques [26, 27] and (d) optimum interpolation .
The total ozone measurements from GOME, SCIAMACHY and GOME-2 are used in this section, the merging follows an inter-satellite calibration approach. The operational products of these satellites computed using the GOME Data Processor (GDP) 4.x algorithm [29–31] are systematically compared with ground-based measurements and the differences are typically lower than 1% [30–32]. Nevertheless, satellite ozone data from different instruments may show spatial and temporal differences due to sensor-specific characteristics and drifts.
The geophysical validation results show that the GOME total ozone data are remarkable stable for the complete data period , while SCIAMACHY and GOME-2 present temporal and spatial drifts [30, 31]. For this reason, it was decided to use GOME as the transfer standard and to adjust the SCIAMACHY and GOME-2 measurements.
Following the SNNs methodology presented in the previous section, a first NNGS is trained with the drifts between monthly mean total ozone measurements of GOME and SCIAMACHY on a regular grid of 1° × 1°. GOME data are available since July 1995, but the global coverage was lost in June 2003 due to a failure on the satellite tape recorder. The SCIAMACHY data on the other side is available since August 2002. Therefore, the training dataset is created using data from the overlapping period but using only grid points containing measurements from both satellites. There are three NNs input parameters:
latitude from 90°S to 90°N
season coded as the measurement month from 1 to 12
measurement time taking as base the year 2002
The NN output is the inter-satellite drift computed as the ratio between the GOME and SCIAMACHY measurements.
A total of 25,000 patterns are collected using overlapping measurements from the two satellites between 2002 and 2010. In order to avoid discontinuities in the season dimension (circular structure), we use the circular resampling technique presented in the previous section. A continuous behaviour across the extreme points of the season (months 1 and 12) is forced by creating new samples for the virtual months 0 and 13 containing the same patterns as December and January, respectively. In this way, the samples at the beginning and at the end of the year are highly correlated and possible jumps are avoided.
The samples are randomly divided in two sets: 90% are used for training and the remaining 10% for validation. Several NNs topologies were tested; the best performance was reached with a feedforward NN with two hidden layers of 6 and 12 neurons, respectively, resulting in an NNGS topology of 3-6-12-1. The training was speedup using the parallel learning algorithm described in  running on a Linux server with 12 cores.
The SCIAMACHY measurements in the tropical regions around 20°N to 20°S overestimate the total ozone by around 2% over the complete time period with a noticeable reduction around middle 2007. This overestimation is related to the intra-cloud effect  that was not considered during the processing of the SCIAMACHY data. The ozone at high latitudes in the southern hemisphere varies from small overestimations in 2002 to seasonal underestimation of approximately 2% in next years. A similar tendency occurs at high latitudes in the northern hemisphere, but with stronger underestimation up to 3% in the winter periods. The regular patterns during winter are mainly caused by differences in the GOME and SCIAMACHY retrievals for measurements under snow/ice conditions.
The training dataset does not contain samples from December 2002 because there are relative few SCIAMACHY measurements for that period. Hence, the SNN has to interpolate in the time domain, which is solved quite well (see Figure 2). In the same way, the SNN extrapolates smoothly the adjustments for the year 2011 as it is trained with data only until end of 2010. Moreover, GOME data suffer from reduced spatial coverage in the southern hemisphere since July 2003 due to the ERS-2 tape recorder failure. Nonetheless, note the excellent interpolation capabilities also in the latitudinal domain where the SNN computes smooth adjustments in the southern hemisphere after 2003.
The merged GOME+SCIAMACHY dataset is created using the original GOME measurements and the SCIAMACHY measurements adjusted with NNGS. The mean differences between GOME and SCIAMACHY measurements are reduced from 0.94 ± 9.70 to -0.05 ± 8.86%.
Well-maintained ground-based instruments are used to evaluate the long-term stability of the space-born total ozone observations. Those measurements are routinely deposited at the World Ozone and Ultraviolet Radiation Data Centre (WOUDC) in Canada (http://www.woudc.org). The WOUDC archive provides data from the early 1950s onward collected with different types of sensors covering a wide geographical range.
Monthly mean data from 43 ground stations located in the northern hemisphere and equipped with Dobson spectrophotometers were used for the comparison with the individual satellite instruments and the new SNNs merged time series. The monthly means from each ground station were compared with the corresponding monthly means from the 1° × 1°gridded satellite data.
In this article, we presented a novel multi-sensor data merging technique based on a generalized stacked neural network strategy. SNNs are used to adjust temporal and spatial drifts between data from different satellite sensors. An NN is added sequentially to the stack for adjusting every new dataset; the final result is a homogeneously merged data record.
The proposed technique has successfully been applied to the merging of ozone data from three satellite sensors: GOME, SCIAMACHY and GOME-2.
The SNNs reduce the mean differences between GOME/SCIAMACHY and SCIAMACHY/GOME-2 from 0.94 ± 9.70 to -0.05 ± 8.86% and from 3.37 ± 6.83 to 0.28 ± 6.15%, respectively. It is worth noting that the SNNs are able to compensate for missing data by means of their excellent interpolation and extrapolation capabilities in the time and space domains. The resulting merged long-term SNN ozone data record is well suitable for ozone and climate studies due to its excellent long-term stability of 0.2 ± 0.2% per decade.
The GOME-type merged total ozone climate data record covering a time period of more than 16 years is free available at http://atmos.caf.dlr.de/gome/gto-ecv.html.
European Remote Sensing satellite
European Space Agency
European Organisation for the Exploitation of Meteorological Satellites
GOME data processor
global ozone monitoring experiment
medium resolution imaging spectrometer
moderate resolution imaging spectroradiometer
National Aeronautics and Space Administration
ozone depleting substances
ozone monitoring instrument
Imaging Absorption Spectrometer for Atmospheric CHartographY
Sea viewing Wide Field of View Scanner
stacked neural networks
World Ozone and Ultraviolet Radiation Data Centre.
Thanks to ESA/DLR for the provision of the GOME data (http://atmos.caf.dlr.de/gome), to ESA/BIRA for the provision of the SCIAMACHY data, and to O3M-SAF/EUMETSAT/DLR for the provision of the GOME-2 data (http://atmos.caf.dlr.de/gome2). We would like to thank colleagues from BIRA-IASB (Belgium), DLR (Germany), RT Solutions Inc. (USA) and AUTH (Greece) for their work on ozone retrieval algorithms from the GOME-type satellites and corresponding geophysical validation. Ground-based data used in this study were taken from the World Ozone and Ultraviolet Data Centre (http://www.woudc.org). This study was partially supported by the ESA Climate Change Initiative project on Ozone (Ozone CCI).
- Melin F, Zibordi G, Djavidnia S: Development and validation of a technique for merging satellite derived aerosol optical depth from SeaWiFS and MODIS. Remote Sens Environ 2007, 108: 436-450. 10.1016/j.rse.2006.11.026View ArticleGoogle Scholar
- Zubko V, Leptoukh GG: A Gopalan, Study of data-merging and interpolation methods for use in an interactive online analysis system: MODIS terra and aqua daily aerosol case. IEEE Trans Geosci Remote Sens 2010, 48(12):4219-4235. 10.1109/TGRS.2010.2050893View ArticleGoogle Scholar
- Reynolds RW, Smith TM: Improved global sea surface temperature analyses using optimum interpolation. J Climate 1994, 7: 929-948. 10.1175/1520-0442(1994)007<0929:IGSSTA>2.0.CO;2View ArticleGoogle Scholar
- Guan L, Kawamura H: Merging satellite infrared and microwave SSTs: methodology and evaluation of the new SST. J Oceanogr 2004, 60: 905. 10.1007/s10872-005-5782-5View ArticleGoogle Scholar
- Sakaida F, Kawamura H, Takahashi S, Shimada T, Kawai Y, Hosoda K, Guan L: Research and development of the New Generation Sea Surface Temperature for Open Ocean (NGSST-O) product and its demonstration operation. J Oceanogr 2009, 65: 859-870. 10.1007/s10872-009-0071-3View ArticleGoogle Scholar
- Maritorena S, d'Andon OHF, Mangin A, Siegel DA: Merged satellite ocean color data products using a bio-optical model: characteristics, benefits and issues. Remote Sens Environ 2010, 114: 1791-1804. 10.1016/j.rse.2010.04.002View ArticleGoogle Scholar
- Atkinson PM, Tatnall ARL: Introduction neural networks in remote sensing. Int J Remote Sens 1997, 18(4):699-709. 10.1080/014311697218700View ArticleGoogle Scholar
- Loyola D: Applications of neural network methods to the processing of earth observation satellite data. Neural Netw 2006, 19(2):168-177. 10.1016/j.neunet.2006.01.010View ArticleGoogle Scholar
- Sellitto P, Bojkov BR, Liu X, Chance K, Del Frate F: Tropospheric ozone column retrieval from the Ozone Monitoring Instrument by means of a neural network algorithm. Atmos Meas Tech 2011, 4: 2375-2388. 10.5194/amt-4-2375-2011View ArticleGoogle Scholar
- Tapiador FJ, Kidd C, Levizzani V, Marzano FS: A neural networks-based fusion technique to estimate half-hourly rainfall estimates at 0.1 resolution from satellite passive microwave and infrared data. J Appl Meteorol 2004, 43: 576-594. 10.1175/1520-0450(2004)043<0576:ANNFTT>2.0.CO;2View ArticleGoogle Scholar
- Turlapaty AC, Anantharaj VG, Younan NH, Turk FJ: Precipitation data fusion using vector space transformation and artificial neural networks. Pattern Recognit Lett 2010, 31: 1184-1200. 10.1016/j.patrec.2009.12.033View ArticleGoogle Scholar
- Ting KM, Witten IH: Issues in stacked generalization. J Artif Intell Res 1999, 10: 271-289.Google Scholar
- Sridhar DV, Bartlett EB, Seagrave RC: An information theoretic approach for combining neural network process models. Neural Netw 1999, 12: 915-926. 10.1016/S0893-6080(99)00030-1View ArticleGoogle Scholar
- Chen FW: Neural network characterization of geophysical processes with circular dependencies. IEEE Trans Geosci Remote Sens 2007, 45(10):3037-3043. 10.1109/TGRS.2007.895409View ArticleGoogle Scholar
- Burrows J, Weber M, Buchwitz M, Rozanov V, Ladstätter-Weißenmayer A, Richter A, DeBeek R, Hoogen R, Bramstedt K, Eichmann K, Eisinger M, Perner D: The Global Ozone Monitoring Experiment (GOME): mission concept and first scientific results. J Atmos Sci 1999, 56(2):151-175. 10.1175/1520-0469(1999)056<0151:TGOMEG>2.0.CO;2View ArticleGoogle Scholar
- Bovensmann H, Burrows J, Buchwitz M, Frerick J, Noel S, Rozanov V, Chance K, Goede A: SCIAMACHY: mission objectives and measurement modes. J Atmos Sci 1999, 56: 127-150. 10.1175/1520-0469(1999)056<0127:SMOAMM>2.0.CO;2View ArticleGoogle Scholar
- Levelt PF, van den Oord GHJ, Dobber MR, Mälkki A, Visser H, de Vries J, Stammes P, Lundell JOV, Saari H: The ozone monitoring instrument. IEEE Trans Geosci Remote Sens 2006, 44(5):1093-1101. 10.1109/TGRS.2006.872333View ArticleGoogle Scholar
- Munro R, Eisinger M, Anderson C, Callies J, Corpaccioli E, Lang R, Lefebvre A, Livschitz Y, Perez Albinana A: GOME-2 on METOP: from in-orbit verification to routine operations. In Proceedings of EUMETSAT Meteorological Satellite Conference 2006. Helsinki, Finland; 2006.Google Scholar
- GCOS-107 (WMO-TD No.1338): Systematic observation requirements for satellite based products for Climate, composed by World Meteorological Organization and Intergovernmental Oceanographic Commission. 2006.Google Scholar
- Miller AJ, Nagatani RM, Flynn LE, Kondragunta S, Beach E, Stolarski R, McPeters RD, Bhartia PK, DeLand MT, Jackman CH, Wuebbles DJ, Patten KO, Cebula RP: A cohesive total ozone data set from the SBUV(/2) satellite system. J Geophys Res 2002, 107(D23):4701.View ArticleGoogle Scholar
- Stolarski R, Frith SM: Search for evidence of trend slow-down in the long-term TOMS/SBUV total ozone data record: the importance of instrument drift uncertainty. Atmos Chem Phys 2006, 6: 4057-4065. 10.5194/acp-6-4057-2006View ArticleGoogle Scholar
- Loyola D, Coldewey-Egbers M, Dameris M, Garny H, Stenke A, Van Roozendael M, Lerot C, Balis D, Koukouli M: Global long-term monitoring of the ozone layer-a prerequisite for predictions. Int J Remote Sens 2009, 30(15):4295-4318. 10.1080/01431160902825016View ArticleGoogle Scholar
- McLinden C, Tegtmeier S, Fioletov V: Technical note: a SAGE-corrected SBUV zonal mean ozone data set. Atmos Chem Phys 2009, 9: 7963-7972. 10.5194/acp-9-7963-2009View ArticleGoogle Scholar
- Bodeker GE, Scott JC, Kreher K, McKenzie RL: Global ozone trends in potential vorticity coordinates using TOMS and GOME intercompared against the Dobson network: 1978-1998. J Geophys Res 2001, 106(D19):23029-23042. 10.1029/2001JD900220View ArticleGoogle Scholar
- Bodeker GE, Shiona H, Eskes H: Indicators of Antarctic ozone depletion. Atmos Chem Phys 2005, 5: 2603-2615. 10.5194/acp-5-2603-2005View ArticleGoogle Scholar
- Kiesewetter G, Sinnhuber BM, Vountas M, Weber M, Burrows JP: A long-term stratospheric ozone data set from assimilation of satellite observations: high-latitude ozone anomalies. J Geophys Res 2010., 115(D10307):Google Scholar
- Van der A RJ, Allaart MAF, Eskes HJ: Multi sensor reanalysis of total ozone. Atmos Chem Phys 2010, 10: 11277-11294. 10.5194/acp-10-11277-2010View ArticleGoogle Scholar
- Nirala M: Multi-sensor data fusion and comparison of total ozone. Int J Remote Sens 2008, 29(15):4553-4573. 10.1080/01431160801927202View ArticleGoogle Scholar
- Van Roozendael M, Loyola D, Spurr R, Balis D, Lambert J-C, Livschitz Y, Valks P, Ruppert T, Kenter P, Fayt C, Zehner C: Ten years of GOME/ERS-2 total ozone data: the new GOME Data Processor (GDP) Version 4: I-algorithm description. J Geophys Res 2006., 111(D14311): 10.1029/2005JD006375Google Scholar
- Lerot C, Van Roozendael M, van Geffen J, Gent J van, Fayt C, Spurr R, Lichtenberg G, von Bargen A: Six years of total ozone column measurements from SCIAMACHY nadir observations. Atmos Meas Tech 2009, 2: 87-98. 10.5194/amt-2-87-2009View ArticleGoogle Scholar
- Loyola D, Koukouli M, Valks P, Balis D, Hao N, Van Roozendael M, Spurr R, Zimmer W, Kiemle S, Lerot C, Lambert JC: The GOME-2 total column ozone product: retrieval algorithm and ground-based validation. J Geophys Res 2011., 116(D07302): 10.1029/2010JD014675Google Scholar
- Balis D, Lambert JC, van Roozendael M, Spurr R, Loyola D, Livschitz Y, Valks P, Amiridis V, Gerard P, Granville J, Zehner C: Ten years of GOME/ERS-2 total ozone data: the new GOME Data Processor (GDP) Version 4: II-ground-based validation and comparisons with TOMS V7/V8. J Geophys Res 2007., 112(D07307): 10.1029/2005JD006376Google Scholar
- Schuessler O, Loyola D: Parallel training of artificial neural networks using multithreaded and multicore CPUs. In Adaptive and Natural Computing Algorithms. Edited by: A Dobnikar, U Lotric, B Šter. Lecture Notes in Computer Science, 6593 (Springer Berlin, 2011); 70-79. 10.1007/978-3-642-20282-7_8
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.