Skip to main content

Multispectral imaging using a stereo camera: concept, design and assessment


This paper proposes a one-shot six-channel multispectral color image acquisition system using a stereo camera and a pair of optical filters. The two filters from the best pair, selected from among readily available filters such that they modify the sensitivities of the two cameras in such a way that they produce optimal estimation of spectral reflectance and/or color, are placed in front of the two lenses of the stereo camera. The two images acquired from the stereo camera are then registered for pixel-to-pixel correspondence. The spectral reflectance and/or color at each pixel on the scene are estimated from the corresponding camera outputs in the two images. Both simulations and experiments have shown that the proposed system performs well both spectrally and colorimetrically. Since it acquires the multispectral images in one shot, the proposed system can solve the limitations of slow and complex acquisition process, and costliness of the state of the art multispectral imaging systems, leading to its possible uses in widespread applications.


With the development and advancement of digital cameras, acquisition and use of digital images have increased tremendously. Conventional image acquisition systems, which capture images into three color channels, usually red, green and blue, are by far the most commonly used imaging systems. However, these suffer from several limitations: these systems provide only color image, suffer from metamerism and are limited to visual range, and the captured images are environment dependent. Spectral imaging addresses these problems. Spectral imaging systems capture image data at specific wavelengths across the electromagnetic spectrum. Based on the number of bands, spectral imaging systems can be divided into two major types: multispectral and hyperspectral. There is no fine line separating the two; however, spectral imaging systems with more than 10 bands are generally considered as hyperspectral, whereas with less than 10 are considered as multispectral. Hyperspectral imaging deals with imaging narrow spectral bands over a contiguous spectral range and produces the spectra of all pixels in the scene. Hyperspectral imaging systems produce high measurement accuracy; however, the acquisition time, complexity and cost of these systems are generally quite high compared to multispectral systems. This paper is mainly focused on multispectral imaging. Multispectral imaging systems acquire images in relatively wider and limited spectral bands. They do not produce the spectrum of an object directly, and they rather use estimation algorithms to obtain spectral functions from the sensor responses. Multispectral imaging systems are still considerably less prone to metamerism [1] and have higher color accuracy, and unlike conventional digital cameras, they are not limited to the visual range, rather they can also be used in near infrared, infrared and ultraviolet spectrum as well [25] depending on the sensor responsivity range. These systems can significantly improve the color accuracy [610] and make color reproduction under different illumination environments possible with reasonably good accuracy [11]. Multispectral imaging has wider application domains, such as remote sensing [12], astronomy [13], medical imaging [14], analysis of museological objects [15], cosmetics [16], medicine [17], high-accuracy color printing [18, 19], computer graphics [20] and multimedia [21].

Despite all these benefits and applicability of multispectral imaging, its use is still not so wider. This is because of the limitations of the current state of the art multispectral imaging systems. There are different types of multispectral imaging systems, most of them are filter-based which use additional filters to expand the number of color channels, and our interest in this paper is also in this type. In a typical filter-based imaging system, a set of either traditional optical filters in a filter wheel or a tunable filter [2224] capable of many different configurations is employed. These multispectral imaging systems acquire images in multiple shots. A sensor used in a multispectral system may be a linear array as in CRISATEL [25] where the images are acquired by scanning line-by-line. With a matrix sensor (CCD or CMOS) like in a monochrome camera, a whole image scene can be captured at once without the need of scanning [23, 26], but this still requires multiple shots, one channel at a time. A high quality trichromatic digital camera in conjunction with a set of appropriate optical filters makes it possible to acquire unique spectral information [4, 2732]. This method enables three channels of data to be captured per exposure as opposed to one. With a total of n colored filters, there are 3n + 3 camera responses for each pixel (including responses with no colored filters), correspondingly giving rise to a 3n + 3 channel multispectral images. This greatly increases the speed of capture and allows the use of technology that is readily and cheaply available. Such systems can be easily used even without much specialized knowledge. Nonetheless, multiple shots are still necessary to acquire a multispectral color image. Several systems have been proposed aiming to circumvent multi-shot requirements for a multispectral image acquisition.

Hashimoto [33] proposed a two-shot 6-band still image capturing system using a commercial digital camera and a custom color filter. The system captures a multispectral image in two shots, one with and one without the filter, thus resulting in a 6-channel output. The filter is custom designed in such a way that it cuts off the left side (short wavelength domain) of the peak of original spectral sensitivity of blue and red, and also cuts off the right side (long-wavelength domain) of the green. The proposed 6-channel system claimed to produce high color accuracy and wider color range. The problem with this system is that it still needs two shots and is, therefore, incapable of capturing scenes in motion.

Ohsawa et al. [34] proposed a one-shot 6-band HDTV camera system. In their system, the light is divided into two optical paths by a half-mirror and is incident on two conventional CCD cameras after transmission through the specially designed interference filters inserted in each optical path. The two HDTV cameras capture three-band images in sync to compose each frame of the six band image. The total spectral sensitivities of the six band camera are the combination of spectral characteristics of the optical components: the objective lens, the half-mirror, the IR cutoff filter, the interference filters, the CCD sensors, etc. This system needs custom designed filters and complex optics making it still far from being practical.

Even though our focus is mainly on filter-based systems, some other non-filter-based systems proposed for faster multispectral acquisitions are worth mentioning here. Park et al. [35] proposed multispectral imaging using multiplexed LED illumination with computer-controlled switching, and they claimed to produce even multispectral videos of scenes at 30 fps. This is an alternative strategy for multispectral capture more or less on the same level with using colored filters, although not useful for uncontrolled illumination environments. Three-CCD camera-based systems offering 5 or 7 channels from FluxData Inc. [36] are available in the market. But, high price could be a concern for its common use. Langfelder et al. [37] proposed a filter-less and demosaicking-less color sensitive device that use the transverse field detectors or tunable sensitivity sensors. However, this is still in the computational stage at the moment.

In this paper, we have proposed a fast and practical solution to multispectral imaging with the use of a digital stereo camera or a pair of commercial digital cameras joined in a stereoscopic configuration, and a pair of readily available optical filters. As the two cameras are in a stereoscopic configuration, the system allows us to capture 3D stereo images also. This makes the system capable of acquiring both the multispectral and 3D stereo data simultaneously.

The rest of the paper is organized as follows. We first present the proposed system along with its design, optimal filer selection, estimation methods and evaluation. The proposed system has been investigated through computational simulation, and an experimental study has been carried out by investigating the performance of the system constructed. The simulation and experimental works and results are discussed next. Finally, we present the conclusion of the paper.

Proposed multispectral imaging with a stereo camera

Design and model

The multispectral imaging system we propose here is constructed from a stereo camera or two modern digital (RGB) cameras in a stereoscopic configuration, and a pair of appropriate optical filters in front of each camera of the stereo pair. Depending upon the sensitivities of the two cameras, one or two appropriate optical filters are selected from among a set of readily available filters, so that they will modify the sensitivities of one or two cameras to produce six channels (three each contributed from the two cameras) in the visible spectrum so as to give optimal estimation of the scene spectral reflectance and/or the color. The two cameras need not be of same type, instead, any two cameras can be used in a stereoscopic configuration, provided the two are operated in the same resolution. One-shot acquisition can be made possible by using two cameras with a sync controller available in the market. The proposed multispectral system is a faster, cheaper and practical solution, as it is the one-shot acquisition which can be constructed from even commercial digital cameras and readily available filters. Since the two cameras are in a stereoscopic configuration, the system is also capable of acquiring 3D image that provides added value to the system. 3D imaging in itself is an interesting area of study, and could be a large part of the study. This paper, therefore, focuses mainly on multispectral imaging, and 3D imaging has not been considered within its scope. Figure 1 illustrates a multispectral-stereo system constructed from a modern digital stereo camera - Fujifilm FinePix REAL 3D W1 (Fujifilm 3D) and two optical filters in front of the two lenses. We have used this system in our experimental study.

Figure 1

Illustration of a multispectral-stereo system constructed from Fujifilm 3D camera and a pair of filters placed on top of the two lenses.

Selection of the filters can be done computationally using a filter selection method presented below in this section. The two images captured with the stereo camera are registered for the pixel-to-pixel correspondence through an image registration process. As an illustration, a simple registration method has been presented in this paper below. The subsequent combination of the images from the two cameras provides a six channel multispectral image of the acquired scene.

In order to model the proposed multispectral system, let s i denote the spectral sensitivity of the i th channel, t is the spectral transmittance of the selected filter, L is the spectral power distribution of the light source, and R is the spectral reflectance of the surface captured by the camera. As there is always acquisition noise introduced into the camera outputs, let n denotes the acquisition noise. The camera response corresponding to the i th channel C i is then given by the multispectral camera model as

C i = S i T D i a g ( L ) R + n i ; i = 1 , 2 , , K ,

where S i = Diag(t)s i , n i is the channel acquisition noise, and K is the number of channels, which is 6 here in our system. For natural and man-made surfaces whose reflectance are more or less smooth, it is recommended to use as few channels as possible [38] and we study here with the proposed six channel system.

Optimal filters selection

Now, the next task at hand is on how to select an optimal filter pair for the construction of a proposed multispectral system. Several methods have been proposed for the selection of filters, particularly for multi-shot-based multispectral color imaging [26, 3941]. In our study, as we have to choose just two filters from a set of filters, the exhaustive search method is feasible and a logical choice because of its guaranteed optimal results. For selecting k (here k = 2) filters from the given set of n filters, the search requires P ( n , k ) = n ! ( n - k ) ! permutations. When two same type of cameras (assuming the same spectral sensitivities) are used, the problem reduces to combinations instead of permutations, i.e., C ( n , k ) = n ! k ! ( n - k ) ! combinations. The feasibility of the exhaustive search method thus depends on the number of sample filters. However, in order to extend the usability of this method for considerably large number of filters, we introduce a secondary criterion which excludes all infeasible filter pairs from computations. This criterion states that the filter pairs that result in a maximum transmission factor of less than forty percent and less than ten percent of the maximum transmission factor in one or more channels are excluded.

For a given pair of camera, a pair of optimal filters is selected using this filter selection algorithm and the secondary criterion through simulation, and the performance is then investigated experimentally.

Spectral reflectance estimation and evaluation

The estimated reflectance ( R ̃ ) is obtained for the corresponding original reflectance (R) from the camera responses for the training and test targets C(train) and C respectively, using different estimation methods. Training targets are the database of surface reflectance functions from which basis functions are generated and test targets are used to validate the performance of the device. There are many estimation algorithms proposed in the literature[28, 30, 4246]. It is not our primary goal to make comparative study of different algorithms. However, we have tried to investigate the performance of the proposed system with methods based on three major types of models: linear, polynomial and neural network. These models are described briefly below:

  • Linear Model: A linear-model approach formulates the problem of the estimation of a spectral reflectance R ̃ from the camera responses C as finding a transformation matrix (or reconstruction matrix) Q that reconstructs the spectrum from the K measurements as follows:

    R ̃ = Q C .

The matrix Q that minimizes a given distance metric d ( R , R ̃ ) or that maximizes a given similarity metric s ( R , R ̃ ) is determined. Linear regression (LR) method determines Q from the training data set using the pseudo-inverse:

Q = R ( t r a i n ) C + .

The pseudo-inverse C+ may be difficult to compute and when the problem is ill-posed, it may not even give any inverse, so it may need to be regularized (see "Regularization" later).

There are several approaches proposed [28, 42] which approximate R by linear combination of a small number of basis functions:

R B w ,

where B is a matrix containing the basis functions obtained from the training data set, and w is a weight matrix. Different approaches have been proposed for computing w. We present and use the method proposed by Imai and Berns (IB) [28] which was found to be relatively more robust to noise. This method assumes a linear relationship between camera responses and the weights that represent reflectance in a linear model:

w = M C ,

where M is the transformation matrix which can be determined empirically via a least-square fit as

M = w C + .

w is computed from Equation 4 as

w = B - 1 R ( t r a i n ) = B T R ( t r a i n ) .

The reflectance of the test target is then estimated using

R ̃ = B w = B M C ( t e s t ) = B w C ( t r a i n ) + C ( t e s t ) = B B T R ( t r a i n ) C ( t r a i n ) + C ( t e s t ) .
  • Polynomial Model (PN): With this model, the reflectance R of the characterization data set is directly mapped from the camera responses C through a linear relationship with the n degree polynomials of the camera responses [45, 47]:

    R ( λ 1 ) = m 1 1 C 1 + m 1 2 C 2 + m 1 3 C 3 + m 1 4 C 1 C 2 + (1) R ( λ 2 ) = m 2 1 C 1 + m 2 2 C 2 + m 2 3 C 3 + m 2 4 C 1 C 2 + (2) (3) R ( λ N ) = m N 1 C 1 + m N 2 C 2 + m N 3 C 3 + m N 4 C 1 C 2 + (4) (5)

    It can be written in a matrix form as

    R = M C p ,

    where M is the matrix formed from the coefficients, and C p is the polynomial vector/matrix from n degree polynomials of the camera responses as ( C 1 , C 2 , C 3 , C 1 2 , C 1 C 2 , C 1 C 3 , C 2 C 3 , ) T . The polynomial degree n is determined through optimization such that the estimation error is minimized. Complete or selected polynomial terms (for example, polynomial without crossed terms) could be used depending on the application. Transformation matrix M is determined from the training data set using

    M = R C p ( t r a i n ) + .

    Substituting the computed matrix M in Equation 10, the reflectance of the test target is estimated as

    R ̃ ( t e s t ) = R C p ( t r a i n ) + C p ( t e s t ) .

    Since non-linear method of mapping camera responses onto reflectance values may cause over-fitting the characterization surface, regularization can be done as described in the subsection below to solve this problem.

  • Neural Network Model (NN): Artificial neural networks simulate the behavior of many simple processing elements present in the human brain, called neurons. Neurons are linked to each other by connections called synapses. Each synapse has a coefficient that represents the strength or weight of the connection. Advantage of the neural network model is that they are robust to noise. A robust spectral reconstruction algorithm based on hetero-associative memories linear neural networks proposed by Mansouri [46] has been used.

The neural network is trained with the training data set using Delta rule also known as Widrow-Hoff rule. The rule continuously modifies weights w to reduce the difference (the Delta) between the expected output value e and the actual output o of a neuron. This rule changes the connection weights in the way that minimizes the mean squared error of the neuron between an observed response o and a desired theoretical one like:

w i j t + 1 = w i j t + η ( e j - o j ) x i = w i j t + Δ w i j ,

where e is the expected response, t is the number of iteration, and η is a learning rate. The weights w thus computed is finally used to estimate the reflectance of the test target using

R ̃ = w C ( t e s t ) .

In addition to the methods described previously, we have also tested some other methods like Maloney and Wandell, and Least-Squares Wiener; however, they are not included as they are considerably less robust to noise.

The estimated reflectances are evaluated using spectral as well as colorimetric metrics. Two different metrics: GFC (Goodness of Fit Coefficient)[48] and RMS (Root Mean Square) error have been used as spectral metrics, and Δ E a b * (CIELAB Color Difference) as the colorimetric metric. These metrics are given by the equations:

G F C = i = 1 n R ( λ i ) R ̃ ( λ i ) i = 1 n R ( λ i ) 2 i = 1 n R ̃ ( λ i ) 2
R M S = 1 n i = 1 n R ( λ i ) - R ̃ ( λ i ) 2
Δ E a b * = ( Δ L * ) 2 + ( Δ a * ) 2 + ( Δ b * ) 2

The GFC ranges from 0 to 1, with 1 corresponding to the perfect estimation. The RMS and Δ E a b * are positive values from 0 and higher, with 0 corresponding to the perfect estimation.


Regularization introduces additional information in an inverse problem in order to solve an ill-posed problem or to prevent over-fitting. Non-linear method of mapping camera responses onto reflectance values is the potential for over-fitting the characterization surfaces. Over-fitting is caused when the number of parameters in the model is greater than the number of dimensions of variation in the data. Among many regularization methods, Tikhonov regularization is the most commonly used method of regularization which tries to obtain regularized solution to Ax = b by choosing x to fit data b in least-square sense, but penalize solutions of large norm [49, 50]. The solution will then be the minimization problem:

x α = argmin | | A x b | | 2 + α | | x | | 2
= ( A T A + α I ) 1 A T b

where α > 0 is called the regularization parameter whose optimal values are determined through optimization for minimum estimation errors.


In order to have accurate estimation of spectral reflectance and/or color in each pixel of a scene, it is very important for the two images to have accurate pixel-to-pixel correspondence. In other words, the two images must be properly aligned. However, the stereo images captured from the stereo camera are not aligned. We, therefore, need to align the two images from the stereo pair, the process known as image registration. Different techniques could be used for the registration of the stereo images. One technique could be the use of a stereo-matching algorithm [5154]. Here, we go for a simple manual approach [55]. In this method, we select some (at least 8) corresponding points in the two images as control points, considering the left image as the base/reference image and the right image as the unregistered image. Based on the selected control points, an appropriate transformation that properly aligns the unregistered image with the base image is determined. And then, the unregistered image is registered using this transformation. Irrespective of the registration method, the problem of occlusion might occur in the stereo images due to the geometrical separation of the two lenses of the stereo camera. As we use central portion of the large patches, this simple registration method works well for our purpose. However, we should note that the correct registration is very important for accurate reflectance estimation. If there is misregistration leading to the incorrect correspondence in the two images, this may lead to wide deviation in the reflectance estimation especially in and around the edges where the image difference could be significantly large.


The proposed multispectral system has been investigated first with simulation and then validated experimentally. This section presents the simulation and experimental setups and results obtained.

Simulation setup

Simulation has been carried out with different stereo camera pairs whose spectral sensitivities are known or measured. The simulation takes a pair of filters one at a time, computes the camera responses using Equation 1, obtains the estimated spectral reflectance using four different spectral estimation methods and evaluates the estimation errors (spectral and colorimetric) as discussed previously. Similarly, the spectral reflectances are also estimated with 3-channel systems, where one camera (left or right) from the stereo is used.

As there is always acquisition noise introduced into the camera outputs, in order to make the simulation more realistic, simulated random shot noise and quantization noise are introduced. Recent measurements of noise levels in a trichromatic camera suggest that the realistic levels of shot noise are between 1 and 2% [56]. Therefore, 2% normally distributed Gaussian noise is introduced as a random shot noise in the simulation. And, 12-bit quantization noise is incorporated by directly quantizing the simulated responses after the application of the shot noise.

The simulation study has been conducted with a pair of Nikon D70 cameras, Nikon D70 and Canon 20D pair, and Fujifilm 3D stereo camera. Previously measured spectral sensitivities of the Nikon D70 and Canon 20D cameras are used, and those of the Fujifilm 3D camera are measured using Bentham TMc300 monochromator. Figure 2 shows these spectral sensitivities. Two hundred and sixty-five optical filters of three different types: exciter, dichroic, and emitter from Omega are used. Transmittances of the filters available in the company web site [57] have been used in the simulation. Rather than mixing filters from different vendors, one vendor has been chosen as a one-point solution for the filters, and the Omega has been chosen as they have a large selection of filters, and data are available online. Sixty-three patches of the Gretag Macbeth Color Checker DC have been used as the training target; and one hundred and twenty-two patches remained after omitting the outer surrounding achromatic patches, multiple white patches at the center, and the glossy patches in the S-column of the DC chart have been used as the test target. The training patches have been selected using linear distance minimization method (LDMM) proposed by Pellegri et al. [58]. A color whose associated system output vector has maximum norm among all the target colors is selected first. The method then chooses the colors of the training set iteratively based on their distances from those already chosen; the maximum absolute difference is used as the distance metric.

Figure 2

Normalized spectral sensitivities of the cameras: a Nikon D70 (solid) and Canon 20D (dotted). b Fujifilm 3D (Left - solid, Right - dotted).

The same spectral power distribution of the illuminant and the reflectances of the color checkers measured and used in the experiment later are used in the simulation. The spectral reflectances are estimated using the four estimation methods: LR, IB, PN and NN methods described previously. The type and the degree of polynomials in PN method are determined through optimization for minimum estimation errors, and we found that the 2 degree polynomials without cross-terms produce the best results. The estimated reflectances are evaluated using three evaluation metrics: GFC, RMS and Δ E a b * described previously. CIE 1964 10° color matching functions are used for color computation as it is the logical choice for each color checker patches subtends more than 2° from the lens position. The best pair of filters is exhaustively searched as discussed in the Optimal Filters Selection section, according to each of the evaluation metrics, from among all available filters with which the multispectral system can optimally estimate the reflectances of the 122 test target patches. The results corresponding to the minimum mean of the evaluation metrics are obtained. To speed up the process, the filter combinations not fulfilling the criterion described in the same section are skipped. The 265 filters lead to more than 70,000 possible permutations (for two different cameras). The criterion introduced reduces the processing down to less than 20,000 permutations.

Simulation results

The simulation selects optimal pairs of filters from among the 265 filters for the three camera setups depending on the estimation methods and the evaluation metrics. Table 1 shows these selected filters along with the statistics (maximum/minimum, mean and standard deviation) of estimation errors in all the cases for both the 6-channel and the 3-channel systems. These filters selected by the simulation are considered optimal and used as the basis of selection of filters to be used in the construction of the proposed multispectral system in the experiments. The NkonD70, Canon20D and Left camera of Fujifilm 3D are used for the simulation of the 3-channel systems.

Table 1 Statistics of estimation errors produced by the simulated systems

In the simulation of the NikonD70-NikonD70 camera system, the IB and the LR methods selected the filter pair (XF2077-XF2021), the PN selected the filter pair (XF2021-XF2203), and the NN picked the filter pair (XF2009-XF2021) for the maximum GFC, with the average mean value of 0.998. For the minimum RMS, the IB, the LR and the NN selected the filter pair (XF2009-XF2021), while the PN selected the filter pair (XF2010-XF2021) with the average mean value of 0.013. All four methods selected the filter pair (XF2014-XF2030) for the minimum Δ E a b * with the average mean error value of 0.387. The average mean values of GFC, RMS and Δ E a b * from all four methods (IB,LR,PN and NN) for the 3-channel system (NikonD70) are 0.989, 0.033 and 2.374, respectively.

With the NikonD70-Canon20D camera system, the IB and the LR selected the filter pair (XF2010-XF2021), and the PN and the NN selected the filter pair (XF2009-XF2021) for the maximum GFC, with the average mean value of 0.998. For the minimum RMS, the IB, the LR and the NN picked the filter pair (XF2009-XF2021), while th PN selected the filter pair (XF2203-XF2021) with the average mean value of 0.013. Similarly, the IB and the NN selected the filter pair (XF2021-XF2012), and the LR and the PN picked the filter pair (XF2040-XF2012) for the minimum Δ E a b * with the average value of 0.403. The average values of GFC, RMS and Δ E a b * from all four methods for the 3-channel system (Canon20D) are 0.99, 0.031 and 3.944, respectively.

Similarly, with the Fujifilm 3D camera system, the IB, the LR and the PN selected the filter pair (XF2026-XF1026), and the NN selected the filter pair (XF2021-XF2203) for the maximum GFC, with the average mean value of 0.998. For the minimum RMS, the IB, the LR and the PN picked the filter pair (XF2058-XF2021), while the NN picked the filter pair (XF2203-XF2021) with the average mean value of 0.013. And, for the minimum Δ E a b * , the IB and the LR selected the filter pair (XF2021-XF2012), and the PN and the NN selected the filter pair (XF2021-XF2030) with the average mean value of 0.448. The average values of GFC, RMS and Δ E a b * from all four methods for the 3-channel system (left camera) are 0.99, 0.031 and 3.522, respectively.

Now, we would like to illustrate the filters and the resulting 6-channel sensitivities of the simulated multispectral imaging systems. As we have seen, for a given camera system, different methods selected different filter pairs depending on the estimation method and the evaluation metric. However, the shapes of the filter pairs and the resulting effective channel sensitivities are very much similar. Therefore, in order to avoid excessive number of figures, instead of showing figures for all cases, we are giving the figures for the Fujifilm 3D camera system as illustrations, as our experiments have been performed with this system along with the filter pair (XF2021-XF2030) selected by the neural network method for minimum color error. Figure 3a shows the transmittances of this filter pair, and Figure 3b shows the resulting 6-channel normalized effective spectral sensitivities of the multispectral system. Figure 4 shows the estimated spectral reflectances with this system along with the measured reflectances of randomly picked 9 patches from among the 122 test patches selected as described previously in the Simulation Setup section. The patch numbers are given below the graphs. Figure 5 shows the estimated spectral reflectances obtained with the 3-channel system for the same 9 test patches, also along with the measured reflectance.

Figure 3

a An optimal pair of filters selected for Fujifilm 3D camera system by the neural network method for the minimum Δ E a b * , and the resulting, b 6-channel normalized sensitivities.

Figure 4

Estimated and measured spectral reflectances of 9 randomly picked test patches obtained with the simulated 6-channel multispectral system.

Figure 5

Estimated and measured spectral reflectances of the 9 test patches obtained with the simulated 3-channel system.

Experimental setup

We have conducted experiments with the multispectral system constructed from the Fujifilm 3D stereo camera and the filter pair (XF2021-XF2030) selected as an optimal from the simulation as described previously, by the neural network estimation method for the minimal Δ E a b * . The optimal filters selected by the simulation previously have been considered as the basis for choosing the filters for the experiment. As we have already seen, different estimation algorithms pick different filter pairs which also depend on the evaluation metrics. However, the shapes of the filter pairs selected and the resulting 6-channel sensitivities look very much similar. The results from the all four methods and the three metrics are also quite similar as can be seen in the Table 1. Results also show that minimizing Δ E a b * also produces more or less similar mean GFC and RMS values with all four methods for all three camera setups. We, therefore, decided to go for the filter pair (XF2021-XF2030) that produced the minimum Δ E a b * by the neural network method. The multispectral camera system has been built by placing the XF2021 filter in front of the left lens and the XF2030 filter in front of the right lens of the camera. Throughout the whole experiment, the camera has been set to a fixed configuration (mode: manual, flash: off, ISO: 100, exposure time: 1/60s, aperture: F3.7, white balance: fine, 3D file format: MPO, image size: 3648 × 2736). The left camera has been used for the 3-channel system.

The spectral sensitivities of the Fujifilm 3D were measured using the Bentham TMc300 monochromator, and the monochromatic lights have been measured with the calibrated photo diode provided with the monochromator. The spectral power distribution of the light source (Daylight D50 simulator, Gretag Macbeth SpectraLight III) under which the experiments have been carried out has been measured with the Minolta CS-1000 spectroradiometer. The transmittances of the filters have also been measured with the spectroradiometer. Figure 6 shows the measured transmittances of the filter pair (XF2021-XF2030). We can see some differences in the shapes of the filters from the one used in the simulation with the transmittance data provided by the manufacturer (see Figure 3a).

Figure 6

Measured transmittances of the pair of filters used in the experiment.

In order to investigate the performance of the system, as in the simulation, the same 63 patches of the Gretag Macbeth Color Checker DC has been used as the training target and 122 patches have been used as the test target. Spectral reflectances of the color chart patches have been measured with the X-Rite Eye One Pro spectrophotometer. Both the left and the right cameras have been corrected for linearity, DC noise and non-uniformity.

The system then acquired the images of the color chart. To minimize the statistical error, each acquisition has been made 10 times and the averages of these 10 acquisitions are used in the analysis. The images from the left and the right cameras are registered using the method discussed earlier, and the 3-channel and the 6-channel responses for each patch are obtained by channel wise averaging of the central area of certain size from the patch. The camera responses thus obtained are then used for spectral estimations using the same four different estimation methods, and the spectral and the colorimetric estimation errors are evaluated similarly as in the simulation.

Experimental results

The statistics of estimation errors obtained from the experiment with both the 6-channel and the 3-channel systems for all the four estimation methods and the three evaluation metrics are given in Table 2. We can see that all the four methods produce almost the similar results. For instance, the NN method produces the mean GFC, RMS and Δ E a b * values of 0.992,0.036 and 4.854, respectively, with the 6-channel system. The corresponding mean metric values produced with the NN method for the 3-channel system are 0.988, 0.063 and 9.126, respectively.

Table 2 Statistics of estimation errors produced by the experimental system

To illustrate the results graphically, the estimated spectral reflectance of the same 9 test patches used in the simulation above along with the measured reflectance are shown in Figure 7. Similarly, Figure 8 shows the estimated and measured reflectances of the same patches obtained with the 3-channel system.

Figure 7

Estimated and measured spectral reflectances of the 9 test patches obtained with the experimental 6-channel multispectral system.

Figure 8

Estimated and measured spectral reflectances of the 9 test patches obtained with the experimental 3-channel system.

Discussion on the results

We have investigated the proposed multispectral system with both the simulation and the real experiments. The simulation determines the optimal pair of filters from among 265 filters for a given camera setup. The results show that the selected optimal filter pairs depend on the evaluation metric used (GFC, RMS and Δ E a b * ). This is quite expected as colorimetric optimization not necessarily optimize spectrally and vice versa; since more than one spectrum can produce the same color, the phenomenon known as metamerism. For a given camera setup and a selected metric, most of the estimation methods selected the same pair of filters. Even though some others selected the different pairs, we find that they are very similar in the type and the shape, and hence, all four methods produce similar performances. The results also show similar performances from both the spectral metrics GFC, and RMS.

The simulation results show that the proposed 6-channel multispectral system outperforms classical 3-channel camera systems, both spectrally and colorimetrically. The improvements are significant, for instance, with the increase in the mean GFC from 0.99 to 0.998, decrease in the RMS error from 0.031 down to 0.014 and decrease in the Δ E a b * from 3.499 down to 0.4 in the case of Fujifilm 3D with the neural network method. The results are similar with the other camera systems and the estimation methods. It is to be noted that the improvement strictly depends on the choice of the filters; badly chosen filters may lead to the system which might fail to work better. The estimated spectral reflectances with the 6-channel system, as can be seen in the Figure 4, is significantly closer to the original ones compared to the estimation results in the case of 3-channel system shown in the Figure 5. The simulation results, thus, show promising results clearly indicating that the proposed system built with two RGB cameras or a stereo camera and a pair of appropriate filters can function well as a multispectral system.

Encouraged by the promising results from the simulation, we performed real experiments for validation. As explained previously, the experiments have been carried out with the multispectral system built with the Fujifilm 3D camera and the optimal filter pair (XF2021-XF2030) selected by the simulation for minimum Δ E a b * with the neural network method. Experimental results also show that the proposed 6-channel multispectral system consistently performs better than the 3-channel system both spectrally and colorimetrically in terms of mean metric values. As in the simulation, all four estimation methods produced better results for all three metrics with the 6-channel multispectral system than with the 3-channel system. For instance, in case of Fujifilm 3D camera system, GFC is increased from 0.988 to 0.992, RMS is reduced from 0.063 down to 0.036, and Δ E a b * is reduced from 9.126 down to 4.854 with the neural network method. All other estimation methods also produced similar results. The minimum value 4.733 of Δ E a b * obtained with the PN method is still quite high and considerably higher than the simulation result. One reason could be the limited noise consideration in the simulation model, where we used the random shot noise and the quantization noise only, whereas in reality there could be many other noises that come into play in real cameras. We have investigated the influence of noise on the performance in the simulation with the Fujifilm 3D camera, and we found that Δ E a b * increases almost linearly with the increase in the percentage of shot noise from 0 to 20%. Also, we have already seen some differences in the measured filter transmittances from the ones used in the simulation. In order to see the performance change, we have done simulation again this time with the measured transmittances of the filters and this produces the Δ E a b * of 1.428 with the same neural network method that produced the minimum value of 0.4 in the previous simulation. This also explains some higher values in the experimental results. We should note here that the performance of the system highly depends on the filters and their correct transmittance values. Moreover, we have to note that the Fujifilm 3D camera we used has limited control; there is no manual focus and the camera does not support the raw data. It has its own white balancing and interpolation algorithms. Even though we have used the fixed setting of the camera during the whole experiment including the characterization and all image acquisitions, the acquired images are still subject to built-in preprocessing and optical changes. This might also have influenced the results leading to higher estimation errors. We believe that the performance can surely be improved with more controllable camera.


In this paper, we have proposed a one-shot multispectral imaging system built with a stereo camera. The proposed system is simple to construct from commercial off-the-shelf digital cameras, and a pair of filters selected from readily available filters in the market. The system, therefore, could be a fast, practical and cheaper solution to multispectral imaging, useful in a variety of applications. Both the simulation and experimental results show that the proposed 6-channel multispectral system performs significantly better than the traditional 3-channel cameras both spectrally and colorimetrically. Moreover, stereo configuration allows acquiring stereo 3D images simultaneously along with the multispectral image, and this could be an interesting further work.


  1. 1.

    Yamaguchi M, Teraji T, Ohsawa K, Uchiyama T, Motomura H, Murakami Y, Ohyama N: Color Imaging: Device-Independent Color, Color Hardcopy, and Applications VII, Volume 4663 of SPIE Proceedings. Color Image Reproduction Based on the Multispectral and Multiprimary Imaging: Experimental Evaluation 2002, 15-26.

    Google Scholar 

  2. 2.

    Horman MH: Temperature analysis from multispectral infrared data. Appl Opt 1976,15(9):2099-2104. 10.1364/AO.15.002099

    Article  Google Scholar 

  3. 3.

    Ellrod GP, Connell BH, Hillger DW: Improved detection of airborne volcanic ash using multispectral infrared satellite data. J Geophys Res 2003,108(D12):4356-4369.

    Article  Google Scholar 

  4. 4.

    Huang HH: Asian Association on Remote Sensing (ACRS). Acquisition of Multispectral Images Using Digital Cameras; 2004.

    Google Scholar 

  5. 5.

    Ononye AE, Vodacek A, Saber E: Automated extraction of fire line parameters from multispectral infrared images. Remote Sens Environ 2007,108(2):179-188. 10.1016/j.rse.2006.09.029

    Article  Google Scholar 

  6. 6.

    Pratt WK, Mancill CE: Spectral estimation techniques for the spectral calibration of a color image scanner. Appl Opt 1976,15(1):73-75. 10.1364/AO.15.000073

    Article  Google Scholar 

  7. 7.

    Hill B, Vorhagen FW: Multispectral Image Pick-Up System. US Patent 5,319,472 1994.

    Google Scholar 

  8. 8.

    Tominaga S: Multichannel vision system for estimating surface and illumination functions. J Opt Soc Am A 1996,13(11):2163-2173. 10.1364/JOSAA.13.002163

    Article  Google Scholar 

  9. 9.

    Burns PD, Berns RS: Proceedings of the IS&T/SID Fourth Color Imaging Conference: Color Science, Systems, and Applications, Color Imaging Conference. Analysis of Multispectral Image Capture. Scottsdale: IS&T/SID; 1996:19-22.

    Google Scholar 

  10. 10.

    Yamaguchi M, Iwama R, Ohya Y, Obi T, Ohyama N, Komiya Y, Wada T: Natural color reproduction in the television system for telemedicime. Med Imaging 1997 Image Disp 1997,3031(1):482-489.

    Article  Google Scholar 

  11. 11.

    Tsumura N: Appearance reproduction and multispectral imaging. Color Res Appl 2006,31(4):270-277. 10.1002/col.20225

    Article  Google Scholar 

  12. 12.

    Swain PH, Davis SM: Remote Sensing: The Quantitative Approach. McGraw-Hill, New York; 1978.

    Google Scholar 

  13. 13.

    Rosselet AC, Graff W, Wild UP, Keller CU, Gschwind R: Imaging Spectrometry, Volume 2480 of SPIE Proceedings. Persistent Spectral Hole Burning Used for Spectrally High-Resolved Imaging of the Sun. 1995, 205-212.

    Google Scholar 

  14. 14.

    Taxt T, Lundervold A: Multispectral analysis of the brain using magnetic resonance imaging. Med Imaging IEEE Trans 1994,13(3):470-481. 10.1109/42.310878

    Article  Google Scholar 

  15. 15.

    Maitre H, Schmitt F, Crettez JP, Wu Y, Hardeberg JY: IS&T and SID's 4th Color Imaging Conference: Color Science, Systems and Applications. Spectrophotometric Image Analysis of Fine Art Paintings Scottsdale, Arizona. 1996, 50-53.

    Google Scholar 

  16. 16.

    Doi M, Ohtsuki R, Tominaga S: Image Analysis, Volume 3540 of Lecture Notes in Computer Science. Spectral Estimation of Skin Color with Foundation Makeup. Springer, Berlin; 2005:95-104.

    Google Scholar 

  17. 17.

    Farkas DL, Ballou BT, Fisher GW, Fishman D, Garini Y, Niu W, Wachman : Optical Diagnostics of Living Cells and Biofluids, Volume 2678 of SPIE Proceedings. Microscopic and Mesoscopic Spectral Bio-Imaging; 1996:200-206.

    Book  Google Scholar 

  18. 18.

    Berns RS: CIM'98: Colour Imaging in Multimedia. Challenges for Color Science in Multimedia Imaging. Derby, UK. 1998, 123-133.

    Google Scholar 

  19. 19.

    Berns RS, Imai FH, Burns PD, Tzeng DY: Electronic Imaging: Processing, Printing, and Publishing in Color, Volume 3409 of SPIE Proceedings. Multispectral-Based Color Reproduction Research at the Munsell Color Science Laboratory. 1998, 14-25.

    Google Scholar 

  20. 20.

    Peercy MS: SIGGRAPH '93: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques. Linear Color Representations for Full Speed Spectral Rendering (ACM, New York); 191-198.

  21. 21.

    Hardeberg JY, Schmitt F, Brettel H, Crettez JP, Matre H: Proceedings, CIM'98, Colour Imaging in Multimedia. Multispectral Imaging in Multimedia; 1998:75-86.

    Google Scholar 

  22. 22.

    Miller PJ, Hoyt CC: Optics in Agriculture, Forestry, and Biological Processing, Volume 2345 of SPIE Proceedings. Multispectral Imaging with a Liquid Crystal Tunable Filter; 1995:354-365.

    Google Scholar 

  23. 23.

    Hardeberg JY, Schmitt F, Brettel H: Multispectral color image capture using a liquid crystal tunable filter. Opt Eng 2002,41(10):2532-2548. 10.1117/1.1503346

    Article  Google Scholar 

  24. 24.

    Nascimento SMC, Ferreira FP, Foster DH: Statistics of spatial cone-excitation ratios in natural scenes. J Opt Soc Am A 2002,19(8):1484-1490. 10.1364/JOSAA.19.001484

    Article  Google Scholar 

  25. 25.

    Cotte P, Dupouy M: PICS. CRISATEL High Resolution Multispectral System IS&T; 2003:161-165.

    Google Scholar 

  26. 26.

    Hardeberg JY: Acquisition and Reproduction of Colour Images: Colorimetric and Multispectral Approaches. Doctoral dissertation, École Nationale Supérieure des Télécommunications de Paris 1999.

    Google Scholar 

  27. 27.

    Imai FH: Multi-Spectral Image Acquisition and Spectral Reconstruction Using a Trichromatic Digital Camera System Associated with Absorption Filters. Technical report, Munsell Color Science Laboratory Technical Report, Rochester 1998.

    Google Scholar 

  28. 28.

    Imai FH, Berns RS: International Symposium on Multispectral Imaging and Color Reproduction for Digital Archives. Spectral Estimation Using Trichromatic Digital Cameras; 1999:42-49.

    Google Scholar 

  29. 29.

    Tominaga S: Spectral imaging by a multichannel camera. J Electron Imaging 1999,8(4):332-341. 10.1117/1.482702

    Article  Google Scholar 

  30. 30.

    Imai FH: A comparative analysis of spectral reflectance estimated in various spaces using a trichromatic camera system. J Imaging Sci Technol 2000, 44: 280-287.

    Google Scholar 

  31. 31.

    Valero EM, Nieves JL, Nascimento SMC, Amano K, Foster DH: Recovering spectral data from natural scenes with an RGB digital camera. Color Res Appl 2007, 32: 352-360. 10.1002/col.20339

    Article  Google Scholar 

  32. 32.

    Yamaguchi M, Haneishi H, Ohyama N: Beyond red-green-blue (RGB): spectrum-based color imaging technology. J Imaging Sci Technol 2008,52(1):010201. 10.2352/J.ImagingSci.Technol.(2008)52:1(010201)

    Article  Google Scholar 

  33. 33.

    Hashimoto M, Kishimoto J: IS&T Fourth European Conference on Colourin Graphics. Two-Shot Type 6-Band Still Image Capturing System Using Commercial Digital Camera and Custom Color Filter Terrassa, Spain; 2008:538.

    Google Scholar 

  34. 34.

    Ohsawa K, Ajito T, Komiya Y, Fukuda H, Hanelshi H, Yamaguchi M, Ohyama N: Six band HDTV camera system for spectrum-based color reproduction. J Imaging Sci Technol 2004,48(PART 2):85-92.

    Google Scholar 

  35. 35.

    Park J, Lee M, Grossberg MD, Nayar SK: IEEE International Conference on Computer Vision (ICCV). Multispectral Imaging Using Multiplexed Illumination; 2007.

    Google Scholar 

  36. 36.

    Flux Inc: FluxData FD-1665 series of 3CCD cameras.2011. []

    Google Scholar 

  37. 37.

    Langfelder G, Longoni AF, Zaraga F: Digital Photography VII, Volume 7876-7877 of Proceedings of SPIE/IS&T Electronic Imaging. Implementation of a Multispectral Color Imaging Device Without Color Filter Array SPIE, San Francisco; 2011:787609.

    Google Scholar 

  38. 38.

    Connah D, Alsam A, Hardeberg JY: Multispectral imaging: how many sensors do we need? J Imaging Sci Technol 2006,50(1):45-52. 10.2352/J.ImagingSci.Technol.(2006)50:1(45)

    Article  Google Scholar 

  39. 39.

    Day DC: Filter Selection for Spectral Estimation Using a Trichromatic Camera. Master Thesis, Rochester Institute of Technology, Center for Imaging Science, Rochester; 2003.

    Google Scholar 

  40. 40.

    Hardeberg JY: Filter selection for multispectral color image acquisition. J Imaging Sci Technol 2004,48(2):105-110.

    Google Scholar 

  41. 41.

    Novati G, Pellegri P, Schettini R: Color Imaging IX: Processing, Hardcopy, and Applications, Volume 5293 of SPIE Proceedings. Selection of Filters for Multispectral Acquisition Using the Filter Vectors Analysis Method; 2004:20-26.

    Google Scholar 

  42. 42.

    Maloney LT: Evaluation of Linear Models of Surface Spectral Reflectance with Small Numbers of Parameters. Jones and Bartlett Publishers, Inc, USA; 1992.

    Google Scholar 

  43. 43.

    Imai FH, Taplin LA, Day EA: Comparative Study of Spectral Reflectance Estimation Based on Broad-Band Imaging Systems. Technical report, Rochester Institute of Technology, College of Science, Center for Imaging Science, Munsell Color Science Laboratory, Rochester; 2003.

    Google Scholar 

  44. 44.

    Connah D, Hardeberg J, Westland S: ICIP '04. 2004 International Conference on Image Processing, 2004. Volume 3. Comparison of Linear Spectral Reconstruction Methods for Multispectral Imaging; 2004:1497-1500.

    Book  Google Scholar 

  45. 45.

    Connah DR, Hardeberg JY: Color Imaging X: Processing, Hardcopy, and Applications, Volume 5667 of SPIE Proceedings. Spectral Recovery Using Polynomial Models; 2005:65-75.

    Book  Google Scholar 

  46. 46.

    Mansouri A, Marzani FS, Gouton P: IEEE International Conference on Image Processing. Neural Networks in Two Cascade Algorithms for Spectral Reflectance Reconstruction (IEEE, 2005); 2005:2053-2056.

    Google Scholar 

  47. 47.

    Bianco S, Gasparini F, Schettini R, Vanneschi L: Polynomial modeling and optimization for colorimetric characterization of scanners. J Electron Imaging 2008,17(04):043002. 10.1117/1.2982004

    Article  Google Scholar 

  48. 48.

    Romero J, García-Beltrán A, Hernández-Andrés J: Linear bases for representation of natural and artificial illuminants. J Opt Soc Am A 1997,14(5):1007-1014. 10.1364/JOSAA.14.001007

    Article  Google Scholar 

  49. 49.

    Dyas B: The IS&T/SID Eighth Color Imaging Conference. Robust Sensor Response Characterization; 2000:144-148.

    Google Scholar 

  50. 50.

    Gulliksson M, Wedin PA: The use and properties of tikhonov filter matrices. SIAM J Matrix Anal Appl 2000,22(1):276-281. 10.1137/S0895479899355025

    MathSciNet  Article  MATH  Google Scholar 

  51. 51.

    Hannah M: DARPA85. SRI's Baseline Stereo System; 1985:149-155.

    Google Scholar 

  52. 52.

    Marapane SB, Trivedi MM: Multi-Primitive Hierarchical (MPH) stereo analysis. IEEE Trans Pattern Anal Mach Intell 1994,16(3):227-240. 10.1109/34.276122

    Article  Google Scholar 

  53. 53.

    Hung YP, Chen CS, Hung KC, Chen YS, Fuh CS: Multipass hierarchical stereo matching for generation of digitalterrain models from aerial images. Mach Vis Appl 1998,10(5-6):280-291. 10.1007/s001380050079

    Article  Google Scholar 

  54. 54.

    Zitnick C, Kanade T: A Cooperative Algorithm for Stereo Matching and Occlusion Detection. Technical report CMU-RI-TR-99-35, Robotics Institute, Pittsburgh; 1999.

    Google Scholar 

  55. 55.

    Goshtasby A: Image registration by local approximation methods. Image Vis Comput 1988, 6: 255-261. 10.1016/0262-8856(88)90016-9

    Article  Google Scholar 

  56. 56.

    Barnard K, Funt B: Camera characterization for color research. Color Res Appl 2002, 27: 152-163. 10.1002/col.10050

    Article  Google Scholar 

  57. 57.

    Omega: Omega Filters.2011. []

    Google Scholar 

  58. 58.

    Pellegri P, Novati G, Schettini R: PICS. Selection of Training Sets for the Characterisation of Multispectral Imaging Systems 2003, 461-466.

    Google Scholar 

Download references


The authors would like to thank Omega Optical, Inc. for providing the optical filters for this study.

Author information



Corresponding author

Correspondence to Raju Shrestha.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Authors’ original file for figure 13

Authors’ original file for figure 14

Authors’ original file for figure 15

Authors’ original file for figure 16

Authors’ original file for figure 17

Authors’ original file for figure 18

Authors’ original file for figure 19

Authors’ original file for figure 20

Authors’ original file for figure 21

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Shrestha, R., Mansouri, A. & Hardeberg, J.Y. Multispectral imaging using a stereo camera: concept, design and assessment. EURASIP J. Adv. Signal Process. 2011, 57 (2011).

Download citation


  • Spectral Sensitivity
  • Spectral Reflectance
  • Multispectral Imaging
  • Stereo Camera
  • Neural Network Method