design and assessment

This paper proposes a one-shot six-channel multispectral color image acquisition system using a stereo camera and a pair of optical filters. The two filters from the best pair, selected from among readily available filters such that they modify the sensitivities of the two cameras in such a way that they produce optimal estimation of spectral reflectance and/or color, are placed in front of the two lenses of the stereo camera. The two images acquired from the stereo camera are then registered for pixel-to-pixel correspondence. The spectral reflectance and/or color at each pixel on the scene are estimated from the corresponding camera outputs in the two images. Both simulations and experiments have shown that the proposed system performs well both spectrally and colorimetrically. Since it acquires the multispectral images in one shot, the proposed system can solve the limitations of slow and complex acquisition process, and costliness of the state of the art multispectral imaging systems, leading to its possible uses in widespread applications.


Introduction
With the development and advancement of digital cameras, acquisition and use of digital images have increased tremendously. Conventional image acquisition systems, which capture images into three color channels, usually red, green and blue, are by far the most commonly used imaging systems. However, these suffer from several limitations: these systems provide only color image, suffer from metamerism and are limited to visual range, and the captured images are environment dependent. Spectral imaging addresses these problems. Spectral imaging systems capture image data at specific wavelengths across the electromagnetic spectrum. Based on the number of bands, spectral imaging systems can be divided into two major types: multispectral and hyperspectral. There is no fine line separating the two; however, spectral imaging systems with more than 10 bands are generally considered as hyperspectral, whereas with less than 10 are considered as multispectral. Hyperspectral imaging deals with imaging narrow spectral bands over a contiguous spectral range and produces the spectra of all pixels in the scene. Hyperspectral imaging systems produce high measurement accuracy; however, the acquisition time, complexity and cost of these systems are generally quite high compared to multispectral systems. This paper is mainly focused on multispectral imaging. Multispectral imaging systems acquire images in relatively wider and limited spectral bands. They do not produce the spectrum of an object directly, and they rather use estimation algorithms to obtain spectral functions from the sensor responses. Multispectral imaging systems are still considerably less prone to metamerism [1] and have higher color accuracy, and unlike conventional digital cameras, they are not limited to the visual range, rather they can also be used in near infrared, infrared and ultraviolet spectrum as well [2][3][4][5] depending on the sensor responsivity range. These systems can significantly improve the color accuracy [6][7][8][9][10] and make color reproduction under different illumination environments possible with reasonably good accuracy [11]. Multispectral imaging has wider application domains, such as remote sensing [12], astronomy [13], medical imaging [14], analysis of museological objects [15], cosmetics [16], medicine [17], high-accuracy color printing [18,19], computer graphics [20] and multimedia [21].
Despite all these benefits and applicability of multispectral imaging, its use is still not so wider. This is because of the limitations of the current state of the art multispectral imaging systems. There are different types of multispectral imaging systems, most of them are filter-based which use additional filters to expand the number of color channels, and our interest in this paper is also in this type. In a typical filter-based imaging system, a set of either traditional optical filters in a filter wheel or a tunable filter [22][23][24] capable of many different configurations is employed. These multispectral imaging systems acquire images in multiple shots. A sensor used in a multispectral system may be a linear array as in CRISATEL [25] where the images are acquired by scanning line-by-line. With a matrix sensor (CCD or CMOS) like in a monochrome camera, a whole image scene can be captured at once without the need of scanning [23,26], but this still requires multiple shots, one channel at a time. A high quality trichromatic digital camera in conjunction with a set of appropriate optical filters makes it possible to acquire unique spectral information [4,[27][28][29][30][31][32]. This method enables three channels of data to be captured per exposure as opposed to one. With a total of n colored filters, there are 3n + 3 camera responses for each pixel (including responses with no colored filters), correspondingly giving rise to a 3n + 3 channel multispectral images. This greatly increases the speed of capture and allows the use of technology that is readily and cheaply available. Such systems can be easily used even without much specialized knowledge. Nonetheless, multiple shots are still necessary to acquire a multispectral color image. Several systems have been proposed aiming to circumvent multi-shot requirements for a multispectral image acquisition.
Hashimoto [33] proposed a two-shot 6-band still image capturing system using a commercial digital camera and a custom color filter. The system captures a multispectral image in two shots, one with and one without the filter, thus resulting in a 6-channel output. The filter is custom designed in such a way that it cuts off the left side (short wavelength domain) of the peak of original spectral sensitivity of blue and red, and also cuts off the right side (long-wavelength domain) of the green. The proposed 6-channel system claimed to produce high color accuracy and wider color range. The problem with this system is that it still needs two shots and is, therefore, incapable of capturing scenes in motion.
Ohsawa et al. [34] proposed a one-shot 6-band HDTV camera system. In their system, the light is divided into two optical paths by a half-mirror and is incident on two conventional CCD cameras after transmission through the specially designed interference filters inserted in each optical path. The two HDTV cameras capture three-band images in sync to compose each frame of the six band image. The total spectral sensitivities of the six band camera are the combination of spectral characteristics of the optical components: the objective lens, the half-mirror, the IR cutoff filter, the interference filters, the CCD sensors, etc. This system needs custom designed filters and complex optics making it still far from being practical.
Even though our focus is mainly on filter-based systems, some other non-filter-based systems proposed for faster multispectral acquisitions are worth mentioning here. Park et al. [35] proposed multispectral imaging using multiplexed LED illumination with computer-controlled switching, and they claimed to produce even multispectral videos of scenes at 30 fps. This is an alternative strategy for multispectral capture more or less on the same level with using colored filters, although not useful for uncontrolled illumination environments. Three-CCD camera-based systems offering 5 or 7 channels from FluxData Inc. [36] are available in the market. But, high price could be a concern for its common use. Langfelder et al. [37] proposed a filter-less and demosaicking-less color sensitive device that use the transverse field detectors or tunable sensitivity sensors. However, this is still in the computational stage at the moment.
In this paper, we have proposed a fast and practical solution to multispectral imaging with the use of a digital stereo camera or a pair of commercial digital cameras joined in a stereoscopic configuration, and a pair of readily available optical filters. As the two cameras are in a stereoscopic configuration, the system allows us to capture 3D stereo images also. This makes the system capable of acquiring both the multispectral and 3D stereo data simultaneously.
The rest of the paper is organized as follows. We first present the proposed system along with its design, optimal filer selection, estimation methods and evaluation. The proposed system has been investigated through computational simulation, and an experimental study has been carried out by investigating the performance of the system constructed. The simulation and experimental works and results are discussed next. Finally, we present the conclusion of the paper.

Design and model
The multispectral imaging system we propose here is constructed from a stereo camera or two modern digital (RGB) cameras in a stereoscopic configuration, and a pair of appropriate optical filters in front of each camera of the stereo pair. Depending upon the sensitivities of the two cameras, one or two appropriate optical filters are selected from among a set of readily available filters, so that they will modify the sensitivities of one or two cameras to produce six channels (three each contributed from the two cameras) in the visible spectrum so as to give optimal estimation of the scene spectral reflectance and/or the color. The two cameras need not be of same type, instead, any two cameras can be used in a stereoscopic configuration, provided the two are operated in the same resolution. One-shot acquisition can be made possible by using two cameras with a sync controller available in the market. The proposed multispectral system is a faster, cheaper and practical solution, as it is the one-shot acquisition which can be constructed from even commercial digital cameras and readily available filters. Since the two cameras are in a stereoscopic configuration, the system is also capable of acquiring 3D image that provides added value to the system. 3D imaging in itself is an interesting area of study, and could be a large part of the study. This paper, therefore, focuses mainly on multispectral imaging, and 3D imaging has not been considered within its scope. Figure 1 illustrates a multispectral-stereo system constructed from a modern digital stereo camera -Fujifilm FinePix REAL 3D W1 (Fujifilm 3D) and two optical filters in front of the two lenses. We have used this system in our experimental study.
Selection of the filters can be done computationally using a filter selection method presented below in this section. The two images captured with the stereo camera are registered for the pixel-to-pixel correspondence through an image registration process. As an illustration, a simple registration method has been presented in this paper below. The subsequent combination of the images from the two cameras provides a six channel multispectral image of the acquired scene.
In order to model the proposed multispectral system, let s i denote the spectral sensitivity of the ith channel, t is the spectral transmittance of the selected filter, L is the spectral power distribution of the light source, and R is the spectral reflectance of the surface captured by the camera. As there is always acquisition noise introduced into the camera outputs, let n denotes the acquisition noise. The camera response corresponding to the ith channel C i is then given by the multispectral camera model as where S i = Diag(t)s i , n i is the channel acquisition noise, and K is the number of channels, which is 6 here in our system. For natural and man-made surfaces whose reflectance are more or less smooth, it is recommended to use as few channels as possible [38] and we study here with the proposed six channel system.

Optimal filters selection
Now, the next task at hand is on how to select an optimal filter pair for the construction of a proposed multispectral system. Several methods have been proposed for the selection of filters, particularly for multi-shotbased multispectral color imaging [26,[39][40][41]. In our study, as we have to choose just two filters from a set of filters, the exhaustive search method is feasible and a logical choice because of its guaranteed optimal results. For selecting k (here k = 2) filters from the given set of n filters, the search requires P(n, k) = n! (n−k)! permutations. When two same type of cameras (assuming the same spectral sensitivities) are used, the problem reduces to combinations instead of permutations, i.e., C(n, k) = n! k!(n−k)! combinations. The feasibility of the exhaustive search method thus depends on the number of sample filters. However, in order to extend the usability of this method for considerably large number of filters, we introduce a secondary criterion which excludes all infeasible filter pairs from computations. This criterion states that the filter pairs that result in a maximum transmission factor of less than forty percent and less than ten percent of the maximum transmission factor in one or more channels are excluded.
For a given pair of camera, a pair of optimal filters is selected using this filter selection algorithm and the secondary criterion through simulation, and the performance is then investigated experimentally.

Spectral reflectance estimation and evaluation
The estimated reflectance (R) is obtained for the corresponding original reflectance (R) from the camera responses for the training and test targets C (train) and C respectively, using different estimation methods. Training targets are the database of surface reflectance functions from which basis functions are generated and test targets are used to validate the performance of the device. There are many estimation algorithms proposed in the literature [28,30,[42][43][44][45][46]. It is not our primary goal to make comparative study of different algorithms. However, we have tried to investigate the performance Figure 1 Illustration of a multispectral-stereo system constructed from Fujifilm 3D camera and a pair of filters placed on top of the two lenses. of the proposed system with methods based on three major types of models: linear, polynomial and neural network. These models are described briefly below: • Linear Model: A linear-model approach formulates the problem of the estimation of a spectral reflectanceR from the camera responses C as finding a transformation matrix (or reconstruction matrix) Q that reconstructs the spectrum from the K measurements as follows: The matrix Q that minimizes a given distance metric d(R,R) or that maximizes a given similarity metric s(R,R) is determined. Linear regression (LR) method determines Q from the training data set using the pseudo-inverse: The pseudo-inverse C + may be difficult to compute and when the problem is ill-posed, it may not even give any inverse, so it may need to be regularized (see "Regularization" later).
There are several approaches proposed [28,42] which approximate R by linear combination of a small number of basis functions: where B is a matrix containing the basis functions obtained from the training data set, and w is a weight matrix. Different approaches have been proposed for computing w. We present and use the method proposed by Imai and Berns (IB) [28] which was found to be relatively more robust to noise. This method assumes a linear relationship between camera responses and the weights that represent reflectance in a linear model: where M is the transformation matrix which can be determined empirically via a least-square fit as w is computed from Equation 4 as The reflectance of the test target is then estimated usingR • Polynomial Model (PN): With this model, the reflectance R of the characterization data set is directly mapped from the camera responses C through a linear relationship with the n degree polynomials of the camera responses [45,47]: It can be written in a matrix form as where M is the matrix formed from the coefficients, and C p is the polynomial vector/matrix from n degree polynomials of the camera responses as The polynomial degree n is determined through optimization such that the estimation error is minimized. Complete or selected polynomial terms (for example, polynomial without crossed terms) could be used depending on the application. Transformation matrix M is determined from the training data set using Substituting the computed matrix M in Equation 10, the reflectance of the test target is estimated as Since non-linear method of mapping camera responses onto reflectance values may cause over-fitting the characterization surface, regularization can be done as described in the subsection below to solve this problem.
• Neural Network Model (NN): Artificial neural networks simulate the behavior of many simple processing elements present in the human brain, called neurons. Neurons are linked to each other by connections called synapses. Each synapse has a coefficient that represents the strength or weight of the connection. Advantage of the neural network model is that they are robust to noise. A robust spectral reconstruction algorithm based on hetero-associative memories linear neural networks proposed by Mansouri [46] has been used.
The neural network is trained with the training data set using Delta rule also known as Widrow-Hoff rule. The rule continuously modifies weights w to reduce the difference (the Delta) between the expected output value e and the actual output o of a neuron. This rule changes the connection weights in the way that minimizes the mean squared error of the neuron between an observed response o and a desired theoretical one like: where e is the expected response, t is the number of iteration, and h is a learning rate. The weights w thus computed is finally used to estimate the reflectance of the test target using In addition to the methods described previously, we have also tested some other methods like Maloney and Wandell, and Least-Squares Wiener; however, they are not included as they are considerably less robust to noise.
The estimated reflectances are evaluated using spectral as well as colorimetric metrics. Two different metrics: GFC (Goodness of Fit Coefficient) [48] and RMS (Root Mean Square) error have been used as spectral metrics, and E * ab (CIELAB Color Difference) as the colorimetric metric. These metrics are given by the equations: The GFC ranges from 0 to 1, with 1 corresponding to the perfect estimation. The RMS and E * ab are positive values from 0 and higher, with 0 corresponding to the perfect estimation.

Regularization
Regularization introduces additional information in an inverse problem in order to solve an ill-posed problem or to prevent over-fitting. Non-linear method of mapping camera responses onto reflectance values is the potential for over-fitting the characterization surfaces.
Over-fitting is caused when the number of parameters in the model is greater than the number of dimensions of variation in the data. Among many regularization methods, Tikhonov regularization is the most commonly used method of regularization which tries to obtain regularized solution to Ax = b by choosing x to fit data b in least-square sense, but penalize solutions of large norm [49,50]. The solution will then be the minimization problem: where a > 0 is called the regularization parameter whose optimal values are determined through optimization for minimum estimation errors.

Registration
In order to have accurate estimation of spectral reflectance and/or color in each pixel of a scene, it is very important for the two images to have accurate pixel-topixel correspondence. In other words, the two images must be properly aligned. However, the stereo images captured from the stereo camera are not aligned. We, therefore, need to align the two images from the stereo pair, the process known as image registration. Different techniques could be used for the registration of the stereo images. One technique could be the use of a stereo-matching algorithm [51][52][53][54]. Here, we go for a simple manual approach [55]. In this method, we select some (at least 8) corresponding points in the two images as control points, considering the left image as the base/ reference image and the right image as the unregistered image. Based on the selected control points, an appropriate transformation that properly aligns the unregistered image with the base image is determined. And then, the unregistered image is registered using this transformation. Irrespective of the registration method, the problem of occlusion might occur in the stereo images due to the geometrical separation of the two lenses of the stereo camera. As we use central portion of the large patches, this simple registration method works well for our purpose. However, we should note that the correct registration is very important for accurate reflectance estimation. If there is misregistration leading to the incorrect correspondence in the two images, this may lead to wide deviation in the reflectance estimation especially in and around the edges where the image difference could be significantly large.

Experiments
The proposed multispectral system has been investigated first with simulation and then validated experimentally. This section presents the simulation and experimental setups and results obtained.

Simulation setup
Simulation has been carried out with different stereo camera pairs whose spectral sensitivities are known or measured. The simulation takes a pair of filters one at a time, computes the camera responses using Equation 1, obtains the estimated spectral reflectance using four different spectral estimation methods and evaluates the estimation errors (spectral and colorimetric) as discussed previously. Similarly, the spectral reflectances are also estimated with 3-channel systems, where one camera (left or right) from the stereo is used.
As there is always acquisition noise introduced into the camera outputs, in order to make the simulation more realistic, simulated random shot noise and quantization noise are introduced. Recent measurements of noise levels in a trichromatic camera suggest that the realistic levels of shot noise are between 1 and 2% [56]. Therefore, 2% normally distributed Gaussian noise is introduced as a random shot noise in the simulation. And, 12-bit quantization noise is incorporated by directly quantizing the simulated responses after the application of the shot noise.
The simulation study has been conducted with a pair of Nikon D70 cameras, Nikon D70 and Canon 20D pair, and Fujifilm 3D stereo camera. Previously measured spectral sensitivities of the Nikon D70 and Canon 20D cameras are used, and those of the Fujifilm 3D camera are measured using Bentham TMc300 monochromator. Figure 2 shows these spectral sensitivities. Two hundred and sixty-five optical filters of three different types: exciter, dichroic, and emitter from Omega are used.
Transmittances of the filters available in the company web site [57] have been used in the simulation. Rather than mixing filters from different vendors, one vendor has been chosen as a one-point solution for the filters, and the Omega has been chosen as they have a large selection of filters, and data are available online. Sixtythree patches of the Gretag Macbeth Color Checker DC have been used as the training target; and one hundred and twenty-two patches remained after omitting the outer surrounding achromatic patches, multiple white patches at the center, and the glossy patches in the Scolumn of the DC chart have been used as the test target. The training patches have been selected using linear distance minimization method (LDMM) proposed by Pellegri et al. [58]. A color whose associated system output vector has maximum norm among all the target colors is selected first. The method then chooses the colors of the training set iteratively based on their distances from those already chosen; the maximum absolute difference is used as the distance metric.
The same spectral power distribution of the illuminant and the reflectances of the color checkers measured and used in the experiment later are used in the simulation. The spectral reflectances are estimated using the four estimation methods: LR, IB, PN and NN methods described previously. The type and the degree of polynomials in PN method are determined through optimization for minimum estimation errors, and we found that the 2 degree polynomials without cross-terms produce the best results. The estimated reflectances are evaluated using three evaluation metrics: GFC, RMS and E * ab described previously. CIE 1964 10°color matching functions are used for color computation as it is the logical choice for each color checker patches subtends more than 2°from the lens position. The best pair of filters is exhaustively searched as discussed in the Optimal Filters Selection section, according to each of the evaluation metrics, from among all available filters with which the multispectral system can optimally estimate the reflectances of the 122 test target patches. The results corresponding to the minimum mean of the evaluation metrics are obtained. To speed up the process, the filter combinations not fulfilling the criterion described in the same section are skipped. The 265 filters lead to more than 70,000 possible permutations (for two different cameras). The criterion introduced reduces the processing down to less than 20,000 permutations.

Simulation results
The simulation selects optimal pairs of filters from among the 265 filters for the three camera setups depending on the estimation methods and the evaluation metrics. Table 1 shows these selected filters along with the statistics (maximum/minimum, mean and standard deviation) of estimation errors in all the cases for both the 6-channel and the 3-channel systems. These filters selected by the simulation are considered optimal and used as the basis of selection of filters to be used in the construction of the proposed multispectral system in the experiments. The NkonD70, Canon20D and Left camera of Fujifilm 3D are used for the simulation of the 3-channel systems.
In the simulation of the NikonD70-NikonD70 camera system, the IB and the LR methods selected the filter pair (XF2077-XF2021), the PN selected the filter pair (XF2021-XF2203), and the NN picked the filter pair (XF2009-XF2021) for the maximum GFC, with the average mean value of 0.998. For the minimum RMS, the IB, the LR and the NN selected the filter pair (XF2009-XF2021), while the PN selected the filter pair (XF2010-XF2021) with the average mean value of 0.013. All four methods selected the filter pair (XF2014-XF2030) for the minimum E Now, we would like to illustrate the filters and the resulting 6-channel sensitivities of the simulated multispectral imaging systems. As we have seen, for a given camera system, different methods selected different filter pairs depending on the estimation method and the evaluation metric. However, the shapes of the filter pairs and the resulting effective channel sensitivities are very much similar. Therefore, in order to avoid excessive number of figures, instead of showing figures for all cases, we are giving the figures for the Fujifilm 3D camera system as illustrations, as our experiments have been performed with this system along with the filter pair (XF2021-XF2030) selected by the neural network method for minimum color error. Figure 3a shows the transmittances of this filter pair, and Figure 3b shows the resulting 6-channel normalized effective spectral sensitivities of the multispectral system. Figure 4 shows the estimated spectral reflectances with this system along with the measured reflectances of randomly picked 9 patches from among the 122 test patches selected as described previously in the Simulation Setup section. The patch numbers are given below the graphs. Figure 5 shows the estimated spectral reflectances obtained with the 3-channel system for the same 9 test patches, also along with the measured reflectance.

Experimental setup
We have conducted experiments with the multispectral system constructed from the Fujifilm 3D stereo camera and the filter pair (XF2021-XF2030) selected as an optimal from the simulation as described previously, by the neural network estimation method for the minimal E * ab . The optimal filters selected by the simulation previously have been considered as the basis for choosing the filters for the experiment. As we have already seen, different estimation algorithms pick different filter pairs which also depend on the evaluation metrics. However, the shapes of the filter pairs selected and the resulting 6-channel sensitivities look very much similar. The results from the all four methods and the three metrics are also quite similar as can be seen in the Table 1.
Results also show that minimizing E * ab also produces more or less similar mean GFC and RMS values with all four methods for all three camera setups. We, therefore, decided to go for the filter pair (XF2021-XF2030) that produced the minimum E * ab by the neural network method. The multispectral camera system has been built by placing the XF2021 filter in front of the left lens and the XF2030 filter in front of the right lens of the camera. Throughout the whole experiment, the camera has been set to a fixed configuration (mode: manual, flash: off, ISO: 100, exposure time: 1/60s, aperture: F3.7, white balance: fine, 3D file format: MPO, image size: 3648 × 2736). The left camera has been used for the 3-channel system.
The spectral sensitivities of the Fujifilm 3D were measured using the Bentham TMc300 monochromator, and the monochromatic lights have been measured with the calibrated photo diode provided with the monochromator. The spectral power distribution of the light source (Daylight D50 simulator, Gretag Macbeth SpectraLight III) under which the experiments have been carried out has been measured with the Minolta CS-1000 spectroradiometer. The transmittances of the filters have also been measured with the spectroradiometer. Figure 6 shows the measured transmittances of the filter pair (XF2021-XF2030). We can see some differences in the shapes of the filters from the one used in the simulation with the transmittance data provided by the manufacturer (see Figure 3a).
In order to investigate the performance of the system, as in the simulation, the same 63 patches of the Gretag Macbeth Color Checker DC has been used as the training target and 122 patches have been used as the test target. Spectral reflectances of the color chart patches have been measured with the X-Rite Eye One Pro spectrophotometer. Both the left and the right cameras have been corrected for linearity, DC noise and nonuniformity.
The system then acquired the images of the color chart. To minimize the statistical error, each acquisition has been made 10 times and the averages of these 10 acquisitions are used in the analysis. The images from the left and the right cameras are registered using the method discussed earlier, and the 3-channel and the 6channel responses for each patch are obtained by channel wise averaging of the central area of certain size from the patch. The camera responses thus obtained are then used for spectral estimations using the same four different estimation methods, and the spectral and the colorimetric estimation errors are evaluated similarly as in the simulation.

Experimental results
The statistics of estimation errors obtained from the experiment with both the 6-channel and the 3-channel systems for all the four estimation methods and the three evaluation metrics are given in Table 2. We can see that all the four methods produce almost the similar results.  To illustrate the results graphically, the estimated spectral reflectance of the same 9 test patches used in the simulation above along with the measured reflectance are shown in Figure 7. Similarly, Figure 8 shows the estimated and measured reflectances of the same patches obtained with the 3-channel system.

Discussion on the results
We have investigated the proposed multispectral system with both the simulation and the real experiments. The simulation determines the optimal pair of filters from among 265 filters for a given camera setup. The results show that the selected optimal filter pairs depend on the evaluation metric used (GFC, RMS and E * ab ). This is quite expected as colorimetric optimization not necessarily optimize spectrally and vice versa; since more than one spectrum can produce the same color, the phenomenon known as metamerism. For a given camera setup and a selected metric, most of the estimation methods selected the same pair of filters. Even though some others selected the different pairs, we find that they are very similar in the type and the shape, and hence, all four methods produce similar performances. The results also show similar performances from both the spectral metrics GFC, and RMS.
The simulation results show that the proposed 6channel multispectral system outperforms classical 3channel camera systems, both spectrally and colorimetrically. The improvements are significant, for instance, with the increase in the mean GFC from 0.99 to 0.998, decrease in the RMS error from 0.031 down to 0.014 and decrease in the E * ab from 3.499 down to 0.4 in the case of Fujifilm 3D with the neural network method. The results are similar with the other camera systems and the estimation methods. It is to be noted that the improvement strictly depends on the choice of the filters; badly chosen filters may lead to the system which might fail to work better. The estimated spectral reflectances with the 6-channel system, as can be seen in the Figure 4, is significantly closer to the original ones compared to the estimation results in the case of 3-channel system shown in the Figure 5. The simulation results, thus, show promising results clearly indicating that the proposed system built with two RGB cameras or a stereo camera and a pair of appropriate filters can function well as a multispectral system. Encouraged by the promising results from the simulation, we performed real experiments for validation. As explained previously, the experiments have been carried out with the multispectral system built with the Fujifilm 3D camera and the optimal filter pair (XF2021-XF2030) selected by the simulation for minimum E * ab with the neural network method. Experimental results also show that the proposed 6-channel multispectral system consistently performs better than the 3-channel system both spectrally and colorimetrically in terms of mean metric values. As in the simulation, all four estimation methods produced better results for all three metrics with the 6channel multispectral system than with the 3-channel system. For instance, in case of Fujifilm 3D camera system, GFC is increased from 0.988 to 0.992, RMS is reduced from 0.063 down to 0.036, and E * ab is reduced from 9.126 down to 4.854 with the neural network method. All other estimation methods also produced similar results. The minimum value 4.733 of E * ab obtained with the PN method is still quite high and considerably higher than the simulation result. One reason could be the limited noise consideration in the simulation model, where we used the random shot noise and the quantization noise only, whereas in reality there could be many other noises that come into play in real cameras. We have investigated the influence of noise on the performance in the simulation with the Fujifilm 3D camera, and we found that E * ab increases almost linearly with the increase in the percentage of shot noise from 0 to 20%. Also, we have already seen some differences in the measured filter transmittances from the ones used in the simulation. In order to see the performance change, we have done simulation again this time   with the measured transmittances of the filters and this produces the E * ab of 1.428 with the same neural network method that produced the minimum value of 0.4 in the previous simulation. This also explains some higher values in the experimental results. We should note here that the performance of the system highly depends on the filters and their correct transmittance values. Moreover, we have to note that the Fujifilm 3D camera we used has limited control; there is no manual focus and the camera does not support the raw data. It has its own white balancing and interpolation algorithms. Even though we have used the fixed setting of the camera during the whole experiment including the characterization and all image acquisitions, the acquired images are still subject to built-in preprocessing and optical changes. This might also have influenced the results leading to higher estimation errors. We believe that the performance can surely be improved with more controllable camera.

Conclusion
In this paper, we have proposed a one-shot multispectral imaging system built with a stereo camera. The proposed system is simple to construct from commercial off-the-shelf digital cameras, and a pair of filters selected from readily available filters in the market. The system, therefore, could be a fast, practical and cheaper solution to multispectral imaging, useful in a variety of applications. Both the simulation and experimental results show that the proposed 6-channel multispectral system performs significantly better than the traditional 3-channel cameras both spectrally and colorimetrically. Moreover, stereo configuration allows acquiring stereo 3D images simultaneously along with the multispectral image, and this could be an interesting further work.