 Research
 Open Access
 Published:
HALSbased NMF with flexible constraints for hyperspectral unmixing
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 54 (2012)
Abstract
In this article, the hyperspectral unmixing problem is solved with the nonnegative matrix factorization (NMF) algorithm. The regularized criterion is minimized with a hierarchical alternating least squares (HALS) scheme. Under the HALS framework, four constraints are introduced to improve the unmixing accuracy, including the sumtounity constraint, the constraints for minimum spectral dispersion and maximum spatial dispersion, and the minimum volume constraint. The derived algorithm is called FNMF, for NMF with flexible constraints. We experimentally compare FNMF with different constraints and combined ones. We test the sensitivity and robustness of FNMF to many parameters such as the purity level of endmembers, the number of endmembers and pixels, the SNR, the sparsity level of abundances, and the overestimation of endmembers. The proposed algorithm improves the results estimated by vertex component analysis. A comparative analysis on real data is included. The unmixing results given by a geometrical method, the simplex identification via split augmented Lagrangian and the FNMF algorithms with combined constraints are compared, which shows the relative stability of FNMF.
1. Introduction
Airborne hyperspectral sensors collect images in hundreds of narrow and contiguous spectral bands. Due to the limited spatial resolution of hyperspectral image (HSI), each observed pixel generally contains more than one material spectral signature. Hence, the hyperspectral unmixing, which decomposes a mixed pixel into a combination of pure material spectra known as endmembers, weighted by their corresponding abundance coefficients, is a challenging task.
Let R (L × I) be the matrix unfolded HSI, whose I columns are the spectral pixels and the L rows are the vectorial spectral band images. As N is the related noise matrix, the linear spectral mixing model (LSMM) can be written as
The rows of S (J × I) are the abundance maps corresponding to the respective endmembers, whose spectra are located in the columns of A (L × J). J denotes the number of endmembers.
Basically, hyperspectral unmixing is a problem of blind source separation (BSS). However, compared with most BSS applications, the endmembers of HSI data are dependent and the elements in A and S are nonnegative, so the hyperspectral unmixing is beyond the reach of many BSS algorithms (e.g., independent component analysisICA) [1]. To fulfill these constraints, numerous special algorithms have been proposed to solve the hyperspectral unmixing problem under the LSMM assumption, including the approaches of convex geometry, Bayesian source separation, and nonnegative matrix factorization (NMF). The geometrical approaches first determine the endmembers and estimate the abundances in a second step, while the BSS and NMFbased approaches find the endmembers and the abundances simultaneously.
Geometrical approaches try to determine the vertices of the Jsimplex enclosing the observed pixels, such as pixel purity index (PPI) [2], NFINDR [3], vertex component analysis (VCA) [4]. The PPI algorithm projects every spectral vector onto skewers (large number of random vectors). The points corresponding to extremes, for each skewer direction, are stored and cumulated. The pixels with the highest scores are the purest ones. NFINDR finds the set of pixels defining the largest volume within the data. VCA iteratively projects data onto a direction orthogonal and the endmembers correspond to the extreme of the projections. The issue of these approaches is to find extreme points within the data with the assumption of pure pixel of each endmember, which is always unsatisfactory for real hyperspectral data. Recently, the stateofart reference algorithms MVSA [5], MVES [6], and the simplex identification via split augmented Lagrangian (SISAL) [7] have proposed various ways to find a minimum volume simplex, showing very good performances in the estimation of endmembers. Particularly, SISAL is able to unmix HSI data in the case of no pure pixel.
The geometrical approaches do not work well when the observed data are highly mixed, because there are not enough vectors in simplex facets. In these cases, the separation problem can be addressed in a Bayesian framework. Several Bayesian Positive Source Separation (BPSS) algorithms under positivity and sumtoone constraints have recently been developed [8–10]. In [10], a discussion on the effectiveness of the sumtoone constraint is given, showing that full constrained BPSS2 gives better results than BPSS for simulated data, while it is the contrary for the real OMEGA data, "due to nonlinearity in the radiative transfer and noise in the dataset in contradiction with the full additivity constraint". We think that it would be not the same with the proposed NMFbased algorithm, firstly because the full additivity is not a hard but a soft constraint, and second because the residual error RQE is able to represent measurement noise or model noise, and then the algorithm is quite robust for real data, which can contain nonlinear mixed terms. This can be seen by comparing the results of a geometrical algorithm like SISAL, very performant on simulated data, with the results obtained on Cuprite real data, which drops dramatically, while the NMFbased algorithms keep performing.
In the last decade, NMF has been a popular algorithm since Lee and Seung [11] investigated the properties of the algorithm and published some simple and useful algorithms for two types of factorizations. The NMF algorithm has broadly been used in text mining, image analysis, speech processing, and automatic control. The basic NMF problem consists of finding two nonnegative data matrices whose product approximates the mixed data in a chosen measure sense (e.g., the reconstruction quadratic errorRQE). However, the solution to NMF is not unique so various regularizations with prior knowledge should be taken into account to reduce the number of solutions. The sumtounity (STU) constraint is proposed in [12], which regularizes the RQE with a function of S to normalize the columns of it. The authors of [13] propose constraints based on two inherent characteristics of hyperspectral data: the spectral piecewise smoothness and spatial sparseness. In [14], a minimum volume constrained NMF (MVCNMF) based on projected gradient (PG) optimization method is proposed, whose regularization term minimizes the simplex volume spanned by the endmembers. Other authors [15] propose a minimum distance constrained NMF (MDCNMF), which consider the endmember distance instead of the volume of the estimated simplex. MDCNMF makes a slight modification of the optimized algorithm used for MVCNMF. MiniDisCo algorithm makes the assumption of minimum spectral dispersion for NMF regularization [16], and MDMDNMF regularizes with minimum spectral dispersion and maximum spatial dispersion [17]. A new stepsize estimation technique is proposed for the two algorithms to hasten the PG convergence.
The optimization algorithms and constraints on A and S are two main techniques for NMFbased hyperspectral unmixing. The authors of [18] propose a flexible hierarchical alternating least squares (HALS) algorithm with a set of local cost functions called alpha and beta divergences. The word "flexible" means the variation of the optimization algorithm. In this article, we propose an improved NMF algorithm with four constraints due to the characteristics of HSI, called the flexible NMF (FNMF). The word "flexible" means the variation of constraints on A and S. FNMF also uses the HALS update rules, significantly outperfoming the PG update rules in convergence speed. Actually, the novelty is both the combination of the constraints and the development of these constraints under HALSbased algorithm.
The rest of the article is organized as follows: Section 2 presents the basic NMF algorithm and the HALS update rules. In Section 3, we introduce four constraint functions and integrate them into the FNMF algorithm. In Section 4, the comparison and analysis of the FNMF with different constraints are given by processing various simulated HSIs. The algorithms are applied to real data in Section 5. The FNMF algorithms are compared with SISAL, for the two algorithms are both able to unmix hyperspectral data in which the pure pixel assumption is violated. Finally, some conclusion closes the article.
2. NMF for hyperspectral unmixing
In this section, we first present the NMF problem and then the optimization algorithm used to solve it in this article.
2.1. NMF problem
The aim of basis NMF methods is to find two estimated matrices $\stackrel{\u02c6}{\mathbf{A}}$ and $\stackrel{\u02c6}{\mathbf{S}}$ such that
A commonly used theoretical solution is to find nonnegative matrices minimizing the RQE
where ·_{ F } is the Frobenius (e.g., quadratic) norm.
2.2. HALS algorithms
In [19], the authors show that the HALS scheme works remarkably well in practice, outperforming, in most cases, the other optimization algorithms for NMF. In particular, it is proved to be locally more efficient [20] and shown to converge to a stationary point under some mild assumptions [21]. For these reasons, we choose HALS as the optimization technique.
The basic idea of HALS is to define residues as
for k = 1, 2,..., J. A_{ k } (L × 1) is one endmember spectrum and S_{ k } (1 × I) corresponds to its abundance fraction.
By substituting Equation (4) into (3), the new RQE function is
The gradients of the above function are expressed by
By setting the above equation to zero, the updating rules are obtained:
For k = 1, 2,..., J, where [δ][_{0,1]} is to enforce every element δ_{ ij } lies in [0,1], so
Clearly, the HALS algorithm is boundconstrained. It is also shown that the optimal value of each entry of A (A_{ k }) does not depend on the other entries of the same column. By symmetry, the same property holds for each row of S (S_{ k }). Thus, the detailed HALS algorithm is summarized as follows:

(1)
Initialize A and S with the VCA algorithm;

(2)
for i = 1, 2,..., do
for k = 1, 2,..., J
Update A_{ k } and S_{ k } with the HALS update rules;
end
until the stop criteria is reached
The simplest update rules are given in Equation (7), and the regularized f with all constraints will be proposed in Equation (18). The maximum number of iterations is always set high (e.g., 2000) to obtain accurate estimations. However, the overestimation of the iteration number induces time waste. Indeed, the RQE value slightly increases from certain iteration whereas the regularized f keeps decreasing. Thus, the algorithm is stopped at this iteration when the RQE value goes to a minimum although the highest iteration number is not reached. The stop criteria is expressed as
3. NMF with flexible constraints
The basic NMF optimized function ensures that the two constraints A and S are both nonnegatives. Since the NMF solution is not unique, some prior knowledge on HSIs can be introduced to regularize the problem. A generic expression for the optimized function is
where σ, α, and β are regularized parameters for the estimation error and the spectral and abundance constraints. D(A,S) measures the difference between X and AS with respect to some norms. By substituting Equation (4) into (8) and using the RQE norm, the new optimized function f is
In this section, we add four constraints for A and S to the function to improve the unmixing result. With all these constraints, the algorithm is called flexible NMF (FNMF), based on HALS update rules.
3.1. STU constraint
The STU constraint makes the sums of the columns of S equal to 1. The STU constraint is defined as follows:
where 1_{1I}is an (1 × I) vector of ones. The gradient derivation of D_{1} with respect to S_{ k } is
3.2. Maximum spatial dispersion constraint
In real situations, abundance matrix is often very sparse because the materials are mostly grouped in separate regions even if the pure pixels. We note that reducing the data enclosing simplex volume is equivalent to increase the dispersion of the abundances fractions in the sumtoone constrained subspace enclosing the abundances. Actually, the most impossible situation is the uniformly mixed data. Therefore, as the mean value of abundances is 1/J, we defined the maximum spatial dispersion constraint as follows:
This constraint encourages null abundance pixels and full pixels, as in real scenes, not all the endmembers are present in all pixels, and in contrast some pixels contain only one material. The gradient derivation of D_{2} with respect to S_{ k } is
3.3. Minimum spectral dispersion constraint
This constrained function depends on A, encouraging the variance of each endmember spectrum to be as low as possible. This dispersion constraint is to improve the shape estimation of flat endmember spectra. Consequently, if the estimation of some spectra is improved, the estimation of the other spectra involved in the mixture will also indirectly be improved due to the parameter interdependences. We define the minimum spectral dispersion constraint as
The gradient derivation of D_{1} with respect to A_{ k } is
3.4. Minimum distance constraint
In MVCNMF [9], the volume of A is calculated as the constraint, which suffers from numerical instabilities [11]. Here, we choose the minimum distance constraint as a substitute in order to shrink the volume of the data enclosing the simplex. The distance is measured and summed up from every endmember to their centroid. This constraint is defined as
The gradient derivation of D_{2} with respect to A_{ k } is
The final FNMF update rules to minimize f with all these considerations are derived from (6), (9), (11), (13), (15), and (17). Thus
4. Simulations on synthetic data
In this section, we present a batch of simulations to quantitatively compare the FNMF algorithms with different constraints. First, we present the used evaluation metrics. Then, we present the way we build simulated data. Finally, the experimental results of five FNMF algorithms are given.
4.1. Evaluation metrics

(1)
To evaluate the abundance estimation, we define the abundance mean squared error (AME) as
$$\mathsf{\text{AME}}\left(\widehat{\mathbf{S}},\mathbf{S}\right)=\frac{1}{JI}{\u2225\widehat{\mathbf{S}}\mathbf{S}\u2225}_{F}^{2}$$(19) 
(2)
To evaluate the endmember spectra estimation, we define the spectral mean squared error (SME) as
$$\mathsf{\text{SME}}\left(\widehat{\mathbf{A}},\mathbf{A}\right)=\frac{1}{LJ}{\u2225\widehat{\mathbf{A}}\mathbf{A}\u2225}_{F}^{2}$$(20) 
(3)
To consider the global shape of the spectra, the spectral angle distance (SAD) is defined as
$$\mathsf{\text{SAD}}\left(\widehat{\mathbf{a}},\mathbf{a}\right)={cos}^{1}\left(\frac{{\mathbf{a}}^{T}\widehat{\mathbf{a}}}{\sqrt{{\widehat{\mathbf{a}}}^{T}\widehat{\mathbf{a}}}\sqrt{{\mathbf{a}}^{T}\mathbf{a}}}\right)$$(21)
where a is the true spectral vector and $\widehat{\mathbf{a}}$is its estimate.
4.2. Synthetic data
The HSI synthesis process is in three steps corresponding to the matrices A, S, and the noise matrix N.
First, the J endmember spectra are randomly selected among the U.S. Geological Survey (USGS) spectral library. The selected 224channel spectra constitute the columns of the matrix A.
Then, the Jelement column vector in S is generated following a Dirichlet pdf, with parameters equal to 1. The element maximal value of each column is controlled by a threshold ξ (0 <ξ ≤ 1). This operation allows one to control the mixing or purity level of the data. In particular, the image can contain "pure" pixels when ξ = 1. We also introduce a sparsity parameter ι (ι > 0), which controls the sparsity of S. If ι is set at 0.8, 20% of the J × I elements in S are selected randomly and set to zeros at first, and then the nonzero elements in each column vector of S are generated following the Dirichlet pdf with the STU constraint and the maximal threshold ξ.
Finally, we add a noise matrix N, assumed to be zeromean white Gaussian. The noise is characterized by the SNR
where σ^{2} is its variance.
Therefore, a synthetic HSI is characterized by J, the randomly selected endmember spectra, I, ξ, ι, and the SNR. The default configuration is given in Table 1.
4.3. Compared algorithms
In our simulations, we compare the FNMF algorithms with different constraints and the typical geometrical and Bayesian algorithms. All the algorithms are used with the same initial and stop conditions.

(1)
F1NMF: the basic HALSNMF with no extra constraint based on HALS optimization algorithm. Only the nonnegative constraints are guaranteed.

(2)
F2NMF: the HALSNMF is improved with the STU constraint.

(3)
F3NMF: the HALSNMF with the STU and maximum spatial dispersion constraints.

(4)
F4NMF: the HALSNMF with the STU and minimum spectral dispersion constraints.

(5)
F5NMF: the HALSNMF with the STU and minimum distance constraints.

(6)
F35NMF: the HALSNMF with the combined constraints of F3 and F5.

(7)
VCA: a popular geometrical algorithm proposed in [4].

(8)
BPSS2: an improved Bayesian algorithm addressed in [9] under nonnegativity and full additivity constraints.

(9)
MiniDisCo: a novel NMFbased algorithm with spectral constraint given in [16].
Note that the initializations of A and S for all the algorithms are chosen from a uniform distribution on the interval [0,1].
4.4. Simulations
The first simulation shows the behavior of the objective f function along the interactions of two optimization algorithms. Experiments 27 present statistical simulations to compare the average behaviors of the five FNMF algorithms while varying the parameters given in Table 1, and robustness to an overestimation of the endmember number J.

(1)
The first experiment is to assess the choice of the optimization algorithm. We compare the convergence efficiency between the PG, which is widely used for NMF optimization, and the HALS algorithm. Here, the PG and HALS algorithms are regularized with the minimum spectral dispersion constraint. The PGbased algorithm is named MiniDisCo in [11], and HALSbased algorithm in this experiment is also called F4NMF as above. The f value is calculated with the corresponding constraints and the performances of the two estimators are presented in Figure 1. Note that both the curves result from the same HSI, with the same random initial conditions; thus, the only variability is the optimization method. We note that the final value of f is almost the same with both algorithms, whereas the convergence speed of HALS is faster.
The following experiments 27 present the behaviors of the FNMF algorithms with different constraints while varying the parameters summed up in Table 1. The unmixing results are evaluated by AME, SME, and SAD. The presented results are averages (bars) and standard deviations (error bars) resulting from 20 experiments. Note that all the considered algorithms are compared on the same sets of 20 HSIs. We perform preliminary Monte Carlo simulations to find relevant values for the regularized parameters. The retained values are α_{1} = 1, α_{2} = 0.1, β_{1} = 0.1, and β_{2} = 0.1, which are chosen to minimize the average evaluation errors for synthetic data.

(2)
In the second experiment, the algorithms are compared when the number of endmembers J varies.
In this experiment, we first test the efficiency of the algorithms. The FNMF algorithms are compared with the PGNMF with no constraint except positivity as F1NMF, and J is set from 3 to 10 as the experiment in [11]. The performance metrics of SME are shown in Figure 2. Note that the considered statistics do not necessarily include each of the results. Here, SME values higher than 0.5 are not included. In particular, the PGNMF results are never considered while the FNMF results are all included, because the SME values of PGNMF are always greater than 0.5. With NMF algorithms, only a local minimum can be attained in general. In the case of random initializations and no constraints, HALS is able to obtain a better solution than PG.
Then, we set J higher to 20 to test the performance of FNMF. The performance metrics are shown in Figure 3. Note that the F1NMF without constraints performs worse as the number of endmembers increases. In the case of the constrained FNMF (F2, F3, F4, F5, F35), the results are much better. Figure 3 puts forward the high robustness of the constrained FNMF algorithms, when the basic FNMF is sensitive to the number of endmembers. The combination of constraints F3 and F5, F35, gives good results.

(3)
The purity level ξ is the topic of the third experiment. None of the considered algorithms are based on the hypothesis of one pure pixel for each endmember, but the unmixing performance may vary with the purity. The obtained performance metrics are presented in Figure 4. F3NMF is particularly worse for AME when ξ = 0.6, because the low purity level make the maximum spatial dispersion constraint ineffective. F35NMF also performs worse in term of AME due to the maximum spatial dispersion constraint.
Figure 5 shows a comparison of the proposed F35NMF algorithm with the geometrical method (e.g., VCA), a BSS algorithm (e.g., BPSS2) and another NMFbased algorithm (e.g., MiniDisCo), with the variation of ξ. The two NMFbased algorithms and BPSS2 are each initialized with VCA. The parameter of the spectral constraint is 0.1 for MiniDisCo. VCA performs better with higher purity level due to its assumption of pure pixels. MiniDisCo and F35NMF both improve the unmixing results of VCA. Specifically, MiniDisCo and BPSS2 outperform F35NMF in the sense of AME but the result is quite the reverse in the sense of SME, which is caused by different constraints in MiniDisCo and F35NMF. In the sense of AME, F35NMF performs worse as the purity level decreases, because the algorithm is regularized by the maximum spatial dispersion constraint, which improves the values of AME for the mixing data with high purity level. This could be verified by the results shown in Figure 4a. The algorithms with the maximum spatial dispersion constraint (F3NMF and F35NMF) give worse results than the other algorithms (F4NMF and F5NMF). We choose F35NMF for comparison due to its better performances in SME and SAD. In the sense of SAD, MiniDisCo is better than F35NMF with lower purity level, but F35NMF performs better with purity level lower than 0.7. The performance of BPSS2 is always worse. This may be resulted by the minimum distance constraint in F35NMF, which plays an important role in the unmixing of highlymixed data.
As it can be noted, the results can vary for the various metrics, i.e., some algorithms can be efficient for spectral estimation, and not for abundances, and vice versa. We have chosen to keep the three metrics for the complement of information they bring. A small SAD indicates very similar spectral shapes, and is not sensitive to a scale factor, while SME also depends on the values and is sensitive to a scale factor. For abundances, only the values are relevant, because the sumtoone constraint sets the scale factor. In one sense, SAD is the more meaningful metric, used to identify endmembers from spectral libraries.

(4)
The fourth experiment studies the robustness to noise of the considered algorithms. The metric values obtained for various SNR are shown in Figure 6. The FNMF algorithms are all based on the RQE minimization, which is optimal for white Gaussian noise. Thus, the performances do not significantly depend on the noise. In accordance with the experiment 3, the F3NMF and F35NMF results are not good in AME, but better in the terms of SME and SAD.

(5)
It is interesting to study the estimation quality in terms of the data spatial dimensions. Figure 7 presents the influence of the number of observed spectral pixels. The FNMF algorithms are both robust to a small number of spectral pixels and a large amount of data. It is interesting to see that a small number of spectral pixels globally improve the performances of the regularized NMF. The F4NMF and F5NMF outperform the other algorithms in AME, but the results of F3NMF and F35NMF algorithms are better in the terms of SME and SAD. In general, a large data set does not improve the results, so it is more efficient to use a small set of data (400 pixels).

(6)
This experiment tests the influence of the sparsity parameter ι. The results are presented in Figure 8. All the algorithms are not very sensitive to the sparsity parameter. The F4NMF and F5NMF outperform the other algorithms in AME, and the maximum spatial dispersion constraint brings improvement in SME and SAD.

(7)
Estimating the endmember number J is the first issue of the HSI analysis. On real data, existing methods to estimate J generally overestimate the number [22]. Thus, we study the robustness of the algorithms to an overestimation of J (Figure 9). Here, we overestimate J by 1. The estimation errors show that constrained FNMF algorithms are robust to an overestimation of J, while the basic FNMF is sensitive to the number of endmembers.
The following conclusions can be drawn from these experiments:

(1)
The optimized algorithm of HALS outperforms PG in convergence speed and efficiency. In [11], poor estimations due to local minimum affect the basic PGNMF, so the estimated values of SME higher than 0.5 are not included in the statistics. In FNMF, the estimation performance is much better so all the results of the experiments are considered.

(2)
The performances of constrained FNMF are better than the basic NMF, according to all the parameters (J, ξ, SNR, I, and ι) and the different performance metrics.

(3)
The NMF algorithms with minimum spectral dispersion constraint (F4NMF) and minimum distance constraint (F5NMF) performs better in AME, while the algorithms with maximum spatial dispersion constraint (F3NMF and F35NMF) outperform the other algorithms in the terms of SME and SAD.

(4)
The NMF algorithm with combined constraints (F35NMF) performs better than the algorithm with one constraint (F3NMF).

(5)
FNMF algorithm can improve the unmixing results initialized by VCA.
5. Application on real hyperspectral data
We have applied the five FNMF algorithms on a hyperspectral scene captured by the AVIRIS sensor. This sensor has a 20m spatial resolutions and a 10nm spectral resolution and acquires 224 spectral bands between 0.4 and 2.5 μm. The analyzed reflectance image is a 99 × 99 pixel selection of the Cuprite geological data. A RGB representation of the scene is shown in Figure 10. Some spectral bands have been removed due to noise corruption and atmosphere absorption, and only the data of the remaining 188 bands have been used. In this section, we choose SISAL as the compared algorithm because it is able to deal with the unmixing problem without the pure pixel assumption as the NMF algorithms.
It is required to estimate the number of endmembers J before unmixing the image. In this article, the number of endmembers is determined from the final RQE obtained after convergence for many preliminary experiences, and is set to J = 11; however, this value is only an approximation.
To improve the algorithm performances, we run the five FNMF algorithms with the VCA initializations [4] to obtain better local solutions. The estimated endmembers are associated with the closest ones contained in the USGS library in the SAD sense. To evaluate the stability of the algorithms and the ability to find a unique solution, we make 50 runs for each FNMF algorithm and keep the 11 estimated endmembers at each run. In each run, a new HSI synthetic data is generated with the same parameters (J, I, ξ, ι, SNR), when the endmembers are selected randomly from the library. We should obtain the same 11 identified references in each experiment. However, the results vary in the 50 experiments. In order to compare the results with a minimum volumebased algorithm, we choose SISAL for its good performances on simulated data and its high speed. Note that the FNMF and SISAL algorithms are all based on the assumption that the endmembers, or at least some of them, are not in the data set. The references identified by FNMF are presented in Tables 2, 3, 4, 5, 6, and 7 and the results by SISAL in Table 8. The estimated endmembers are identified as the closest library spectra in the sense of SAD. It can be seen from the tables that F3NMF gives 77 names for a total of 550 possible different answers, whereas the other four FNMFs give much more references. The top 11 responses of F3NMF and F35NMF represent 66.7 and 68%, respectively, of all the answers. All these results show the stability of F3NMF, due to the maximum spatial dispersion constraint. From Table 8, we can see that the SISAL identifies 146 names from 550 possible answers, which shows its serious instability. Therefore, the FNMF algorithms are more stable than SISAL. Otherwise, the mean SAD between the estimated endmembers and the closest references in the library is significantly lower with FNMF than SISAL, so we can conclude that the FNMF algorithms are more efficient in endmember identification for difficult real cases.
Figures 11 and 12 give the estimated endmember spectra and abundance maps by F35 in one experiment. The endmember spectra resulting from the F35NMF analysis in one experiment are shown in Figure 11a. In this figure, the ycoordinate tick (from j = 1 to J) corresponds to zero reflectance of the j th endmember. The associated spectral endmembers are the closest library spectra (Figure 11b) in the sense of SAD. Note that 3, 7, 8, and 10 spectra are all identified as Kaolin/Smect KLF508 85%K, whose proportion is 26% the first in Table 7. This is the reason of the low identification dispersion of F35NMF in 50 runs. The estimated abundance maps are given in Figure 12, where the maximum abundance value ξ_{ j } of each endmember j is high due to the maximum spatial dispersion constraint.
We compare the references identified by FNMF and SISAL with the available ground truth of the Cuprite scene from the website [23]. In Tables 2, 3, 4, 5, 6, 7, and 8, the identified results, which appear in the groundtruth list, are highlighted. Each of the considered algorithms only can identify two or three groundtruth minerals. In particular, FNMF and SISAL algorithms all detect Kaolin, and Goethite is detected by F1NMF, F2NMF, F4NMF, and F5NMF. In addition, Nontronite is detected by F2NMF, F4NMF and F5NMF, and F35NMF detects Montmorillonite and SISAL detects Hematite. The identified results illustrate the difficulty of the unmixing problem for real data. Three reasons can be explained as follows:

(1)
The analyzed Cuprite data are only a selection of the whole scene, which holds 18 endmembers; thus, the unmixing results are also incomplete.

(2)
It is difficult to find the right spectra in the considered library with a huge amount of references (500). Some prior knowledge should be used to reduce the number of references before the comparison.

(3)
We use a linear mixing model in this article, but the radiative transfer is always nonlinear in real scene [10].

(4)
It is subjective to identify the endmembers with SAD. A more robust method for identification should make the decision jointly with several criteria. Moreover, the variability of real spectra has made their identification from library more difficult.
Finally, it is important to analyze the computation time of the FNMF algorithms. Under the Matlab environment and 3 GHz CPU, the computation times for an iteration of the FNMF algorithms with the real data (99 × 99 pixels) are shown in Table 9. It is clear that the algorithms with spectral constraints (F4, F5, F35) are more timeconsuming due to the computation of matrix inversion. If the number of iteration is more than a thousand, the running of any FNMF algorithm will cost a few minutes. In the case of computation cost, geometrical methods (e.g., VCA and SISAL) are fast and efficient, while the NMFbased methods are always slow.
6. Conclusion
In this article, we have proposed an NMFbased hyperspectral unmixing algorithm with flexible constraints, including the STU constraint, the maximum spatial dispersion constraint, the minimum spectral dispersion constraint, and the minimum distance constraint. The optimization scheme is based on the HALS, whose convergence speed outperforms that of PG. The resulting algorithm, called FNMF, is experimentally tested with different constraints. The estimation accuracy shows that the FNMF works stably in all experiments, overcoming the estimation instability of PGNMF. In particular, the FNMF algorithms are robust to high number of endmembers, low SNR, low number of observed pixels and overestimation of the number of endmembers.
The FNMF algorithms seem to be effective in the estimation of abundance maps, since they consider the STU and maximum spatial dispersion constraints. The identified references of real data by FNMF seem more stable and reliable than geometrical method like SISAL. However, the identified results of real data are unsatisfied so the identification method needs further investigation to improve the results.
References
 1.
Nascimento JMP, Dias JMB: Does independent component analysis play a role in unmixing hyperspectral data? IEEE Trans Geosci Remote Sens 2005, 43(1):175187.
 2.
Theiler J, Lavenier D, Harvey N, Perkins S, Szymanski J: Using blocks of skewers for faster computation of pixel purity index. Volume 4132. Proc. of the SPIE International Conference on Optical Science and Technology, Bellingham, WA; 2000:6171.
 3.
ME Winter: Nfindr: an algorithm for fast autonomous spectral endmember determination in hyperspectral data. In Proc SPIE Conf Imaging Spectrometry V. Volume 3753. Denver, Colorado; 1999:266275.
 4.
Nascimento J, Dias J: Vertex component analysis: a fast algorithm to unmix hyperspectral data. IEEE Trans Geosci Remote Sens 2005, 43(4):898908.
 5.
Li J, BioucasDias JM: Minimum volume simplex analysis: a fast algorithm to unmix hyperspectral data. Proc IEEE IGARSS, Boston 2008, 3: 250253.
 6.
Chan TH, Chi CY, Huang YM, Ma WK: A convex analysisbased minimumvolume enclosing simplex algorithm for hyperspectral unmixing. IEEE Trans Signal Process 2009, 57(11):44184432.
 7.
BioucasDias JM: A variable splitting augmented Lagrangian approach to linear spectral unmixing. In Proc WHISPERS. Volume 1. Grenoble, France; 2009:14.
 8.
Moussaoui S, Brie D, MohammadDjafari A, Carteret C: Separation of nonnegative mixture of nonnegative sources using a Bayesian approach and MCMC sampling. IEEE Trans Signal Process 2006, 54(11):41334145.
 9.
Dobigeon N, Moussaoui S, Tourneret JY, Carteret C: Bayesian separation of spectral sources under nonnegativity and full additivity constraints. Signal Process 2009, 89(12):26572669. 10.1016/j.sigpro.2009.05.005
 10.
Schmidt F, Schmidt A, Tréguier E, Guiheneuf M, Moussaoui S, Dobigeon N: Implementation strategies for hyperspectral unmixing using Bayesian source separation. IEEE Trans Geosci Remote Sens 2010, 48(11):40034013.
 11.
Lee DD, Seung HS: Learning the parts of objects by nonnegative matrix factorization. Nature 1999, 401(6755):788791. 10.1038/44565
 12.
Heinz DC, Chang CI: Fully constrained least squares linear spectral mixture analysis method for material quantification in hyperspectralimagery. IEEE Trans Geosci Remote Sens 2001, 39(3):529545. 10.1109/36.911111
 13.
Jia S, Qian Y: Constrained nonnegative matrix factorization for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 2009, 47(1):161173.
 14.
Miao L, Qi H: Endmember extraction from highly mixed data using minimum volume constrained nonnegative matrix factorization. IEEE Trans Geosci Remote Sens 2007, 45(3):765777.
 15.
Yu Y, Guo S, Sun WD: Minimum distance constrained nonnegative matrix factorization for the endmember extraction of hyperspectral images. Proc SPIE, MIPPR, Wuhan, China 2007, 6790: 14.
 16.
Huck A, Guillaume M, BlancTalon J: Minimum dispersion constrained nonnegative matrix factorization to unmix hyperspectral data. IEEE Trans Geosci Remote Sens 2010, 48(6):25902602.
 17.
Huck A, Guillaume M: Robust hyperspectral data unmixing with spatial and spectral regularized NMF. Proc WHISPERS, Reykjavik, Iceland 2010, 2: 14.
 18.
Cichocki A, Phan AH, Caiafa C: Flexible HALS algorithms for sparse nonnegative matrix/tensor factorization. IEEE Workshop on Machine Learning for Signal Processing, Cancun, Mexico 2008, 4(4):7378.
 19.
Gillis N, Glineur F: Using underapproximations for sparse nonnegative matrix factorization. Pattern Recogn 2010, 43(4):16761687. 10.1016/j.patcog.2009.11.013
 20.
Gillis N, Glineur F: Nonnegative factorization and the maximum edge biclique problem. 2008.
 21.
Ho ND: Nonnegative matrix factorizationalgorithms and applications. PhD thesis. Universite catholique de Louvain; 2008.
 22.
Chang CI, Du Q: Estimation of number of spectrally distinct signal sources in hyperspectral imagery. IEEE Trans Geosci Remote Sens 2004, 42(3):608619. 10.1109/TGRS.2003.819189
 23.
[http://speclab.cr.usgs.gov/PAPERS/cuprite.gr.truth.1992/swayze.1992.html]
Acknowledgements
The authors would like to thank S. Moussaoui for letting us do the benchmarks with the code of his BPSS algorithms.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Chen, W., Guillaume, M. HALSbased NMF with flexible constraints for hyperspectral unmixing. EURASIP J. Adv. Signal Process. 2012, 54 (2012). https://doi.org/10.1186/16876180201254
Received:
Accepted:
Published:
Keywords
 hyperspectral unmixing
 nonnegative matrix factorization (NMF)
 hierarchical alternating least squares (HALS)
 constraint