HALS-based NMF with flexible constraints for hyperspectral unmixing

In this article, the hyperspectral unmixing problem is solved with the nonnegative matrix factorization (NMF) algorithm. The regularized criterion is minimized with a hierarchical alternating least squares (HALS) scheme. Under the HALS framework, four constraints are introduced to improve the unmixing accuracy, including the sum-to-unity constraint, the constraints for minimum spectral dispersion and maximum spatial dispersion, and the minimum volume constraint. The derived algorithm is called F-NMF, for NMF with flexible constraints. We experimentally compare F-NMF with different constraints and combined ones. We test the sensitivity and robustness of F-NMF to many parameters such as the purity level of endmembers, the number of endmembers and pixels, the SNR, the sparsity level of abundances, and the overestimation of endmembers. The proposed algorithm improves the results estimated by vertex component analysis. A comparative analysis on real data is included. The unmixing results given by a geometrical method, the simplex identification via split augmented Lagrangian and the F-NMF algorithms with combined constraints are compared, which shows the relative stability of F-NMF.


Introduction
Airborne hyperspectral sensors collect images in hundreds of narrow and contiguous spectral bands. Due to the limited spatial resolution of hyperspectral image (HSI), each observed pixel generally contains more than one material spectral signature. Hence, the hyperspectral unmixing, which decomposes a mixed pixel into a combination of pure material spectra known as endmembers, weighted by their corresponding abundance coefficients, is a challenging task.
Let R (L × I) be the matrix unfolded HSI, whose I columns are the spectral pixels and the L rows are the vectorial spectral band images. As N is the related noise matrix, the linear spectral mixing model (LSMM) can be written as The rows of S (J × I) are the abundance maps corresponding to the respective endmembers, whose spectra are located in the columns of A (L × J). J denotes the number of endmembers.
Basically, hyperspectral unmixing is a problem of blind source separation (BSS). However, compared with most BSS applications, the endmembers of HSI data are dependent and the elements in A and S are nonnegative, so the hyperspectral unmixing is beyond the reach of many BSS algorithms (e.g., independent component analysis-ICA) [1]. To fulfill these constraints, numerous special algorithms have been proposed to solve the hyperspectral unmixing problem under the LSMM assumption, including the approaches of convex geometry, Bayesian source separation, and nonnegative matrix factorization (NMF). The geometrical approaches first determine the endmembers and estimate the abundances in a second step, while the BSS and NMF-based approaches find the endmembers and the abundances simultaneously.
Geometrical approaches try to determine the vertices of the J-simplex enclosing the observed pixels, such as pixel purity index (PPI) [2], N-FINDR [3], vertex component analysis (VCA) [4]. The PPI algorithm projects every spectral vector onto skewers (large number of random vectors). The points corresponding to extremes, for each skewer direction, are stored and cumulated. The pixels with the highest scores are the purest ones. N-FINDR finds the set of pixels defining the largest volume within the data. VCA iteratively projects data onto a direction orthogonal and the endmembers correspond to the extreme of the projections. The issue of these approaches is to find extreme points within the data with the assumption of pure pixel of each endmember, which is always unsatisfactory for real hyperspectral data. Recently, the state-of-art reference algorithms MVSA [5], MVES [6], and the simplex identification via split augmented Lagrangian (SISAL) [7] have proposed various ways to find a minimum volume simplex, showing very good performances in the estimation of endmembers. Particularly, SISAL is able to unmix HSI data in the case of no pure pixel.
The geometrical approaches do not work well when the observed data are highly mixed, because there are not enough vectors in simplex facets. In these cases, the separation problem can be addressed in a Bayesian framework. Several Bayesian Positive Source Separation (BPSS) algorithms under positivity and sum-to-one constraints have recently been developed [8][9][10]. In [10], a discussion on the effectiveness of the sum-to-one constraint is given, showing that full constrained BPSS2 gives better results than BPSS for simulated data, while it is the contrary for the real OMEGA data, "due to nonlinearity in the radiative transfer and noise in the dataset in contradiction with the full additivity constraint". We think that it would be not the same with the proposed NMFbased algorithm, firstly because the full additivity is not a hard but a soft constraint, and second because the residual error RQE is able to represent measurement noise or model noise, and then the algorithm is quite robust for real data, which can contain nonlinear mixed terms. This can be seen by comparing the results of a geometrical algorithm like SISAL, very performant on simulated data, with the results obtained on Cuprite real data, which drops dramatically, while the NMF-based algorithms keep performing.
In the last decade, NMF has been a popular algorithm since Lee and Seung [11] investigated the properties of the algorithm and published some simple and useful algorithms for two types of factorizations. The NMF algorithm has broadly been used in text mining, image analysis, speech processing, and automatic control. The basic NMF problem consists of finding two nonnegative data matrices whose product approximates the mixed data in a chosen measure sense (e.g., the reconstruction quadratic error-RQE). However, the solution to NMF is not unique so various regularizations with prior knowledge should be taken into account to reduce the number of solutions. The sum-to-unity (STU) constraint is proposed in [12], which regularizes the RQE with a function of S to normalize the columns of it. The authors of [13] propose constraints based on two inherent characteristics of hyperspectral data: the spectral piecewise smoothness and spatial sparseness. In [14], a minimum volume constrained NMF (MVC-NMF) based on projected gradient (PG) optimization method is proposed, whose regularization term minimizes the simplex volume spanned by the endmembers. Other authors [15] propose a minimum distance constrained NMF (MDC-NMF), which consider the endmember distance instead of the volume of the estimated simplex. MDC-NMF makes a slight modification of the optimized algorithm used for MVC-NMF. MiniDisCo algorithm makes the assumption of minimum spectral dispersion for NMF regularization [16], and MDMD-NMF regularizes with minimum spectral dispersion and maximum spatial dispersion [17]. A new step-size estimation technique is proposed for the two algorithms to hasten the PG convergence.
The optimization algorithms and constraints on A and S are two main techniques for NMF-based hyperspectral unmixing. The authors of [18] propose a flexible hierarchical alternating least squares (HALS) algorithm with a set of local cost functions called alpha and beta divergences. The word "flexible" means the variation of the optimization algorithm. In this article, we propose an improved NMF algorithm with four constraints due to the characteristics of HSI, called the flexible NMF (F-NMF). The word "flexible" means the variation of constraints on A and S. F-NMF also uses the HALS update rules, significantly outperfoming the PG update rules in convergence speed. Actually, the novelty is both the combination of the constraints and the development of these constraints under HALS-based algorithm.
The rest of the article is organized as follows: Section 2 presents the basic NMF algorithm and the HALS update rules. In Section 3, we introduce four constraint functions and integrate them into the F-NMF algorithm. In Section 4, the comparison and analysis of the F-NMF with different constraints are given by processing various simulated HSIs. The algorithms are applied to real data in Section 5. The F-NMF algorithms are compared with SISAL, for the two algorithms are both able to unmix hyperspectral data in which the pure pixel assumption is violated. Finally, some conclusion closes the article.

NMF for hyperspectral unmixing
In this section, we first present the NMF problem and then the optimization algorithm used to solve it in this article.

NMF problem
The aim of basis NMF methods is to find two estimated matricesÂ andŜ such that X ÂŜ

(2)
A commonly used theoretical solution is to find nonnegative matrices minimizing the RQE where ||·|| F is the Frobenius (e.g., quadratic) norm.

HALS algorithms
In [19], the authors show that the HALS scheme works remarkably well in practice, outperforming, in most cases, the other optimization algorithms for NMF. In particular, it is proved to be locally more efficient [20] and shown to converge to a stationary point under some mild assumptions [21]. For these reasons, we choose HALS as the optimization technique.
The basic idea of HALS is to define residues as for k = 1, 2,..., J. A k (L × 1) is one endmember spectrum and S k (1 × I) corresponds to its abundance fraction.
By substituting Equation (4) into (3), the new RQE function is The gradients of the above function are expressed by By setting the above equation to zero, the updating rules are obtained: For k = 1, 2,..., J, where [δ][ 0,1] is to enforce every element δ ij lies in [0,1], so Clearly, the HALS algorithm is bound-constrained. It is also shown that the optimal value of each entry of A (A k ) does not depend on the other entries of the same column. By symmetry, the same property holds for each row of S (S k ). Thus, the detailed HALS algorithm is summarized as follows: (1) Initialize A and S with the VCA algorithm; (2) for i = 1, 2,..., do for k = 1, 2,..., J Update A k and S k with the HALS update rules; end until the stop criteria is reached The simplest update rules are given in Equation (7), and the regularized f with all constraints will be proposed in Equation (18). The maximum number of iterations is always set high (e.g., 2000) to obtain accurate estimations. However, the overestimation of the iteration number induces time waste. Indeed, the RQE value slightly increases from certain iteration whereas the regularized f keeps decreasing. Thus, the algorithm is stopped at this iteration when the RQE value goes to a minimum although the highest iteration number is not reached. The stop criteria is expressed as

NMF with flexible constraints
The basic NMF optimized function ensures that the two constraints A and S are both nonnegatives. Since the NMF solution is not unique, some prior knowledge on HSIs can be introduced to regularize the problem. A generic expression for the optimized function is where σ, α, and β are regularized parameters for the estimation error and the spectral and abundance constraints. D(A,S) measures the difference between X and AS with respect to some norms. By substituting Equation (4) into (8) and using the RQE norm, the new optimized function f is In this section, we add four constraints for A and S to the function to improve the unmixing result. With all these constraints, the algorithm is called flexible NMF (F-NMF), based on HALS update rules.

STU constraint
The STU constraint makes the sums of the columns of S equal to 1. The STU constraint is defined as follows: where 1 1I is an (1 × I) vector of ones. The gradient derivation of D 1 with respect to S k is

Maximum spatial dispersion constraint
In real situations, abundance matrix is often very sparse because the materials are mostly grouped in separate regions even if the pure pixels. We note that reducing the data enclosing simplex volume is equivalent to increase the dispersion of the abundances fractions in the sum-to-one constrained subspace enclosing the abundances. Actually, the most impossible situation is the uniformly mixed data. Therefore, as the mean value of abundances is 1/J, we defined the maximum spatial dispersion constraint as follows: This constraint encourages null abundance pixels and full pixels, as in real scenes, not all the endmembers are present in all pixels, and in contrast some pixels contain only one material. The gradient derivation of D 2 with respect to S k is

Minimum spectral dispersion constraint
This constrained function depends on A, encouraging the variance of each endmember spectrum to be as low as possible. This dispersion constraint is to improve the shape estimation of flat endmember spectra. Consequently, if the estimation of some spectra is improved, the estimation of the other spectra involved in the mixture will also indirectly be improved due to the parameter interdependences. We define the minimum spectral dispersion constraint as The gradient derivation of D 1 with respect to A k is

Minimum distance constraint
In MVC-NMF [9], the volume of A is calculated as the constraint, which suffers from numerical instabilities [11]. Here, we choose the minimum distance constraint as a substitute in order to shrink the volume of the data enclosing the simplex. The distance is measured and summed up from every endmember to their centroid. This constraint is defined as The gradient derivation of D 2 with respect to A k is The final F-NMF update rules to minimize f with all these considerations are derived from (6), (9), (11), (13), (15), and (17). Thus

Simulations on synthetic data
In this section, we present a batch of simulations to quantitatively compare the F-NMF algorithms with different constraints. First, we present the used evaluation metrics. Then, we present the way we build simulated data. Finally, the experimental results of five F-NMF algorithms are given.

Evaluation metrics
(1) To evaluate the abundance estimation, we define the abundance mean squared error (AME) as (2) To evaluate the endmember spectra estimation, we define the spectral mean squared error (SME) as Chen and Guillaume EURASIP Journal on Advances in Signal Processing 2012, 2012:54 http://asp.eurasipjournals.com/content/2012/1/54 (3) To consider the global shape of the spectra, the spectral angle distance (SAD) is defined as where a is the true spectral vector andâis its estimate.

Synthetic data
The HSI synthesis process is in three steps corresponding to the matrices A, S, and the noise matrix N.
First, the J endmember spectra are randomly selected among the U.S. Geological Survey (USGS) spectral library. The selected 224-channel spectra constitute the columns of the matrix A.
Then, the J-element column vector in S is generated following a Dirichlet pdf, with parameters equal to 1. The element maximal value of each column is controlled by a threshold ξ (0 <ξ ≤ 1). This operation allows one to control the mixing or purity level of the data. In particular, the image can contain "pure" pixels when ξ = 1. We also introduce a sparsity parameter ι (ι > 0), which controls the sparsity of S. If ι is set at 0.8, 20% of the J × I elements in S are selected randomly and set to zeros at first, and then the nonzero elements in each column vector of S are generated following the Dirichlet pdf with the STU constraint and the maximal threshold ξ.
Finally, we add a noise matrix N, assumed to be zeromean white Gaussian. The noise is characterized by the SNR where s 2 is its variance. Therefore, a synthetic HSI is characterized by J, the randomly selected endmember spectra, I, ξ, ι, and the SNR. The default configuration is given in Table 1.

Compared algorithms
In our simulations, we compare the F-NMF algorithms with different constraints and the typical geometrical and Bayesian algorithms. All the algorithms are used with the same initial and stop conditions.  [9] under nonnegativity and full additivity constraints. (9) MiniDisCo: a novel NMF-based algorithm with spectral constraint given in [16].
Note that the initializations of A and S for all the algorithms are chosen from a uniform distribution on the interval [0,1].

Simulations
The first simulation shows the behavior of the objective f function along the interactions of two optimization algorithms. Experiments 2-7 present statistical simulations to compare the average behaviors of the five F-NMF algorithms while varying the parameters given in Table 1, and robustness to an overestimation of the endmember number J.
(1) The first experiment is to assess the choice of the optimization algorithm. We compare the convergence efficiency between the PG, which is widely used for NMF optimization, and the HALS algorithm. Here, the PG and HALS algorithms are regularized with the minimum spectral dispersion constraint. The PG-based algorithm is named Mini-DisCo in [11], and HALS-based algorithm in this experiment is also called F4-NMF as above. The f value is calculated with the corresponding constraints and the performances of the two estimators are presented in Figure 1. Note that both the curves result from the same HSI, with the same random initial conditions; thus, the only variability is the optimization method. We note that the final value of f is almost the same with both algorithms, whereas the convergence speed of HALS is faster.
The following experiments 2-7 present the behaviors of the F-NMF algorithms with different constraints (2) In the second experiment, the algorithms are compared when the number of endmembers J varies.
In this experiment, we first test the efficiency of the algorithms. The F-NMF algorithms are compared with the PG-NMF with no constraint except positivity as F1-NMF, and J is set from 3 to 10 as the experiment in [11]. The performance metrics of SME are shown in Figure 2. Note that the considered statistics do not necessarily include each of the results. Here, SME values higher than 0.5 are not included. In particular, the PG-NMF results are never considered while the F-NMF results are all included, because the SME values of PG-NMF are always greater than 0.5. With NMF algorithms, only a local minimum can be attained in general. In the case of random initializations and no constraints, HALS is able to obtain a better solution than PG.
Then, we set J higher to 20 to test the performance of F-NMF. The performance metrics are shown in Figure  3. Note that the F1-NMF without constraints performs worse as the number of endmembers increases. In the case of the constrained F-NMF (F2, F3, F4, F5, F35), the results are much better.   robustness of the constrained F-NMF algorithms, when the basic F-NMF is sensitive to the number of endmembers. The combination of constraints F3 and F5, F35, gives good results.
(3) The purity level ξ is the topic of the third experiment. None of the considered algorithms are based on the hypothesis of one pure pixel for each endmember, but the un-mixing performance may vary with the purity. The obtained performance metrics are presented in Figure 4. F3-NMF is particularly worse for AME when ξ = 0.6, because the low purity level make the maximum spatial dispersion constraint ineffective. F35-NMF also performs worse in term of AME due to the maximum spatial dispersion constraint. Figure 5 shows a comparison of the proposed F35-NMF algorithm with the geometrical method (e.g., VCA), a BSS algorithm (e.g., BPSS2) and another NMFbased algorithm (e.g., MiniDisCo), with the variation of ξ. The two NMF-based algorithms and BPSS2 are each initialized with VCA. The parameter of the spectral  constraint is 0.1 for MiniDisCo. VCA performs better with higher purity level due to its assumption of pure pixels. MiniDisCo and F35-NMF both improve the unmixing results of VCA. Specifically, MiniDisCo and BPSS2 outperform F35-NMF in the sense of AME but the result is quite the reverse in the sense of SME, which is caused by different constraints in MiniDisCo and F35-NMF. In the sense of AME, F35-NMF performs worse as the purity level decreases, because the algorithm is regularized by the maximum spatial dispersion constraint, which improves the values of AME for the mixing data with high purity level. This could be verified by the results shown in Figure 4a. The algorithms with the maximum spatial dispersion constraint (F3-NMF and F35-NMF) give worse results than the other algorithms (F4-NMF and F5-NMF). We choose F35-NMF for comparison due to its better performances in SME and SAD. In the sense of SAD, MiniDisCo is better than F35-NMF with lower purity level, but F35-NMF performs better with purity level lower than 0.7. The performance of BPSS2 is always worse. This may be resulted by the minimum distance constraint in F35-NMF, which plays an important role in the unmixing of highly-mixed data.
As it can be noted, the results can vary for the various metrics, i.e., some algorithms can be efficient for spectral estimation, and not for abundances, and vice versa. We have chosen to keep the three metrics for the complement of information they bring. A small SAD indicates very similar spectral shapes, and is not sensitive to a scale factor, while SME also depends on the values and is sensitive to a scale factor. For abundances, only the values are relevant, because the sum-to-one constraint sets the scale factor. In one sense, SAD is the more meaningful metric, used to identify endmembers from spectral libraries.
(4) The fourth experiment studies the robustness to noise of the considered algorithms. The metric values obtained for various SNR are shown in Figure 6. The F-NMF algorithms are all based on the RQE minimization, which is optimal for white Gaussian noise. Thus, the performances do not significantly depend on the noise. In accordance with the experiment 3, the F3-NMF and F35-NMF results are not good in AME, but better in the terms of SME and SAD. (5) It is interesting to study the estimation quality in terms of the data spatial dimensions. Figure 7 presents the influence of the number of observed spectral pixels. The F-NMF algorithms are both robust to a small number of spectral pixels and a large amount of data. It is interesting to see that a small number of spectral pixels globally improve the performances of the regularized NMF. The F4-NMF and F5-NMF outperform the other algorithms in AME, but the results of F3-NMF and F35-NMF algorithms are better in the terms of SME and SAD. In general, a large data set does not improve the results, so it is more efficient to use a small set of data (400 pixels). (6) This experiment tests the influence of the sparsity parameter ι. The results are presented in Figure 8. All the algorithms are not very sensitive to the sparsity parameter. The F4-NMF and F5-NMF outperform the other algorithms in AME, and the maximum spatial dispersion constraint brings improvement in SME and SAD. (7) Estimating the endmember number J is the first issue of the HSI analysis. On real data, existing methods to estimate J generally overestimate the number [22]. Thus, we study the robustness of the algorithms to an overestimation of J (Figure 9). Here, we overestimate J by 1. The estimation errors show that constrained F-NMF algorithms are robust to an overestimation of J, while the basic F-NMF is sensitive to the number of endmembers.
The following conclusions can be drawn from these experiments: (1) The optimized algorithm of HALS outperforms PG in convergence speed and efficiency. In [11], poor estimations due to local minimum affect the basic PG-NMF, so the estimated values of SME higher than 0.5 are not included in the statistics. In F-NMF, the estimation performance is much better so all the results of the experiments are considered.
(2) The performances of constrained F-NMF are better than the basic NMF, according to all the   parameters (J, ξ, SNR, I, and ι) and the different performance metrics.
(3) The NMF algorithms with minimum spectral dispersion constraint (F4-NMF) and minimum distance constraint (F5-NMF) performs better in AME, while the algorithms with maximum spatial dispersion constraint (F3-NMF and F35-NMF) outperform the other algorithms in the terms of SME and SAD. (4) The NMF algorithm with combined constraints (F35-NMF) performs better than the algorithm with one constraint (F3-NMF).
(5) F-NMF algorithm can improve the unmixing results initialized by VCA.

Application on real hyperspectral data
We have applied the five F-NMF algorithms on a hyperspectral scene captured by the AVIRIS sensor. This sensor has a 20-m spatial resolutions and a 10-nm spectral resolution and acquires 224 spectral bands between 0.4 and 2.5 μm. The analyzed reflectance image is a 99 × 99 pixel selection of the Cuprite geological data. A RGB representation of the scene is shown in Figure 10. Some spectral bands have been removed due to noise corruption and atmosphere absorption, and only the data of the remaining 188 bands have been used. In this section, we choose SISAL as the compared algorithm because it is able to deal with the unmixing problem without the pure pixel assumption as the NMF algorithms.
It is required to estimate the number of endmembers J before unmixing the image. In this article, the number of endmembers is determined from the final RQE obtained after convergence for many preliminary experiences, and is set to J = 11; however, this value is only an approximation.
To improve the algorithm performances, we run the five F-NMF algorithms with the VCA initializations [4] to obtain better local solutions. The estimated endmembers are associated with the closest ones contained in the USGS library in the SAD sense. To evaluate the stability of the algorithms and the ability to find a unique solution, we make 50 runs for each F-NMF algorithm and keep the 11 estimated endmembers at each run. In each run, a new HSI synthetic data is generated with the same parameters (J, I, ξ, ι, SNR), when the endmembers are selected randomly from the library. We should obtain the same 11 identified references in each experiment.  However, the results vary in the 50 experiments. In order to compare the results with a minimum volume-based algorithm, we choose SISAL for its good performances on simulated data and its high speed. Note that the F-NMF and SISAL algorithms are all based on the assumption that the endmembers, or at least some of them, are not in the data set. The references identified by F-NMF are presented in Tables 2, 3, 4, 5, 6, and 7 and the results by SISAL in Table 8. The estimated endmembers are identified as the closest library spectra in the sense of SAD. It can be seen from the tables that F3-NMF gives 77 names for a total of 550 possible different answers, whereas the other four F-NMFs give much more references. The top 11 responses of F3-NMF and F35-NMF represent 66.7 and 68%, respectively, of all the answers. All these results show the stability of F3-NMF, due to the maximum spatial dispersion constraint. From Table 8, we can see that the SISAL identifies 146 names from 550 possible answers, which shows its serious instability.     Therefore, the F-NMF algorithms are more stable than SISAL. Otherwise, the mean SAD between the estimated endmembers and the closest references in the library is significantly lower with F-NMF than SISAL, so we can conclude that the F-NMF algorithms are more efficient in endmember identification for difficult real cases. Figures 11 and 12 give the estimated endmember spectra and abundance maps by F-35 in one experiment. The endmember spectra resulting from the F35-NMF analysis in one experiment are shown in Figure 11a. In this figure, the y-coordinate tick (from j = 1 to J) corresponds to zero reflectance of the jth endmember. The associated spectral endmembers are the closest library spectra (Figure 11b) in the sense of SAD. Note that 3, 7, 8, and 10 spectra are all identified as Kaolin/Smect KLF508 85%K, whose proportion is 26% the first in Table 7. This is the reason of the low identification dispersion of F35-NMF in 50 runs. The estimated abundance maps are given in Figure 12, where the maximum abundance value ξ j of each endmember j is high due to the maximum spatial dispersion constraint.
We compare the references identified by F-NMF and SISAL with the available ground truth of the Cuprite scene from the website [23]. In Tables 2, 3, 4, 5, 6, 7, and 8, the identified results, which appear in the ground-truth list, are highlighted. Each of the considered algorithms only can identify two or three groundtruth minerals. In particular, F-NMF and SISAL algorithms all detect Kaolin, and Goethite is detected by F1-NMF, F2-NMF, F4-NMF, and F5-NMF. In addition, Nontronite is detected by F2-NMF, F4-NMF and F5-NMF, and F35-NMF detects Montmorillonite and SISAL detects Hematite. The identified results illustrate the difficulty of the unmixing problem for real data. Three reasons can be explained as follows: (1) The analyzed Cuprite data are only a selection of the whole scene, which holds 18 endmembers; thus, the unmixing results are also incomplete.
(2) It is difficult to find the right spectra in the considered library with a huge amount of references (500). Some prior knowledge should be used to   reduce the number of references before the comparison.
(3) We use a linear mixing model in this article, but the radiative transfer is always nonlinear in real scene [10]. (4) It is subjective to identify the endmembers with SAD. A more robust method for identification should make the decision jointly with several criteria. Moreover, the variability of real spectra has made their identification from library more difficult.
Finally, it is important to analyze the computation time of the F-NMF algorithms. Under the Matlab environment and 3 GHz CPU, the computation times for an iteration of the F-NMF algorithms with the real data (99 × 99 pixels) are shown in Table 9. It is clear that the algorithms with spectral constraints (F4, F5, F35) are more time-consuming due to the computation of matrix inversion. If the number of iteration is more than a thousand, the running of any F-NMF algorithm will cost a few minutes. In the case of computation cost, geometrical methods (e.g., VCA and SISAL) are fast and efficient, while the NMF-based methods are always slow.

Conclusion
In this article, we have proposed an NMF-based hyperspectral unmixing algorithm with flexible constraints, including the STU constraint, the maximum spatial dispersion constraint, the minimum spectral dispersion constraint, and the minimum distance constraint. The optimization scheme is based on the HALS, whose convergence speed outperforms that of PG. The resulting algorithm, called F-NMF, is experimentally tested with different constraints. The estimation accuracy shows that the F-NMF works stably in all experiments, overcoming the estimation instability of PG-NMF. In particular, the F-NMF algorithms are robust to high number of endmembers, low SNR, low number of observed pixels and overestimation of the number of endmembers.
The F-NMF algorithms seem to be effective in the estimation of abundance maps, since they consider the STU and maximum spatial dispersion constraints. The identified references of real data by F-NMF seem more stable and reliable than geometrical method like SISAL. However, the identified results of real data are unsatisfied so the identification method needs further investigation to improve the results.