HALS-based NMF with flexible constraints for hyperspectral unmixing

Chen, Weishi; Guillaume, Mireille

doi:10.1186/1687-6180-2012-54

Research
Open access
Published: 05 March 2012

HALS-based NMF with flexible constraints for hyperspectral unmixing

Weishi Chen^1,2 &
Mireille Guillaume¹

EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 54 (2012) Cite this article

4562 Accesses
24 Citations
Metrics details

Abstract

In this article, the hyperspectral unmixing problem is solved with the nonnegative matrix factorization (NMF) algorithm. The regularized criterion is minimized with a hierarchical alternating least squares (HALS) scheme. Under the HALS framework, four constraints are introduced to improve the unmixing accuracy, including the sum-to-unity constraint, the constraints for minimum spectral dispersion and maximum spatial dispersion, and the minimum volume constraint. The derived algorithm is called F-NMF, for NMF with flexible constraints. We experimentally compare F-NMF with different constraints and combined ones. We test the sensitivity and robustness of F-NMF to many parameters such as the purity level of endmembers, the number of endmembers and pixels, the SNR, the sparsity level of abundances, and the overestimation of endmembers. The proposed algorithm improves the results estimated by vertex component analysis. A comparative analysis on real data is included. The unmixing results given by a geometrical method, the simplex identification via split augmented Lagrangian and the F-NMF algorithms with combined constraints are compared, which shows the relative stability of F-NMF.

1. Introduction

Airborne hyperspectral sensors collect images in hundreds of narrow and contiguous spectral bands. Due to the limited spatial resolution of hyperspectral image (HSI), each observed pixel generally contains more than one material spectral signature. Hence, the hyperspectral unmixing, which decomposes a mixed pixel into a combination of pure material spectra known as endmembers, weighted by their corresponding abundance coefficients, is a challenging task.

Let R (L × I) be the matrix unfolded HSI, whose I columns are the spectral pixels and the L rows are the vectorial spectral band images. As N is the related noise matrix, the linear spectral mixing model (LSMM) can be written as

R = A S + N = X + N

(1)

The rows of S (J × I) are the abundance maps corresponding to the respective endmembers, whose spectra are located in the columns of A (L × J). J denotes the number of endmembers.

Basically, hyperspectral unmixing is a problem of blind source separation (BSS). However, compared with most BSS applications, the endmembers of HSI data are dependent and the elements in A and S are nonnegative, so the hyperspectral unmixing is beyond the reach of many BSS algorithms (e.g., independent component analysis-ICA) [1]. To fulfill these constraints, numerous special algorithms have been proposed to solve the hyperspectral unmixing problem under the LSMM assumption, including the approaches of convex geometry, Bayesian source separation, and nonnegative matrix factorization (NMF). The geometrical approaches first determine the endmembers and estimate the abundances in a second step, while the BSS and NMF-based approaches find the endmembers and the abundances simultaneously.

Geometrical approaches try to determine the vertices of the J-simplex enclosing the observed pixels, such as pixel purity index (PPI) [2], N-FINDR [3], vertex component analysis (VCA) [4]. The PPI algorithm projects every spectral vector onto skewers (large number of random vectors). The points corresponding to extremes, for each skewer direction, are stored and cumulated. The pixels with the highest scores are the purest ones. N-FINDR finds the set of pixels defining the largest volume within the data. VCA iteratively projects data onto a direction orthogonal and the endmembers correspond to the extreme of the projections. The issue of these approaches is to find extreme points within the data with the assumption of pure pixel of each endmember, which is always unsatisfactory for real hyperspectral data. Recently, the state-of-art reference algorithms MVSA [5], MVES [6], and the simplex identification via split augmented Lagrangian (SISAL) [7] have proposed various ways to find a minimum volume simplex, showing very good performances in the estimation of endmembers. Particularly, SISAL is able to unmix HSI data in the case of no pure pixel.

The geometrical approaches do not work well when the observed data are highly mixed, because there are not enough vectors in simplex facets. In these cases, the separation problem can be addressed in a Bayesian framework. Several Bayesian Positive Source Separation (BPSS) algorithms under positivity and sum-to-one constraints have recently been developed [8–10]. In [10], a discussion on the effectiveness of the sum-to-one constraint is given, showing that full constrained BPSS2 gives better results than BPSS for simulated data, while it is the contrary for the real OMEGA data, "due to nonlinearity in the radiative transfer and noise in the dataset in contradiction with the full additivity constraint". We think that it would be not the same with the proposed NMF-based algorithm, firstly because the full additivity is not a hard but a soft constraint, and second because the residual error RQE is able to represent measurement noise or model noise, and then the algorithm is quite robust for real data, which can contain nonlinear mixed terms. This can be seen by comparing the results of a geometrical algorithm like SISAL, very performant on simulated data, with the results obtained on Cuprite real data, which drops dramatically, while the NMF-based algorithms keep performing.

In the last decade, NMF has been a popular algorithm since Lee and Seung [11] investigated the properties of the algorithm and published some simple and useful algorithms for two types of factorizations. The NMF algorithm has broadly been used in text mining, image analysis, speech processing, and automatic control. The basic NMF problem consists of finding two nonnegative data matrices whose product approximates the mixed data in a chosen measure sense (e.g., the reconstruction quadratic error-RQE). However, the solution to NMF is not unique so various regularizations with prior knowledge should be taken into account to reduce the number of solutions. The sum-to-unity (STU) constraint is proposed in [12], which regularizes the RQE with a function of S to normalize the columns of it. The authors of [13] propose constraints based on two inherent characteristics of hyperspectral data: the spectral piecewise smoothness and spatial sparseness. In [14], a minimum volume constrained NMF (MVC-NMF) based on projected gradient (PG) optimization method is proposed, whose regularization term minimizes the simplex volume spanned by the endmembers. Other authors [15] propose a minimum distance constrained NMF (MDC-NMF), which consider the endmember distance instead of the volume of the estimated simplex. MDC-NMF makes a slight modification of the optimized algorithm used for MVC-NMF. MiniDisCo algorithm makes the assumption of minimum spectral dispersion for NMF regularization [16], and MDMD-NMF regularizes with minimum spectral dispersion and maximum spatial dispersion [17]. A new step-size estimation technique is proposed for the two algorithms to hasten the PG convergence.

The optimization algorithms and constraints on A and S are two main techniques for NMF-based hyperspectral unmixing. The authors of [18] propose a flexible hierarchical alternating least squares (HALS) algorithm with a set of local cost functions called alpha and beta divergences. The word "flexible" means the variation of the optimization algorithm. In this article, we propose an improved NMF algorithm with four constraints due to the characteristics of HSI, called the flexible NMF (F-NMF). The word "flexible" means the variation of constraints on A and S. F-NMF also uses the HALS update rules, significantly outperfoming the PG update rules in convergence speed. Actually, the novelty is both the combination of the constraints and the development of these constraints under HALS-based algorithm.

The rest of the article is organized as follows: Section 2 presents the basic NMF algorithm and the HALS update rules. In Section 3, we introduce four constraint functions and integrate them into the F-NMF algorithm. In Section 4, the comparison and analysis of the F-NMF with different constraints are given by processing various simulated HSIs. The algorithms are applied to real data in Section 5. The F-NMF algorithms are compared with SISAL, for the two algorithms are both able to unmix hyperspectral data in which the pure pixel assumption is violated. Finally, some conclusion closes the article.

2. NMF for hyperspectral unmixing

In this section, we first present the NMF problem and then the optimization algorithm used to solve it in this article.

2.1. NMF problem

The aim of basis NMF methods is to find two estimated matrices $\hat{A}$ and $\hat{S}$ such that

X ≃ \hat{A} \hat{S}

(2)

A commonly used theoretical solution is to find nonnegative matrices minimizing the RQE

RQE (A, S) = {∥X - A S∥}_{F}^{2}

(3)

where ||·||_F is the Frobenius (e.g., quadratic) norm.

2.2. HALS algorithms

In [19], the authors show that the HALS scheme works remarkably well in practice, outperforming, in most cases, the other optimization algorithms for NMF. In particular, it is proved to be locally more efficient [20] and shown to converge to a stationary point under some mild assumptions [21]. For these reasons, we choose HALS as the optimization technique.

The basic idea of HALS is to define residues as

X^{(k)} = X - \sum_{i \neq k} A_{i} S_{i} = X - A S + A_{k} S_{k}

(4)

for k = 1, 2,..., J. A_k (L × 1) is one endmember spectrum and S_k (1 × I) corresponds to its abundance fraction.

By substituting Equation (4) into (3), the new RQE function is

RQE (A_{k}, S_{k}) = {∥X^{(k)} - A_{k} S_{k}∥}_{F}^{2}

(5)

The gradients of the above function are expressed by

\begin{gathered} \frac{\partial RQE (A_{k}, S_{k})}{\partial A_{k}} = 2 (A_{k} {∥S_{k}∥}^{2} - X^{(k)} S_{k}^{T}) \\ \frac{\partial RQE (A_{k}, S_{k})}{\partial S_{k}} = 2 ({∥A_{k}∥}^{2} S_{k} - A_{k}^{T} X^{(k)}) \end{gathered}

(6)

By setting the above equation to zero, the updating rules are obtained:

A_{k} \leftarrow {[\frac{X^{(k)} S_{k}^{T}}{{∥S_{k}∥}^{2}}]}_{[0, 1]} \begin{matrix} , \end{matrix} S_{k} \leftarrow {[\frac{A_{k}^{T} X^{(k)}}{{∥A_{k}∥}^{2}}]}_{[0, 1]} .

(7)

For k = 1, 2,..., J, where [δ][_0,1] is to enforce every element δ_ij lies in [0,1], so

{({[δ]}_{[0, 1]})}_{i j} = \{\begin{matrix} 0, \\ δ_{i j}, \\ 1, \end{matrix} \begin{matrix} if δ_{i j} < 0 \\ if 0 < δ_{i j} < 1 \\ if δ_{i j} > 1 \end{matrix}

Clearly, the HALS algorithm is bound-constrained. It is also shown that the optimal value of each entry of A (A_k) does not depend on the other entries of the same column. By symmetry, the same property holds for each row of S (S_k). Thus, the detailed HALS algorithm is summarized as follows:

(1)
Initialize A and S with the VCA algorithm;
(2)
for i = 1, 2,..., do

for k = 1, 2,..., J

Update A_k and S_k with the HALS update rules;

end

until the stop criteria is reached

The simplest update rules are given in Equation (7), and the regularized f with all constraints will be proposed in Equation (18). The maximum number of iterations is always set high (e.g., 2000) to obtain accurate estimations. However, the overestimation of the iteration number induces time waste. Indeed, the RQE value slightly increases from certain iteration whereas the regularized f keeps decreasing. Thus, the algorithm is stopped at this iteration when the RQE value goes to a minimum although the highest iteration number is not reached. The stop criteria is expressed as

RQ E^{k - 50} < min_{k^{'} = 0, . . ., 50} RQ E^{k - k^{'}} .

3. NMF with flexible constraints

The basic NMF optimized function ensures that the two constraints A and S are both nonnegatives. Since the NMF solution is not unique, some prior knowledge on HSIs can be introduced to regularize the problem. A generic expression for the optimized function is

f (A, S) = \sum_{e} σ_{e} D_{e} (A, S) + \sum_{i} α_{i} D_{i} (S) + \sum_{j} β_{j} D_{j} (A)

(8)

where σ, α, and β are regularized parameters for the estimation error and the spectral and abundance constraints. D(A,S) measures the difference between X and AS with respect to some norms. By substituting Equation (4) into (8) and using the RQE norm, the new optimized function f is

f (A_{k}, S_{k}) = RQE (A_{k}, S_{k}) + \sum_{i} α_{i} D_{i} (S_{k}) + \sum_{j} β_{j} D_{j} (A_{k})

(9)

In this section, we add four constraints for A and S to the function to improve the unmixing result. With all these constraints, the algorithm is called flexible NMF (F-NMF), based on HALS update rules.

3.1. STU constraint

The STU constraint makes the sums of the columns of S equal to 1. The STU constraint is defined as follows:

D_{1} (S_{k}) = {∥S_{k} + \sum_{i \neq k} S_{i} - 1_{1 I}∥}_{F}^{2}

(10)

where 1_1Iis an (1 × I) vector of ones. The gradient derivation of D₁ with respect to S_k is

\frac{\partial D_{1} (S_{k})}{\partial S_{k}} = 2 (S_{k} + \sum_{i \neq k} S_{i} - 1_{1 I})

(11)

3.2. Maximum spatial dispersion constraint

In real situations, abundance matrix is often very sparse because the materials are mostly grouped in separate regions even if the pure pixels. We note that reducing the data enclosing simplex volume is equivalent to increase the dispersion of the abundances fractions in the sum-to-one constrained subspace enclosing the abundances. Actually, the most impossible situation is the uniformly mixed data. Therefore, as the mean value of abundances is 1/J, we defined the maximum spatial dispersion constraint as follows:

D_{2} (S_{k}) = - {∥S_{k} - \frac{1}{J} 1_{1 I}∥}_{F}^{2}

(12)

This constraint encourages null abundance pixels and full pixels, as in real scenes, not all the endmembers are present in all pixels, and in contrast some pixels contain only one material. The gradient derivation of D₂ with respect to S_k is

\frac{\partial D_{2} (S_{k})}{\partial S_{k}} = - 2 (S_{k} - \frac{1}{J} 1_{1 I})

(13)

3.3. Minimum spectral dispersion constraint

This constrained function depends on A, encouraging the variance of each endmember spectrum to be as low as possible. This dispersion constraint is to improve the shape estimation of flat endmember spectra. Consequently, if the estimation of some spectra is improved, the estimation of the other spectra involved in the mixture will also indirectly be improved due to the parameter interdependences. We define the minimum spectral dispersion constraint as

D_{1} (A_{k}) = {∥(I_{L L} - \frac{1}{L} 1_{L L}) A_{k}∥}_{F}^{2}

(14)

The gradient derivation of D₁ with respect to A_k is

\frac{\partial D_{1} (A_{k})}{\partial A_{k}} = 2 (I_{L L} - \frac{1}{L} 1_{L L}) A_{k}

(15)

3.4. Minimum distance constraint

In MVC-NMF [9], the volume of A is calculated as the constraint, which suffers from numerical instabilities [11]. Here, we choose the minimum distance constraint as a substitute in order to shrink the volume of the data enclosing the simplex. The distance is measured and summed up from every endmember to their centroid. This constraint is defined as

D_{2} (A_{k}) = {∥(I_{L L} - \frac{1}{L} 1_{L L}) (A_{k} - \frac{1}{J} (A_{k} + \sum_{i \neq k} A_{i}))∥}_{F}^{2}

(16)

The gradient derivation of D₂ with respect to A_k is

\frac{\partial D_{2} (A_{k})}{\partial A_{k}} = 2 (I_{L L} - \frac{1}{L} 1_{L L}) (1 - \frac{1}{J}) (A_{k} - \frac{1}{J} (A_{k} + \sum_{i \neq k} A_{i}))

(17)

The final F-NMF update rules to minimize f with all these considerations are derived from (6), (9), (11), (13), (15), and (17). Thus

S_{k} \leftarrow {[\frac{A_{k}^{T} X^{(k)} + α_{1} (1_{1 I} - \sum_{i \neq k} S_{i}) - \frac{α_{2}}{J} 1_{1 I}}{{∥A_{k}∥}^{2} + α_{1} - α_{2}}]}_{[0, 1]}

A_{k} \leftarrow {[\frac{X^{(k)} S_{k}^{T} + β_{2} \frac{1}{J} (1 - \frac{1}{J}) (I_{L L} - \frac{1}{L} 1_{L L}) \sum_{i \neq k} A_{i}}{{∥S_{k}∥}^{2} I_{L L} + (I_{L L} - \frac{1}{L} 1_{L L}) \cdot (β_{1} + β_{2} {(1 - \frac{1}{J})}^{2})}]}_{[0, 1]}

(18)

4. Simulations on synthetic data

In this section, we present a batch of simulations to quantitatively compare the F-NMF algorithms with different constraints. First, we present the used evaluation metrics. Then, we present the way we build simulated data. Finally, the experimental results of five F-NMF algorithms are given.

4.1. Evaluation metrics

(1)
To evaluate the abundance estimation, we define the abundance mean squared error (AME) as
$AME (\hat{S}, S) = \frac{1}{J I} {∥\hat{S} - S∥}_{F}^{2}$
(19)
(2)
To evaluate the endmember spectra estimation, we define the spectral mean squared error (SME) as
$SME (\hat{A}, A) = \frac{1}{L J} {∥\hat{A} - A∥}_{F}^{2}$
(20)
(3)
To consider the global shape of the spectra, the spectral angle distance (SAD) is defined as
$SAD (\hat{a}, a) = {cos}^{- 1} (\frac{a^{T} \hat{a}}{\sqrt{{\hat{a}}^{T} \hat{a}} \sqrt{a^{T} a}})$
(21)

where a is the true spectral vector and $\hat{a}$ is its estimate.

4.2. Synthetic data

The HSI synthesis process is in three steps corresponding to the matrices A, S, and the noise matrix N.

First, the J endmember spectra are randomly selected among the U.S. Geological Survey (USGS) spectral library. The selected 224-channel spectra constitute the columns of the matrix A.

Then, the J-element column vector in S is generated following a Dirichlet pdf, with parameters equal to 1. The element maximal value of each column is controlled by a threshold ξ (0 <ξ ≤ 1). This operation allows one to control the mixing or purity level of the data. In particular, the image can contain "pure" pixels when ξ = 1. We also introduce a sparsity parameter ι (ι > 0), which controls the sparsity of S. If ι is set at 0.8, 20% of the J × I elements in S are selected randomly and set to zeros at first, and then the nonzero elements in each column vector of S are generated following the Dirichlet pdf with the STU constraint and the maximal threshold ξ.

Finally, we add a noise matrix N, assumed to be zero-mean white Gaussian. The noise is characterized by the SNR

SNR = 10 \cdot log (\frac{{∥X∥}_{F}^{2}}{L I σ^{2}})

where σ² is its variance.

Therefore, a synthetic HSI is characterized by J, the randomly selected endmember spectra, I, ξ, ι, and the SNR. The default configuration is given in Table 1.

Table 1 Default configuration of the synthetic data

Full size table

4.3. Compared algorithms

In our simulations, we compare the F-NMF algorithms with different constraints and the typical geometrical and Bayesian algorithms. All the algorithms are used with the same initial and stop conditions.

(1)
F1-NMF: the basic HALS-NMF with no extra constraint based on HALS optimization algorithm. Only the nonnegative constraints are guaranteed.
(2)
F2-NMF: the HALS-NMF is improved with the STU constraint.
(3)
F3-NMF: the HALS-NMF with the STU and maximum spatial dispersion constraints.
(4)
F4-NMF: the HALS-NMF with the STU and minimum spectral dispersion constraints.
(5)
F5-NMF: the HALS-NMF with the STU and minimum distance constraints.
(6)
F35-NMF: the HALS-NMF with the combined constraints of F3 and F5.
(7)
VCA: a popular geometrical algorithm proposed in [4].
(8)
BPSS2: an improved Bayesian algorithm addressed in [9] under nonnegativity and full additivity constraints.
(9)
MiniDisCo: a novel NMF-based algorithm with spectral constraint given in [16].

Note that the initializations of A and S for all the algorithms are chosen from a uniform distribution on the interval [0,1].

4.4. Simulations

The first simulation shows the behavior of the objective f function along the interactions of two optimization algorithms. Experiments 2-7 present statistical simulations to compare the average behaviors of the five F-NMF algorithms while varying the parameters given in Table 1, and robustness to an overestimation of the endmember number J.

(1)
The first experiment is to assess the choice of the optimization algorithm. We compare the convergence efficiency between the PG, which is widely used for NMF optimization, and the HALS algorithm. Here, the PG and HALS algorithms are regularized with the minimum spectral dispersion constraint. The PG-based algorithm is named MiniDisCo in [11], and HALS-based algorithm in this experiment is also called F4-NMF as above. The f value is calculated with the corresponding constraints and the performances of the two estimators are presented in Figure 1. Note that both the curves result from the same HSI, with the same random initial conditions; thus, the only variability is the optimization method. We note that the final value of f is almost the same with both algorithms, whereas the convergence speed of HALS is faster.

The following experiments 2-7 present the behaviors of the F-NMF algorithms with different constraints while varying the parameters summed up in Table 1. The unmixing results are evaluated by AME, SME, and SAD. The presented results are averages (bars) and standard deviations (error bars) resulting from 20 experiments. Note that all the considered algorithms are compared on the same sets of 20 HSIs. We perform preliminary Monte Carlo simulations to find relevant values for the regularized parameters. The retained values are α₁ = 1, α₂ = 0.1, β₁ = 0.1, and β₂ = 0.1, which are chosen to minimize the average evaluation errors for synthetic data.

(2)
In the second experiment, the algorithms are compared when the number of endmembers J varies.

In this experiment, we first test the efficiency of the algorithms. The F-NMF algorithms are compared with the PG-NMF with no constraint except positivity as F1-NMF, and J is set from 3 to 10 as the experiment in [11]. The performance metrics of SME are shown in Figure 2. Note that the considered statistics do not necessarily include each of the results. Here, SME values higher than 0.5 are not included. In particular, the PG-NMF results are never considered while the F-NMF results are all included, because the SME values of PG-NMF are always greater than 0.5. With NMF algorithms, only a local minimum can be attained in general. In the case of random initializations and no constraints, HALS is able to obtain a better solution than PG.

Then, we set J higher to 20 to test the performance of F-NMF. The performance metrics are shown in Figure 3. Note that the F1-NMF without constraints performs worse as the number of endmembers increases. In the case of the constrained F-NMF (F2, F3, F4, F5, F35), the results are much better. Figure 3 puts forward the high robustness of the constrained F-NMF algorithms, when the basic F-NMF is sensitive to the number of endmembers. The combination of constraints F3 and F5, F35, gives good results.

(3)
The purity level ξ is the topic of the third experiment. None of the considered algorithms are based on the hypothesis of one pure pixel for each endmember, but the un-mixing performance may vary with the purity. The obtained performance metrics are presented in Figure 4. F3-NMF is particularly worse for AME when ξ = 0.6, because the low purity level make the maximum spatial dispersion constraint ineffective. F35-NMF also performs worse in term of AME due to the maximum spatial dispersion constraint.

Figure 5 shows a comparison of the proposed F35-NMF algorithm with the geometrical method (e.g., VCA), a BSS algorithm (e.g., BPSS2) and another NMF-based algorithm (e.g., MiniDisCo), with the variation of ξ. The two NMF-based algorithms and BPSS2 are each initialized with VCA. The parameter of the spectral constraint is 0.1 for MiniDisCo. VCA performs better with higher purity level due to its assumption of pure pixels. MiniDisCo and F35-NMF both improve the unmixing results of VCA. Specifically, MiniDisCo and BPSS2 outperform F35-NMF in the sense of AME but the result is quite the reverse in the sense of SME, which is caused by different constraints in MiniDisCo and F35-NMF. In the sense of AME, F35-NMF performs worse as the purity level decreases, because the algorithm is regularized by the maximum spatial dispersion constraint, which improves the values of AME for the mixing data with high purity level. This could be verified by the results shown in Figure 4a. The algorithms with the maximum spatial dispersion constraint (F3-NMF and F35-NMF) give worse results than the other algorithms (F4-NMF and F5-NMF). We choose F35-NMF for comparison due to its better performances in SME and SAD. In the sense of SAD, MiniDisCo is better than F35-NMF with lower purity level, but F35-NMF performs better with purity level lower than 0.7. The performance of BPSS2 is always worse. This may be resulted by the minimum distance constraint in F35-NMF, which plays an important role in the unmixing of highly-mixed data.

As it can be noted, the results can vary for the various metrics, i.e., some algorithms can be efficient for spectral estimation, and not for abundances, and vice versa. We have chosen to keep the three metrics for the complement of information they bring. A small SAD indicates very similar spectral shapes, and is not sensitive to a scale factor, while SME also depends on the values and is sensitive to a scale factor. For abundances, only the values are relevant, because the sum-to-one constraint sets the scale factor. In one sense, SAD is the more meaningful metric, used to identify endmembers from spectral libraries.

(4)
The fourth experiment studies the robustness to noise of the considered algorithms. The metric values obtained for various SNR are shown in Figure 6. The F-NMF algorithms are all based on the RQE minimization, which is optimal for white Gaussian noise. Thus, the performances do not significantly depend on the noise. In accordance with the experiment 3, the F3-NMF and F35-NMF results are not good in AME, but better in the terms of SME and SAD.

(5)
It is interesting to study the estimation quality in terms of the data spatial dimensions. Figure 7 presents the influence of the number of observed spectral pixels. The F-NMF algorithms are both robust to a small number of spectral pixels and a large amount of data. It is interesting to see that a small number of spectral pixels globally improve the performances of the regularized NMF. The F4-NMF and F5-NMF outperform the other algorithms in AME, but the results of F3-NMF and F35-NMF algorithms are better in the terms of SME and SAD. In general, a large data set does not improve the results, so it is more efficient to use a small set of data (400 pixels).

(6)
This experiment tests the influence of the sparsity parameter ι. The results are presented in Figure 8. All the algorithms are not very sensitive to the sparsity parameter. The F4-NMF and F5-NMF outperform the other algorithms in AME, and the maximum spatial dispersion constraint brings improvement in SME and SAD.

(7)
Estimating the endmember number J is the first issue of the HSI analysis. On real data, existing methods to estimate J generally overestimate the number [22]. Thus, we study the robustness of the algorithms to an overestimation of J (Figure 9). Here, we overestimate J by 1. The estimation errors show that constrained F-NMF algorithms are robust to an overestimation of J, while the basic F-NMF is sensitive to the number of endmembers.

The following conclusions can be drawn from these experiments:

(1)
The optimized algorithm of HALS outperforms PG in convergence speed and efficiency. In [11], poor estimations due to local minimum affect the basic PG-NMF, so the estimated values of SME higher than 0.5 are not included in the statistics. In F-NMF, the estimation performance is much better so all the results of the experiments are considered.
(2)
The performances of constrained F-NMF are better than the basic NMF, according to all the parameters (J, ξ, SNR, I, and ι) and the different performance metrics.
(3)
The NMF algorithms with minimum spectral dispersion constraint (F4-NMF) and minimum distance constraint (F5-NMF) performs better in AME, while the algorithms with maximum spatial dispersion constraint (F3-NMF and F35-NMF) outperform the other algorithms in the terms of SME and SAD.
(4)
The NMF algorithm with combined constraints (F35-NMF) performs better than the algorithm with one constraint (F3-NMF).
(5)
F-NMF algorithm can improve the unmixing results initialized by VCA.

5. Application on real hyperspectral data

We have applied the five F-NMF algorithms on a hyperspectral scene captured by the AVIRIS sensor. This sensor has a 20-m spatial resolutions and a 10-nm spectral resolution and acquires 224 spectral bands between 0.4 and 2.5 μm. The analyzed reflectance image is a 99 × 99 pixel selection of the Cuprite geological data. A RGB representation of the scene is shown in Figure 10. Some spectral bands have been removed due to noise corruption and atmosphere absorption, and only the data of the remaining 188 bands have been used. In this section, we choose SISAL as the compared algorithm because it is able to deal with the unmixing problem without the pure pixel assumption as the NMF algorithms.

It is required to estimate the number of endmembers J before unmixing the image. In this article, the number of endmembers is determined from the final RQE obtained after convergence for many preliminary experiences, and is set to J = 11; however, this value is only an approximation.

To improve the algorithm performances, we run the five F-NMF algorithms with the VCA initializations [4] to obtain better local solutions. The estimated endmembers are associated with the closest ones contained in the USGS library in the SAD sense. To evaluate the stability of the algorithms and the ability to find a unique solution, we make 50 runs for each F-NMF algorithm and keep the 11 estimated endmembers at each run. In each run, a new HSI synthetic data is generated with the same parameters (J, I, ξ, ι, SNR), when the endmembers are selected randomly from the library. We should obtain the same 11 identified references in each experiment. However, the results vary in the 50 experiments. In order to compare the results with a minimum volume-based algorithm, we choose SISAL for its good performances on simulated data and its high speed. Note that the F-NMF and SISAL algorithms are all based on the assumption that the endmembers, or at least some of them, are not in the data set. The references identified by F-NMF are presented in Tables 2, 3, 4, 5, 6, and 7 and the results by SISAL in Table 8. The estimated endmembers are identified as the closest library spectra in the sense of SAD. It can be seen from the tables that F3-NMF gives 77 names for a total of 550 possible different answers, whereas the other four F-NMFs give much more references. The top 11 responses of F3-NMF and F35-NMF represent 66.7 and 68%, respectively, of all the answers. All these results show the stability of F3-NMF, due to the maximum spatial dispersion constraint. From Table 8, we can see that the SISAL identifies 146 names from 550 possible answers, which shows its serious instability. Therefore, the F-NMF algorithms are more stable than SISAL. Otherwise, the mean SAD between the estimated endmembers and the closest references in the library is significantly lower with F-NMF than SISAL, so we can conclude that the F-NMF algorithms are more efficient in endmember identification for difficult real cases.

Table 2 References identified by f1-NMF

Full size table

Table 3 References identified by f2-NMF

Full size table

Table 4 References identified by f3-NMF

Full size table

Table 5 References identified by F4-NMF

Full size table

Table 6 References identified by F5-NMF

Full size table

Table 7 References identified by F35-NMF

Full size table

Table 8 References identified by SISAL

Full size table

Figures 11 and 12 give the estimated endmember spectra and abundance maps by F-35 in one experiment. The endmember spectra resulting from the F35-NMF analysis in one experiment are shown in Figure 11a. In this figure, the y-coordinate tick (from j = 1 to J) corresponds to zero reflectance of the j th endmember. The associated spectral endmembers are the closest library spectra (Figure 11b) in the sense of SAD. Note that 3, 7, 8, and 10 spectra are all identified as Kaolin/Smect KLF508 85%K, whose proportion is 26% the first in Table 7. This is the reason of the low identification dispersion of F35-NMF in 50 runs. The estimated abundance maps are given in Figure 12, where the maximum abundance value ξ_j of each endmember j is high due to the maximum spatial dispersion constraint.

We compare the references identified by F-NMF and SISAL with the available ground truth of the Cuprite scene from the website [23]. In Tables 2, 3, 4, 5, 6, 7, and 8, the identified results, which appear in the ground-truth list, are highlighted. Each of the considered algorithms only can identify two or three ground-truth minerals. In particular, F-NMF and SISAL algorithms all detect Kaolin, and Goethite is detected by F1-NMF, F2-NMF, F4-NMF, and F5-NMF. In addition, Nontronite is detected by F2-NMF, F4-NMF and F5-NMF, and F35-NMF detects Montmorillonite and SISAL detects Hematite. The identified results illustrate the difficulty of the unmixing problem for real data. Three reasons can be explained as follows:

(1)
The analyzed Cuprite data are only a selection of the whole scene, which holds 18 endmembers; thus, the unmixing results are also incomplete.
(2)
It is difficult to find the right spectra in the considered library with a huge amount of references (500). Some prior knowledge should be used to reduce the number of references before the comparison.
(3)
We use a linear mixing model in this article, but the radiative transfer is always nonlinear in real scene [10].
(4)
It is subjective to identify the endmembers with SAD. A more robust method for identification should make the decision jointly with several criteria. Moreover, the variability of real spectra has made their identification from library more difficult.

Finally, it is important to analyze the computation time of the F-NMF algorithms. Under the Matlab environment and 3 GHz CPU, the computation times for an iteration of the F-NMF algorithms with the real data (99 × 99 pixels) are shown in Table 9. It is clear that the algorithms with spectral constraints (F4, F5, F35) are more time-consuming due to the computation of matrix inversion. If the number of iteration is more than a thousand, the running of any F-NMF algorithm will cost a few minutes. In the case of computation cost, geometrical methods (e.g., VCA and SISAL) are fast and efficient, while the NMF-based methods are always slow.

Table 9 Computation time of F-NMF algorithms(s)

Full size table

6. Conclusion

In this article, we have proposed an NMF-based hyperspectral unmixing algorithm with flexible constraints, including the STU constraint, the maximum spatial dispersion constraint, the minimum spectral dispersion constraint, and the minimum distance constraint. The optimization scheme is based on the HALS, whose convergence speed outperforms that of PG. The resulting algorithm, called F-NMF, is experimentally tested with different constraints. The estimation accuracy shows that the F-NMF works stably in all experiments, overcoming the estimation instability of PG-NMF. In particular, the F-NMF algorithms are robust to high number of endmembers, low SNR, low number of observed pixels and overestimation of the number of endmembers.

The F-NMF algorithms seem to be effective in the estimation of abundance maps, since they consider the STU and maximum spatial dispersion constraints. The identified references of real data by F-NMF seem more stable and reliable than geometrical method like SISAL. However, the identified results of real data are unsatisfied so the identification method needs further investigation to improve the results.

References

Nascimento JMP, Dias JMB: Does independent component analysis play a role in unmixing hyperspectral data? IEEE Trans Geosci Remote Sens 2005, 43(1):175-187.
Article Google Scholar
Theiler J, Lavenier D, Harvey N, Perkins S, Szymanski J: Using blocks of skewers for faster computation of pixel purity index. Volume 4132. Proc. of the SPIE International Conference on Optical Science and Technology, Bellingham, WA; 2000:61-71.
Google Scholar
ME Winter: N-findr: an algorithm for fast autonomous spectral end-member determination in hyperspectral data. In Proc SPIE Conf Imaging Spectrometry V. Volume 3753. Denver, Colorado; 1999:266-275.
Chapter Google Scholar
Nascimento J, Dias J: Vertex component analysis: a fast algorithm to unmix hyperspectral data. IEEE Trans Geosci Remote Sens 2005, 43(4):898-908.
Article Google Scholar
Li J, Bioucas-Dias JM: Minimum volume simplex analysis: a fast algorithm to unmix hyperspectral data. Proc IEEE IGARSS, Boston 2008, 3: 250-253.
Google Scholar
Chan TH, Chi CY, Huang YM, Ma WK: A convex analysis-based minimum-volume enclosing simplex algorithm for hyperspectral unmixing. IEEE Trans Signal Process 2009, 57(11):4418-4432.
Article MathSciNet Google Scholar
Bioucas-Dias JM: A variable splitting augmented Lagrangian approach to linear spectral unmixing. In Proc WHISPERS. Volume 1. Grenoble, France; 2009:1-4.
Google Scholar
Moussaoui S, Brie D, Mohammad-Djafari A, Carteret C: Separation of non-negative mixture of non-negative sources using a Bayesian approach and MCMC sampling. IEEE Trans Signal Process 2006, 54(11):4133-4145.
Article Google Scholar
Dobigeon N, Moussaoui S, Tourneret J-Y, Carteret C: Bayesian separation of spectral sources under non-negativity and full additivity constraints. Signal Process 2009, 89(12):2657-2669. 10.1016/j.sigpro.2009.05.005
Article MATH Google Scholar
Schmidt F, Schmidt A, Tréguier E, Guiheneuf M, Moussaoui S, Dobigeon N: Implementation strategies for hyperspectral unmixing using Bayesian source separation. IEEE Trans Geosci Remote Sens 2010, 48(11):4003-4013.
Google Scholar
Lee DD, Seung HS: Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401(6755):788-791. 10.1038/44565
Article Google Scholar
Heinz DC, Chang CI: Fully constrained least squares linear spectral mixture analysis method for material quantification in hyperspectralimagery. IEEE Trans Geosci Remote Sens 2001, 39(3):529-545. 10.1109/36.911111
Article Google Scholar
Jia S, Qian Y: Constrained nonnegative matrix factorization for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 2009, 47(1):161-173.
Article MATH Google Scholar
Miao L, Qi H: Endmember extraction from highly mixed data using minimum volume constrained nonnegative matrix factorization. IEEE Trans Geosci Remote Sens 2007, 45(3):765-777.
Article Google Scholar
Yu Y, Guo S, Sun WD: Minimum distance constrained nonnegative matrix factorization for the endmember extraction of hyperspectral images. Proc SPIE, MIPPR, Wuhan, China 2007, 6790: 1-4.
Google Scholar
Huck A, Guillaume M, Blanc-Talon J: Minimum dispersion constrained nonnegative matrix factorization to unmix hyperspectral data. IEEE Trans Geosci Remote Sens 2010, 48(6):2590-2602.
Article Google Scholar
Huck A, Guillaume M: Robust hyperspectral data unmixing with spatial and spectral regularized NMF. Proc WHISPERS, Reykjavik, Iceland 2010, 2: 1-4.
Google Scholar
Cichocki A, Phan AH, Caiafa C: Flexible HALS algorithms for sparse non-negative matrix/tensor factorization. IEEE Workshop on Machine Learning for Signal Processing, Cancun, Mexico 2008, 4(4):73-78.
Google Scholar
Gillis N, Glineur F: Using underapproximations for sparse nonnegative matrix factorization. Pattern Recogn 2010, 43(4):1676-1687. 10.1016/j.patcog.2009.11.013
Article MATH Google Scholar
Gillis N, Glineur F: Nonnegative factorization and the maximum edge biclique problem. 2008.
Google Scholar
Ho N-D: Nonnegative matrix factorization-algorithms and applications. PhD thesis. Universite catholique de Louvain; 2008.
Google Scholar
Chang C-I, Du Q: Estimation of number of spectrally distinct signal sources in hyperspectral imagery. IEEE Trans Geosci Remote Sens 2004, 42(3):608-619. 10.1109/TGRS.2003.819189
Article Google Scholar
[http://speclab.cr.usgs.gov/PAPERS/cuprite.gr.truth.1992/swayze.1992.html]

Download references

Acknowledgements

The authors would like to thank S. Moussaoui for letting us do the benchmarks with the code of his BPSS algorithms.

Author information

Authors and Affiliations

Institut Fresnel, 13397, Marseille, France
Weishi Chen & Mireille Guillaume
School of Electronics and Information Engineering, Beihang University, 100191, Beijing, China
Weishi Chen

Authors

Weishi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Mireille Guillaume
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weishi Chen.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Authors’ original file for figure 13

Authors’ original file for figure 14

Authors’ original file for figure 15

Authors’ original file for figure 16

Authors’ original file for figure 17

Authors’ original file for figure 18

Authors’ original file for figure 19

Authors’ original file for figure 20

Authors’ original file for figure 21

Authors’ original file for figure 22

Authors’ original file for figure 23

Authors’ original file for figure 24

Authors’ original file for figure 25

Authors’ original file for figure 26

Authors’ original file for figure 27

Authors’ original file for figure 28

Authors’ original file for figure 29

Authors’ original file for figure 30

Authors’ original file for figure 31

Authors’ original file for figure 32

Authors’ original file for figure 33

Authors’ original file for figure 34

Authors’ original file for figure 35

Authors’ original file for figure 36

Authors’ original file for figure 37

Authors’ original file for figure 38

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Chen, W., Guillaume, M. HALS-based NMF with flexible constraints for hyperspectral unmixing. EURASIP J. Adv. Signal Process. 2012, 54 (2012). https://doi.org/10.1186/1687-6180-2012-54

Download citation

Received: 29 June 2011
Accepted: 05 March 2012
Published: 05 March 2012
DOI: https://doi.org/10.1186/1687-6180-2012-54

HALS-based NMF with flexible constraints for hyperspectral unmixing

Abstract

1. Introduction

2. NMF for hyperspectral unmixing

2.1. NMF problem

2.2. HALS algorithms

3. NMF with flexible constraints

3.1. STU constraint

3.2. Maximum spatial dispersion constraint

3.3. Minimum spectral dispersion constraint

3.4. Minimum distance constraint

4. Simulations on synthetic data

4.1. Evaluation metrics

4.2. Synthetic data

4.3. Compared algorithms

4.4. Simulations

5. Application on real hyperspectral data

6. Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords