# Endmember induction by lattice associative memories and multi-objective genetic algorithms

- Manuel Graña
^{1}Email author and - Miguel A Veganzones
^{1}

**2012**:64

https://doi.org/10.1186/1687-6180-2012-64

© Graña and Veganzones; licensee Springer. 2012

**Received: **29 June 2011

**Accepted: **13 March 2012

**Published: **13 March 2012

## Abstract

Endmembers are the spectral signatures of the constituent materials of an scene captured with a hyperspectral sensor. Endmember induction algorithms (EIAs) try to extract the endmembers of the scene from the corresponding hyperspectral image. In this article, we benefit from recent theoretical results showing that a set of affine independent vectors can be extracted from the rows and columns of lattice autoassociative memories (LAAM). In the linear mixing model (LMM), endmembers are defined as the vertices of a convex polytope covering the data. Affine independence is a sufficient condition for a set of vectors to be the vertices of a convex polytope, and thus to be considered as endmembers. Our basic procedure is the WM algorithm extracting the endmembers from the dual LAAMs built to store the spectra of the hyperspectral image pixels. The set of endmembers induced by this algorithm defines a convex polytope covering the hyperspectral image data. However, the number of induced endmembers obtained by this procedure is too high for practical purposes, besides they are highly correlated. We apply a Multi-Objective Genetic Algorithm (MOGA) to the optimal selection of the image endmembers. Two fitness functions are used, the residual error of the unmixing process and the size of the set of endmembers. From the MOGA's Pareto front we decide the final set of endmembers by examining the decrease in residual error obtained by increasing the number of endmembers. We propose a faster MOGA where the error fitness function is replaced by a fitness function based on the correlation between endmembers. We compare our process with a state-of-the-art EIA on well known benchmark images.

## Keywords

## 1 Introduction

The high spectral resolution provided by current hyperspectral imaging devices facilitates identification of fundamental materials that make up a remotely sensed scene [1, 2]. Specifically, the linear mixing model (LMM) [1] assumes that the spectral signature of one pixel of the hyperspectral image is a linear combination of the endmember spectra corresponding to the aggregation of materials in the scene due to reduced sensor spatial resolution. Therefore, sub-pixel resolution analysis aims to the extraction of the fractional abundances of such endmembers inside the pixel.

Endmember spectra define a convex polytope^{a} in the high-dimensional space defined by the image pixel spectra. The fractional abundance of the endmembers at each pixel correspond to the convex coordinates relative to the convex polytope vertices. The set of endmembers can be defined on the basis of *a priori* knowledge about the imaged scene. A library of known pure ground signatures or laboratory samples could be used. However, if such knowledge is not available or it is not useful due to noise conditions, then the set of endmembers must be induced from the hyperspectral image data by means of endmember induction algorithms (EIA) [3, 4]. The review in [3] emphasized the degree of automation to classify the algorithms, while [4] looks at the diverse computational foundations, assuming that user interaction must be minimal or null.

Lattice based EIA (L-EIA) are based on lattice computing techniques [5]. For instance, the studies in [6–12] are based on the notion of Strong Lattice Independence (SLI), following the conjecture in [13] that SLI vectors are affine independent vectors and thus its convex hull defines a simplex. Using lattice auto-associative memories (LAAM) [14, 15] built from the hyperspectral image data, sets of SLI vectors where induced and used as endmembers. Recent study [10] has shown how to obtain sets of affine independent vectors from the rows and columns of the LAAM constructed using the hyperspectral data. Specifically, the WM algorithm introduced in [10] computes the erosive and dilative LAAM, the hyperbox enclosing the data, defined the minimum and maximum values of the data at each band, and transforms the columns or rows of the erosive and dilative LAAMs to become the vertices of a convex polytope covering the image data. This algorithm has several advantages: (1) it is very fast, (2) performs only addition, subtraction and max/min operations, (3) induced endmembers are directly related to the actual data in the image. However, the WM algorithm always returns the same number of endmembers *p* = 2 * (*L* + 1), where *L* is the number of spectral bands in the image. This number is too high for the actual distinct constituent materials in hyperspectral images. Besides, these endmembers are highly correlated and identification of useful endmembers with some physical interpretation is tricky [16].

Genetic algorithms (GA) are random optimization algorithms inspired on natural evolution, a population of individuals evolves through mutation and crossover to maximize a fitness function. Multi-objective GA (MOGA) are specific GAs dealing with the optimization of several objective functions. That is, MOGA's fitness function is vector valued [17–20]. A minimization multi-objective problem is stated as follows: given an *n*-dimensional variable vector **x** = {*x*_{1},..., *x*_{
n
}} in the solution space **X**, find a vector **x*** that minimizes the *K* objective functions $\mathbf{z}\left({\mathbf{x}}^{*}\right)=\underset{\mathbf{x}}{\text{min}}\left\{{z}_{1}\left(\mathbf{x}\right),...,{z}_{K}\left(\mathbf{x}\right)\right\}$. The region of feasible solutions in the solution space **X** is often specified by a collection of constraints, such as *g*_{
j
}(**x***) = *b*_{
j
}for *j* = 1,..., *m*. An ideal solution that simultaneously optimizes each objective function is impossible to find in most cases because of the mutual conflicts between objectives. Multi-objective optimization algorithms search for balanced solutions, providing not a single solution but a collection of them, an approximation to the *Pareto set*. A solution **x** *dominates* another solution **x**' if it improves it for all objective functions, i.e., *z*_{1} (**x**') < *z*_{1}(**x**),...,*z*_{
K
}(**x**') < *z*_{
K
}(**x**). The Pareto set is the set of solutions that are not dominated by any other solution **P** = {**x**|¬∃**x**'; *z*_{1} (**x**') < *z*_{1} (**x**), ..., *z*_{
K
}(**x**') < *z*_{
K
}(**x**) }. In the function domain, the *Pareto front* is constituted by the function values of the solutions in the Pareto set. If a single solution is sought, then an additional selection must be performed on the Pareto set of non-dominated solutions. We propose the use of specific MOGA for the selection of the endmembers from the large set of endmember candidates provided by the WM algorithm.

In this study, we propose and test a three-step process for endmember induction which we call WM-MOGA. First, we compute the set of candidate endmembers applying the WM algorithm. Second, we apply a MOGA looking for an optimal set of endmembers minimizing both the residual error (RMSE) from the unmixing process and its cardinality, which amounts to minimizing the complexity of the solution. This WM-MOGA process returns a set of solutions that form a Pareto front on the solutions space formed by the two objective functions. Third, we apply an Occam razor threshold criterion [21] to select the optimal set of induced endmembers as in [22]. To speed up the MOGA phase we use an alternative definition of the objective functions, minimizing the maximum correlation between endmembers. This MOGA does not need to compute the unmixing process for each individual and, therefore, it is several orders of magnitude faster.

We test the WM-MOGA process on real hyperspectral scenes and compare it against the random search approach [22] based on the N-FINDR algorithm [23]. For validation, we calculate the correlation between the fractional abundances of the optimal endmembers induced by both, the WM-MOGA and the N-FINDR, and the available ground truth class spatial distribution. As WM-MOGA is an unsupervised process, the assignment of an abundance image to a ground truth class implies computing all possible combinations and selecting the best match.

The article is organized as follows. In Section 2, we review the WM algorithm. Section 3 introduces the proposed WM-MOGA approach for endmember induction. In Section 4, we define the experimental research, and in Section 5, we analyze the results. Finally, we give some conclusions in Section 6. Appendix 1 describes the N-FINDR algorithm. Appendix 2 reviews the specific MOGA applied to the problem.

## 2 WM algorithm

*H*, it is reshaped to form a matrix

*X*of dimension

*N*×

*L*, where

*N*is the number of image pixels, and

*L*is the number of spectral bands. The algorithm starts by computing the least hyperbox covering the data, $\mathcal{B}\left(\mathbf{v},\mathbf{u}\right)$, where

**v**and

**u**are the

*minimal*and

*maximal corners*, respectively, whose components are computed as follows:

**W**

_{ XX }and

**M**

_{ XX }:

**W**

_{ XX }and

**M**

_{ XX }are scaled by

**v**and

**u**, forming the additive scaled sets $W={\left\{{\mathbf{w}}^{k}\right\}}_{k=1}^{L}$ and $M={\left\{{\mathbf{m}}^{k}\right\}}_{k=1}^{L}$:

**W**

^{ k }and

**M**

^{ k }denote the

*k*th column of

**W**

_{ XX }and

**M**

_{ XX }, respectively. Finally, the set

*V = W*∪

*M*∪ {

**v**,

**u**} contains the vertexes of the convex polytope $F\left(X\right)\cap \mathcal{B}\left(\mathbf{v},\mathbf{u}\right)$ which covers the convex hull of the data,

*C*(

*X*), as a subset:

where *F* (*X*) denotes the set of fixed points for both LAAMS **W**_{
XX
}and **M**_{
XX
}. The WM algorithm returns the set *V* as the set of induced endmembers. The algorithm is simple and fast but the number of induced endmembers, the amount of column vectors in *V*, can be too large for practical purposes. Furthermore, some of the endmembers induced that way can show high correlation even if they are affine independent. To obtain a meaningful set of endmembers, we search for an optimal subset of *V* in the sense of minimizing the unmixing residual error and the number of endmembers.

## 3 WM-MOGA

*H*, it is reshaped to a matrix

*X*of size

*N*×

*L*, where

*N*is the number of pixels in the image, and

*L*is the number of spectral bands. Applying WM algorithm to

*X*obtains a set of candidate endmembers, denoted

*E*

_{ WM }= {

**e**

^{1},...,

**e**

^{ p }}. The second step of WM-MOGA finds the optimal subset of endmembers in terms of unmixing residual error and complexity, by using a multi-objective genetic algorithm (MOGA) [17, 18, 20] to calculate the

*Pareto front*of non dominated solutions, as defined in the introduction section, $\mathbf{P}=\left\{{E}^{1},...,{E}^{q}\right\}\subseteq \mathcal{P}\left({E}_{\text{WM}}\right)$, where $\mathcal{P}\left({E}_{\text{WM}}\right)$ is the power-set of

*E*

_{ WM }. We define two fitness functions. One is the unmixing residual error of Equation (7) denoted by

*f*

_{RMSE}(

*E*),

**x**

_{ i }is the

*i*th pixel in the hyperspectral image

*X*, and α

_{ i }is the vector of fractional abundances for the

*i*th pixel calculated by full constrained least squares unmixing

- 1.
*L*is the number of the spectral bands and*N*is the number of data samples. - 2.Compute
**v**= [*v*_{1},...,*v*_{ L }] and**u**= [*u*_{1}, ...,*u*_{ L }],${v}_{k}=\underset{\xi}{\text{min}}{x}_{k}^{\xi};{u}_{k}=\underset{\xi}{\text{max}}{x}_{k}^{\xi}$

*k*= 1,...,

*L*and

*ξ*= 1,...,

*N*,

- 3.Compute the LAAMs${\mathbf{W}}_{XX}=\underset{\xi =1}{\overset{N}{\wedge}}\left[{\mathbf{x}}^{\xi}\times {\left(-{\mathbf{x}}^{\xi}\right)}^{\prime}\right];{\mathbf{M}}_{XX}=\underset{\xi =1}{\overset{N}{\vee}}\left[{\mathbf{x}}^{\xi}\times {\left(-{\mathbf{x}}^{\xi}\right)}^{\prime}\right]$

- 4.Build
*W*= {**w**^{1},...,**w**^{ L }} and*M*= {**m**^{1},...,**m**^{ L }} such that${\mathbf{w}}^{k}={u}_{k}+{\mathbf{W}}^{k};{\mathbf{m}}^{k}={v}_{k}+{\mathbf{M}}^{k};k=1,...,L.$ - 5.
Return the set

*V*=*W*∪*M*∪ {**v**,**u**}.

**Pseudo-code specification of the WM algorithm**.

- 1.
Apply WM (

*X*) to obtain*E*_{WM}= {**e**^{1},...,**e**^{ p }} - 2.
Apply MOGA (

*E*_{ WM }) to obtain the Pareto set of solutions**P**= {*E*^{ i },*i*= 1,...,*q*} - 3.
Apply the Occam razor selecting $E*\left(\epsilon \right)=\text{arg}\underset{\mathbf{P}}{\text{min}}\left\{\left|\frac{{f}_{\text{RMSE}}\left({E}^{i+1}\right)}{{f}_{\text{RMSE}}\left({E}^{i}\right)}-\frac{{f}_{\text{RMSE}}\left({E}^{i}\right)}{{f}_{\text{RMSE}}\left({E}^{i-1}\right)}\right|<\epsilon \right\}$

- 4.
Return

*E**(*ε*)

Algorithm 2: **Pseudo-code for the WM-MOGA process**.

*f*

_{|·|}(

*E*),

where |·| denotes the cardinality of a set. The MOGA requires to encode the problem so each individual in the search population represents a solution. A *k* th individual *chromosome* is defined as a binary vector **b**_{
k
}= {*b*_{1}, ..., *b*_{
p
}}; *b*_{
i
}∈ {0,1}; ∀*i* = 1,..., *p*; being *p* the number of candidate endmembers returned by WM algorithm. If *b*_{
i
}= 1, the *i* th candidate endmember **e**^{
i
}∈ *E*_{
WM
}belongs to the set of induced endmembers, *E*_{
k
}, corresponding to **b**_{
k
}. Appendix 2 gives the details of the NSGA-II algorithm [19] that we have applied to this problem.

*E*

^{ i }∈

**P**by their cardinality so that

*i*= |

*E*

^{ i }|, the Occam razor condition is specified in Equation (9) for a given selection threshold

*ε*on the difference of relative errors between consecutive solutions according to the number of endmembers:

The algorithm returns *E**(*ε*) as the final set of induced endmembers from the hyperspectral image *X*.

### 3.1 WM-MOGA-CORR

*c*

_{ ij }is the Pearson's correlation between endmembers. Thus, we try to minimize the maximum correlation between endmembers. This fitness function is far less computationally expensive than the fitness function

*f*

_{RMSE}of Equation (7). The solution complexity related fitness function of Equation (8) can not be applied in conjunction with the correlation based fitness function of Equation (10) because it leads to the trivial result of a single endmember. Instead, we use its inverse:

We call this approximation to the WM-MOGA using fitness functions $\left\{{f}_{\text{CORR}},{f}_{{\left|\cdot \right|}^{-1}}\right\}$ WM-MOGA_{CORR}. Note that the optimality criteria for the set of endmembers sought is the minimization of the unmixing residual error and number of endmembers. The approximation WM-MOGA_{CORR} does not attempt to minimize these criteria directly, but nevertheless the quality of its achieved solution will be evaluated on the basis of the unmixing residual error.

## 4 Experimental design

We first present the hyperspectral data used on the experiments. Secondly, we describe the experimentation methodology. The experiments have been run using the MATLAB^{b} implementation of NSGA-II [19]. The number of individuals in the population was set to 100 for the WM-MOG*A*_{RMSE} and 1000 for the WM-MOG*A*_{CORR}. To perform the comparison we calculate the correlation between the fractional abundance images corresponding to the optimal sets of endmembers induced by the different approaches and the ground truth classes from the different hyperspectral scenes. All the hyperspectral scenes and code of the algorithms and methods implemented are freely accessible from the computational intelligence group website.^{c,d}

### 4.1 Hyperspectral data

#### 4.1.1 The Indian Pines scene

*μ*m. This scene contains two-thirds agriculture, and one-third forest or other natural perennial vegetation. There are two major dual lane highways, a railway line, as well as some low density housing, other built structures, and smaller roads. Since the scene is taken in June some of the crops present, corn, soybeans, are in early stages of growth with less than 5% coverage. The ground truth available is designated into sixteen classes with variable number of samples for each class (see Table 1). We have reduced the number of bands to 200 by removing bands covering the region of water absorption: [104-108], [150-163], and 220. Indian Pines data are available through Purdue's university MultiSpec site.

^{e}Figure 1 shows a sample band and the ground truth of Indian Pines dataset.

Indian Pines ground truth classes and number of samples collected for each class

# | Class | Samples |
---|---|---|

1 | Alfalfa | 46 |

2 | Corn-notill | 1428 |

3 | Corn-mintill | 830 |

4 | Corn | 237 |

5 | Grass-pasture | 483 |

6 | Grass-trees | 730 |

7 | Grass-pasture-mowed | 28 |

8 | Hay-windrowed | 478 |

9 | Oats | 20 |

10 | Soybean-notill | 972 |

11 | Soybean-mintill | 2455 |

12 | Soybean-clean | 593 |

13 | Wheat | 205 |

14 | Woods | 1265 |

15 | Buildings-Grass-Trees-Drives | 386 |

16 | Stone-Steel-Towers | 93 |

#### 4.1.2 The Salinas scene

Salinas groundtruth classes and number of samples for each class

# | Class | Number of samples |
---|---|---|

1 | Brocoli_green_weeds_1 | 2009 |

2 | Brocoli_green_weeds_2 | 3726 |

3 | Fallow | 1976 |

4 | Fallow_rough_plow | 1394 |

5 | Fallow_smooth | 2678 |

6 | Stubble | 3959 |

7 | Celery | 3579 |

8 | Grapes_untrained | 11271 |

9 | Soil_vinyard_develop | 6203 |

10 | Corn_senesced_green_weeds | 3278 |

11 | Lettuce_romaine_4wk | 1068 |

12 | Lettuce_romaine_5wk | 1927 |

13 | Lettuce_romaine_6wk | 916 |

14 | Lettuce_romaine_7wk | 1070 |

15 | Vinyard_untrained | 7268 |

16 | Vinyard_vertical_trellis | 1807 |

### 4.2 Methodology

Research questions explored in this article are the following ones: (a) is it possible to obtain a reduced set of endmembers from the WM algorithm using a search process based on the quality and size of the set of endmembers?, (b) is it possible to speed up the search process using indirect information such as the correlation between endmembers?, (c) how those endmember induction processes compare with a state-of-the-art algorithm?. We have defined the WM-MOGA and WM-MOGA_{CORR} algorithms to answer the first two questions. The experimental results provide answers to the last question. We compare the WM-MOGA (denoted WM-MOGA_{
RMSE
}in the figures) and WM-MOGA_{CORR} processes with a recent random search approach based on N-FINDR [22] which runs the N-FINDR algorithm several times for increasing values of the number of desired endmembers, *p*, and then applies the Occam razor specified by Equation (9) to determine the optimal set of endmembers ${E}_{\text{N-FINDR}}^{*}\left(\epsilon \right)$.

It is important to note that the endmember induction processes are unsupervised, therefore the meaning of the endmembers found and their relation to the ground-truth classes is unknown. However, we want to support our study on the knowledge of a given ground-truth for the benchmark images. The evaluation process looks for the best match between the abundance images produced by the unmixing and the image regions identified with each class in the ground truth.^{f} We compute all the possible spatial correlations between them, obtaining a matrix of correlation indices. The examination of this matrix gives information about the discovery of the ground-truth classes, which endmembers are associated to them and the uncertainty or ambiguity of this association.

## 5 Results

_{CORR}and N-FINDR based approach of [22]. Note that the Pareto front of WM-MOGA

_{CORR}refers to the correlation between endmembers, not to the unmixing residual error. However the selection of the solution is based on the same criteria for all algorithms. Figures 3 and 4 show these plots for the Indian Pines and Salinas images, respectively. The curve corresponding to the WM-MOGA is smooth because the MOGA searches for the Pareto front based on these criteria. The curves corresponding to WM-MOGA

_{CORR}and N-FINDR are more irregular, with several local minima. The Occam razor tries to determine the optimal solution based on the relative error decrease according to Equation (9). We plot in Figures 5 and 6 the relative error

*f*

_{RMSE}(

*E*

^{ i })/(

*E*

^{i-1}) evolution for the algorithms. It can be appreciated that the WM-MOGA provides a smooth relative error curve that allows an easy setting of the threshold parameter of the Occam razor and gives sensible results in the selection applying

*ε*= 10

^{-2}. For the WM-MOGA

_{CORR}selection of the definitive threshold required inspection of the relative error curve, a threshold

*ε*= 10

^{-2}gives sensible results for the Indian Pines image, but

*ε*= 10

^{-1}is required for the Salinas image. The N-FINDR approach gives very irregular relative error curves, however the standard threshold

*ε*= 10

^{-2}gives sensible results for both images. The endmembers selected as a result of these decisions are plot in Figures 7 and 8 for the Indian Pines and Salinas images, respectively. Endmembers found by the N-FINDR are also plot for comparison. It can be appreciated that WM-MOGA

_{CORR}provides more uncorrelated endmembers than the other approaches. The WM-MOGA

_{CORR}is the most relaxed approach, finding the highest number of endmembers, however, a strong correlation can be appreciated in Figures 7 and 8 among the endmembers found by all three approaches, though the WM-based endmembers are not pixel spectra from the image, while N-FINDR endmembers are pixel signatures selected from the image. For a qualitative evaluation of the results, we provide in Figures 9 and 10 the thematic maps and the images containing the maximum abundance value per pixel for each of the tested approaches. The thematic maps are computed as follows. We compute the correlation coefficient between each abundance image and each binary image corresponding to a ground-truth class spatial distribution. We assign to each endmember the set of ground truth regions with positive correlation coefficients. For each pixel we select the endmember with the maximal abundance value and we assign to the pixel the linear combination of the colors of ground-truth positively correlated with the endmember abundance, i.e., the orange color corresponds to the mixture of red and yellow. We remove the background class in these computations. It can be appreciated examining the thematic maps in Figures 9b,d,f and 10b,d,f that the WM-MOGA

_{CORR}provides more clean recognition of the ground-truth class areas in both Indian Pines and Salinas, maybe due to its emphasis in uncorrelated endmembers. Besides, there is little correspondence between the classes and the endmembers in all cases: most of the ground truth regions are not recognized in their original spatial localization. Attending to the abundance coefficients shown in Figures 9c,e,g and 10c,e,g there are few pixels with a pure endmember matching, most abundance values are moderate implying some degree of mixture of the real classes. We found that, in average, the WM-MOGA

_{CORR}provides the greater values of the abundance coefficients, improving over WM-MOGA and N-FINDR. Finally, to asses the degree of ground-truth class discovery by the endmembers we plot the maximum correlation coefficient

*per*abundance and

*per*ground truth class in Figures 11 and 12 for the Indian Pines and Salinas images, respectively. Examining the maximum correlation

*per*ground truth class (Figures 11a and 12a) the results are not exactly equal in both images, however the trends are similar. We find that the WM-MOGA provides the best identification of the class almost for all of them. For some classes WM-MOGA

_{CORR}performs better, for a couple of classes N-FINDR is the best detector. We can say that WM-MOGA compares well or improves the ground-truth class detection over N-FINDR. Also the correlation approximation of WM-MOGA

_{CORR}gives surprising good detections of some ground truth classes, and in general is a good approximation to the detection obtained by WM-MOGA. Examining the maximum correlation per abundance image (Figures 11b and 12b), we find the same kind of results. For most of the induced abundances, the WM-MOGA provides the best correspondence to some ground-truth class. The approximation provided by the WM-MOGA

_{CORR}, which is several orders of magnitude faster, does a good job of finding meaningful endmembers, but it finds many endmembers so that there is a tail of irrelevant endmembers little correlated with the ground-truth classes.

## 6 Conclusions

The WM Algorithm proposed by Ritter at al. [10, 16] is a fast procedure to obtain a set of affine independent vectors which are the vertices of a convex polytope covering the sample data. Applied to hyperspectral images, WM Algorithm produces a large set of candidate endmembers. We propose the application of an specific MOGA minimizing the unmixing residual error and the number of endmembers, followed by an Occam razor selection on the Pareto front to obtain an appropriate set of endmembers tailored to the data. The WM-MOGA compares well to a recent state-of-the-art endmember induction heuristic [22] in terms of the correlation of the induced abundance images with the given ground truth class spatial distribution. Furthermore, we propose an approximation to the MOGA which does not need to compute the linear unmixing based on the individual chromosomes at each generation. This fast process identification of the ground truth classes compares well with the reference heuristic. However, it overestimates the set of endmembers, including some redundant or irrelevant endmembers. Future study will be addressed to improve the fast approach introducing new regularization fitness functions to obtain smaller sets of endmembers of equivalent quality.

## Appendix 1: N-FINDR

Algorithm 3 presents the N-FINDER [23] pseudo-code. The N-FINDER algorithm works by growing a simplex inside the data, beginning with a random set of pixels. The vertices of the simplex with higher volume are assumed to identify the endmembers. Previously, data dimensionality has to be reduced to *p -* 1 dimensions, being *p* the number of endmembers searched for.

*E*be the matrix of endmembers augmented with a row of ones

**e**

_{ i }is a column vector containing the spectra of the

*i*th endmember. The volume of the simplex defined by the endmembers is proportional to the determinant of

*E*

The N-FINDER starts by selecting an initial random set of pixels as endmembers. Then, for each pixel and each stored endmember, the endmember is replaced with the spectrum of the pixel and the volume recalculated by Equation (13). If the volume of the new simplex increases, the endmember is replaced by the spectrum of the pixel. The procedure ends when no more replacements are done. The N-FINDER is a greedy algorithm, prone to fall in local maxima of the volume function.

## Appendix 2: NSGA-II

The non-dominated sorting genetic algorithm II (NSGA-II) [19] is a fast and elitist multi-objective genetic algorithm. The NSGA-II algorithm starts by creating a random initial parent population *P*_{0}, which is sorted based on non-domination such that a rank is assigned to each solution according to its level of non-domination (rank 1 corresponds to non-dominated solutions in the Pareto front). Conventional tournament selection, recombination and mutation operators for binary chromosomes are used to create an offspring population *Q*_{0} of size *N*.

*R*

_{ t }=

*P*

_{ t }⋃

*Q*

_{ t }of size 2

*N*is formed. Elitism is ensured because the best individuals from the parents and of springs are always retained. Then,

*R*

_{ t }is sorted according to non-domination level using a fast sorting algorithm. The non-dominated set of solutions in ${\mathcal{F}}_{1}$ are included in new population

*P*

_{t +1}. Once ${\mathcal{F}}_{1}$ is removed from

*R*

_{ t }, then the solutions in ${\mathcal{F}}_{2}$ are the new set of non-dominated solutions in ${R}_{t}-{\mathcal{F}}_{1}$, and are thus included in the new population

*P*

_{t +1}. This procedure is repeated adding subsequent non-dominated fronts in the order of their ranking until reaching the required number of solutions

*N*. Often, not all the solutions in the last considered front

*F*

_{ l }are included. The solutions of the last front are sorted in descending order using the crowded-comparison

- 1.
Apply principal component analysis (PCA) to reduce the data dimensionality. Keep the first

*p -*1 principal components. - 2.
Randomly select

*p*vectors from the data to initialize the set of induced endmembers*E*. - 3.
Calculate the volume of the simplex

*v*=*V*(*E*) (13).*v*_{actual}=*v*. - 4.
For each endmember

**e**_{ k },*k*= 1,...,*p*: - (a)
For each data vector

**x**_{ i },*i*= 1, ...,*N*: - i.
Form a new matrix

*E*' by substituting the endmember**e**_{ k }by the data vector**x**_{ i }. - ii.
Calculate the volume of the simplex

*v*' =*V*(*E*'). - iii.
If

*v*' >*v*_{actual}then*E*' becomes*E*.*v*_{actual}=*v*'. - 5.
If

*v*_{actual}>*v*then*v*=*v*_{actual}. Go to step 4.

**N-FINDR algorithm**.

- 1.
Combine parent and offspring population:

*R*_{ t }=*P*_{ t }⋃*Q*_{ t } - 2.
Calculate all the non-dominated fronts of ${R}_{t}:\mathcal{F}=\left({\mathcal{F}}_{1},{\mathcal{F}}_{2},...\right)$ = fast-non-dominated-sort (

*R*_{ t }) - 3.
Do until filling the parent population: $\left|{P}_{t+1}\right|+{\left|\mathcal{F}\right|}_{i}\le N$

- (a)
Calculate crowding-distance in ${\mathcal{F}}_{i}$: crowding-distance-assignment $\left({\mathcal{F}}_{i}\right)$

- (b)
Include

*i*th non-dominated front in the parent population: ${P}_{t+1}={P}_{t+1}\cup {\mathcal{F}}_{i}$ - (c)
Check the next front for inclusion:

*i*=*i*+ 1 - 4.
Sort in descending order using the crowding-comparison operator ≼

_{ n }: Sort $\left({\mathcal{F}}_{i},{\preccurlyeq}_{n}\right)$ - 5.
Choose the first (

*N*- |*P*_{t +1}|) elements of ${\mathcal{F}}_{i}:{P}_{t+1}={P}_{t+1}\cup {\mathcal{F}}_{i}\left[1:\left(N-\left|{P}_{t+1}\right|\right)\right]$ - 6.
Use crossover and mutation to create a new offspring population

*Q*_{t +1} - 7.
Increment the generation counter:

*t*=*t*+1

Algorithm 4: **NSGA-II algorithm iteration**.

operator ≼_{
n
}which favors solutions with lower (better) non-domination rank and, if both solutions belong to the same front, favors the solution located in a lesser crowded region. Best solutions are chosen up to fill *P*_{t +1}, which is now used for crossover and mutation to create a new offspring population *Q*_{t +1}of size *N*. The overall complexity of the algorithm is *O* (*MN*^{2}) which is governed by the non-dominated sorting part of the algorithm. The diversity among non-dominated solutions is introduced by using the crowding-comparison procedure, which makes unnecessary any niching parameter.

## Endnotes

^{a}The convex polytope is a simplex when it is defined by *d* + 1 vertices, where *d* is the dimension of the space. ^{b}http://www.mathworks.com/products/matlab/ ^{c}http://www.ehu.es/ccwintco/index.php/GIC-source-code-free-libre ^{d}http://www.ehu.es/ccwintco/index.php/GIC-experimental-databases ^{e}https://engineering.purdue.edu/~biehl/MultiSpec/ ^{f}We do not have knowledge of the ground truth endmembers.

## Declarations

### Acknowledgements

Miguel Angel Veganzones has a predoctoral grant from the Gobierno Vasco.

## Authors’ Affiliations

## References

- Keshava N, Mustard JF: Spectral unmixing.
*IEEE Signal Process Mag*2002, 19: 44-57. 10.1109/79.974727View ArticleGoogle Scholar - Plaza A, Benediktsson JA, Boardman JW, Brazile J, Bruzzone L, Camps-Valls G, Chanussot J, Fauvel M, Gamba P, Gualtieri A, Marconcini M, Tilton JC, Trianni G: Recent advances in techniques for hyperspectral image processing.
*Remote Sens Environ*2009, 113(Supplement 1):S110-S122.View ArticleGoogle Scholar - Plaza A, Martinez P, Perez R, Plaza J: A quantitative and comparative analysis of endmember extraction algorithms from hyperspectral data.
*IEEE Trans Geosci Remote Sens*2004, 42(3):650-663. 10.1109/TGRS.2003.820314View ArticleGoogle Scholar - Veganzones MA, Graña M: Endmember extraction methods: a short review. In
*Knowledge-Based Intelligent Information and Engineering Systems, 12th International Conference, KES 2008 Proceedings, Part III, Volume 5179 of*. Lecture Notes in Computer Science, Springer; 2008:400-407.Google Scholar - Graña M: A brief review of lattice computing. In
*IEEE International Conference on Fuzzy Systems, 2008. FUZZ-IEEE 2008. (IEEE World Congress on Computational Intelligence)*.*Volume 1*. Hong Kong, China; 2008:1777-1781.View ArticleGoogle Scholar - Graña M, Sussner P, Ritter G: Associative morphological memories for endmember determination in spectral unmixing. In
*The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ'03*.*Volume 2*. St. Louis, MO, USA; 2003:1285-1290.View ArticleGoogle Scholar - Graña M, Villaverde I, Maldonado JO, Hernandez C: Two lattice computing approaches for the unsupervised segmentation of hyperspectral images.
*Neurocomputing*2009, 72(10-12):2111-2120. 10.1016/j.neucom.2008.06.026View ArticleGoogle Scholar - Ritter GX, Urcid G, Schmalz MS: Autonomous single-pass endmember approximation using lattice auto-associative memories.
*Neurocomputing*2009, 72(10-12):2101-2110. 10.1016/j.neucom.2008.06.025View ArticleGoogle Scholar - Graña M, Chyzhyk D, García-Sebastián M, Hernández C: Lattice independent component analysis for functional magnetic resonance imaging.
*Inf Sci*2011, 181(10):1910-1928. 10.1016/j.ins.2010.09.023View ArticleGoogle Scholar - Ritter GX, Urcid G: A lattice matrix method for hyperspectral image unmixing.
*Inf Sci*2010, 181(10):1787-1803.MathSciNetView ArticleMATHGoogle Scholar - Urcid S, Valdiviezo N: Generation of lattice independent vector sets for pattern recognition applications. In
*Mathematics of Data/Image Pattern Recognition, Compression, Coding, and Encryption with Applications X, Volume 67000C of*. Proceedings of SPIE, San Diego, Ca, USA; 2007:1-12.Google Scholar - Urcid G, Valdiviezo J, Ritter GX: Endmember search techniques based on lattice auto-associative memories: a case of vegetation discrimination. In
*SPIE, Proceedings, Volume 7477,74771D of*. Image and Signal Processing for Remote Sensing XV, Barcelona, Spain; 2009:1-12.Google Scholar - Ritter G, Gader P: Fixed points of lattice transforms and lattice associative memories. In
*Advances in Imaging and Electron Physics*.*Volume 144*. Edited by: P Hawkes. Academic Press, Waltham, MA; 2006:165-242.Google Scholar - Ritter GX, Sussner P, Diaz-de-Leon JL: Morphological associative memories.
*IEEE Trans Neural Netw*1998, 9(2):281-293. 10.1109/72.661123View ArticleGoogle Scholar - Ritter GX, Urcid G, Iancu L: Reconstruction of patterns from noisy inputs using morphological associative memories.
*J Math Imag Vision*2003, 19(2):95-111. 10.1023/A:1024773330134MathSciNetView ArticleMATHGoogle Scholar - Ritter G, Urcid G: Lattice algebra approach to endmember determination in hyperspectral imagery. In
*Advances in Imaging and Electron Physics*.*Volume 160*. Edited by: PW Hawkes. Academic Press, Burlington; 2010:113-169.Google Scholar - Coello CAC, Lamont GB, Veldhuizen DAV:
*Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation)*. Springer-Verlag New York, Inc., New York; 2006.MATHGoogle Scholar - Deb K:
*Multi-Objective Optimization Using Evolutionary Algorithms*. 1st edition. Wiley, NJ; 2001.MATHGoogle Scholar - Deb K, Pratap A, Agarwal S, Meyarivan T: A fast and elitist multiobjective genetic algorithm: NSGA-II.
*IEEE Trans Evolution Comput*2002, 6(2):182-197. 10.1109/4235.996017View ArticleGoogle Scholar - Konak A, Coit DW, Smith AE: Multi-objective optimization using genetic algorithms: a tutorial.
*Reliab Eng Syst Safe*2006, 91(9):992-1007. 10.1016/j.ress.2005.11.018View ArticleGoogle Scholar - Natarajan B, Konstantinides K, Herley C: Occam filters for stochastic sources with application to digital images.
*IEEE Trans Signal Process*1998, 46(5):1434-1438. 10.1109/78.668806View ArticleGoogle Scholar - Chang C, Wu C, Tsai C: Random N-Finder (N-FINDR) endmember extraction algorithms for hyperspectral imagery.
*IEEE Trans Image Process*2011, 20(3):641-656.MathSciNetView ArticleGoogle Scholar - Winter ME, N-FINDR: an algorithm for fast autonomous spectral end-member determination in hyperspectral data. In
*Imaging Spectrometry V, Volume 3753 of*. Edited by: Descour MR, Shen SS, Boston, MA, USA. SPIE, Proceedings; 1999:266-275.View ArticleGoogle Scholar - Chang C, Ren H, Chang C, D'Amico F, Jensen JO: Estimation of subpixel target size for remotely sensed imagery.
*IEEE Trans Geosci Remote Sens*2004, 42(6):1309-1320.View ArticleGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.