2.1. Image Database
A set of 33 high-resolution images was collected from storage tanks of a petroleum plant. The images were obtained under different acquisition conditions of illumination and magnification. Some images show a large number of corrosion defects, while others give a detailed view of a single defect. An expert selected 84 regions of interest (ROI), each one resulting
pixels small images containing true corrosion, corrosion-like and noncorrosion samples. A subset of 43 ROI images represent different corrosion damages. The remaining 41 ROI images contain non corroded surfaces or corroded-like surfaces. Figure 1 illustrates the image database.
2.2. Texture Attributes
Texture is formally defined as the set of local neighborhood properties of the gray levels of an image region [16]. It reaches intuitive attributes as roughness, granulation, and regularity. There are four different methods for texture analysis in the literature: statistical, structural, model-based and transform-based methods.
The gray level intensity distribution of an image is based on the assumption that texture information is contained in the spatial relationship between the intensities of a pixel and its neighbor [8]. This information is condensed in the GLCM. The gray level intensity distribution can be specified by a matrix of relative frequencies, in which two neighbor elements of texture labeled
and
, separated by a distance
in an orientation
, occur in the image, one with property
and other with property
.
GLCM encompasses at least 14 texture attributes [8]. Although, for simplicity sake, we adopt an optimized subset of 4 attributes that is, contrast, correlation, energy, and homogeneity [10] given by
The matrix
represents GLCM and the sum index
in (1) is denoted by GLCM size minus one less one. The parameters
,
,
, and
in (2) represent, respectively, the mean value and standard deviation of line
and column
from GLCM.
Contrast measures the dissimilarity intensity between a pixel and its neighbor over the whole image. Correlation represents how a pixel is related to its neighbor over the whole image. Energy is the sum of squared elements in GLCM, also known as uniformity of energy. Homogeneity stands for the similarity between gray level values of image pixels.
Homogeneity and contrast identify organized structures in the image. Energy and correlation characterize the complexity and nature of gray level transitions in the image. Even though these attributes contain information about image texture, it is difficult to identify which specific texture characteristic is represented by each attribute. Hence, texture attributes are stored in a feature database for further characterization by a classification process.
2.3. Color Attributes
Color is the visual perception of the spectral distribution of the light. Optical imaging uses three color channels, usually associated with red (
), green (
), and blue (
), sufficient for the visual interpretation of spectra [16]. In applications of image corrosion detection by using digital image processing and pattern recognition algorithms, it is relevant to identify the best color model to represent color attributes.
The HSI system constitutes a model that best describes how humans naturally respond to color. Thus, the HSI color space is appropriate for this purpose since it allows describing characteristics separately from brightness chrominance [11].
The hue, saturation, and intensity are obtained from RGB color space by using the following transformations:
Hue (
) is proportional to the color frequency as (5) describes. For a corroded surface,
lies between yellow and red wavelengths.
Saturation (
) refers to the dominance of hue in the color and is given by (6). A corroded surface is normally more saturated than other areas because metallic surface is often painted in light colors as gray and white.
Intensity (
) is given by (7) and describes the strength of the light. As explained before, the color of non corroded surface tends to white wavelength (high intensity).
Color attributes are obtained by using statistical moments extracted from each HSI channel histogram. We adopted the histogram definition as a frequency
for each pixel value, where
and it refers to imaging quantization.
Each statistical moment provides a different meaning. Furthermore, the first moment (8) indicates where the individual color generally lies in the HSI color space. The second moment (9) incorporates the information on the spread or scale of the color distribution. Non corroded surfaces are often homogeneous and they imply low variance. The third moment (10) measures the asymmetry of the data around the sample mean and indicates when the HSI values lie toward maximum or minimum in the scale. The fourth moment (11) measures the flatness or peakedness of the color distribution as follows: