Open Access

Fast marching over the 2D Gabor magnitude domain for tongue body segmentation

  • Zhenchao Cui1,
  • Hongzhi Zhang1,
  • David Zhang1, 2,
  • Naimin Li1 and
  • Wangmeng Zuo1Email author
EURASIP Journal on Advances in Signal Processing20132013:190

https://doi.org/10.1186/1687-6180-2013-190

Received: 4 June 2013

Accepted: 12 December 2013

Published: 26 December 2013

Abstract

Tongue body segmentation is a prerequisite to tongue image analysis and has recently received considerable attention. The existing tongue body segmentation methods usually involve two key steps: edge detection and active contour model (ACM)-based segmentation. However, conventional edge detectors cannot faithfully detect the contour of the tongue body, and the initialization of ACM suffers from the edge discontinuity problem. To address these issues, we proposed a novel tongue body segmentation method, GaborFM, which initializes ACM by performing fast marching over the two-dimensional (2D) Gabor magnitude domain of the tongue images. For the enhancement of the contour of the tongue body, we used the 2D Gabor magnitude-based detector. To cope with the edge discontinuity problem, the fast marching method was utilized to connect the discontinuous contour segments, resulting in a closed and continuous tongue body contour for subsequent ACM-based segmentation. Qualitative and quantitative results showed that GaborFM is superior to the other methods for tongue body segmentation.

Keywords

Image segmentation Tongue diagnosis Fast marching 2D Gabor filter Active contour model

1. Introduction

Medical image segmentation, analysis, and diagnosis have received much attention in image analysis and computer vision. Several methods of segmentation have been proposed for medical images such as magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound medical imagery. Yezzi et al. proposed the geometric active contour model for segmentation of medical imagery [1]. Tsai et al. proposed a shape-based method using level sets to segment the medical images [2]. For prostate segmentation in transrectal ultrasound (TRUS), Yan et al. proposed a discrete deformable model [3]. Graph cuts [4], proposed by Boykov et al., have also been used for the segmentation of medical images. As one part of medical image analysis and diagnosis, because of its convenience and non-invasion nature, tongue image diagnosis has received considerable research interest.

Tongue diagnosis has played an important role in traditional Chinese medicine (TCM) for thousands of years [5, 6]. The TCM practitioner can analyze the physiological and pathological conditions of the patient by inspecting the color, shape, and texture of the tongue, making tongue diagnosis very promising for convenient and non-invasive diagnosis. However, traditional tongue diagnosis is a subjective skill which requires years of experience and practice. Moreover, for different practitioners, the diagnosis results may be inconsistent.

To address these problems, computational tongue diagnosis was studied for the digital acquisition and quantitative analysis of tongue images [79]. Generally, the computational tongue diagnosis system involves three major modules: tongue body segmentation, feature extraction, syndromes and disease analysis [1012]. Several imaging systems have been developed for the acquisition of tongue images [1315]. A number of feature extraction methods have been proposed for extracting features of tongue color [16, 17], coating, and texture [1821], fissures [2225], shape [26, 27], tooth marks [28], petechia spots [29], and the sublingual vein [30, 31]. Pattern recognition approaches such as the Bayesian network and support vector machine have been used for syndromes and disease analysis [9, 3234].

Among the modules of computational tongue diagnosis systems, tongue body segmentation is a prerequisite for subsequent feature extraction and diagnostic analysis. Tongue body segmentation usually involves three steps: edge enhancement, contour initialization, and segmentation. For edge enhancement, conventional edge detectors, such as the Gaussian gradient [35], color gradient [36], and Canny detector [7], have been used. Considering the shape and gray-level characteristics of tongue contour, Zuo et al. [10, 11] proposed a polar edge detector to suppress adverse interference from the lip boundary and tongue fissures. For contour initialization, several heuristic [10], deformable model-based [35], and watershed-based [36] methods have been suggested. For segmentation, active contour models (ACM) [11, 13, 35, 37] and the gradient vector flow (GVF) snake [38] have been used for the final segmentation of the tongue body [14, 39, 40].

Despite the progresses in tongue body segmentation, the existing methods usually suffer from some limitations and cannot meet the performance requirements of a practical computational tongue diagnosis system. For example, in the edge enhancement step, conventional edge detectors usually neglect the characteristics of gray-level variation along the tongue body contour. In the contour initialization step, parts of the tongue body contour might exhibit very weak edges, and the detected contour would be discontinuous, making it hard to obtain a continuous contour for initialization. Moreover, the interference from lips and tongue fissures makes tongue body segmentation more challenging.

In this study, to circumvent edge enhancement and contour discontinuity problems, we propose a novel tongue body segmentation method, GaborFM, which performs fast marching over the two-dimensional (2D) Gabor magnitude domain of tongue images. As shown in Figure 1, the proposed GaborFM method first uses the 2D Gabor magnitude-based edge detector for edge enhancement. Then, in the contour initialization stage, the continuous initial contour is obtained by using fast marching to connect stable segments of the tongue body contour. Finally, the GVF snake is used for tongue body segmentation.
Figure 1

Intermediate and final results of GaborFM for tongue body segmentation. (a) A tongue image, (b) results after edge enhancement, (c) result after removing non-tongue parts, (d) result after edge thresholding and the selection of the stable segments, points A, C, B, D are four points in the two segments; (e) the result of fast marching, and (f) the final segmentation results, where the red curve is the initial contour, and the green curve is the final contour of the tongue body.

Edge enhancement is fundamental for any edge-based active contour and level set based segmentation method. Thus, the enhancement of the tongue body contour by the proposed 2D Gabor magnitude-based edge detection method was first studied. There are several region-based models, such as color-texture segmentations [38, 41] and the C-V model and its extensions [42], but, tongue texture is very complex and varies with images, so this study used the edge-based model and first investigated the edge enhancement problem.

In the contour initialization stage, the stable segment selection and the fast marching method for contour initialization were used. This initialization strategy allowed the incorporation the various tongue priors to obtain a satisfactory initial contour. For example, segments with less length were abandoned. Considering the symmetrical characteristics of the tongue body contour, we selected pairs of segments with better symmetry. Moreover, to avoid the interference of lips and to remove the shadow parts, two paths were found for any two points. If the difference in length L F of the two paths was low, the inner curve was chosen, otherwise the one with the lower length was chosen.

The contour initialization strategy allowed the employment of many types of tongue priors, such as color, shape, and symmetry, to obtain a satisfactory initial contour, and the GVF snake was able to obtain a local optimal and smooth segmentation results. Geodesic active contours [43, 44] can be used to address the limited capture range problem of the conventional snake model. Most globally optimal active contours [4548] can obtain the global optimal solution based on the energy function, which generally is only a non-ideal separation of the tongue body from the other parts. Several globally optimal active contours [49, 50], are similar to the fast marching method used in GaborFM, and can obtain the global optimal path to connect the two points. Thus, we did not use the methods for global minimization of the active contour model.

Compared with the existing methods, GaborFM has two outstanding advantages:
  1. (1)

    By taking the characteristics of gray-level variation of the real tongue body contour into account, the 2D Gabor magnitude-based detector is more effective than the conventional edge detector for the enhancement of the tongue body contour and the suppression of interference from lip and tongue fissures. For this reason, the 2D Gabor magnitude-based edge detection method was used to address the edge enhancement problem.

     
  2. (2)

    To circumvent the edge discontinuity problem, we used fast marching over the 2D Gabor magnitude domain method to connect the discontinuous contour segments into a closed tongue body contour. First, several stable segments were selected based on the edge enhancement result. Then, the contour initialization problem was modeled as the minimal geodesic paths over the 2D Gabor magnitude domain. Finally, the fast marching algorithm was used to obtain a closed tongue body contour for initialization.

     

The remainder of the paper is organized as follows. Section 2 presents the 2D Gabor magnitude-based edge detector for the enhancement of the tongue body contour. Section 3 describes the scheme for contour initialization and tongue body segmentation. Section 4 provides the qualitative and quantitative results to evaluate the proposed GaborFM method. Finally, Section 5 offers the conclusion.

2. 2D Gabor magnitude-based edge detection

In this section, a 2D Gabor magnitude-based edge detector for the enhancement of the tongue body contour is proposed. It then utilizes the color characteristics of the tongue body to suppress interference from the tongue texture and fissures, and finally uses Otsu's method for edge thresholding.

2.1 2D Gabor magnitude-based detector

Figure 2 shows the typical profiles of the boundary pixels of the tongue body. For the typical four boundary pixels shown in Figure 2a, Figure 2b shows the profiles of the intensities along the white lines, and Figure 2c shows the profiles of the real and the imaginary parts of the Gabor filter. From Figure 2, it can be seen that, the profiles of the boundary pixels are similar to both the real and the imaginary parts of the Gabor filter. Thus, it is proper to use the 2D Gabor filter for the enhancement of the boundary of tongue body.
Figure 2

Typical profiles of the boundary pixel of the tongue body. (a) A typical tongue image, (b) profiles of the intensities of the four boundary pixels along the white lines in (a), and (c) profiles of the real and the imaginary parts of the Gabor filter.

The 2D Gabor filter was first proposed by Daugman to model the receptive fields of the orientation-selective simple cells of the visual cortex [51]. Lee reformulated 2D Gabor wavelets based on physiological evidence and the wavelet frame criterion [52]. 2D Gabor filters have been extensively applied to many image processing and computer vision applications [53, 54]. We used the form of 2D Gabor functions derived by Lee [52],
ψ x , y , x 0 , y 0 , ω , θ , κ = ω 2 π κ e ω 2 8 κ 2 4 x ' 2 + y ' 2 e x ' e κ 2 2 ,
(1)

where x′ = (x − x 0)cos θ + (y − y 0)sin θ, y′ = (x − x 0)sin θ + (y − y 0)cos θ, (x 0 , y 0) is the center of the function, ω is the radial frequency in radians per unit length, and θ is the orientation of the Gabor functions in radians. The κ is defined by κ = 2 ln 2 2 δ + 1 / 2 δ 1 , where δ is the half-amplitude bandwidth of the frequency response, which is between 1 and 1.5 octaves according to neurophysiological findings [52]. When ω and δ are fixed, σ can be derived from σ = κ/ω.

In the 2D Gabor magnitude-based edge detector, ω = 4 × 2 and δ = 1.5 were chosen and G k (x, y) was used to denote the Gabor filters with an orientation of θ = k/ 8π (k = 0, 1, …, 7). Given a tongue image I(x, y), the convolution of the image I and G k is
F I k = I G k ,
(2)
where ‘*’ denotes the convolution operator. From Figure 1, the real part of FI k can be used to enhance the boundary pixels with the profiles similar to those shown in the left of Figure 2b, and the imaginary part can be used to enhance the boundary pixels with the profiles similar to those shown in the right of Figure 2b. Then, the output of the 2D Gabor magnitude-based detector is defined as
M max x , y = max k F I k x , y × F I k x , y ¯ ,
(3)

where ‘¯’ denotes the complex conjugate operator. Generally, any profiles in Figure 2b would result in local maxima in M max(x, y). Moreover, the promising performance of the Gabor filter on noise robustness and time-frequency tradeoff makes the proposed 2D Gabor magnitude-based detector suitable for the enhancement of the tongue boundary.

Figure 3 shows the edge enhancement results of a typical tongue image using different edge detectors. Compared with the Sobel operator and derivative of Gaussian (DoG) filters [55], the 2D Gabor magnitude-based detector is more effective for the enhancement of the tongue body contour.
Figure 3

Enhancement of the tongue body contour using different edge detectors. (a) The original image, (b) Sobel operator, (c) DoG filters ( σ = 2 ), and (d) 2D Gabor magnitude.

2.2 Edge thresholding

We used a two-step strategy for edge thresholding. First, based on the color characteristics of the tongue body and face, parts of the non-boundary pixels were identified. Then, the binarization of the edge image was obtained by defining the proper threshold.

As shown in Figure 4a, a typical tongue image usually involves three major components: background, the tongue body, and other facial parts. If parts of the tongue body and the background components are correctly identified, they can be simply marked as non-boundary pixels. All the tongue images were captured in a semi-enclosed environment under stable and uniform lighting condition. Thus, the background is stable and can be easily segmented from the tongue image. The color tongue image was transformed into the YIQ color space, and Otsu's method [56] was used to determine a threshold T b for the I channel of the YIQ color space. The reason for choosing the I channel of YIQ is that increasing I characterizes the change from blue, through purple, to red colors, and is more effective in separating background pixels from the others than Y and Q. Pixels with the I value smaller than T b were then set as background pixels, and morphological operators, such as dilation, filling, and erosion, are used to define the background region.
Figure 4

Thresholding of the edge image. (a) Three major components of a tongue image, (b) edge image after masking out parts of the non-boundary pixels, and (c) the final binarized edge image.

Unlike the background, due to physiological and pathological factors, the color of the tongue body varies with tongue images. Fortunately, it is relatively easy to distinguish the color of tongue body from that of other facial parts: the pixel values of the tongue body component usually have higher Q value. Let T 1 denote the threshold obtained by Otsu's method. We set the threshold T Q  = 2 T 1 and set the pixels with a Q value higher than T Q as tongue body pixels. Morphological operators, mentioned above, were then used to define the tongue body region. A subset of the background and tongue body components was then obtained. As shown in Figure 4b, by assigning these pixels as non-boundary pixels, a modified edge image can be derived for edge binarization.

Finally, a threshold T M for the binarization of the modified edge image is defined. Let T o be the threshold obtained by Otsu's method on M max(x, y), and VarM be the standard variance of M max(x, y). The threshold T M is then defined as
T M = T o , if T o > 1.1 Va r M 1.1 Va r M , else .
(4)

After edge thresholding, the morphological operators were employed to obtain single-pixel edge curves, resulting in the final binarized edge image shown in Figure 4c.

3. Tongue body segmentation using fast marching and the GVF snake

In this section, first several stable segments from the binarized edge image are selected, and then the fast marching algorithm is used to obtain a closed contour for initialization. Finally, the GVF snake model is adopted for tongue body segmentation.

3.1 Selection of stable segments

Although the 2D Gabor magnitude-based detector is effective, false detection of the tongue body boundary cannot be completely avoided. Moreover, it is also difficult to detect the entire closed tongue body contour by only using the 2D Gabor magnitude-based detector. Fortunately, the location and shape of the tongue body are useful in distinguishing a true boundary from a false boundary. Thus, to suppress the adverse influence of false detection, we used the strategy of selecting stable segments of the tongue body contour.

By referring to the location and shape of the tongue body, the following approach for selecting several stable segments from the binarized edge image is suggested. First, the mean x ¯ 1 , y ¯ 1 of the coordinate of all the edge pixels and the mean x ¯ 2 , y ¯ 2 of the coordinate of all the tongue body pixels obtained in Section 2.2 were computed. Then, the center of the tongue body x ¯ , y ¯ was estimated as the average of x ¯ 1 , y ¯ 1 and x ¯ 2 , y ¯ 2 .

As shown in Figure 5a, to avoid the interference of the lips, the left and the right sides of the tongue body boundary are considered to be more stable. So, we focused on the stable segments from the left and right boundary of the tongue body. Based on the estimated center of the tongue body x ¯ , y ¯ , the binarized edge image was divided into left and the right parts. For each part, the segment with sufficient length was defined as a stable segment. In order to verify the stable segments, the approximate symmetry of the segments was also taken into account. Figure 5b shows an example of the two stable segments. Finally, on each stable segment, two stable points were found by backtracking from the end points of the stable segment for contour initialization, as shown in Figure 5b.
Figure 5

Selection of stable segments. (a) The binarized edge image, (b) stable segments with the corresponding stable points, and (c) the result of fast marching.

3.2 Contour initialization using fast marching

For the initialization of the tongue body contour, the fast marching algorithm was utilized to address the discontinuity problem by connecting different stable segments. Given two points A (x A, y A) and C (x C, y C), we defined the length of a planar curve γ : [0, 1]  2 from A to C,
L F γ = 0 1 g γ t γ ' t 2 dt ,
(5)
where γ′(t)2 denotes the l 2-norm of the gradient of γ(t), and g(x, y) is defined in 2[43],
g x , y = 1 1 + M max x , y .
(6)
Then the required curve from A to C was defined as the shortest path defined by L F (γ),
γ t = arg min γ t C L F γ t ,
(7)

where C is the set of curves with γ(0) = (x A , y A ) and γ(1) = (x C , y C ).

Accordingly [5759], γ* can also be estimated by solving the following Eikonal equation,
T A x , y = g x , y
(8)

with T A (x A , y A ) = 0, where T A (x, y) denotes the shortest distance L F of (x A, y A) and (x, y). The fast marching methoda was used to solve the Eikonal equation to obtain T A (x, y). The computational complexity of fast marching is O(N logN), where N = mn is the size of the tongue image. Moreover, fast marching stops when the point (x C, y C) is reached, so T A (x, y) can be more efficiently obtained.

After the computation of T A (x, y), given the point of (x C, y C), the shortest path can be constructed by solving the following ordinary differential equation (ODE),
X t = T A
(9)

with the initial value X t = 0 = (x C , y C ). This ODE problem can be efficiently solved by using second-order Heun's method [57].

In practice, to avoid the interference of the lips, we found two paths for any two points. If the difference in length L F of the two paths was low, the inner curve was chosen, otherwise the one with the lower length was chosen. Figures 1e and 5c show two examples of the shortest paths constructed by the fast marching method. Satisfactory initialization of the tongue body contours can be obtained by using the fast marching method.

3.3 Gradient vector flow snake

An active contour model (or snake) [60] is a curve x(s) = [x(s), y(s)], s [0, 1], that moves through the spatial domain of an image to minimize the following energy functional,
E = 0 1 1 2 α x ' s 2 + β x '' s 2 + E ext x s ds ,
(10)

where α and β are weighting parameters that control the snake's tension and rigidity, respectively, x ' (s) and x ' ' (s) denote the first and second derivatives of x(s) with respect to s, and E ext is the external energy.

We used the gradient vector flow (GVF) snake modelb proposed by Xu and Prince [61] for final tongue body segmentation, which uses GVF as the external field. The GVF snake is promising in addressing the limitations of Kass et al.'s snake model [60], such as small capture range of the external field and difficulties in progressing into boundary concavities. Given an edge image M(x, y), the GVF field w(x, y) = [u(x, y), v(x, y)] can be obtained by solving the minimization problem,
E ext w x , y = μ w 2 + M 2 w M 2 dxdy ,
(11)

where |  | denotes the l 2 norm with w 2 = u x 2 + u y 2 + v x 2 + v y 2 . For fast GVF computation, the augmented Lagrangian method (ALM)-based algorithm developed by our group was used [62].

For the GVF snake, E ext = w(x, y). So, after the computation of the GVF field, the curve can dynamically evolve until convergence, and the partial derivative of the curve x(s, t) with respective to time t as
x t ' s , t = α x '' s , t β x '' '' s , t w .
(12)

For tongue body segmentation, μ = 0.1, α = 0.05, and β = 0.01 were chosen.

4. Experimental results

4.1 Database and evaluation criteria

We constructed a tongue image data set of 300 imagesc to evaluate the proposed method. Manual segmentation results were used as the ground truth. All the images were acquired by our tongue image acquisition device. The image size was 768 × 576. Since the images were captured in a semi-enclosed environment under stable lighting conditions, it was not necessary to use any pre-processing method for illumination normalization. All the experiments were executed on a PC with T2250 @1.73 GHz CPU and 2G memory.

To evaluate the segmentation performance, qualitative and quantitative evaluations were used. For qualitative evaluation, the results on several typical tongue image using existing methods and GaborFM were presented to test GaborFM. For quantitative evaluation of the segmentation results, we used both boundary- and area-based performance criteria. In area-based evaluation [63, 64], the false negative volume fraction (FNVF, %) and the false positive volume fraction (FPVF, %) were used to evaluate the segmentation methods. In the boundary-based evaluation [63], the Hausdorff distance (HD), the mean distance to the closest point (MD) and the standard deviation of the error distribution (SD) were used. Let A = {a 1, …, a m } and B = {b 1, …, b n } be two curves, where a i denotes the i th pixel of the curve A, and b j denotes the j th pixel of the curve B. The distance to the closest point (DCP) d(a i , B) is defined in [63]. The difference between A and B based on the error distribution of d(a i , B) and d(b j , A) was evaluated. The standard deviation of the error distribution (SD) is computed as
SD A , B = 1 m + n 1 i d a i , B MD 2 + j d b j , A MD 2 .
(13)

4.2 Components of GaborFM

We compared the edge enhancement results obtained using the Sobel, DoG, and 2D Gabor magnitude-based edge detector, as shown in Figure 3. It can be seen that the proposed 2D Gabor magnitude-based detector can faithfully enhance the tongue body contour and is more robust against inference from tongue texture. Thus, compared with Sobel and DoG, the 2D Gabor magnitude-based method is more effective for the enhancement of the tongue body contour.

Figure 6 shows the binarized edge images obtained using the proposed method and the Canny edge detector. The proposed method can obtain a binarized edge image with more continuous contour. Compared with the proposed method, the binarized edge image obtained using Canny is noisier and has many false positive edges.
Figure 6

The binarized edge images obtained using Canny and the proposed method. (a) The original image, (b) Canny, and (c) the proposed method.

To evaluate the role of each step, we compared the segmentation performances of the following three variants:
  1. (1)

    Gabor + FM: we only used the first two steps (Gabor magnitude-based edge detection and fast marching) for tongue body segmentation and used the contour after fast marching as the contour of the tongue body.

     
  2. (2)

    Canny + FM + GVF: we replaced the Gabor magnitude-based edge detection with the Canny detector.

     
  3. (3)

    Gabor + FM + GVF: we used all the three steps, GaborFM, for tongue body segmentation.

     
Table 1 lists the HD, MD, and SD values of these three methods. Gabor + FM + GVF is much better than Canny + FM + GVF, which indicate that the Gabor magnitude-based edge detection is more suitable for the enhancement of tongue body contour. Besides, Gabor + FM + GVF performs a little better than Gabor + FM, which indicates that the final GVF snake step is also useful in improving the segmentation accuracy.
Table 1

The HD, MD, and SD of Canny + FM + GVF, Gabor + FM, and GaborFM

Method

HD

MD

SD

Canny + FM + GVF

33.85 ± 39.44

6.83 ± 8.88

9.20 ± 13.69

Gabor + FM

19.15 ± 12.31

4.28 ± 2.18

4.20 ± 3.04

Gabor + FM + GVF

17.79 ± 11.87

4.26 ± 2.17

3.98 ± 2.99

HD, Hausdorff distance; MD, mean distance to the closest point; SD, standard deviation of the error distribution.

To validate the influence of the second-order regularization in GVF snake, we compared the segmentation results obtained using GVF with β = 0, and GVF with β = 0.01. As shown in Figure 7, the result of GVF with β = 0.01 is more smooth than the result of GVF with β = 0, which indicates that GVF with β = 0.01 could further improve the local smoothness of the contour. And thus, in this paper, we adopted GVF snake with β = 0.01 as the final step to segment tongue body from the image.
Figure 7

The segmentation results obtained using GVF snakes. (a) GVF with β = 0, (b) GVF with β = 0.01.

Figure 8 shows the final segmentation obtained using the GGVF snake [65] and the proposed GVF snake. The proposed GVF snake and GGVF snake can obtain almost the same segmentation results. However, the CPU running time of the proposed GVF snake is 0.76, which is much less that of GGVF snake.
Figure 8

The segmentation results of the GGVF snake and the proposed GVF snake. (a) The result of the GGVF snake, (b) the result of the proposed GVF snake.

4.3 Comparison with the other segmentation methods

GaborFM consists of several major steps: edge enhancement, thresholding, stable segment selection, GVF computation, and the snake. Given an m × n image, the computational complexity of edge enhancement and GVF computation is O(Kmn log(mn)), where K is the number of filters or the number of iterations. For the steps of thresholding and stable segment selection, the computational complexity is O(mn). The computational complexity of the snake is O(kd), where k is the number of iterations and d is the number of elements of the snake curve. Table 2 shows the average CPU times of two other automatic tongue body segmentation methods and proposed method, and also shows manual segmentation time. It can be seen that the average CPU time of GaborFM is 10.80 s and it is shorter than the other methods. This shows that the proposed method can be efficiently executed, and it is faster than the other methods in finishing the segmentation of one tongue image.
Table 2

Average CPU time of methods

Method

Manual segmentation

Watershed

PolarSnake

GaborFM

Average CPU time (s)

About 80

19.98

45.49

10.80

The segmentation results obtained using GaborFM were compared with those obtained using the method proposed by Cohen and Kimmel [48] and the method proposed by Bresson et al. [47].

Globally, optimal active contours [47] can obtain the global optimal solution based on the energy function, which is generally only a non-ideal separation of the tongue body from the other parts. Figure 9a,b,c shows the segmentation results using the fast global active contour model [47]. Bresson et al.'s method generally cannot segment the tongue body from the image, where considerable parts of tongue body region are mis-segmented. Figure 9d,e,f shows the segmentation results obtained using Cohen and Kimmel's method [50], and Figure 9g,h,i shows the segmentation results obtained using GaborFM. It can be seen that the segmentation result of Cohen and Kimmel's method [50] tend to be less smooth than those of the GVF snake.
Figure 9

Segmentation results of three images obtained using. (a,b,c) Bresson et al.'s method, (d,e,f) Cohen and Kimmel's method, and (g,h,i) the proposed GaborFM method.

We further compared the proposed GaborFM method with three other tongue body segmentation methods: BEDC [35], Watershed [36], and PolarSnake [32]. Figure 10 shows the segmentation results of three typical tongue images obtained using the four segmentation methods: BEDC [35], Watershed [36], PolarSnake [32], and GaborFM. In BEDC [35], the bi-elliptical deformable template was adopted for contour initialization. Since the shapes of the tongue body contour varies with tongue images, some contours could not be well approximated by bi-elliptical templates, and BEDC [35] obtained poor contour initialization results, as shown in Figure 10b,c. Watershed [36] adopted the region merging strategy for contour initialization, and this usually led to a non-smooth contour, and over- and under-segmentation, as shown in Figure 10d,f. The PolarSnake [32] was designed for detecting dark lines, and location bias occurred for the detection of boundary pixels such as Points 3 and 4 in Figure 2. The location bias of the PolarSnake can be observed in Figure 10g,i. In summary, GaborFM is more robust against the interference from lips, chin, tongue texture, and fissures, and is superior to BEDC, Watershed, and the PolarSnake for tongue body segmentation.
Figure 10

The segmentation results of three typical tongue images by using four different segmentation methods. (a,b,c) BEDC, (d,e,f) Watershed, (g,h,i) PolarSnake, and (j,k,l) the proposed GaborFM method.

The tongue image data set of 300 imagesd was used to evaluate the proposed method. A subset of the data, 20 tongue images, were used to learn the parameters of GaborFM, and the other 280 tongue images were used to test the proposed method. Table 3 lists the HD, MD distance and SD of BEDC, Watershed, PolarSnake, and the proposed GaborFM method. From Table 3, it can be seen that GaborFM can achieve much lower HD, MD distance, and SD, which indicates that the proposed method is superior to the other three methods for tongue body segmentation. It can be seen that the standard deviation of GaborFM is much lower than those of BEDC, Watershed, and PolarSnake, which indicates that GaborFM is more robust than BEDC, Watershed, and PolarSnake.
Table 3

HD and MD distances of BEDC, watershed, PolarSnake, and GaborFM

Method

HD

MD

SD

BEDC [35]

49.59 ± 36.91

21.80 ± 10.66

16.38 ± 10.54

Watershed [36]

37.19 ± 26.26

15.36 ± 7.25

14.82 ± 9.72

PolarSnake [32]

30.35 ± 22.07

8.57 ± 4.34

7.30 ± 5.21

GaborFM

17.79 ± 11.87

4.26 ± 2.17

3.98 ± 2.99

Table 4 lists FPVF and FNVF values of BEDC, Watershed, PolarSnake, and GaborFM. It can be seen that the proposed GaborFM method can achieve lower FPVF and FNVF than the other three methods. Thus, in terms of area-based evaluation, GaborFM is superior to BEDC, Watershed, and PolarSnake. Table 4 also lists the standard deviation of FPVF and FNVF. The standard deviation of GaborFM is also lower than those of the other three competing methods, which indicates that GaborFM is more robust than BEDC, Watershed, and PolarSnake.
Table 4

FPVF and FNVF values of BEDC, watershed, PolarSnake, and GaborFM

Method

FPVF(%)

FNVF(%)

BEDC [35]

13.89 ± 5.63

19.89 ± 15.51

Watershed [36]

5.78 ± 4.56

16.49 ± 13.15

PolarSnake [32]

1.93 ± 3.94

10.04 ± 11.56

GaborFM

0.96 ± 2.51

5.44 ± 10.17

FPVF, false positive volume fraction; FNVF, false negative volume fraction.

Based on the evaluation results above, it can be seen that GaborFM is better than BEDC, Watershed, and PolarSnake for tongue body segmentation. From Table 3, the proposed GaborFM method can achieve the HD, MD distance, and SD of 17.79, 4.26, and 3.98 pixels, respectively. From Table 4, the proposed method can achieve the FPVF and FNVF of 0.96% and 5.44%, respectively. Moreover, the qualitative and quantitative evaluation results also show that GaborFM satisfies the practical performance requirements of tongue body segmentation and can be embedded into a real computational tongue diagnosis system.

5. Conclusion

In this paper, we proposed a novel method, GaborFM, for automated tongue body segmentation. First, a Gabor magnitude-based detector was developed for edge enhancement. Second, both the color characteristics and the edge enhancement result for the thresholding of the edge image were taken into account. Third, stable segments were selected from the binarized edge image and use the fast marching algorithm over 2D Gabor magnitude domain to obtain a continuous closed curve for contour initialization. Finally, the GVF snake was used for tongue body segmentation. Generally, the proposed method can well address the edge enhancement and the contour discontinuity problems. Experimental results also showed that the proposed method is more effective than other tongue body segmentation methods, i.e., BEDC [35], Watershed [36], and PolarSnake [32].

Endnotes

aTo solve the Eikonal equation, we used the Matlab Toolbox Fast Marching provided by Dr. Gabriel Peyre, which is available to the public at http://www.mathworks.com/matlabcentral/fileexchange/6110-toolbox-fast-marching.

bThe code of GVF snake was provided by the authors of [61], which is available to the public at http://www.iacl.ece.jhu.edu/static/gvf/. Our Matlab code on ALM-based GVF computation has been released at https://sites.google.com/site/cswmzuo/IALM-GVF_GGVF.rar.

cWe will soon make the tongue image dataset together with the ground truth segmentation results available to the public.

dWe will soon make the tongue image dataset together with the ground truth segmentation results available to the public.

Declarations

Acknowledgements

The work is partially supported by the GRF fund from the HKSAR Government, the central fund from the Hong Kong Polytechnic University, the NSFC funds of China (grant nos. 61001037, 61071179, 61271093, and 61102037), and the Fundamental Research Funds for the Central Universities (grant no. HIT.NSRIF.2010051). The authors would like to thank the associate editor and the anonymous reviewers for their constructive suggestions. Thanks to Dr. Edward C. Mignot, Shandong University, for linguistic advice.

Authors’ Affiliations

(1)
School of Computer Science and Technology, Harbin Institute of Technology
(2)
Department of Computing, Biometrics Research Centre, Hong Kong Polytechnic University

References

  1. Yezzi A, Kichenassamy S, Kumar A, Olver P, Tannenbaum A: A geometric snake model for segmentation of medical imagery. IEEE Trans. Med. Imag. 1997, 16(2):199-209. 10.1109/42.563665Google Scholar
  2. Tsai A, Yezzi A Jr, Wells W, Tempany C, Tucker D, Fan A, Grimson WE, Willsky A: A shape-based approach to the segmentation of medical imagery using level sets. IEEE Trans. Med. Imag. 2003, 22(2):137-154. 10.1109/TMI.2002.808355Google Scholar
  3. Yan P, Xu S, Turkbey B, Kruecher J: Discrete deformable model guided by partial active shape model for TRUS image segmentation. IEEE Trans. Biomedical engineering 2010, 57(5):1158-1166.Google Scholar
  4. Boykov Y, Funka-Lea G: Graph cuts and efficient N-D image segmentation. Int. J. Comput. Vis. 2006, 70(2):109-131. 10.1007/s11263-006-7934-5Google Scholar
  5. Xu D: Mutual understanding between traditional Chinese medicine and systems biology: gaps, challenges and opportunities. Int. J. Funct. Inform. Personal. Med. 2009, 2(3):248-260. 10.1504/IJFIPM.2009.030826Google Scholar
  6. Lukman S, He YL, Hui SC: Computational methods for traditional Chinese medicine: a survey. Computer Methods and Program in Biomedicine 2007, 88(3):283-294. 10.1016/j.cmpb.2007.09.008Google Scholar
  7. Xu WT, Kanawong R, Xu D, Li S, Ma T, Zhang GX, Duan Y: An automatic tongue detection and segmentation framework for computer-aided tongue image analysis. Columbia, MO, 13–15 June 2011. Proceedings of the IEEE 13th Int. Conf. E-health Networking, Applications and Services 189-192.Google Scholar
  8. Zhang D: Automated biometrics: technologies and system. Boston: Kluwer; 2000.Google Scholar
  9. Pang B, Zhang D, Li NM, Wang KQ: Computerized tongue diagnosis based on Bayesian networks. IEEE Trans. Biomedical Engineering 2004, 51(10):1803-1810. 10.1109/TBME.2004.831534Google Scholar
  10. Zuo W, Wang K, Zhang D, Zhang H: Combination of polar edge detection and active contour model for automated tongue segmentation. Hong Kong, 18–20 December 2004. Proceedings of the 3rd Int. Conf. Image and Graphics 270-273.Google Scholar
  11. Zhang H, Zuo W, Wang K, Zhang D: A snake-based approach to automated segmentation of tongue image using polar edge detector. Int. J. Imaging Syst. Technol. 2006, 16(4):103-112. 10.1002/ima.20075Google Scholar
  12. Jung CJ, Jeon YJ, Kim JY, Kim KH: Review on the current trends in tongue diagnosis systems. Integrative Medicine Research 2012, 1(1):13-20. 10.1016/j.imr.2012.09.001Google Scholar
  13. Cai Y: A novel imaging system for tongue inspection. Anchorage, 21–23 May 2002. Proceedings of the IEEE conf. Instrumentation and Measurement Technology 159-163.Google Scholar
  14. Li QL, Liu J, Xiao G, Xue YQ: Hyperspectral tongue imaging system used in tongue diagnosis. Shanghai, 16–18 May 2006. Proceedings of the 2nd Int. Conf. Bioinformatics and Biomedical Engineering 2579-2581.Google Scholar
  15. Dong H, Guo Z, Zeng C, Zhong H, He Y, Wang RK, Liu S: Quantitative analysis on tongue inspection in traditional Chinese medicine using optical coherence tomography. J. Biomed. Opt. 2008, 13(1):011004. 10.1117/1.2870175Google Scholar
  16. Li C, Yuen P: Tongue image matching using color content. Pattern Recogn. 2002, 35(2):407-419. 10.1016/S0031-3203(01)00021-8Google Scholar
  17. Wang Y, Yang J, Zhou Y, Wang Y: Region partition and feature matching based color recognition of tongue image. Pattern Recogn. Lett. 2007, 28(1):11-19. 10.1016/j.patrec.2006.06.004Google Scholar
  18. Bai Y, Shi Y, Wu J, Zhang Y, Wong W, Wu Y, Bai J: Automatic extraction of tongue coating from digital images: a traditional Chinese medicine diagnostic tool. TsingHua Sci Technol 2009, 14(2):170-175. 10.1016/S1007-0214(09)70026-4Google Scholar
  19. Li W, Hu S, Yao J, Song H: The separation framework of tongue coating and proper in traditional Chinese medicine. Macau, 8–10 December 2009. Proceedings of the 7th Int. Conf. Information, Communications and Signal Processing 1-4.Google Scholar
  20. Huang W, Yan Z, Xu J, Zhang L: Analysis of the tongue Fur and tongue features by naive bayesian classifier. Taiyuan, 22–24 October 2010. Proceedings of the Int. Conf. Computer Application and System Modeling 304-308. vol. 4,Google Scholar
  21. Huang B, Zhang D, Li Y, Zhang H, Li N: Tongue coating image retrieval. Harbin, 18–20 January 2011. Proceedings of the 3rd Int. Conf. Advanced Computer Control 292-296.Google Scholar
  22. Liu L, Zhang D, You J: Detecting wide lines using isotropic nonlinear filter. IEEE Trans. Imaging Processing 2007, 16(6):1584-1595.MathSciNetGoogle Scholar
  23. Liu L, Zhang D, Ajay K, Wang KQ: Tongue line extraction. Tampa, FL, 8–11 December 2008. Proceedings of the 19th Int. Conf. Pattern Recognition 1-4.Google Scholar
  24. Liu L, Zhang D: Extracting tongue cracks using the wide line detector. Hong Kong, 4–5 January 2008. Proceedings 1st Int. Conf. Medical Biometrics 49-56.Google Scholar
  25. Yang Z, Zhang D, Li N: Kernel false-colour transformation and line extraction for fissured tongue image. Journal of Computer-Aided Design and Computer Graphics 2010, 22(5):771-776. 10.3724/SP.J.1089.2010.10754Google Scholar
  26. Huang B, Wu JS, Zhang D, Li NM: Tongue shape classification by geometric features. Inform. Sci. 2010, 180(2):312-324. 10.1016/j.ins.2009.09.016Google Scholar
  27. Zhang D, Liu Z, Yan J: Dynamic tongueprint: a novel biometric identifier. Pattern Recogn. 2010, 43(3):1071-1082. 10.1016/j.patcog.2009.09.002MathSciNetGoogle Scholar
  28. Li WS, Yao JF, Song H: The recognition of the teeth marks of tongue based on the improved level set in TCM. Yantai, 16–18 October 2010. Proceedings of the 3rd Int. Congress on Image and Signal Processing 2700-2704.Google Scholar
  29. Li JF, Zhang HZ, Wang KQ, Zuo WM: An automated feature extraction method for recognition of petechia spot in tongue diagnosis. Sanya, 13–14 December 2009. Proceedings of the Int. Conf. Future Biomedical Information Engineering 69-72.Google Scholar
  30. Hoover AD, Kouznetsova V, Goldbaum M: Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Trans. Medical Imaging 2000, 19(3):203-210. 10.1109/42.845178Google Scholar
  31. Yan Z, Wang K, Li N: Computerized feature quantification of sublingual veins from color sublingual images. Comput. Methods Programs Biomed. 2009, 93(2):192-205. 10.1016/j.cmpb.2008.09.006Google Scholar
  32. Zhang HZ, Wang KQ, Zhang D, Pang B, Huang B: Computer aided tongue diagnosis system. Shanghai, 1–4 September 2005. Proceedings of the IEEE 27th Annual Conf. Engineering in Medicine and Biology 6754-6757.Google Scholar
  33. Gao Z, Cui M, Lu GM: A novel computerized system for tongue diagnosis. Leicestershire, 20 November 2008. Proceedings of the Int. Seminar on Future Information Technology and Management Engineering 364-367.Google Scholar
  34. Lo LC, Hou MC, Chen YL, Chiang JY, Hsu CC: Automatic tongue diagnosis system. Tianjin, 17–19 October 2009. Proceedings of the 2nd Int. Conf. Biomedical Engineering and Informatics 1-5.Google Scholar
  35. Pang B, Zhang D, Wang KQ: The bi-elliptical deformable contour and its application to automated tongue segmentation in Chinese medicine. IEEE Trans. Medical Imaging 2005, 24(8):946-956.Google Scholar
  36. Ning J, Zhang D, Wu C, Yue F: Automatic tongue image segmentation based on gradient vector flow and region merging. Neural Comput. & Applic. 2012, 21(8):1819-1826. 10.1007/s00521-010-0484-3Google Scholar
  37. Yu SY, Yang J, Wang YG, Zhang Y: Color active contour models based tongue segmentation in traditional Chinese medicine. Wuhan, 6–8 July 2007. Proceedings of the 1st Int. C. Bioinformatics and Biomedical Engineering 1065-1068.Google Scholar
  38. Liapis S, Sifakis E, Tziritas G: Colour and texture segmentation using wavelet frame analysis, deterministic relaxation, and fast marching algorithms. J. Vis. Commun. Image R. 2004, 15: 1-26. 10.1016/S1047-3203(03)00025-7Google Scholar
  39. Li Q, Wang Y, Liu H, Sun Z: AOTF based hyperspectral tongue imaging system and its applications in computer-aided tongue disease diagnosis. Yantai, 16–18 October 2010. Proceedings of the 3rd Int. Conf. Biomedical Engineering and Informatics 1424-1427.Google Scholar
  40. Liu Z, Yan J, Zhang D, Li Q: Automated tongue segmentation in hyperspectral images for medicine. Appl. Optics 2007, 46(1):8328-8334.Google Scholar
  41. Chen J, Pappas TN, Mojsilović A, Rogowitz BE: Adaptive perceptual color-texture image segmentation. IEEE Trans. Image Processing 2005, 14(10):1-13.Google Scholar
  42. Chan TF, Vese LA: Active contours without edges. IEEE Trans. Image Processing 2001, 10(2):266-277. 10.1109/83.902291Google Scholar
  43. Caselles V, Kimmel R, Sariro G: Geodesic active contours. Int. J. Comput. Vis. 1997, 2(1):61-79.Google Scholar
  44. Xu C, Yezzi A Jr, Prince JL: On the relationship between parametric and geometric active contours. vol. 1, Pacific Grove, CA, October 2000. Proceedings of the 34th Asilomar Conference on Signals, Systems and Computers 483-489.Google Scholar
  45. Chan TF, Esedoglu S, Nikolova M: Algorithms for finding global minimizers of image segmentation and denoising models. SIAM J. Appl. Math. 2006, 66(5):1632-1648. 10.1137/040615286MathSciNetGoogle Scholar
  46. Boykov Y, Kolmogorov V, Cremers D, Delong A: An integral solution to surface evolution PDEs via geo-cuts. Berlin Heidelberg: Springer; 2006.Google Scholar
  47. Bresson X, Esedoḡlu S, Vandergheynst P, Thiran J-P, Osher S: Fast global minimization of the active contour/snake model. J. Math. Imaging Vis. 2007, 28(2):151-167. 10.1007/s10851-007-0002-0Google Scholar
  48. Márquez-Neila P, Baumela L, Alvarez L: A morphological approach to curvature-based evolution of curves and surfaces. IEEE Tran. Pattern Anal. Machine Intell. 2014, 36(1):2-17.Google Scholar
  49. Appleton B, Talbot H: Globally optimal geodesic active contours. J. Math. Imaging Vis. 2005, 23(1):67-86. 10.1007/s10851-005-4968-1MathSciNetGoogle Scholar
  50. Cohen LD, Kimmel R: Global minimum for active contour models: a minimal path approach. Int. J. Comput. Vis. 1997, 24(1):57-78. 10.1023/A:1007922224810Google Scholar
  51. Daugman JG: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Optical Soc. Amer. 1985, 2(7):1160-1169. 10.1364/JOSAA.2.001160Google Scholar
  52. Lee TS: Image representation using 2D Gabor wavelet. IEEE Trans. Pattern Analysis and Machine Intelligence 1996, 18(10):959-971. 10.1109/34.541406Google Scholar
  53. Serrano Á, Martín de Diego I, Conde C, Cabello E: Recent advances in face biometrics with Gabor wavelets: a review. Pattern Recogn. Lett. 2010, 31(5):372-381. 10.1016/j.patrec.2009.11.002Google Scholar
  54. Riaz F, Hassan A, Rehman S, Qamar U: Texture classification using rotation- and scale-invariant Gabor texture features. IEEE Signal. Process. Lett. 2013, 20(6):607-610.Google Scholar
  55. Sonka M, Hlavac V, Boyle R: Image processing, analysis, and machine vision. 3rd edition. Stamford: Cengage Learning; 2007.Google Scholar
  56. Otsu N: A threshold selection method from gray-level histograms. IEEE Trans. System, Man, and Cybernetics 1979, 9(1):62-66.MathSciNetGoogle Scholar
  57. Sethian JA: Level Set methods and fast marching methods: evolving interfaces in computational geometry, fluid mechanics, computer vision, and materials science. Cambridge: Cambridge University Press; 1999.Google Scholar
  58. Sethian JA: Fast marching methods. SIAM Rev. 1999, 41(2):199-235. 10.1137/S0036144598347059MathSciNetGoogle Scholar
  59. Pechaud M, Keriven R, Peyre G: Extraction of tubular structures over an orientation domain. Miami, 20–25 June 2009. Proceedings of the IEEE conference on computer vision and pattern recognition, 2009, CVPR 2009 336-342.Google Scholar
  60. Kass M, Witkin A, Terzopoulos D: Snakes: active contour models. Int. J. Comput. Vis. 1987, 1(4):321-331.Google Scholar
  61. Xu C, Prince JL: Snakes, shapes, and gradient vector flow. IEEE Trans. Image Processing 1998, 7(3):359-369. 10.1109/83.661186MathSciNetGoogle Scholar
  62. Ren D, Zuo W, Zhao X, Lin Z, Zhang D: Fast gradient vector flow computation based on augmented Lagrangian method. Pattern Recogn. Lett. 2013, 34(12):219-225.Google Scholar
  63. Chalana V, Kim Y: A methodology for evaluation of boundary detection algorithms on medical images. IEEE Trans. Med. Imag. 1997, 16(5):642-652. 10.1109/42.640755Google Scholar
  64. Udupa JK, LeBlanc VR, Schmidt H, Imielinska C, Saha PL, Grevera GJ, Zhuge Y, Currie LM, Moholt P, Jin Y: A methodology for evaluating image segmentation algorithms. Proc. SPIE 2002, 4684: 266-277. 10.1117/12.467166Google Scholar
  65. Xu C, Prince JL: Generalized gradient vector flow external forces for active contours. Signal Process 1998, 71(2):131-139. 10.1016/S0165-1684(98)00140-6Google Scholar

Copyright

© Cui et al.; licensee Springer. 2013

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.