Fast marching over the 2D Gabor magnitude domain for tongue body segmentation

Tongue body segmentation is a prerequisite to tongue image analysis and has recently received considerable attention. The existing tongue body segmentation methods usually involve two key steps: edge detection and active contour model (ACM)-based segmentation. However, conventional edge detectors cannot faithfully detect the contour of the tongue body, and the initialization of ACM suffers from the edge discontinuity problem. To address these issues, we proposed a novel tongue body segmentation method, GaborFM, which initializes ACM by performing fast marching over the two-dimensional (2D) Gabor magnitude domain of the tongue images. For the enhancement of the contour of the tongue body, we used the 2D Gabor magnitude-based detector. To cope with the edge discontinuity problem, the fast marching method was utilized to connect the discontinuous contour segments, resulting in a closed and continuous tongue body contour for subsequent ACM-based segmentation. Qualitative and quantitative results showed that GaborFM is superior to the other methods for tongue body segmentation.


Introduction
Medical image segmentation, analysis, and diagnosis have received much attention in image analysis and computer vision. Several methods of segmentation have been proposed for medical images such as magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound medical imagery. Yezzi et al. proposed the geometric active contour model for segmentation of medical imagery [1]. Tsai et al. proposed a shape-based method using level sets to segment the medical images [2]. For prostate segmentation in transrectal ultrasound (TRUS), Yan et al. proposed a discrete deformable model [3]. Graph cuts [4], proposed by Boykov et al., have also been used for the segmentation of medical images. As one part of medical image analysis and diagnosis, because of its convenience and non-invasion nature, tongue image diagnosis has received considerable research interest.
Tongue diagnosis has played an important role in traditional Chinese medicine (TCM) for thousands of years [5,6]. The TCM practitioner can analyze the physiological and pathological conditions of the patient by inspecting the color, shape, and texture of the tongue, making tongue diagnosis very promising for convenient and non-invasive diagnosis. However, traditional tongue diagnosis is a subjective skill which requires years of experience and practice. Moreover, for different practitioners, the diagnosis results may be inconsistent.
Among the modules of computational tongue diagnosis systems, tongue body segmentation is a prerequisite for subsequent feature extraction and diagnostic analysis. Tongue body segmentation usually involves three steps: edge enhancement, contour initialization, and segmentation. For edge enhancement, conventional edge detectors, such as the Gaussian gradient [35], color gradient [36], and Canny detector [7], have been used. Considering the shape and gray-level characteristics of tongue contour, Zuo et al. [10,11] proposed a polar edge detector to suppress adverse interference from the lip boundary and tongue fissures. For contour initialization, several heuristic [10], deformable model-based [35], and watershed-based [36] methods have been suggested. For segmentation, active contour models (ACM) [11,13,35,37] and the gradient vector flow (GVF) snake [38] have been used for the final segmentation of the tongue body [14,39,40].
Despite the progresses in tongue body segmentation, the existing methods usually suffer from some limitations and cannot meet the performance requirements of a practical computational tongue diagnosis system. For example, in the edge enhancement step, conventional edge detectors usually neglect the characteristics of gray-level variation along the tongue body contour. In the contour initialization step, parts of the tongue body contour might exhibit very weak edges, and the detected contour would be discontinuous, making it hard to obtain a continuous contour for initialization. Moreover, the interference from lips and tongue fissures makes tongue body segmentation more challenging.
In this study, to circumvent edge enhancement and contour discontinuity problems, we propose a novel tongue body segmentation method, GaborFM, which performs fast marching over the two-dimensional (2D) Gabor magnitude domain of tongue images. As shown in Figure 1, the proposed GaborFM method first uses the 2D Gabor magnitude-based edge detector for edge enhancement. Then, in the contour initialization stage, the continuous initial contour is obtained by using fast marching to connect stable segments of the tongue body contour. Finally, the GVF snake is used for tongue body segmentation.
Edge enhancement is fundamental for any edge-based active contour and level set based segmentation method. Thus, the enhancement of the tongue body contour by the proposed 2D Gabor magnitude-based edge detection method was first studied. There are several region-based models, such as color-texture segmentations [38,41] and the C-V model and its extensions [42], but, tongue texture is very complex and varies with images, so this study used the edge-based model and first investigated the edge enhancement problem.
In the contour initialization stage, the stable segment selection and the fast marching method for contour initialization were used. This initialization strategy allowed the incorporation the various tongue priors to obtain a satisfactory initial contour. For example, segments with less length were abandoned. Considering the symmetrical characteristics of the tongue body contour, we selected pairs of segments with better symmetry. Moreover, to avoid the interference of lips and to remove the shadow parts, two paths were found for any two points. If the difference in length L F of the two paths was low, the inner curve was chosen, otherwise the one with the lower length was chosen.
The contour initialization strategy allowed the employment of many types of tongue priors, such as color, shape, and symmetry, to obtain a satisfactory initial contour, and the GVF snake was able to obtain a local optimal and smooth segmentation results. Geodesic active contours [43,44] can be used to address the limited capture range problem of the conventional snake model. Most globally optimal active contours [45][46][47][48] can obtain the global optimal solution based on the energy function, which generally is only a non-ideal separation of the tongue body from the other parts. Several globally optimal active contours [49,50], are similar to the fast marching method used in GaborFM, and can obtain the global optimal path to connect the two points. Thus, we did not use the methods for global minimization of the active contour model.
Compared with the existing methods, GaborFM has two outstanding advantages: (1) By taking the characteristics of gray-level variation of the real tongue body contour into account, the 2D Gabor magnitude-based detector is more effective than the conventional edge detector for the enhancement of the tongue body contour and the suppression of interference from lip and tongue fissures. For this reason, the 2D Gabor magnitude-based edge detection method was used to address the edge enhancement problem. (2) To circumvent the edge discontinuity problem, we used fast marching over the 2D Gabor magnitude domain method to connect the discontinuous contour segments into a closed tongue body contour. First, several stable segments were selected based on the edge enhancement result. Then, the contour initialization problem was modeled as the minimal geodesic paths over the 2D Gabor magnitude domain. Finally, the fast marching algorithm was used to obtain a closed tongue body contour for initialization.
The remainder of the paper is organized as follows. Section 2 presents the 2D Gabor magnitude-based edge detector for the enhancement of the tongue body contour. Section 3 describes the scheme for contour initialization and tongue body segmentation. Section 4 provides the qualitative and quantitative results to evaluate the proposed GaborFM method. Finally, Section 5 offers the conclusion.

2D Gabor magnitude-based edge detection
In this section, a 2D Gabor magnitude-based edge detector for the enhancement of the tongue body contour is proposed. It then utilizes the color characteristics of the tongue body to suppress interference from the tongue texture and fissures, and finally uses Otsu's method for edge thresholding. Figure 2 shows the typical profiles of the boundary pixels of the tongue body. For the typical four boundary pixels shown in Figure 2a, Figure 2b shows the profiles of the intensities along the white lines, and Figure 2c shows the profiles of the real and the imaginary parts of the Gabor filter. From Figure 2, it can be seen that, the profiles of the boundary pixels are similar to both the real and the imaginary parts of the Gabor filter. Thus, it is proper to use the 2D Gabor filter for the enhancement of the boundary of tongue body.

2D Gabor magnitude-based detector
The 2D Gabor filter was first proposed by Daugman to model the receptive fields of the orientation-selective simple cells of the visual cortex [51]. Lee reformulated 2D Gabor wavelets based on physiological evidence and the wavelet frame criterion [52]. 2D Gabor filters have been extensively applied to many image processing and computer vision applications [53,54]. We used the form of 2D Gabor functions derived by Lee [52], is the center of the function, ω is the radial frequency in radians per unit length, and θ is the orientation of the Gabor functions in radians. The κ is defined by where δ is the half-amplitude bandwidth of the frequency response, which is between 1 and 1.5 octaves according to neurophysiological findings [52]. When ω and δ are fixed, σ can be derived from σ = κ/ω.
In the 2D Gabor magnitude-based edge detector, ω ¼ 4 Â ffiffi ffi 2 p and δ = 1.5 were chosen and G k (x, y) was used to denote the Gabor filters with an orientation of θ = k/8π . Given a tongue image I(x, y), the convolution of the image I and G k is where '*' denotes the convolution operator. From Figure 1, the real part of FI k can be used to enhance the boundary pixels with the profiles similar to those shown in the left of Figure 2b, and the imaginary part can be used to enhance the boundary pixels with the profiles similar to those shown in the right of Figure 2b. Then, the output of the 2D Gabor magnitude-based detector is defined as where '¯' denotes the complex conjugate operator. Generally, any profiles in Figure 2b would result in local maxima in M max (x, y). Moreover, the promising performance of the Gabor filter on noise robustness and time-frequency tradeoff makes the proposed 2D Gabor magnitude-based detector suitable for the enhancement of the tongue boundary. Figure 3 shows the edge enhancement results of a typical tongue image using different edge detectors. Compared with the Sobel operator and derivative of Gaussian (DoG) filters [55], the 2D Gabor magnitude-based detector is more effective for the enhancement of the tongue body contour.

Edge thresholding
We used a two-step strategy for edge thresholding. First, based on the color characteristics of the tongue body and face, parts of the non-boundary pixels were identified. Then, the binarization of the edge image was obtained by defining the proper threshold.
As shown in Figure 4a, a typical tongue image usually involves three major components: background, the tongue body, and other facial parts. If parts of the tongue body and the background components are correctly identified, they can be simply marked as nonboundary pixels. All the tongue images were captured in a semi-enclosed environment under stable and uniform lighting condition. Thus, the background is stable and can be easily segmented from the tongue image. The color tongue image was transformed into the YIQ color space, and Otsu's method [56] was used to determine a threshold T b for the I channel of the YIQ color space. The reason for choosing the I channel of YIQ is that increasing I characterizes the change from blue, through purple, to red colors, and is more effective in separating background pixels from the others than Y and Q. Pixels with the I value smaller than T b were then set as background pixels, and morphological operators, such as dilation, filling, and erosion, are used to define the background region.
Unlike the background, due to physiological and pathological factors, the color of the tongue body varies with tongue images. Fortunately, it is relatively easy to distinguish the color of tongue body from that of other facial parts: the pixel values of the tongue body component usually have higher Q value. Let T 1 denote the threshold obtained by Otsu's method. We set the threshold T Q = 2 T 1 and set the pixels with a Q value higher than T Q as tongue body pixels. Morphological operators, mentioned above, were then used to define the tongue body region. A subset of the background and tongue body components was then obtained. As shown in Figure 4b, by assigning these pixels as non-boundary pixels, a modified edge image can be derived for edge binarization. Finally, a threshold T M for the binarization of the modified edge image is defined. Let T o be the threshold obtained by Otsu's method on M max (x, y), and Var M be the standard variance of M max (x, y). The threshold T M is then defined as After edge thresholding, the morphological operators were employed to obtain single-pixel edge curves, resulting in the final binarized edge image shown in Figure 4c.

Tongue body segmentation using fast marching and the GVF snake
In this section, first several stable segments from the binarized edge image are selected, and then the fast marching  algorithm is used to obtain a closed contour for initialization. Finally, the GVF snake model is adopted for tongue body segmentation.

Selection of stable segments
Although the 2D Gabor magnitude-based detector is effective, false detection of the tongue body boundary cannot be completely avoided. Moreover, it is also difficult to detect the entire closed tongue body contour by only using the 2D Gabor magnitude-based detector. Fortunately, the location and shape of the tongue body are useful in distinguishing a true boundary from a false boundary. Thus, to suppress the adverse influence of false detection, we used the strategy of selecting stable segments of the tongue body contour. By referring to the location and shape of the tongue body, the following approach for selecting several stable segments from the binarized edge image is suggested. First, the mean x 1 ; y 1 ð Þ of the coordinate of all the edge pixels and the mean x 2 ; y 2 ð Þ of the coordinate of all the tongue body pixels obtained in Section 2.2 were computed. Then, the center of the tongue body x; y ð Þ was estimated as the average of x 1 ; y 1 ð Þand x 2 ; y 2 ð Þ. As shown in Figure 5a, to avoid the interference of the lips, the left and the right sides of the tongue body boundary are considered to be more stable. So, we focused on the stable segments from the left and right boundary of the tongue body. Based on the estimated center of the tongue body x; y ð Þ, the binarized edge image was divided into left and the right parts. For each part, the segment with sufficient length was defined as a stable segment. In order to verify the stable segments, the approximate symmetry of the segments was also taken into account. Figure 5b shows an example of the two stable segments. Finally, on each stable segment, two stable points were found by backtracking from the end points of the stable segment for contour initialization, as shown in Figure 5b.

Contour initialization using fast marching
For the initialization of the tongue body contour, the fast marching algorithm was utilized to address the discontinuity problem by connecting different stable segments. Given two points A (x A , y A ) and C (x C , y C ), we defined the length of a planar curve γ : [0, 1] ↦ ℝ 2 from A to C, where ‖γ′(t)‖ 2 denotes the l 2 -norm of the gradient of γ(t), and g(x, y) is defined in ℝ 2 [43], Then the required curve from A to C was defined as the shortest path defined by L F (γ), where C is the set of curves with γ(0) = (x A , y A ) and γ(1) = (x C , y C ). Accordingly [57][58][59], γ* can also be estimated by solving the following Eikonal equation, with T A (x A , y A ) = 0, where T A (x, y) denotes the shortest distance L F of (x A , y A ) and (x, y). The fast marching method a was used to solve the Eikonal equation to obtain T A (x, y). The computational complexity of fast marching is O(NlogN), where N = mn is the size of the tongue image. Moreover, fast marching stops when the point (x C , y C ) is reached, so T A (x, y) can be more efficiently obtained. After the computation of T A (x, y), given the point of (x C , y C ), the shortest path can be constructed by solving the following ordinary differential equation (ODE), with the initial value X t = 0 = (x C , y C ). This ODE problem can be efficiently solved by using second-order Heun's method [57].
In practice, to avoid the interference of the lips, we found two paths for any two points. If the difference in length L F of the two paths was low, the inner curve was chosen, otherwise the one with the lower length was chosen. Figures 1e and 5c show two examples of the shortest paths constructed by the fast marching method. Satisfactory initialization of the tongue body contours can be obtained by using the fast marching method.

Gradient vector flow snake
An active contour model (or snake) [60] is a curve x(s) = [x(s), y(s)], s ∈ [0, 1], that moves through the spatial domain of an image to minimize the following energy functional, where α and β are weighting parameters that control the snake's tension and rigidity, respectively, x ' (s) and x ' ' (s) denote the first and second derivatives of x(s) with respect to s, and E ext is the external energy. We used the gradient vector flow (GVF) snake model b proposed by Xu and Prince [61] for final tongue body segmentation, which uses GVF as the external field. The GVF snake is promising in addressing the limitations of Kass et al.'s snake model [60], such as small capture range of the external field and difficulties in progressing into boundary concavities. Given an edge image M(x, y), the GVF field w(x, y) = [u(x, y), v(x, y)] can be obtained by solving the minimization problem, where | ⋅ | denotes the l 2 norm with ∇w . For fast GVF computation, the augmented Lagrangian method (ALM)-based algorithm developed by our group was used [62].
For the GVF snake, ∇E ext = w(x, y). So, after the computation of the GVF field, the curve can dynamically evolve until convergence, and the partial derivative of the curve x(s, t) with respective to time t as

Database and evaluation criteria
We constructed a tongue image data set of 300 images c to evaluate the proposed method. Manual segmentation results were used as the ground truth. All the images were acquired by our tongue image acquisition device. The image size was 768 × 576. Since the images were captured in a semi-enclosed environment under stable lighting conditions, it was not necessary to use any preprocessing method for illumination normalization. All the experiments were executed on a PC with T2250 @1.73 GHz CPU and 2G memory.
To evaluate the segmentation performance, qualitative and quantitative evaluations were used. For qualitative evaluation, the results on several typical tongue image using existing methods and GaborFM were presented to test GaborFM. For quantitative evaluation of the segmentation results, we used both boundary-and areabased performance criteria. In area-based evaluation [63,64], the false negative volume fraction (FNVF, %) and the false positive volume fraction (FPVF, %) were used to evaluate the segmentation methods. In the boundarybased evaluation [63], the Hausdorff distance (HD), the mean distance to the closest point (MD) and the standard deviation of the error distribution (SD) were used. Let A = {a 1 , …, a m } and B = {b 1 , …, b n } be two curves, where a i denotes the ith pixel of the curve A, and b j denotes the jth pixel of the curve B. The distance to the closest point (DCP) d(a i , B) is defined in [63]. The difference between A and B based on the error distribution of d(a i , B) and d(b j , A) was evaluated. The standard deviation of the error distribution (SD) is computed as

Components of GaborFM
We compared the edge enhancement results obtained using the Sobel, DoG, and 2D Gabor magnitude-based edge detector, as shown in Figure 3. It can be seen that the proposed 2D Gabor magnitude-based detector can faithfully enhance the tongue body contour and is more robust against inference from tongue texture. Thus, compared with Sobel and DoG, the 2D Gabor magnitude-based method is more effective for the enhancement of the tongue body contour. Figure 6 shows the binarized edge images obtained using the proposed method and the Canny edge detector. The proposed method can obtain a binarized edge image with more continuous contour. Compared with the proposed method, the binarized edge image obtained using Canny is noisier and has many false positive edges.
To evaluate the role of each step, we compared the segmentation performances of the following three variants: (1) Gabor + FM: we only used the first two steps (Gabor magnitude-based edge detection and fast marching) for tongue body segmentation and used the contour after fast marching as the contour of the tongue body. (2) Canny + FM + GVF: we replaced the Gabor magnitude-based edge detection with the Canny detector. (3) Gabor + FM + GVF: we used all the three steps, GaborFM, for tongue body segmentation. Table 1 lists the HD, MD, and SD values of these three methods. Gabor + FM + GVF is much better than Canny + FM + GVF, which indicate that the Gabor magnitudebased edge detection is more suitable for the enhancement of tongue body contour. Besides, Gabor + FM + GVF performs a little better than Gabor + FM, which indicates that the final GVF snake step is also useful in improving the segmentation accuracy.
To validate the influence of the second-order regularization in GVF snake, we compared the segmentation results obtained using GVF with β = 0, and GVF with β = 0.01. As shown in Figure 7, the result of GVF with β = 0.01 is more smooth than the result of GVF with β = 0, which indicates that GVF with β = 0.01 could further improve the local smoothness of the contour. And thus, in this paper, we adopted GVF snake with β = 0.01 as the final step to segment tongue body from the image. Figure 8 shows the final segmentation obtained using the GGVF snake [65] and the proposed GVF snake. The proposed GVF snake and GGVF snake can obtain almost the same segmentation results. However, the CPU running time of the proposed GVF snake is 0.76, which is much less that of GGVF snake.

Comparison with the other segmentation methods
GaborFM consists of several major steps: edge enhancement, thresholding, stable segment selection, GVF computation, and the snake. Given an m × n image, the computational complexity of edge enhancement and GVF computation is O(Kmnlog(mn)), where K is the number of filters or the number of iterations. For the steps of thresholding and stable segment selection, the computational complexity is O(mn). The computational complexity of the snake is O(kd), where k is the number of iterations and d is the number of elements of the snake curve. Table 2 shows the average CPU times of two other automatic tongue body segmentation methods and proposed method, and also shows manual segmentation time. It can be seen that the average CPU time of GaborFM is 10.80 s and it is shorter than the other methods. This shows that the proposed method can be efficiently executed, and it is faster than the other methods in finishing the segmentation of one tongue image.
The segmentation results obtained using GaborFM were compared with those obtained using the method proposed by Cohen and Kimmel [48] and the method proposed by Bresson et al. [47].
Globally, optimal active contours [47] can obtain the global optimal solution based on the energy function, which is generally only a non-ideal separation of the tongue body from the other parts. Figure 9a,b,c shows the segmentation results using the fast global active contour model [47]. Bresson    considerable parts of tongue body region are missegmented. Figure 9d,e,f shows the segmentation results obtained using Cohen and Kimmel's method [50], and Figure 9g,h,i shows the segmentation results obtained using GaborFM. It can be seen that the segmentation result of Cohen and Kimmel's method [50] tend to be less smooth than those of the GVF snake.
We further compared the proposed GaborFM method with three other tongue body segmentation methods: BEDC [35], Watershed [36], and PolarSnake [32]. Figure 10 shows the segmentation results of three typical tongue images obtained using the four segmentation methods: BEDC [35], Watershed [36], PolarSnake [32], and GaborFM. In BEDC [35], the bi-elliptical deformable template was adopted for contour initialization. Since the shapes of the tongue body contour varies with tongue images, some contours could not be well approximated by bielliptical templates, and BEDC [35] obtained poor contour initialization results, as shown in Figure 10b,c. Watershed [36] adopted the region merging strategy for contour initialization, and this usually led to a nonsmooth contour, and over-and under-segmentation, as shown in Figure 10d,f. The PolarSnake [32] was designed for detecting dark lines, and location bias occurred for the detection of boundary pixels such as Points 3 and 4 in Figure 2. The location bias of the PolarSnake can be observed in Figure 10g,i. In summary, GaborFM is more robust against the interference from lips, chin, tongue texture, and fissures, and is superior to BEDC, Watershed, and the PolarSnake for tongue body segmentation.
The tongue image data set of 300 images d was used to evaluate the proposed method. A subset of the data, 20 tongue images, were used to learn the parameters of GaborFM, and the other 280 tongue images were used to test the proposed method. Table 3 lists the HD, MD distance and SD of BEDC, Watershed, PolarSnake, and the proposed GaborFM method. From Table 3, it can be seen that GaborFM can achieve much lower HD, MD distance, and SD, which indicates that the proposed method is superior to the other three methods for tongue body segmentation. It can be seen that the standard deviation of GaborFM is much lower than those of BEDC, Watershed, and PolarSnake, which indicates that GaborFM is more robust than BEDC, Watershed, and PolarSnake.    Table 4 also lists the standard deviation of FPVF and FNVF. The standard deviation of GaborFM is also lower than those of the other three competing methods, which indicates that GaborFM is more robust than BEDC, Watershed, and PolarSnake.
Based on the evaluation results above, it can be seen that GaborFM is better than BEDC, Watershed, and PolarSnake for tongue body segmentation. From Table 3, the proposed GaborFM method can achieve the HD, MD distance, and SD of 17.79, 4.26, and 3.98 pixels, respectively. From Table 4, the proposed method can achieve the FPVF and FNVF of 0.96% and 5.44%, respectively. Moreover, the qualitative and quantitative evaluation results also show that GaborFM satisfies the practical performance requirements of tongue body segmentation and can be embedded into a real computational tongue diagnosis system.

Conclusion
In this paper, we proposed a novel method, GaborFM, for automated tongue body segmentation. First, a Gabor magnitude-based detector was developed for edge enhancement. Second, both the color characteristics and the edge enhancement result for the thresholding of the edge image were taken into account. Third, stable segments were selected from the binarized edge image and use the fast marching algorithm over 2D Gabor magnitude domain to obtain a continuous closed curve for contour initialization. Finally, the GVF snake was used for tongue body segmentation. Generally, the proposed method can well address the edge enhancement and the contour discontinuity problems. Experimental results also showed that the proposed method is more effective than other tongue body segmentation methods, i.e., BEDC [35], Watershed [36], and PolarSnake [32].
Endnotes a To solve the Eikonal equation, we used the Matlab Toolbox Fast Marching provided by Dr. Gabriel Peyre, which is available to the public at http://www.mathworks. com/matlabcentral/fileexchange/6110-toolbox-fast-marching. b The code of GVF snake was provided by the authors of [61], which is available to the public at http://www. iacl.ece.jhu.edu/static/gvf/. Our Matlab code on ALMbased GVF computation has been released at https:// sites.google.com/site/cswmzuo/IALM-GVF_GGVF.rar. c We will soon make the tongue image dataset together with the ground truth segmentation results available to the public. d We will soon make the tongue image dataset together with the ground truth segmentation results available to the public.