The block diagram of the proposed CAD system for the automatic detection of breast lesions in DCE-MRI is presented in Fig. 1. As the first processing steps, motion artifacts are corrected and the breast region is segmented. Then, the potential lesion voxels are detected and utilized as the initial seed points for region-growing algorithm. Subsequently, the region-growing method based on FCM clustering algorithm and vesselness filter segments the potential lesion regions. Eventually, a discrimination step is used by relying on the morphological and kinetic features as inputs to the support vector machine (SVM) classifier to reduce false positives detections. The processing steps are sequentially explained in the following subsections.

### 3.1 Motion correction

Respiration, cardiac motion, muscle relaxation, and involuntary movements of a patient cause motion artifacts which are inevitable due to relatively long acquisition period of breast DCE-MRI. Motion artifacts can influence the lesion kinetic characteristics and also increase false positive findings by generating spurious enhancing voxels. In this research, motion correction is performed by registering all five post-contrast sequences to the pre-contrast sequence. The registration step contains a rigid transformation followed by a non-rigid B-Spline transformation [16] based on mutual information similarity measure [17]. The rigid transformation which contains translation and rotation provides global alignment of two images. Likewise, the local differences between the images are minimized by non-rigid transformation. For rigid registration, 1000 iterations of the stochastic gradient descent optimizer have been performed. For non-rigid B-spline registration, three resolutions and 64 histogram bins have been utilized and the gradient descent optimization algorithm is applied in each resolution performing 200 iterations. The registration process has been implemented using medical image registration toolbox (MIRT) [18]. The influence of motion correction for a sample subtracted image at the fifth post-contrast time point is presented in Fig. 2.

### 3.2 Breast region segmentation

Breast region segmentation is automatically performed to decrease the computational burden and avoid false positive findings due to enhancing tissues of the heart and vessels outside of the breast. The segmentation procedure in [19] is applied to extract the breast region. The breast segmentation pipeline consists of four consecutive stages: local adaptive thresholding, connected component labeling operation to exclude the extra regions in the binary image, horizontal projection to delineate the breast region, and both hole-filling and morphological closing operators to eliminate the discontinuities in the breast region. The breast segmentation approach has been applied on nonfat-suppressed images due to high signal intensity of fat tissue that makes a high contrast between adjacent regions. Hence, the breast boundary is detected by applying a prior registration and then translating the obtained breast masks in nonfat-suppressed images to the subtracted images. Figure 3 shows breast region segmentation method using sample images. The breast region is firstly detected on the nonfat-suppressed image, and then, the obtained breast mask is translated to the subtracted image by applying the registration method.

### 3.3 Detection of potential lesion voxels

The next step is to segment out the voxels that might belong to the breast lesions. Following injection of contrast agent, enhancement of signal intensity occurs in all breast lesions except in cysts. Therefore, the enhancing voxels in post-contrast sequences are dubious to be a part of breast lesions. For detecting potential lesion voxels, the maximum enhancement ratio is utilized using the following equation [8]:

$$ ME\left( x, y, z\right)= \max \left(\frac{I_t\left( x, y, z\right)-{I}_0\left( x, y, z\right)}{I_0\left( x, y, z\right)},\kern0.5em t=1,2,\cdots, 5\right), $$

(1)

where *I*
_{0} and *I*
_{t} are the intensity values on pre-contrast and \( t \)
^{th} post-contrast sequences, respectively, and (*x*, *y*, *z*) is related to a voxel location. Afterwards, the maximum enhancement ratio is convolved with a Gaussian smoothing filter at 10 exponentially distributed scales between 0 and 10 mm. Because of the various sizes of the breast lesions, different scales are computed and the highest response is chosen for each voxel. Eventually, local maxima of voxel values are found based on a spherical kernel with a radius of 10 mm. The achieved points are potential lesion voxels which are utilized to detect breast lesions. However, the signal intensity of voxels in the blood vessels, noise, skin, and fibroglandular tissues can be similar to those of the lesion voxels. Hence, some of the detected potential lesion voxels do not belong to the lesions. In order to remove these false detections, a combination of FCM clustering and vesselness filter has been used.

FCM clustering technique is utilized to partition the voxels based on the signal intensity variation over time (one pre-contrast and five post-contrast time points) into two categories: lesion and non-lesion. Signal intensity variation over time is one of the tissue characteristics which is widely used for the segmentation and classification of breast lesions [20, 21]. Figure 4 presents signal intensity variation over time for some randomly chosen voxels which belong to different lesion and non-lesion tissues. As shown in this figure, the signal intensity variations of voxels in the lesion and non-lesion tissues have different characteristics. Each voxel is represented using signal intensity variation over time as written here,

$$ X=\left\{{\mathbf{x}}_{\boldsymbol{i}}, i=1,2,\cdots, N\Big|{\mathbf{x}}_{\boldsymbol{i}}=\left({I}_{i0},{I}_{i1},\cdots, {I}_{i, T-1}\right)\right\}, $$

(2)

where \( {\mathbf{x}}_{\boldsymbol{i}} \) represents the data vector for the \( i \)
^{th} voxel, \( N \) is the number of voxels, \( {I}_{it}\left( t=0,1,\cdots, T-1\right) \)is the intensity value of the \( i \)
^{th} voxel at time point \( t \), and \( T \) is the number of time points (\( T=6 \)). FCM clustering process is performed based on minimization of the objective function by iteratively updating the membership functions and the cluster centers. The objective function, cluster centers, and membership functions are defined here [22]:

$$ F={\sum}_{i=1}^N\kern0.2em {\sum}_{k=1}^w\kern0.2em {\mu}_{k i}^m\kern0.5em \parallel {\mathbf{x}}_{\mathbf{i}}-{\mathbf{v}}_{\mathbf{k}}{\parallel}^2 $$

(3)

$$ {\mathbf{v}}_{\mathbf{k}}=\frac{{\displaystyle {\sum}_{i=1}^N}{\mu}_{ki}^m{\mathbf{x}}_{\mathbf{i}}}{{\displaystyle {\sum}_{i=1}^N}{\mu}_{ki}^m},\kern0.5em k=1,2,\cdots, w, $$

(4)

$$ {\mu}_{k i}=\frac{1}{{\displaystyle {\sum}_{j=1}^w}{\left\{\frac{\left\Vert {\mathbf{x}}_{\mathbf{i}}-{v}_k\right\Vert }{\left\Vert {\mathbf{x}}_{\mathbf{i}}-{v}_j\right\Vert}\right\}}^{\frac{2}{m-1}}},\kern0.5em k=1,2,\cdots; w; i=1,2,\cdots; N. $$

(5)

where \( m\in \left[1,\infty \Big)\right. \) and controls the fuzziness of the clustering results, \( w \) is the number of clusters, \( {\mathbf{v}}_{\mathbf{k}} \) is the center of the \( k \)
^{th} cluster, and \( {\mu}_{ki} \)is the membership values of \( i \)
^{th} voxel to \( k \)
^{th} cluster which continuously ranges from 0 to 1. Utilizing the class membership values of voxels, two membership matrices are created (\( {\mu}_{lesion} \) and \( {\mu}_{nonlesion} \)). In Fig. 5, the lesion membership matrices are shown for two sample images. Each entry in this matrix represents the degree of similarity between corresponding voxel and lesion tissue. By thresholding the lesion membership matrix, spurious candidate voxels which may belong to noise or normal breast tissues are eliminated. As it can be observed from this figure, the vessels show contrast enhancement similar to the breast lesions. It means that the FCM clustering places the voxels which belong to the mammary vessels and breast lesions in one cluster. Thus, the potential lesion voxels which belong to the vessels should be detected and eliminated to reduce false positive detections.

To identify mammary vessels, we apply Hessian-based filter introduced by Frangi et al. [23] which is one of the most well-known vesselness filters. In the scale space, the second-order derivative of an image \( {I}_0(p)={I}_0\left( x, y, z\right) \) is called Hessian matrix and can be obtained by Eq. 6:

$$ {H}_{\sigma}(p)=\left\{\begin{array}{ccc}\hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial {x}^2}\hfill & \hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial x\partial y}\hfill & \hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial x\partial z}\hfill \\ {}\hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial y\partial x}\hfill & \hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial {y}^2}\hfill & \hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial y\partial z}\hfill \\ {}\hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial z\partial x}\hfill & \hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial z\partial y}\hfill & \hfill \frac{\partial^2{I}_{\sigma}\left( x, y, z\right)}{\partial {z}^2}\hfill \end{array}\right\} $$

(6)

where \( p=\left( x, y, z\right) \) is a voxel location and \( {I}_{\sigma} \) is a blurred image at a certain scale defined as:

$$ {I}_{\sigma}(p)={I}_{\sigma}\left( x, y, z\right)={I}_0\left( x, y, z\right)\otimes {G}_{\sigma}\left( x, y, z\right) $$

(7)

where \( \bigotimes \) is the convolution operator and \( {G}_{\sigma}\left( x, y, z\right) \) is the 3D Gaussian kernel defined as follows:

$$ {G}_{\sigma}(p)={G}_{\sigma}\left( x, y, z\right)=\frac{1}{\sqrt{2\pi {\sigma}^2}} exp-\frac{\left({x}^2+{y}^2+{z}^2\right)}{2{\sigma}^2} $$

(8)

where \( \sigma \) is the standard deviation and has to be set according to the approximate width of the vessels. Eigenvalues of the Hessian matrix present a good geometric interpretation of the image; hence, they are used to detect different structures. In our approach, the eigenvalues of the Hessian matrix are sorted as: \( \left|{\lambda}_1\right|<\left|{\lambda}_2\right|<\left|{\lambda}_3\right| \) and according to Eqs. 6–8, they depend on the voxel location and standard deviation. Frangi et al. [23] notify that a voxel belonging to a white vessel on a black background is given by small \( {\lambda}_1 \) and high negative values of \( {\lambda}_2 \) and\( {\lambda}_3 \). The vesselness function is defined here [23] as follows:

$$ {v}_o\left( p,\sigma \right)=\left\{\begin{array}{l}0\hfill \\ {}\left(1- \exp \left(-\frac{R_A^2}{2{\alpha}^2}\right)\right) exp\left(-\frac{R_B^2}{2{\beta}^2}\right)\left(1- \exp \left(-\frac{s^2}{2{c}^2}\right)\right)\hfill \end{array}\;\begin{array}{r}\hfill \mathrm{if}\ {\lambda}_2>0\kern1em \mathrm{or}\kern0.75em {\lambda}_3>0\\ {}\hfill \mathrm{otherwise},\end{array}\right. $$

(9)

where \( \alpha \) and \( \beta \) are fixed to 0.5 and \( c \) is the half of the maximum Hessian norm. Moreover, \( {R}_A \), \( {R}_B \), and \( s \) are given as follows:

$$ {R}_A=\frac{\left|{\lambda}_2\right|}{\left|{\lambda}_3\right|},\kern0.5em {R}_B=\frac{\left|{\lambda}_1\right|}{\sqrt{\left|{\lambda}_2{\lambda}_3\right|}},\kern0.5em s=\sqrt{\lambda_1^2+{\lambda}_2^2+{\lambda}_3^2}. $$

(10)

The term \( {R}_A \) distinguishes between plate-like and tubular-like structures, \( {R}_B \) describes the deviation from a blob-like structure, and \( s \) represents the difference between vessel and background. Due to different diameters of mammary vessels, the vesselness filter is applied at six exponentially distributed scales between the maximum and minimum scales which are \( {\sigma}_{min}=0.5 \) and \( {\sigma}_{max}=1 \) and the highest value is chosen for each voxel.

$$ {v}_o(p)={ \max}_{\sigma_{\min}\le \sigma \le {\sigma}_{\max }}{v}_o\left( p,\sigma \right). $$

(11)

Vessel detection is performed on the subtracted images at the first post-contrast time point due to the maximum contrast enhancement of vessels in the early frames. The obtained result from the vesselness filter for a sample image is presented in Fig. 6. The value of each entry in the response matrix ranges from 0 to 1 and demonstrates the degree of similarity of each voxel to the vessel.

Lesion membership matrix obtained by the FCM algorithm and the response of the vesselness filter are then thresholded to eliminate the false positive detections in the potential lesion voxel set. The threshold level is chosen equal to 0.5 for both the lesion membership matrix and response of the vesselness filter. Lesion membership matrix is now converted to a binary image which shows normal voxels in black and lesion voxels in white. Also, a binary image is generated for response of the vesselness filter which shows non-vessel voxels in black and mammary vessel voxels in white. Consequently, the voxels with label one from the thresholded lesion membership matrix and label zero from the thresholded response of the vesselness filter are selected as the final potential lesion voxels.

### 3.4 Detection of potential lesion regions

In the previous processing step, the potential lesion voxels are segmented out using a combination of the maximum enhancement ratio, vesselness filter, and FCM clustering. The obtained potential lesion voxels are used as the seed points for the seeded region-growing algorithm to segment the potential lesion regions. The seeded region-growing algorithm is used because it is simple and robust [24]. The seeded region-growing algorithm starts with an initial seed voxel and tries to compare its neighborhood voxels with this seed according to a specific homogeneity criterion and then enlarges the size of the region iteratively. If the neighboring voxel satisfies the homogeneity criterion, it will be joined to the segmented region. Twenty-six neighbors of the new voxel are tested according to the homogeneity criterion, and then, this process will be continued in the same way. The initial seed voxel and homogeneity criterion are usually selected manually [25]. In this study, an automated version of the seeded region-growing algorithm is performed for choosing the parameters. The detected potential lesion voxels are considered as the initial seed voxels, and the attributes which are used to select the potential lesion voxels are considered as the growth criteria of the seeded region-growing algorithm. The neighborhood voxels with label one from the thresholded lesion membership matrix and label zero from the thresholded response of the vesselness filter are considered to be in the potential lesion region.

### 3.5 False positive reduction

Despite the elimination of spurious candidate voxels in Section 3.3, the potential lesion regions are not only breast lesions and there are still some false positive findings. The existence of these false detections can effect on the performance of the CAD system. In order to reduce false positive detections, a discrimination step is used to determine whether a potential lesion region is a true lesion or a false positive detection. This is achieved by classifying the potential lesion regions into two classes, lesion and normal breast tissue, based on the morphological and kinetic features as inputs to the SVM classifier. The main reason for choosing SVM classifier is its high generalization ability, robustness to outliers, and absence of local minima [26].

For classifying the potential lesion regions, morphological and kinetic features are calculated after applying the 3D-connected component algorithm [27, 28] on the potential lesion regions. Morphological features consist of volume, compactness, radius, and spiculation. Lesion volume is utilized to decrease false positive detections, since the majority of the false positive findings have the smaller volumes with respect to the lesions. Compactness describes the correlation between the surface and volume of the segmented regions. Radius and spiculation measure variations of the margins in the segmented regions. More details about the morphological features are described in [21, 29]. To characterize the kinetic features, maximum enhancement (ME), time to peak (TP), uptake rate (UR), washout rate (WR), and area under the curve are extracted from the time-intensity curve of each voxel. These kinetic features are computed according to the relative signal enhancement [14]:

$$ {R}_t=\frac{I_t-{I}_0}{I_0} $$

(12)

For each voxel in the potential lesion region, ME, TP, UR, and WR are defined here [14, 21] as follows:

$$ \mathrm{ME}={ \max}_{t=1,\cdots, 5}\left({R}_t\right), $$

(13)

$$ \mathrm{T}\mathrm{P}= \arg \kern0.5em { \max}_t\left({R}_t\right) $$

(14)

$$ \mathrm{U}\mathrm{R}=\frac{\mathrm{ME}}{\mathrm{TP}}, $$

(15)

$$ \mathrm{W}\mathrm{R}=\left\{\begin{array}{c}\hfill \frac{R_{TP}-{R}_5}{5-\mathrm{TP}}\kern0.75em if\ \mathrm{TP}\ne 5,\hfill \\ {}\hfill 0\kern4.5em if\ \mathrm{TP}=5.\hfill \end{array}\right. $$

(16)

For each potential lesion region, 19 features are totally extracted which contain kinetic parameters for seed point of the segmented region, the average and standard deviation of the kinetic parameters for the entire voxels in the segmented region, and four morphological features. Each potential lesion region is classified by feeding its feature vector as the input to the SVM classifier. More details about the SVM is available in [26].