- Research
- Open access
- Published:

# Throat polyp detection based on compressed big data of voice with support vector machine algorithm

*EURASIP Journal on Advances in Signal Processing*
**volumeÂ 2014**, ArticleÂ number:Â 1 (2014)

## Abstract

Classification in large-scale data is a key problem in big data domain. The theory of compressive sensing enables the recovery of a sparse signal from a small set of linear, random projections which provides a compressive classification method operating directly on the compressed data without reconstructing for big data. In this paper, we collected the compressed vowel /a:/ and /i:/ voice signals using compressive sensing for throat polyp detection. The throat polyp prediction procedure based on wavelet packet transform and support vector machine intelligent algorithm was deduced. The experiments for throat polyp prediction with the proposed classification algorithm were carried out. The results showed that the correct rate of prediction was stable under different number of samples and different random measurement matrices.

## 1 Introduction

Throat polyps are small fleshy growths which form on the vocal cords, usually as a result of overuse. They are mainly caused by straining or overusing the voice, for example, public speaking. Professional singers, sports/fitness coaches, or actors are all prone to developing throat polyps. The most common symptoms of throat polyps are a hoarse or deeper voice or a breathy sounding voice similar to laryngitis.

The traditional methods of throat polyp diagnosis are indirect laryngoscope, video-laryngoscope, and stroboscope light [1]. These methods need special instrument and depend on the experience of the pathologists. Usually, the patients will feel uncomfortable pain. Due to the fact that voice change of the patients is the most common symptoms of throat polyps, it would be desirable if the throat polyps could be detected based on the patient voices. In [1], Zhong et al. tried to detect throat polyps based on patient voices. Two fuzzy classifiers and a Bayesian classifier were designed for throat polyp detection based on patient vowel voices /a:/ and /i:/. The experimental results showed that an interval type 2 fuzzy classifier performed the better. In this paper, we will use the compressive sensing and support vector machine (SVM) algorithm to detect the throat polyps with patient vowel voices /a:/ and /i:/ while reducing the burden of voice data collection and storage.

Compressive sensing (or compressed sampling (CS)) theory demonstrated that a high-dimensional signal can be projected into a low-dimensional space with a random measurement matrix when the signal was sparse or compressible which was proposed by Donoho and CandÃ¨s in 2006 [2, 3]. Then, the original signal can be recovered from the low-dimensional information with solving an optimization problem. The provable success of CS for signal reconstruction motivated that the low-dimensional signal contained the main features of the original signal. Thus, the universality of CS theory can be leveraged in the hypothesis testing problem and mitigate the complexity of data computing [4].

The hypothesis testing in compressed domain can not only reduce the pressure of data storage and transmission but also overcome the large amount of data calculation. In [5, 6], Budhaditya used the compressed sensor network data for anomaly detection based on spectrum theory method and obtained satisfactory detection results in the light of residual analysis of compressed data. In [7, 8], random projection in conjunction with principal component analysis (PCA) was implemented for anomaly detection in compressed domain, and an application of this proposed methodology to detect IP-level volume anomalies in computer network traffic suggested a high relevance to practical problems. In [9], an anomaly detection criterion based on wavelet packet transform and statistic process control theory in compressed domain was used for through wall human detection. The experimental results showed that the proposed algorithm could effectively detect the existence of human being through compressed signals.

Because of the advantage of compressed classification in big data based on compressive sensing comparative with classification in original data [10â€“15], a throat polyp detection algorithm based on compressive sensing and support vector machine is proposed in this paper. The remainder of this paper is organized as follows: In Section 2, the compressive sensing theory will be introduced. Throat polyp detection procedure based on CS and SVM will be deduced in Section 3. Experimental results of throat polyp detection will be shown in Section 4. Section 5 is the conclusion and discussion.

## 2 Background on compressive sensing

Compressive sensing states that the signal often contains some type of structure that enables intelligent representation and processing which builds on a core tenet of signal processing and information theory [16].

Suppose that an observer makes measurements of a signal *x*, it can be expressed as follows:

where *Î¸* âˆˆ *R*^{N} is the expansion coefficient vector under the orthonormal basis Î¦. If *Î¸* has only *K*â€‰â‰¤â€‰*N* nonzero coefficients, we can say that signal *x* is *K*-sparse.

The surprising result of CS is that a length-*N* signal that is *K*-sparse in some basis can be recovered exactly/approximately from a nonadaptive linear projection of the signal onto a random basis. In matrix notation, it can be described as follows [17â€“21]:

where *y* is an *M*â€‰Ã—â€‰1 column vector and Î¨ is an *M*â€‰Ã—â€‰*N* random matrix. The appeal of CS is that we only need to collect *M*â€‰=â€‰*O*(*K*â€‰log(*N*/*K*)) random measurements to recover the signal *x* by solving the following *l*_{0}-norm-constrained optimization problem:

where the ||*Î¸*||_{0} norm counts the number of nonzero components of *Î¸*. However, solving Equation 3 was both numerically unstable and NP-complete. Instead of solving the *l*_{0} minimization problem, nonadaptive CS theory seeks to solve the â€˜closest possibleâ€™ tractable minimization problem, i.e., the *l*_{1} minimization:

Although *M*â€‰<â€‰*N*, the recovery of the signal *x* from the measurements *y* become possible and practical under the additional assumption of signal sparsity or compressibility. The provable success of CS for signal reconstruction can indicate that the collected low-dimensional measurements contained the main features of the original signal. Therefore, it provides us a novel procedure for hypothesis testing of big data which can be carried out in the compressed domain.

## 3 Throat polyp detection procedure

The most common symptom and the first to typically appear in the throat polyp patients is a general roughness or hoarseness of the voice, which may or may not be accompanied by a sore throat or a full feeling in the throat. In other words, the frequency components of the same voice such as vowel voices will be varied when a person suffers throat polyps. Therefore, we can detect the throat polyps by analyzing the frequency component of the voice signal.

### 3.1 Acquire the frequency component by WPT

Fourier transform (FT) is the conventional signal frequency spectrum analysis tool which is a global transform and has low-frequency resolution. Due to its shortage in recognizing the tiny change of the frequency spectrum of FT, wavelet packet transform (WPT) has become the widest implement in the field of signal frequency analysis.

WPT is one extension of the wavelet transform (WT) which provides a complete level-by-level decomposition. It can enable the extraction of features from signals which combine stationary and nonstationary characteristics with an arbitrary time-frequency resolution [22].

In this paper, we extract the features of vowel voices /a:/ and /i:/ to detect the throat polyps. According to the principle of WPT, the vowel voice signal *x*(*t*) is decomposed into *j* levels of decomposition, and the node signals are reconstructed as {x}_{j}^{i}\left(t\right). Then, it can be expressed as follows:

The node signal energy can be defined {E}_{j}^{i} as

On the basis of WPT, Equation 6 illustrates that the node signal energy {E}_{j}^{i} stores the energy of a specific time-frequency window. In other words, {E}_{j}^{i} indicates the proportion of corresponding frequency component in the original signal. Thus, according to the principle described above, the throat polyp detection can be achieved by investigating the changing trend of {E}_{j}^{i}.

In order to eliminate the influence of volume for throat polyp detection, we define \xce\u201d{E}_{j}^{i} as the node signal energy ratio in the total signal energy further:

where *m* denotes the first dominant nodes which contained the main energy of the signal. It can eliminate the noise effect on the tiny energy node and improve the detection accuracy.

### 3.2 Throat polyp detection procedure with SVM

Support vector machines (SVM) are a popular machine learning method for classification, regression, and other learning tasks. In this method, one maps the data into a higher dimensional input space and one constructs an optimal separating hyper plane in this space [23, 24].

Given a training set of *N* data points (*y*_{
i
}, *x*_{
i
}), *i*â€‰=â€‰1, 2, 3, â€¦, *N* where *x*_{
i
} âˆˆ *R*^{n} and *y*_{
i
} âˆˆ {1, -1}. The classifier is constructed as follows. One assumes that

which is equivalent to

where *Ï†* is a nonlinear function which maps the input space into a higher dimensional space. However, the function (9) is not explicitly constructed. In order to obtain the separating hyper plane in the higher dimensional space, variables *Î¾*_{
i
}are introduced to solve the following primal optimization problem

Through the training data, we can obtain the support vectors and kernel parameters in the model for prediction.

As we know, a continuous speech usually consists of vowels and voiced consonants. The vocal cord does not vibrate when producing voiced consonants which come from the vibration of the lips and teeth. It will bring up interference in the afterward steps because of useless signal collection. Meanwhile, multi-vowel in one speech sample will result in an aliasing in spectrum map of the sample. Therefore, in this paper, we only use the vowels /a:/ and /i:/ to detect the throat polyps of the patients.

In this paper, we acquired the vowel voice signal based on compressive sensing and extract the features constructed by frequency components of compressed signals. Lastly, the SVM method is used to obtain the classification model for throat polyp detection. The procedure is depicted in Algorithm 1.

## 4 Experimental results and analysis

In the experiments, vowel /a:/ and /i:/ voice signals of 26 patients were collected, among which 13 patients have throat polyps and 13 patients did not have throat polyps. The Gaussian random measurement matrix was used for compressed data obtained. The compressed voice signals were decomposed by eight-layer wavelet packet with â€˜db10â€™ wavelet, and the first 20 node signals were used to extract the features. The C-SVM program proposed by Dr. Lin was used for setting up the classification model and throat polyp prediction [23].

In the first experiment, we used 7,000 samples in the original vowel /a:/ and /i:/ of 26 patients, respectively, to construct the features. The compression ratio is 50%, and the features of eight normal patients without throat polyps and eight abnormal patients with throat polyps were used for training to establish a classification model. The other features of ten patients were used to test the performance of the proposed algorithm (Algorithm 1).

Figures 1 and 2 showed the node energy ratio of compressed vowel /a:/ and /i:/ signals of a normal patient without throat polyps and an abnormal patient with throat polyps. It can be seen that the frequency components of two kinds of patients are different and the low-frequency component in vowel voice signals changed more obviously when the patient has throat polyps. In other words, the frequency component of patients would vary when he or she suffers throat polyps. Thus, frequency component energy of vowel voice signals could be used as the features for throat polyp detection.

Figure 3 showed the prediction results of throat polyp patients under different random measurement matrices based on the proposed algorithm (Algorithm 1). We can see that the correct rate of prediction is about 50% with small fluctuations. It indicates that the features used for test and training were similar although they were obtained under different measurement matrices. Meanwhile, we repute that the low correct rate of prediction and small fluctuations were caused by the few training samples.

In the second experiment, we used different samples in the original vowel /a:/ and /i:/ of 26 patients, respectively, to construct the features with the same measurement matrix. The compression ratio, the training data, and test data were the same with the first experiment. The results were shown in Figure 4, while the correct rate of prediction has a mean value of ten predictions, respectively, for each number of samples. It can be seen that the correct rate of prediction was about 50% at different number of samples, while the fluctuations also were caused by the few training date and could not construct a high-accuracy prediction model. Meanwhile, the results demonstrated that our classifier is able to detect throat patient with a small number of samples.

## 5 Conclusions

Big data refers to large, diverse, complex, longitudinal, and distributed data sets. Some core technologies are needed to solve the problem in big data such as classification technology. Compressive sensing theory provided a new approach for big data classification which overwhelmed the limitation of Nyquist sampling theory and could sample and compress data simultaneously.

In this paper, we used the compressive sensing theory to acquire the compressed vowel voice signals for throat polyp detection. The frequency component energy ratios of compressed data obtained by wavelet packet transform were used as features. Then, the support vector machine intelligent algorithm was used to detect the existence of throat polyps. The experimental results showed that the performance of prediction was stable, but the correct rate of prediction is low, due to the few samples of patient cases.

## References

Zhong Z, Chen Z, Liang Q, Xiao S: Throat polyps detection based on patient voices.

*Lecture Notes in Electrical Engineering*2012, 202: 531-539. 10.1007/978-1-4614-5803-6_54Donoho DL: Compressed sensing.

*IEEE Trans. Inf. Theory*2006, 52(4):1289-1306.CandÃ¨s E: Compressive sampling. In

*Proceedings of International Congress of Mathematicians*. Madrid; 2006:1433-1452.Haupt J, Castro R, Nowak R, Fudge G, Yeh A: Compressive sampling for signal classification.

*Fortieth Asilomar Conference on Signals, Systems and Computer*2006, 10: 1430-1434.Budhaditya S, Pham D-S, Lazarescu M, Venkatesh S: Effective Anomaly Detection in Sensor Networks Data Streams. In

*2009 Ninth IEEE International Conference on Data Mining*. Miami; 2009:722-727.Pham D-S, Venkatesh S, Lazarescu M, Budhaditya S: Anomaly detection in large-scale data stream networks.

*Data Mining and Knowledge Discovery*2012. doi:10.1007/s10618-012-0297-3Ding Q, Kolaczyk ED: A compressed PCA subspace method for anomaly detection in high-dimensional data.

*IEEE Transactions on Information Theory*2013, 59(11):7419-7433.Pham DS, Saha B, Lazarescu M, Venkatesh S:

*Scalable Network-Wide Anomaly Detection Using Compressed Data*. Curtin University of Technology, Perth; 2009.Wang W, Lu D, Zhou X, Zhang B, Mu J: On-line anomaly detection in big data based on compressive sensing.

*Lecture Notes in Electrical Engineering*2013., 212:Liang Q: Situation understanding based on heterogeneous sensor networks and human-inspired favor weak fuzzy logic system.

*IEEE Systems Journal*2011, 5(2):156-163.Liang Q: Biologically-inspired target recognition in radar sensor networks.

*EURASIP Journal on Wireless Communications and Networking*2010, 2010: 523435.Liang Q, Cheng X, Samn S: NEW: network-enabled electronic warfare for target recognition.

*IEEE Trans on Aerospace and Electronic Systems*2010, 46(2):558-568.Liang Q, Cheng X: Underwater acoustic sensor networks: target size detection and performance analysis.

*Ad Hoc Networks Journal (Elsevier)*2009, 7(4):803-808. 10.1016/j.adhoc.2008.07.008Liang Q, Cheng X: KUPS: knowledge-based ubiquitous and persistent sensor networks for threat assessment.

*IEEE Transactions on Aerospace and Electronic Systems*2008, 44(3):1060-1069.Liang Q: Automatic target recognition using waveform diversity in radar sensor networks.

*Pattern Recognition Letters (Elsevier)*2008, 29(2):377-381.Davenport MA, Duarte MF, Wakin MB, Laska JN, Takhar D, Kelly K, Baraniuk R: The smashed filter for compressive classification and target recognition. In

*Proceedings of the SPIE Computational Imaging V*. San Jose; 2007:1.Xu L, Liang Q, Cheng X, Chen D: Compressive sensing in distributed radar sensor networks using pulse compression waveforms.

*EURASIP Journal of Wireless Communications and Networking*2013. doi:1186/1687-1499-2013-36Xu L, Liang Q: Zero correlation zone sequence pair sets for MIMO radar.

*IEEE Trans on Aerospace and Electronic Systems*2012, 48(3):2100-2113.Xu L, Liang Q: Orthogonal pulse compression codes for MIMO radar system. In

*IEEE Globecom*. Miami; 2010.Xu L, Liang Q: Waveform design and optimization in radar sensor network. In

*IEEE Globecom*. Miami; 2010.Liang Q, Mendel JM: Design interval type-2 fuzzy logic systems using SVD-QR method: rule reduction.

*Int. J. Intell. Syst.*2000, 15(10):939-957. 10.1002/1098-111X(200010)15:10<939::AID-INT3>3.0.CO;2-GWang W, Zhou X, Zhang B, Mu J: Anomaly detection in big data from UWB radars, Wiley Security and Communication Networks. doi:10.1002/sec.745,2003

Chang CC, Lin CJ: LIBSVM: a library for support vector machines.

*ACT Transactions on Intelligent Systems and Technology*2011, 2(27):1-27.Suykens JAK, Vandewalle J: Least squares support vector machine classifiers.

*Neural Processing Letters*1999, 9: 293-300. 10.1023/A:1018628609742

## Acknowledgements

The authors would like to thank Professor Qilian Liang from the University of Texas at Arlington for providing the experimental data. This research was supported by the Tianjin Younger Natural Science Foundation (12JCQNJC00400), National Natural Science Foundation of China (61271411), and Tianjin Natural Science Foundation (13JCYBJC15800).

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Competing interests

The authors declare that they have no competing interests.

## Authorsâ€™ original submitted files for images

Below are the links to the authorsâ€™ original submitted files for images.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Wang, W., Chen, Z., Mu, J. *et al.* Throat polyp detection based on compressed big data of voice with support vector machine algorithm.
*EURASIP J. Adv. Signal Process.* **2014**, 1 (2014). https://doi.org/10.1186/1687-6180-2014-1

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/1687-6180-2014-1