Skip to main content

Throat polyp detection based on compressed big data of voice with support vector machine algorithm

Abstract

Classification in large-scale data is a key problem in big data domain. The theory of compressive sensing enables the recovery of a sparse signal from a small set of linear, random projections which provides a compressive classification method operating directly on the compressed data without reconstructing for big data. In this paper, we collected the compressed vowel /a:/ and /i:/ voice signals using compressive sensing for throat polyp detection. The throat polyp prediction procedure based on wavelet packet transform and support vector machine intelligent algorithm was deduced. The experiments for throat polyp prediction with the proposed classification algorithm were carried out. The results showed that the correct rate of prediction was stable under different number of samples and different random measurement matrices.

1 Introduction

Throat polyps are small fleshy growths which form on the vocal cords, usually as a result of overuse. They are mainly caused by straining or overusing the voice, for example, public speaking. Professional singers, sports/fitness coaches, or actors are all prone to developing throat polyps. The most common symptoms of throat polyps are a hoarse or deeper voice or a breathy sounding voice similar to laryngitis.

The traditional methods of throat polyp diagnosis are indirect laryngoscope, video-laryngoscope, and stroboscope light [1]. These methods need special instrument and depend on the experience of the pathologists. Usually, the patients will feel uncomfortable pain. Due to the fact that voice change of the patients is the most common symptoms of throat polyps, it would be desirable if the throat polyps could be detected based on the patient voices. In [1], Zhong et al. tried to detect throat polyps based on patient voices. Two fuzzy classifiers and a Bayesian classifier were designed for throat polyp detection based on patient vowel voices /a:/ and /i:/. The experimental results showed that an interval type 2 fuzzy classifier performed the better. In this paper, we will use the compressive sensing and support vector machine (SVM) algorithm to detect the throat polyps with patient vowel voices /a:/ and /i:/ while reducing the burden of voice data collection and storage.

Compressive sensing (or compressed sampling (CS)) theory demonstrated that a high-dimensional signal can be projected into a low-dimensional space with a random measurement matrix when the signal was sparse or compressible which was proposed by Donoho and Candès in 2006 [2, 3]. Then, the original signal can be recovered from the low-dimensional information with solving an optimization problem. The provable success of CS for signal reconstruction motivated that the low-dimensional signal contained the main features of the original signal. Thus, the universality of CS theory can be leveraged in the hypothesis testing problem and mitigate the complexity of data computing [4].

The hypothesis testing in compressed domain can not only reduce the pressure of data storage and transmission but also overcome the large amount of data calculation. In [5, 6], Budhaditya used the compressed sensor network data for anomaly detection based on spectrum theory method and obtained satisfactory detection results in the light of residual analysis of compressed data. In [7, 8], random projection in conjunction with principal component analysis (PCA) was implemented for anomaly detection in compressed domain, and an application of this proposed methodology to detect IP-level volume anomalies in computer network traffic suggested a high relevance to practical problems. In [9], an anomaly detection criterion based on wavelet packet transform and statistic process control theory in compressed domain was used for through wall human detection. The experimental results showed that the proposed algorithm could effectively detect the existence of human being through compressed signals.

Because of the advantage of compressed classification in big data based on compressive sensing comparative with classification in original data [10–15], a throat polyp detection algorithm based on compressive sensing and support vector machine is proposed in this paper. The remainder of this paper is organized as follows: In Section 2, the compressive sensing theory will be introduced. Throat polyp detection procedure based on CS and SVM will be deduced in Section 3. Experimental results of throat polyp detection will be shown in Section 4. Section 5 is the conclusion and discussion.

2 Background on compressive sensing

Compressive sensing states that the signal often contains some type of structure that enables intelligent representation and processing which builds on a core tenet of signal processing and information theory [16].

Suppose that an observer makes measurements of a signal x, it can be expressed as follows:

x = Φ θ ,
(1)

where θ ∈ RN is the expansion coefficient vector under the orthonormal basis Φ. If θ has only K ≤ N nonzero coefficients, we can say that signal x is K-sparse.

The surprising result of CS is that a length-N signal that is K-sparse in some basis can be recovered exactly/approximately from a nonadaptive linear projection of the signal onto a random basis. In matrix notation, it can be described as follows [17–21]:

y = Ψ x ,
(2)

where y is an M × 1 column vector and Ψ is an M × N random matrix. The appeal of CS is that we only need to collect M = O(K log(N/K)) random measurements to recover the signal x by solving the following l0-norm-constrained optimization problem:

θ ⌢ = arg min θ 0 s . t . y = Ψ Φ θ ,
(3)

where the ||θ||0 norm counts the number of nonzero components of θ. However, solving Equation 3 was both numerically unstable and NP-complete. Instead of solving the l0 minimization problem, nonadaptive CS theory seeks to solve the ‘closest possible’ tractable minimization problem, i.e., the l1 minimization:

θ ⌢ = arg min θ 1 s . t . y = Ψ Φ θ .
(4)

Although M < N, the recovery of the signal x from the measurements y become possible and practical under the additional assumption of signal sparsity or compressibility. The provable success of CS for signal reconstruction can indicate that the collected low-dimensional measurements contained the main features of the original signal. Therefore, it provides us a novel procedure for hypothesis testing of big data which can be carried out in the compressed domain.

3 Throat polyp detection procedure

The most common symptom and the first to typically appear in the throat polyp patients is a general roughness or hoarseness of the voice, which may or may not be accompanied by a sore throat or a full feeling in the throat. In other words, the frequency components of the same voice such as vowel voices will be varied when a person suffers throat polyps. Therefore, we can detect the throat polyps by analyzing the frequency component of the voice signal.

3.1 Acquire the frequency component by WPT

Fourier transform (FT) is the conventional signal frequency spectrum analysis tool which is a global transform and has low-frequency resolution. Due to its shortage in recognizing the tiny change of the frequency spectrum of FT, wavelet packet transform (WPT) has become the widest implement in the field of signal frequency analysis.

WPT is one extension of the wavelet transform (WT) which provides a complete level-by-level decomposition. It can enable the extraction of features from signals which combine stationary and nonstationary characteristics with an arbitrary time-frequency resolution [22].

In this paper, we extract the features of vowel voices /a:/ and /i:/ to detect the throat polyps. According to the principle of WPT, the vowel voice signal x(t) is decomposed into j levels of decomposition, and the node signals are reconstructed as x j i t . Then, it can be expressed as follows:

x t = ∑ i = 1 2 j x j i t .
(5)

The node signal energy can be defined E j i as

E j i = ∫ - ∞ ∞ x j i t 2 dt = ∑ x j i t 2 .
(6)

On the basis of WPT, Equation 6 illustrates that the node signal energy E j i stores the energy of a specific time-frequency window. In other words, E j i indicates the proportion of corresponding frequency component in the original signal. Thus, according to the principle described above, the throat polyp detection can be achieved by investigating the changing trend of E j i .

In order to eliminate the influence of volume for throat polyp detection, we define Δ E j i as the node signal energy ratio in the total signal energy further:

Δ E j i = E j i ∑ i = 1 m E j i ,
(7)

where m denotes the first dominant nodes which contained the main energy of the signal. It can eliminate the noise effect on the tiny energy node and improve the detection accuracy.

3.2 Throat polyp detection procedure with SVM

Support vector machines (SVM) are a popular machine learning method for classification, regression, and other learning tasks. In this method, one maps the data into a higher dimensional input space and one constructs an optimal separating hyper plane in this space [23, 24].

Given a training set of N data points (y i , x i ), i = 1, 2, 3, …, N where x i ∈ Rn and y i ∈ {1, -1}. The classifier is constructed as follows. One assumes that

ω T φ x i + b ≥ 1 , if y i = 1
ω T φ x i + b ≤ 1 , if y i = - 1
(8)

which is equivalent to

y i ω T φ x i + b ≥ 1 , i = 1 , 2 , … , N
(9)

where φ is a nonlinear function which maps the input space into a higher dimensional space. However, the function (9) is not explicitly constructed. In order to obtain the separating hyper plane in the higher dimensional space, variables ξ i are introduced to solve the following primal optimization problem

min ω , b , ξ 1 2 ω T ω + C ∑ I = 1 N ξ i
subject to y i ω T φ x i + b ≥ 1 - ξ i .
(10)

Through the training data, we can obtain the support vectors and kernel parameters in the model for prediction.

As we know, a continuous speech usually consists of vowels and voiced consonants. The vocal cord does not vibrate when producing voiced consonants which come from the vibration of the lips and teeth. It will bring up interference in the afterward steps because of useless signal collection. Meanwhile, multi-vowel in one speech sample will result in an aliasing in spectrum map of the sample. Therefore, in this paper, we only use the vowels /a:/ and /i:/ to detect the throat polyps of the patients.

In this paper, we acquired the vowel voice signal based on compressive sensing and extract the features constructed by frequency components of compressed signals. Lastly, the SVM method is used to obtain the classification model for throat polyp detection. The procedure is depicted in Algorithm 1.

4 Experimental results and analysis

In the experiments, vowel /a:/ and /i:/ voice signals of 26 patients were collected, among which 13 patients have throat polyps and 13 patients did not have throat polyps. The Gaussian random measurement matrix was used for compressed data obtained. The compressed voice signals were decomposed by eight-layer wavelet packet with ‘db10’ wavelet, and the first 20 node signals were used to extract the features. The C-SVM program proposed by Dr. Lin was used for setting up the classification model and throat polyp prediction [23].

In the first experiment, we used 7,000 samples in the original vowel /a:/ and /i:/ of 26 patients, respectively, to construct the features. The compression ratio is 50%, and the features of eight normal patients without throat polyps and eight abnormal patients with throat polyps were used for training to establish a classification model. The other features of ten patients were used to test the performance of the proposed algorithm (Algorithm 1).

Figures 1 and 2 showed the node energy ratio of compressed vowel /a:/ and /i:/ signals of a normal patient without throat polyps and an abnormal patient with throat polyps. It can be seen that the frequency components of two kinds of patients are different and the low-frequency component in vowel voice signals changed more obviously when the patient has throat polyps. In other words, the frequency component of patients would vary when he or she suffers throat polyps. Thus, frequency component energy of vowel voice signals could be used as the features for throat polyp detection.

Figure 1
figure 1

Node energy ratio of compressed vowel /a:/ voice signal for a normal and abnormal patient.

Figure 2
figure 2

Node energy ratio of compressed vowel /i:/ voice signal for a normal and abnormal patient.

Figure 3 showed the prediction results of throat polyp patients under different random measurement matrices based on the proposed algorithm (Algorithm 1). We can see that the correct rate of prediction is about 50% with small fluctuations. It indicates that the features used for test and training were similar although they were obtained under different measurement matrices. Meanwhile, we repute that the low correct rate of prediction and small fluctuations were caused by the few training samples.

Figure 3
figure 3

Correct rate of throat polyp prediction under different random measurement matrices.

In the second experiment, we used different samples in the original vowel /a:/ and /i:/ of 26 patients, respectively, to construct the features with the same measurement matrix. The compression ratio, the training data, and test data were the same with the first experiment. The results were shown in Figure 4, while the correct rate of prediction has a mean value of ten predictions, respectively, for each number of samples. It can be seen that the correct rate of prediction was about 50% at different number of samples, while the fluctuations also were caused by the few training date and could not construct a high-accuracy prediction model. Meanwhile, the results demonstrated that our classifier is able to detect throat patient with a small number of samples.

Figure 4
figure 4

Correct rate of throat polyp prediction under different number of samples.

5 Conclusions

Big data refers to large, diverse, complex, longitudinal, and distributed data sets. Some core technologies are needed to solve the problem in big data such as classification technology. Compressive sensing theory provided a new approach for big data classification which overwhelmed the limitation of Nyquist sampling theory and could sample and compress data simultaneously.

In this paper, we used the compressive sensing theory to acquire the compressed vowel voice signals for throat polyp detection. The frequency component energy ratios of compressed data obtained by wavelet packet transform were used as features. Then, the support vector machine intelligent algorithm was used to detect the existence of throat polyps. The experimental results showed that the performance of prediction was stable, but the correct rate of prediction is low, due to the few samples of patient cases.

References

  1. Zhong Z, Chen Z, Liang Q, Xiao S: Throat polyps detection based on patient voices. Lecture Notes in Electrical Engineering 2012, 202: 531-539. 10.1007/978-1-4614-5803-6_54

    Article  Google Scholar 

  2. Donoho DL: Compressed sensing. IEEE Trans. Inf. Theory 2006, 52(4):1289-1306.

    Article  MATH  MathSciNet  Google Scholar 

  3. Candès E: Compressive sampling. In Proceedings of International Congress of Mathematicians. Madrid; 2006:1433-1452.

    Google Scholar 

  4. Haupt J, Castro R, Nowak R, Fudge G, Yeh A: Compressive sampling for signal classification. Fortieth Asilomar Conference on Signals, Systems and Computer 2006, 10: 1430-1434.

    Article  Google Scholar 

  5. Budhaditya S, Pham D-S, Lazarescu M, Venkatesh S: Effective Anomaly Detection in Sensor Networks Data Streams. In 2009 Ninth IEEE International Conference on Data Mining. Miami; 2009:722-727.

    Chapter  Google Scholar 

  6. Pham D-S, Venkatesh S, Lazarescu M, Budhaditya S: Anomaly detection in large-scale data stream networks. Data Mining and Knowledge Discovery 2012. doi:10.1007/s10618-012-0297-3

    Google Scholar 

  7. Ding Q, Kolaczyk ED: A compressed PCA subspace method for anomaly detection in high-dimensional data. IEEE Transactions on Information Theory 2013, 59(11):7419-7433.

    Article  Google Scholar 

  8. Pham DS, Saha B, Lazarescu M, Venkatesh S: Scalable Network-Wide Anomaly Detection Using Compressed Data. Curtin University of Technology, Perth; 2009.

    Google Scholar 

  9. Wang W, Lu D, Zhou X, Zhang B, Mu J: On-line anomaly detection in big data based on compressive sensing. Lecture Notes in Electrical Engineering 2013., 212:

    Google Scholar 

  10. Liang Q: Situation understanding based on heterogeneous sensor networks and human-inspired favor weak fuzzy logic system. IEEE Systems Journal 2011, 5(2):156-163.

    Article  Google Scholar 

  11. Liang Q: Biologically-inspired target recognition in radar sensor networks. EURASIP Journal on Wireless Communications and Networking 2010, 2010: 523435.

    Google Scholar 

  12. Liang Q, Cheng X, Samn S: NEW: network-enabled electronic warfare for target recognition. IEEE Trans on Aerospace and Electronic Systems 2010, 46(2):558-568.

    Article  Google Scholar 

  13. Liang Q, Cheng X: Underwater acoustic sensor networks: target size detection and performance analysis. Ad Hoc Networks Journal (Elsevier) 2009, 7(4):803-808. 10.1016/j.adhoc.2008.07.008

    Article  Google Scholar 

  14. Liang Q, Cheng X: KUPS: knowledge-based ubiquitous and persistent sensor networks for threat assessment. IEEE Transactions on Aerospace and Electronic Systems 2008, 44(3):1060-1069.

    Article  Google Scholar 

  15. Liang Q: Automatic target recognition using waveform diversity in radar sensor networks. Pattern Recognition Letters (Elsevier) 2008, 29(2):377-381.

    Article  Google Scholar 

  16. Davenport MA, Duarte MF, Wakin MB, Laska JN, Takhar D, Kelly K, Baraniuk R: The smashed filter for compressive classification and target recognition. In Proceedings of the SPIE Computational Imaging V. San Jose; 2007:1.

    Google Scholar 

  17. Xu L, Liang Q, Cheng X, Chen D: Compressive sensing in distributed radar sensor networks using pulse compression waveforms. EURASIP Journal of Wireless Communications and Networking 2013. doi:1186/1687-1499-2013-36

    Google Scholar 

  18. Xu L, Liang Q: Zero correlation zone sequence pair sets for MIMO radar. IEEE Trans on Aerospace and Electronic Systems 2012, 48(3):2100-2113.

    Article  MathSciNet  Google Scholar 

  19. Xu L, Liang Q: Orthogonal pulse compression codes for MIMO radar system. In IEEE Globecom. Miami; 2010.

    Google Scholar 

  20. Xu L, Liang Q: Waveform design and optimization in radar sensor network. In IEEE Globecom. Miami; 2010.

    Google Scholar 

  21. Liang Q, Mendel JM: Design interval type-2 fuzzy logic systems using SVD-QR method: rule reduction. Int. J. Intell. Syst. 2000, 15(10):939-957. 10.1002/1098-111X(200010)15:10<939::AID-INT3>3.0.CO;2-G

    Article  MATH  Google Scholar 

  22. Wang W, Zhou X, Zhang B, Mu J: Anomaly detection in big data from UWB radars, Wiley Security and Communication Networks. doi:10.1002/sec.745,2003

    Google Scholar 

  23. Chang CC, Lin CJ: LIBSVM: a library for support vector machines. ACT Transactions on Intelligent Systems and Technology 2011, 2(27):1-27.

    Article  Google Scholar 

  24. Suykens JAK, Vandewalle J: Least squares support vector machine classifiers. Neural Processing Letters 1999, 9: 293-300. 10.1023/A:1018628609742

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Professor Qilian Liang from the University of Texas at Arlington for providing the experimental data. This research was supported by the Tianjin Younger Natural Science Foundation (12JCQNJC00400), National Natural Science Foundation of China (61271411), and Tianjin Natural Science Foundation (13JCYBJC15800).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Wang, W., Chen, Z., Mu, J. et al. Throat polyp detection based on compressed big data of voice with support vector machine algorithm. EURASIP J. Adv. Signal Process. 2014, 1 (2014). https://doi.org/10.1186/1687-6180-2014-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1687-6180-2014-1

Keywords