Time-Frequency Analysis and Hermite Projection Method Applied to Swallowing Accelerometry Signals
© Irena Orović et al. 2010
Received: 22 December 2009
Accepted: 22 March 2010
Published: 3 May 2010
Fast Hermite projections have been often used in image-processing procedures such as image database retrieval, projection filtering, and texture analysis. In this paper, we propose an innovative approach for the analysis of one-dimensional biomedical signals that combines the Hermite projection method with time-frequency analysis. In particular, we propose a two-step approach to characterize vibrations of various origins in swallowing accelerometry signals. First, by using time-frequency analysis we obtain the energy distribution of signal frequency content in time. Second, by using fast Hermite projections we characterize whether the analyzed time-frequency regions are associated with swallowing or other phenomena (vocalization, noise, bursts, etc.). The numerical analysis of the proposed scheme clearly shows that by using a few Hermite functions, vibrations of various origins are distinguishable. These results will be the basis for further analysis of swallowing accelerometry to detect swallowing difficulties.
Time-frequency analysis is of great interest when time- or frequency-based techniques provide insufficient information for signal analysis [1–3]. Time-frequency representations (TFRs) depict variations of the spectral characteristics of signals as a function of time, which is ideally suited for nonstationary signals [1–10], especially, nonstationary biomedical signals [4, 5].
Many biomedical signals (e.g., heart sounds , swallowing accelerometry signals ) are multicomponent, one-dimensional signals. The time-frequency analysis of these signals provides a two-dimensional representation of the signals' components, which is appropriate for diagnostic analyses . The resolution, that is, the quality of a representation, depends on a specific time-frequency distribution . Due to the multicomponent nature of the considered biomedical signals, distributions with reduced cross-terms are recommended.
In this paper, we propose an approach for the analysis of two-dimensional time-frequency regions by using a fast Hermite projection method [11–13]. Based on this idea, the reconstruction of different time-frequency regions would require a different number of Hermite functions depending on the component complexity. Hence, we adapt the fast Hermite projection method to characterize time-frequency regions of different origins, namely, by using a small fixed number of Hermite functions, each region is reconstructed with a certain error. A numerical analysis of these errors has shown that vibrations associated with swallowing have distinctive mean square errors compared to the errors obtained for the vibrations associated with other phenomena (e.g., vocalization or noise). Note that, in this application, the spectrogram-based reconstructions produce satisfactory results .
The paper is organized as follows. Section 2 describes swallowing accelerometry and outlines the advantages of this approach for detecting swallowing difficulties. In Section 3, we describe the theory behind the time-frequency analysis and depict a few illustrative examples on how various time-frequency distributions can be used for the representation of swallowing accelerometry signals. Section 4 describes the fast Hermite projections and their application for characterization of vibrations within swallowing accelerometry recordings. In Section 5, we describe the results of the application of the proposed approach to swallowing accelerometry signals.
2. Swallowing Accelerometry
Deglutition, or swallowing, is a well-defined, complex process of transporting food or liquid from the mouth to the stomach. Swallowing consists of four distinct phases: oral preparatory, oral, pharyngeal, and esophageal . Patients suffering from dysphagia (swallowing difficulty) usually deviate from the well-defined pattern of healthy swallowing. Dysphagia is a common problem encountered in the rehabilitation of stroke patients, head injured patients, and others patients with paralyzing neurological diseases . In these patients, dysphagia can develop from lesions in certain areas of the cortex and brain stem, which control the swallowing mechanism, or from damage to the cranial nerves associated with swallowing function [16, 17]. Dysphagic patients are likely to aspirate. Aspiration is defined as the entry of material into the airway below the true vocal folds . Aspiration may have dire consequences including malnutrition and dehydration, degradation in psychosocial well-being, aspiration pneumonia, and even death [14, 18, 19].
Today's dysphagia management relies heavily on the videofluoroscopic swallowing study (VFSS) . Even though VFSS is accepted as the gold standard, it requires expensive X-ray equipment as well as expertise from speech-language pathologists and radiologists. Hence, only a limited number of institutions can offer VFSS . As a result, VFSS has been associated with long waiting lists . Also, day-to-day monitoring of dysphagia is crucial due to the fact that severity of dysphagia can change over time. Certainly, VFSS is not suitable for such day-to-day monitoring.
Cervical auscultation is a promising noninvasive tool for the assessment of swallowing disorders . It is being adopted by dysphagia clinicians as a component of the clinical evaluation of swallowing . The cervical auscultation approach involves the examination of swallowing signals acquired via a stethoscope or other acoustic and/or vibration sensors during deglutition . One such approach is signal detection via swallowing accelerometry [6, 23], which refers to an approach employing an accelerometer as a sensor during cervical auscultation. Swallowing accelerometry, a technique that involves an accelerometer placed on the neck to monitor vibrations associated with swallowing activities, has been used to detect aspiration in several studies, which have described a shared pattern among healthy swallow signals and verified that this pattern is either absent or delayed in dysphagic swallow signals (e.g., [16, 17, 24]).
Nevertheless, the presence of various vibrations not associated with swallowing can severely contaminate swallowing accelerometry signals (e.g., [6, 23]). In particular, vocalizations either voluntarily or involuntarily can have an adverse effect on these signals. Their presence masks the observed swallowing signals and renders the recordings less useful for further analysis . For example, automatic demarcation of swallowing signals becomes impossible in the presence of vocalizations because of their strong amplitudes . In order to develop an approach for removal of these confounding vibrations, a complete understanding of cervical accelerometry is needed.
The sample data considered in this paper were gathered over a three-month period from a public science centre in Toronto, Ontario, Canada. A dual-axis accelerometer (ADXL322, Analog Devices) was attached to the participant's neck (anterior to the cricoid cartilage) using double-sided tape in order to monitor vibrations associated with swallowing. The axes of acceleration were aligned to the anterior-posterior and superior-inferior directions.
All participants provided written consent. The study protocol was approved by the research ethics boards of the Toronto Rehabilitation Institute and Bloorview Kids Rehab, both located in Toronto, Ontario, Canada. Participants had no documented swallowing disorders and passed an oral mechanism exam prior to participation.
Data were band-pass filtered in hardware with a pass band of 0.1–3000 Hz and sampled at 10 kHz using a custom LabVIEW program running on a laptop computer. Data were saved for subsequent offline analysis.
During data collection, participants were cued to perform three types of swallows involving saliva and water swallows. The entire data collection session lasted 15 minutes per participant. The participants were instructed not to vocalize. Nonetheless, approximately one quarter of all recordings contained either voluntary or involuntary vocalizations.
3. Time-Frequency Analysis of Swallowing Accelerometry Signals
where is a signal, and is a window function. The spectrogram is the energetic version of STFT (which is linear). The spectrogram is suitable for multicomponent signals representation, since it does not produce cross-terms.
4. Hermite Projection of Time-Frequency Regions
Two-dimensional regions in the time-frequency domain, as shown in Figure 1, can be considered to have specific "structures". The complexity of these "structures" depends on the nature of vibrations while their refinement is related to the resolution achieved by the time-frequency distribution. Thus, the goal is to analyze these vibrations in the time-frequency domain in order to gain further insights into their complexities. To attain this goal, the fast Hermite projection method is adapted and applied to time-frequency regions.
4.1. Fast Hermite Projection Method
The fast Hermite projection method has been used in various image-processing applications, such as image database retrieval, projection filtering, and texture analysis [11–13]. It has been shown that, in comparison with the trigonometric functions, the expansion into Hermite functions provides better computational localization in both the signal domain and the transform domain [11, 25]. In addition, the expansion into Hermite functions allows for simultaneous analysis of the signal and its Fourier transform, since the Hermite functions are the eigenfunctions of the Fourier transform.
4.2. Hermite Projection Method Applied to Time-Frequency Regions Associated with Various Vibrations
To understand differences between swallowing vibrations and other vibrations (speech, laughter, cough, etc.) we observe the structures in the time-frequency regions.
The time-frequency regions for classification are extracted from the for The length of intervals containing vibrations is set to a constant value in order to have regions of the same size. In the next step, the fast Hermite projection method is applied to these regions.
where denotes the original region from , is the reconstructed region, and and are the region dimensions. Note that the dimensions and in (16) should be the same for all regions. In other words, values for these dimensions are empirically derived in order for any region to include most of the vibration components.
It has been experimentally shown that in comparison to the regions containing swallowing vibrations, the vocalization regions (speech, cough, laughter, etc.) are more complex and thus have higher MSE values. Moreover, there is a significant difference between the values of MSE for vocalization and swallowing vibrations. As well, the MSE for regions with swallowing vibrations is much higher than for regions containing noise. Consequently, the difference between the original and reconstructed regions, given in terms of MSE, can be used as a parameter to characterize a region (e.g., swallowing, noise, or vocalizations).
As a remark, it should be noted that even though it is labelled as a projection method, the Hermite projection method actually expands a signal into a finite series of Hermite functions (multiplied by Hermite coefficients). In this paper, we apply the Hermite projection method to time-frequency regions by expanding each row (i.e., the instantaneous spectrum) into a series of Hermite functions.
5. Experimental Results
MSEs for the selected regions.
The regions with noise and regions with bursts have the smallest MSE, since they do not contain any significant structures. On the other hand, the regions containing vibrations associated with speech are very complex for reconstruction, and thus we observed the highest MSEs in these regions. Finally, the regions with swallowing vibrations have MSEs which are significantly smaller than those of vocalization regions and significantly higher than the MSEs for noisy regions.
It has been empirically shown that MSEs significantly differ between the considered vibration structures due to their different complexities. In other words, swallowing vibrations can be clearly differentiated from the vibrations associated with vocalizations and noise.
One of the advantages of the proposed approach is its simplicity. Specifically, it does not require pattern extraction or matching, and it is simpler than other texture analysis-based algorithms. For example, the texture analysis has been used for the time-frequency-based audio classification , where the spectrogram of musical sounds was composed of specific repetitive patterns and was treated as a visual texture. The feature extraction was performed by using time-frequency blocks where various blocks sizes were considered. Then, a block matching algorithm was used for each block to capture time-frequency structures at different orientations and scales. In comparison, our approach does not require preprocessing steps.
Additionally, even if we disregard a higher computational complexity, it is difficult to use texture analysis for the characterization of time-frequency regions associated with the considered vibrations (cough, laughter, speech, swallowing, etc.), since they do not follow typical image textures. Textures are usually defined as a regular repetition of an element or pattern on a surface [26, 27]. This is not the case with the considered time-frequency structures (e.g., Figure 3). Particularly, the swallowing sounds have a noise-like nature without edges and repetitive elements. Moreover, even for the regions involving coughing or laughing, it is difficult to define 2D patterns and rules that properly describe their directionality and periodicity.
In this paper, we proposed a procedure involving the Hermite projection of time-frequency regions as a way to characterize vibrations of different origins in swallowing accelerometry signals. In particular, the goal of this procedure was to detect regions associated with swallowing and vocalization vibrations (speech, cough, and laughter). The procedure involved two steps. First, we performed the time-frequency analysis of a signal yielding two-dimensional time-frequency regions containing different features depending on the nature of vibrations. Second, the fast Hermite projection method using a small number of Hermite functions was implemented for the time-frequency region reconstruction. Using the proposed approach, we successfully detected regions associated with swallowing and vocalization by observing the MSE between the original and reconstructed regions. More precisely, the regions containing swallowing vibrations were reconstructed with significantly lower MSE in comparison to the regions containing vocalization vibrations. Furthermore, the MSE for the swallowing vibrations was sufficiently high to differentiate them from noise or bursts resulting from the signal recording equipment.
I. Orović and S. Stanković's work is supported by the Ministry of Education and Science of Montenegro. T. Chau, C. M. Steele, and E. Sejdić's work is funded in part by the Ontario Centres of Excellence, the Toronto Rehabilitation Institute, the Health Technology Exchange, Bloorview Kids Rehab, and the Canada Research Chairs Program.
- Loughlin PJ: Scanning the special issue on time-frequency analysis. Proceedings of the IEEE 1996, 84(9):1195. 10.1109/JPROC.1996.535239View ArticleGoogle Scholar
- Cohen L: Time-Frequency Analysis: Theory and Applications. Prentice Hall, Englewood Cliffs, NJ, USA; 1995.Google Scholar
- Boashash B (Ed): Time-Frequency Signal Analysis and Processing: A Comprehensive Reference. Elsevier, Amsterdam, The Netherlands; 1993.Google Scholar
- Sejdić E, Jiang J: Comparative study of three time-frequency representations with applications to a novel correlation method. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Canada 2: 633-636.Google Scholar
- Sejdić E, Djurović I, Jiang J: Time-frequency feature representation using energy concentration: an overview of recent advances. Digital Signal Processing 2009, 19(1):153-183. 10.1016/j.dsp.2007.12.004View ArticleGoogle Scholar
- Lee J, Steele CM, Chau T: Time and time-frequency characterization of dual-axis swallowing accelerometry signals. Physiological Measurement 2008, 29(9):1105-1120. 10.1088/0967-3334/29/9/008View ArticleGoogle Scholar
- Choi H, Williams W: Improved time-frequency representation of multicomponent signals using exponential kernels. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989, 37(6):862-871. 10.1109/ASSP.1989.28057View ArticleGoogle Scholar
- Stanković LJ: Auto-term representation by the reduced interference distributions: a procedure for kernel design. IEEE Transactions on Signal Processing 1996, 44(6):1557-1563. 10.1109/78.506622View ArticleGoogle Scholar
- Stanković LJ: A method for time-frequency analysis. IEEE Transactions on Signal Processing 1994, 42(1):225-229. 10.1109/78.258146View ArticleGoogle Scholar
- Orović I, Stanković S, Stanković LJ, Thayaparan T: Multiwindow S-method for instantaneous frequency estimation and its application in radar signal analysis. IET Signal Processing, In press, 2010Google Scholar
- Krylov A, Korchagin D: Fast Hermite projection method. Proceedings of the 3rd International Conference on Image Analysis and Recognition (ICIAR '06), September 2006, Povoa de Varzim, Portugal 1: 329-338.Google Scholar
- Kortchagine D, Krylov A: Image database retrieval by fast Hermite projection method. Proceedings of the 15th International Conference on Computer Graphics and Applications (GraphiCon '05), June 2005, Novosibirsk, Russia 308-311.Google Scholar
- Kortchagine D, Krylov A: Projection filtering in image processing. Proceedings of the 10th International Conference on Computer Graphics and Applications (GraphiCon'00), August-September 2000, Moscow, Russia 42-45.Google Scholar
- Logemann JA: Evaluation and Treatment of Swallowing Disorders. 2nd edition. PRO-ED, Austin, Texas, USA; 1998.Google Scholar
- Miller AJ: The Neuroscientific Principles of Swallowing and Dysphagia. Singular Publishing, San Diego, Calif, USA; 1999.Google Scholar
- Reddy NP, Canilang EP, Grotz RC, Rane MB, Casterline J, Costarella BR: Biomechanical quantification for assessment and diagnosis of dysphagia. IEEE Engineering in Medicine and Biology Magazine 1988, 7(3):16-20. 10.1109/51.7929View ArticleGoogle Scholar
- Reddy NP, Costarella BR, Grotz RC, Canilang EP: Biomechanical measurements to characterize the oral phase of dysphagia. IEEE Transactions on Biomedical Engineering 1990, 37(4):392-397. 10.1109/10.52346View ArticleGoogle Scholar
- Steele CM, Allen C, Barker J, et al.: Dysphagia service delivery by speech-language pathologists in Canada: results of a national survey. Canadian Journal of Speech-Language Pathology and Audiology 2007, 31(4):166-177.Google Scholar
- Ramsey DJC, Smithard DG, Kalra L: Can pulse oximetry or a bedside swallowing assessment be used to detect aspiration after stroke? Stroke 2006, 37(12):2984-2988. 10.1161/01.STR.0000248758.32627.3bView ArticleGoogle Scholar
- Cichero JAY, Murdoch BE: The physiologic cause of swallowing sounds: answers from heart sounds and vocal tract acoustics. Dysphagia 1998, 13(1):39-52. 10.1007/PL00009548View ArticleGoogle Scholar
- Hamlet S, Penney DG, Formolo J: Stethoscope acoustics and cervical auscultation of swallowing. Dysphagia 1994, 9(1):63-68.View ArticleGoogle Scholar
- Youmans SR, Stierwalt JAG: An acoustic profile of normal swallowing. Dysphagia 2005, 20(3):195-209. 10.1007/s00455-005-0013-1View ArticleGoogle Scholar
- Sejdić E, Steele CM, Chau T: Segmentation of dual-axis swallowing accelerometry signals in healthy subjects with analysis of anthropometric effects on duration of swallowing activities. IEEE Transactions on Biomedical Engineering 2009, 56(4):1090-1097.View ArticleGoogle Scholar
- Lee J, Blain S, Casas M, Kenny D, Berall G, Chau T: A radial basis classifier for the automatic detection of aspiration in children with dysphagia. Journal of NeuroEngineering and Rehabilitation 2006, 3, article 14: 1-17.Google Scholar
- Eberlein WF: A new method for numerical evaluation of the Fourier transform. Journal of Mathematical Analysis and Applications 1978, 65(1):80-84. 10.1016/0022-247X(78)90203-2MathSciNetView ArticleMATHGoogle Scholar
- Yu G, Slotine J-J: Audio classification from time-frequency texture. Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '09), 2009, Taipei, Taiwan 1677-1680.Google Scholar
- Tuceryan M, Jain AK: Texture analysis. In The Handbook of Patern and Computer Vision. 2nd edition. Edited by: Chen CH, Pau LF, Wang PSP. World Scientific, River Edge, NJ, USA; 1998:207-248.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.