Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications

Wang, Chao; Gan, Woon-Seng

doi:10.1155/2007/89264

Research Article
Open access
Published: 01 December 2007

Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications

Chao Wang^1,2 &
Woon-Seng Gan²

EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 089264 (2007) Cite this article

1074 Accesses
1 Citations
Metrics details

Abstract

An efficient algorithm and its corresponding VLSI architecture for the critical-band transform (CBT) are developed to approximate the critical-band filtering of the human ear. The CBT consists of a constant-bandwidth transform in the lower frequency range and a Brown constant- transform (CQT) in the higher frequency range. The corresponding VLSI architecture is proposed to achieve significant power efficiency by reducing the computational complexity, using pipeline and parallel processing, and applying the supply voltage scaling technique. A 21-band Bark scale CBT processor with a sampling rate of 16 kHz is designed and simulated. Simulation results verify its suitability for performing short-time spectral analysis on speech. It has a better fitting on the human ear critical-band analysis, significantly fewer computations, and therefore is more energy-efficient than other methods. With a 0.35m CMOS technology, it calculates a 160-point speech in 4.99 milliseconds at 234 kHz. The power dissipation is 15.6W at 1.1 V. It achieves 82.1 power reduction as compared to a benchmark 256-point FFT processor.

References

Fletcher H: Auditory patterns. Reviews of Modern Physics 1940,12(1):47-65. 10.1103/RevModPhys.12.47
Article Google Scholar
Zwicker E: Subdivision of the audible frequency range into critical bands (frequenzgruppen). The Journal of the Acoustical Society of America 1961,33(2):248. 10.1121/1.1908630
Article Google Scholar
Picone JW: Signal modeling techniques in speech recognition. Proceedings of the IEEE 1993,81(9):1215-1247. 10.1109/5.237532
Article Google Scholar
Dautrich BA, Rabiner LR, Martin TB: On the effects of varying filter bank parameters on isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(4):793-807. 10.1109/TASSP.1983.1164172
Article Google Scholar
Noll P: Digital audio coding for visual communications. Proceedings of the IEEE 1995,83(6):925-943. 10.1109/5.387093
Article Google Scholar
Davis SB, Mermelstein P: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1980,28(4):357-366. 10.1109/TASSP.1980.1163420
Article Google Scholar
Petersen TL, Boll SF: Critical band analysis-synthesis. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(3):656-663. 10.1109/TASSP.1983.1164127
Article Google Scholar
Kates JM: An auditory spectral analysis model using the chirp z-transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(1):148-156. 10.1109/TASSP.1983.1164015
Article Google Scholar
Carnero B, Drygajlo A: Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms. IEEE Transactions on Signal Processing 1999,47(6):1622-1635. 10.1109/78.765133
Article Google Scholar
Farooq O, Datta S: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters 2001,8(7):196-198. 10.1109/97.928676
Article Google Scholar
Chandrakasan AP, Sheng S, Brodersen RW: Low power techniques for portable real-time DSP applications. Proceedings of the 5th International Conference on VLSI Design, January 1992, Bangalore, India 203–208.
Chapter Google Scholar
Wang C, Tong Y-C: An improved critical-band transform processor for speech applications. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '04), May 2004, Vancouver, BC, Canada 3: 461–464.
Google Scholar
Wang C, Tong Y-C, Shao Y: VLSI design and analysis of a critical-band transform processor for speech recognition. Proceedings of IEEE International SOC Conference, September 2004, Santa Clara, Calif, USA 365–368.
Google Scholar
Brown JC: Calculation of a constant Q spectral transform. Journal of the Acoustical Society of America 1991,89(1):425-434. 10.1121/1.400476
Article Google Scholar
Rabiner L, Juang B: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.
Google Scholar
Holmes JN, Holmes WJ: Speech Synthesis and Recognition. 2nd edition. Taylor & Francis, New York, NY, USA; 2001.
MATH Google Scholar
Chandrakasan AP, Sheng S, Brodersen RW: Low-power CMOS digital design. IEEE Journal of Solid-State Circuits 1992,27(4):473-484. 10.1109/4.126534
Article Google Scholar
Bass BM: A low-power, high-performance, 1024-points FFT processor. IEEE Journal of Solid-State Circuits 1999,34(3):380-387. 10.1109/4.748190
Article Google Scholar
Cetin E, Morling RCS, Kale I: An integrated 256-point complex FFT processor for real-time spectrum analysis and measurement. Proceedings of IEEE Instrumentation and Measurement Technology Conference, May 1997, Ottawa, ON, Canada 1: 96–101.
Google Scholar
Ruetz PA, Cai MM: A real time FFT chip set: architectural issues. Proceedings of the 10th International Conference on Pattern Recognition, June 1990, Atlantic City, NJ, USA 2: 385–388.
Google Scholar
Bidet E, Castelain D, Joanblanq C, Senn P: A fast single-chip implementation of 8192 complex point FFT. IEEE Journal of Solid-State Circuits 1995,30(3):300-305. 10.1109/4.364445
Article Google Scholar
Liu Z, Song Y, Ikenaga T, Goto S: A VLSI array processing oriented fast Fourier transform algorithm and hardware implementation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005,88(12):3523-3530.
Article Google Scholar
Daubechies I, Sweldens W: Factoring wavelet transforms into lifting steps. Journal of Fourier Analysis and Applications 1998,4(3):247-269. 10.1007/BF02476026
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Center for Signal Processing, School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, 639798, Singapore
Chao Wang
Digital Signal Processing Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, 639798, Singapore
Chao Wang & Woon-Seng Gan

Authors

Chao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Woon-Seng Gan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Wang.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Wang, C., Gan, WS. Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications. EURASIP J. Adv. Signal Process. 2007, 089264 (2007). https://doi.org/10.1155/2007/89264

Download citation

Received: 15 December 2005
Revised: 08 December 2006
Accepted: 18 January 2007
Published: 01 December 2007
DOI: https://doi.org/10.1155/2007/89264

Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords