- Research Article
- Open access
- Published:
Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 089264 (2007)
Abstract
An efficient algorithm and its corresponding VLSI architecture for the critical-band transform (CBT) are developed to approximate the critical-band filtering of the human ear. The CBT consists of a constant-bandwidth transform in the lower frequency range and a Brown constant- transform (CQT) in the higher frequency range. The corresponding VLSI architecture is proposed to achieve significant power efficiency by reducing the computational complexity, using pipeline and parallel processing, and applying the supply voltage scaling technique. A 21-band Bark scale CBT processor with a sampling rate of 16 kHz is designed and simulated. Simulation results verify its suitability for performing short-time spectral analysis on speech. It has a better fitting on the human ear critical-band analysis, significantly fewer computations, and therefore is more energy-efficient than other methods. With a 0.35m CMOS technology, it calculates a 160-point speech in 4.99 milliseconds at 234 kHz. The power dissipation is 15.6W at 1.1 V. It achieves 82.1 power reduction as compared to a benchmark 256-point FFT processor.
References
Fletcher H: Auditory patterns. Reviews of Modern Physics 1940,12(1):47-65. 10.1103/RevModPhys.12.47
Zwicker E: Subdivision of the audible frequency range into critical bands (frequenzgruppen). The Journal of the Acoustical Society of America 1961,33(2):248. 10.1121/1.1908630
Picone JW: Signal modeling techniques in speech recognition. Proceedings of the IEEE 1993,81(9):1215-1247. 10.1109/5.237532
Dautrich BA, Rabiner LR, Martin TB: On the effects of varying filter bank parameters on isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(4):793-807. 10.1109/TASSP.1983.1164172
Noll P: Digital audio coding for visual communications. Proceedings of the IEEE 1995,83(6):925-943. 10.1109/5.387093
Davis SB, Mermelstein P: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1980,28(4):357-366. 10.1109/TASSP.1980.1163420
Petersen TL, Boll SF: Critical band analysis-synthesis. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(3):656-663. 10.1109/TASSP.1983.1164127
Kates JM: An auditory spectral analysis model using the chirp z-transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(1):148-156. 10.1109/TASSP.1983.1164015
Carnero B, Drygajlo A: Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms. IEEE Transactions on Signal Processing 1999,47(6):1622-1635. 10.1109/78.765133
Farooq O, Datta S: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters 2001,8(7):196-198. 10.1109/97.928676
Chandrakasan AP, Sheng S, Brodersen RW: Low power techniques for portable real-time DSP applications. Proceedings of the 5th International Conference on VLSI Design, January 1992, Bangalore, India 203–208.
Wang C, Tong Y-C: An improved critical-band transform processor for speech applications. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '04), May 2004, Vancouver, BC, Canada 3: 461–464.
Wang C, Tong Y-C, Shao Y: VLSI design and analysis of a critical-band transform processor for speech recognition. Proceedings of IEEE International SOC Conference, September 2004, Santa Clara, Calif, USA 365–368.
Brown JC: Calculation of a constant Q spectral transform. Journal of the Acoustical Society of America 1991,89(1):425-434. 10.1121/1.400476
Rabiner L, Juang B: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.
Holmes JN, Holmes WJ: Speech Synthesis and Recognition. 2nd edition. Taylor & Francis, New York, NY, USA; 2001.
Chandrakasan AP, Sheng S, Brodersen RW: Low-power CMOS digital design. IEEE Journal of Solid-State Circuits 1992,27(4):473-484. 10.1109/4.126534
Bass BM: A low-power, high-performance, 1024-points FFT processor. IEEE Journal of Solid-State Circuits 1999,34(3):380-387. 10.1109/4.748190
Cetin E, Morling RCS, Kale I: An integrated 256-point complex FFT processor for real-time spectrum analysis and measurement. Proceedings of IEEE Instrumentation and Measurement Technology Conference, May 1997, Ottawa, ON, Canada 1: 96–101.
Ruetz PA, Cai MM: A real time FFT chip set: architectural issues. Proceedings of the 10th International Conference on Pattern Recognition, June 1990, Atlantic City, NJ, USA 2: 385–388.
Bidet E, Castelain D, Joanblanq C, Senn P: A fast single-chip implementation of 8192 complex point FFT. IEEE Journal of Solid-State Circuits 1995,30(3):300-305. 10.1109/4.364445
Liu Z, Song Y, Ikenaga T, Goto S: A VLSI array processing oriented fast Fourier transform algorithm and hardware implementation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005,88(12):3523-3530.
Daubechies I, Sweldens W: Factoring wavelet transforms into lifting steps. Journal of Fourier Analysis and Applications 1998,4(3):247-269. 10.1007/BF02476026
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Wang, C., Gan, WS. Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications. EURASIP J. Adv. Signal Process. 2007, 089264 (2007). https://doi.org/10.1155/2007/89264
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/89264