Skip to main content
  • Research Article
  • Open access
  • Published:

Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications

Abstract

An efficient algorithm and its corresponding VLSI architecture for the critical-band transform (CBT) are developed to approximate the critical-band filtering of the human ear. The CBT consists of a constant-bandwidth transform in the lower frequency range and a Brown constant- transform (CQT) in the higher frequency range. The corresponding VLSI architecture is proposed to achieve significant power efficiency by reducing the computational complexity, using pipeline and parallel processing, and applying the supply voltage scaling technique. A 21-band Bark scale CBT processor with a sampling rate of 16 kHz is designed and simulated. Simulation results verify its suitability for performing short-time spectral analysis on speech. It has a better fitting on the human ear critical-band analysis, significantly fewer computations, and therefore is more energy-efficient than other methods. With a 0.35m CMOS technology, it calculates a 160-point speech in 4.99 milliseconds at 234 kHz. The power dissipation is 15.6W at 1.1 V. It achieves 82.1 power reduction as compared to a benchmark 256-point FFT processor.

References

  1. Fletcher H: Auditory patterns. Reviews of Modern Physics 1940,12(1):47-65. 10.1103/RevModPhys.12.47

    Article  Google Scholar 

  2. Zwicker E: Subdivision of the audible frequency range into critical bands (frequenzgruppen). The Journal of the Acoustical Society of America 1961,33(2):248. 10.1121/1.1908630

    Article  Google Scholar 

  3. Picone JW: Signal modeling techniques in speech recognition. Proceedings of the IEEE 1993,81(9):1215-1247. 10.1109/5.237532

    Article  Google Scholar 

  4. Dautrich BA, Rabiner LR, Martin TB: On the effects of varying filter bank parameters on isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(4):793-807. 10.1109/TASSP.1983.1164172

    Article  Google Scholar 

  5. Noll P: Digital audio coding for visual communications. Proceedings of the IEEE 1995,83(6):925-943. 10.1109/5.387093

    Article  Google Scholar 

  6. Davis SB, Mermelstein P: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1980,28(4):357-366. 10.1109/TASSP.1980.1163420

    Article  Google Scholar 

  7. Petersen TL, Boll SF: Critical band analysis-synthesis. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(3):656-663. 10.1109/TASSP.1983.1164127

    Article  Google Scholar 

  8. Kates JM: An auditory spectral analysis model using the chirp z-transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1983,31(1):148-156. 10.1109/TASSP.1983.1164015

    Article  Google Scholar 

  9. Carnero B, Drygajlo A: Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms. IEEE Transactions on Signal Processing 1999,47(6):1622-1635. 10.1109/78.765133

    Article  Google Scholar 

  10. Farooq O, Datta S: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters 2001,8(7):196-198. 10.1109/97.928676

    Article  Google Scholar 

  11. Chandrakasan AP, Sheng S, Brodersen RW: Low power techniques for portable real-time DSP applications. Proceedings of the 5th International Conference on VLSI Design, January 1992, Bangalore, India 203–208.

    Chapter  Google Scholar 

  12. Wang C, Tong Y-C: An improved critical-band transform processor for speech applications. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '04), May 2004, Vancouver, BC, Canada 3: 461–464.

    Google Scholar 

  13. Wang C, Tong Y-C, Shao Y: VLSI design and analysis of a critical-band transform processor for speech recognition. Proceedings of IEEE International SOC Conference, September 2004, Santa Clara, Calif, USA 365–368.

    Google Scholar 

  14. Brown JC: Calculation of a constant Q spectral transform. Journal of the Acoustical Society of America 1991,89(1):425-434. 10.1121/1.400476

    Article  Google Scholar 

  15. Rabiner L, Juang B: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.

    Google Scholar 

  16. Holmes JN, Holmes WJ: Speech Synthesis and Recognition. 2nd edition. Taylor & Francis, New York, NY, USA; 2001.

    MATH  Google Scholar 

  17. Chandrakasan AP, Sheng S, Brodersen RW: Low-power CMOS digital design. IEEE Journal of Solid-State Circuits 1992,27(4):473-484. 10.1109/4.126534

    Article  Google Scholar 

  18. Bass BM: A low-power, high-performance, 1024-points FFT processor. IEEE Journal of Solid-State Circuits 1999,34(3):380-387. 10.1109/4.748190

    Article  Google Scholar 

  19. Cetin E, Morling RCS, Kale I: An integrated 256-point complex FFT processor for real-time spectrum analysis and measurement. Proceedings of IEEE Instrumentation and Measurement Technology Conference, May 1997, Ottawa, ON, Canada 1: 96–101.

    Google Scholar 

  20. Ruetz PA, Cai MM: A real time FFT chip set: architectural issues. Proceedings of the 10th International Conference on Pattern Recognition, June 1990, Atlantic City, NJ, USA 2: 385–388.

    Google Scholar 

  21. Bidet E, Castelain D, Joanblanq C, Senn P: A fast single-chip implementation of 8192 complex point FFT. IEEE Journal of Solid-State Circuits 1995,30(3):300-305. 10.1109/4.364445

    Article  Google Scholar 

  22. Liu Z, Song Y, Ikenaga T, Goto S: A VLSI array processing oriented fast Fourier transform algorithm and hardware implementation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005,88(12):3523-3530.

    Article  Google Scholar 

  23. Daubechies I, Sweldens W: Factoring wavelet transforms into lifting steps. Journal of Fourier Analysis and Applications 1998,4(3):247-269. 10.1007/BF02476026

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Wang.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Wang, C., Gan, WS. Efficient Algorithm and Architecture of Critical-Band Transform for Low-Power Speech Applications. EURASIP J. Adv. Signal Process. 2007, 089264 (2007). https://doi.org/10.1155/2007/89264

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2007/89264

Keywords