Skip to main content

Accurate tempo estimation based on harmonic + noise decomposition

Abstract

We present an innovative tempo estimation system that processes acoustic audio signals and does not use any high-level musical knowledge. Our proposal relies on a harmonic + noise decomposition of the audio signal by means of a subspace analysis method. Then, a technique to measure the degree of musical accentuation as a function of time is developed and separately applied to the harmonic and noise parts of the input signal. This is followed by a periodicity estimation block that calculates the salience of musical accents for a large number of potential periods. Next, a multipath dynamic programming searches among all the potential periodicities for the most consistent prospects through time, and finally the most energetic candidate is selected as tempo. Our proposal is validated using a manually annotated test-base containing 961 music signals from various musical genres. In addition, the performance of the algorithm under different configurations is compared. The robustness of the algorithm when processing signals of degraded quality is also measured.

References

  1. 1.

    Alonso M, Badeau R, David B, Richard G: Musical tempo estimation using noise subspace projections. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03), October 2003, New Paltz, NY, USA 95–98.

    Google Scholar 

  2. 2.

    Alonso M, David B, Richard G: Tempo and beat estimation of musical signals. Proceedings of the 5th International Symposium on Music Information Retrieval (ISMIR '04), October 2004, Barcelona, Spain 158–163.

    Google Scholar 

  3. 3.

    Alonso M, David B, Richard G: Tempo extraction for audio recordings. Proceedings of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UK https://doi.org/www.music-ir.org/evaluation/mirex-results/audio-tempo/index.html

    Google Scholar 

  4. 4.

    Alonso M, Richard G, David B: Extracting note onsets from musical recordings. Proceedings of IEEE International Conference on Multimedia & Expo (ICME '05), July 2005, Amsterdam, The Netherlands

    Google Scholar 

  5. 5.

    Audio Tempo Extraction : Music Information Retrieval Evaluation eXchange. 2005.https://doi.org/www.music-ir.org/evaluation/mirex-results/audio-tempo/index.html

    Google Scholar 

  6. 6.

    Badeau R: Méthodes à haute résolution pour l'estimation et le suivi de sinusöýdes modulées. Application aux signaux de musique, Ph.D. thesis. Télécom Paris, Paris, France; April 2005.

    Google Scholar 

  7. 7.

    Badeau R, Boyer R, David B: EDS parametric modeling and tracking of audio signals. Proceedings of the 5th International Workshop on Digital Audio Effects (DAFx '02), September 2002, Hamburg, Germany 139–144.

    Google Scholar 

  8. 8.

    Badeau R, David B, Richard G: Yet another subspace tracker. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 4: 329–332.

    Google Scholar 

  9. 9.

    Bello JP, Daudet L, Abdallah S, Duxbury C, Davies M, Sandler MB: A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing 2005,13(5):1035–1046.

    Article  Google Scholar 

  10. 10.

    Desain P, Honing H: Computational models of beat induction: the rule based approach. Journal of New Music Research 1999,28(1):29–42. 10.1076/jnmr.28.1.29.3123

    Article  Google Scholar 

  11. 11.

    Dvornikov M: Formulae of numerical differentiation. 2003.https://doi.org/arxiv.org/abs/math.NA/0306092

    Google Scholar 

  12. 12.

    Foote J, Uchihashi S: The beat spectrum: a new approach to rhythm analysis. Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '01), August 2001, Tokyo, Japan 881–884.

    Google Scholar 

  13. 13.

    Gillet O, Richard G: Extraction and remixing of drum tracks from polyphonic music signals. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 315–318.

    Google Scholar 

  14. 14.

    Goto M, Muraoka Y: Issues in evaluating beat tracking systems. Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI '97), August 1997, Nagoya, Japan 9–16.

    Google Scholar 

  15. 15.

    Goto M, Muraoka Y: Real-time rhythm tracking for drumless audio signals. Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI '97), August 1997, Nagoya, Japan 135–144.

    Google Scholar 

  16. 16.

    Goto M, Muraoka Y: Music understanding at the beat level: real-time beat tracking for audio signals. In Computational Auditory Scene Analysis. Lawrence Erlbaum Associates, Mahwah, NJ, USA; 1998:157–176.

    Google Scholar 

  17. 17.

    Gouyon F: Quantitative comparison of tempo induction algorithms. https://doi.org/www.iua.upf.es/mtg/ismir2004/contest/tempoContest/node3.html

  18. 18.

    Gouyon F, Herrera P, Cano P: Pulse-dependent analyses of percussive music. Proceedings of AES22 International Conference on Virtual, Synthetic and Entertainment Audio, June 2002, Espoo, Finland

    Google Scholar 

  19. 19.

    Gouyon F, Klapuri A, Dixon S, et al.: An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Speech and Audio Processing 2006.,14(5):

    Google Scholar 

  20. 20.

    Hainsworth S: Techniques for the automated analysis of musical audio, Ph.D. thesis. Department of Engineering, Cambridge University, Cambridge, UK; December 2003.

    Google Scholar 

  21. 21.

    Hainsworth S, Macleod M: Beat tracking with particle filtering algorithms. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03), October 2003, New Paltz, NY, USA 91–94.

    Google Scholar 

  22. 22.

    Hermus K, Wambacq P: Assessment of signal subspace based speech enhancement for noise robust speech recognition. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 1: I945–I948.

    Google Scholar 

  23. 23.

    Jehan T: Event-synchronous music analysis/synthesis. Proceedings of the International Conference on Digital Audio Effects (DAFx '04), October 2004, Naples, Italy

    Google Scholar 

  24. 24.

    Jensen K, Andersen T: Beat estimation on the beat. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03), October 2003, New Paltz, NY, USA 87–90.

    Google Scholar 

  25. 25.

    Klapuri A: Sound onset detection by applying psychoacoustic knowledge. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99), March 1999, Phoenix, Ariz, USA 6: 3089–3092.

    Google Scholar 

  26. 26.

    Klapuri A, Eronen A, Astola J: Automatic estimation of the meter of acoustic musical signals. IEEE Transactions on Speech and Audio Processing 2006.,14(1):

    Google Scholar 

  27. 27.

    Laroche J: Efficient tempo and beat tracking in audio recordings. Journal of the Audio Engineering Society 2003,51(4):226–233.

    Google Scholar 

  28. 28.

    Lerdahl F, Jackendoff R: A Generative Theory of Tonal Music. MIT Press, Cambridge, Mass, USA; 1983.

    Google Scholar 

  29. 29.

    Meddis R: Simulation of auditory-neural transduction: further studies. The Journal of the Acoustical Society of America 1988,83(3):1056–1063. 10.1121/1.396050

    Article  Google Scholar 

  30. 30.

    Moelants D: Preferred tempo reconsidered. Proceedings of the 7th International Conference on Music Perception and Cognition, July 2002, Sydney, Australia 580–583.

    Google Scholar 

  31. 31.

    Moore B (Ed): Hearing. 2nd edition. Academic Press, London, UK; 1995.

    Google Scholar 

  32. 32.

    Parncutt R: A perceptual model of pulse salience and metrical accent in musical rhythms. Music Perception 1994,11(4):409–464.

    Article  Google Scholar 

  33. 33.

    Peeters G: Time variable tempo detection and beat marking. Proceedings of the International Computer Music Conference (ICMC '05), September 2005, Barcelona, Spain

    Google Scholar 

  34. 34.

    Rabiner L, Juang B: Fundamentals of Speech Recognition. Prentice Hall PTR, Englewood Cliffs, NJ, USA; 1993.

    Google Scholar 

  35. 35.

    Raphael C: Automatic segmentation of acoustic musical signals using hidden Markov models. IEEE Transactions on Pattern Analysis and Machine Intelligence 1999,21(4):360–370. 10.1109/34.761266

    Article  Google Scholar 

  36. 36.

    Scheirer ED: Tempo and beat analysis of acoustic musical signals. The Journal of the Acoustical Society of America 1998,103(1):588–601. 10.1121/1.421129

    Article  Google Scholar 

  37. 37.

    Schwartz D: Méthodes Statistiques à l'Usage des Médecins et des Biologistes, Flammarion Medecine Series. 3rd edition. Flammarion, Paris, France; 1963.

    Google Scholar 

  38. 38.

    Seppänen J: Tatum grid analysis of musical signals. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '01), October 2001, New Paltz, NY, USA 131–134.

    Google Scholar 

  39. 39.

    Serra X: A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition, Ph.D. thesis. Stanford University, Stanford, Calif, USA; 1989.

    Google Scholar 

  40. 40.

    Sethares WA, Morris RD, Sethares JC: Beat tracking of musical performances using low-level audio features. IEEE Transactions on Speech and Audio Processing 2005,13(2):275–285.

    Article  Google Scholar 

  41. 41.

    Sethares WA, Staley T: Meter and periodicity in musical performance. Journal of New Music Research 2001,30(2):149–158. 10.1076/jnmr.30.2.149.7111

    Article  Google Scholar 

  42. 42.

    Temperley D: An evaluation system for metrical models. Computer Music Journal 2004,28(3):28–44. 10.1162/0148926041790621

    Article  Google Scholar 

  43. 43.

    Tzanetakis G, Cook P: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 2002,10(5):293–302. 10.1109/TSA.2002.800560

    Article  Google Scholar 

  44. 44.

    Vaidyanathan P: Multirate Systems and Filter Banks. Prentice-Hall PTR, Englewood Cliffs, NJ, USA; 1992.

    Google Scholar 

  45. 45.

    Wang J-F, Yang C-H, Chang K-H: Subspace tracking for speech enhancement in car noise environments. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 2: 789–792.

    Google Scholar 

  46. 46.

    Wax M, Kailath T: Detection of signals by information theoretic criteria. IEEE Transactions on Acoustics, Speech, and Signal Processing 1985,33(2):387–392. 10.1109/TASSP.1985.1164557

    MathSciNet  Article  Google Scholar 

  47. 47.

    Zhao LC, Krishnaiah PR, Bai ZD: On detection of the number of signals in presence of white noise. Journal of Multivariate Analysis 1986,20(1):1–25. 10.1016/0047-259X(86)90017-5

    MathSciNet  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Miguel Alonso.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Alonso, M., Richard, G. & David, B. Accurate tempo estimation based on harmonic + noise decomposition. EURASIP J. Adv. Signal Process. 2007, 082795 (2006). https://doi.org/10.1155/2007/82795

Download citation

Keywords

  • Dynamic Programming
  • Tempo
  • Quantum Information
  • Audio Signal
  • Programming Search