Skip to main content
  • Research Article
  • Open access
  • Published:

A Supervised Classification Algorithm for Note Onset Detection

Abstract

This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

References

  1. West K, Cox S: Finding an optimal segmentation for audio genre classification. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 680–685.

    Google Scholar 

  2. Scheirer ED: Tempo and beat analysis of acoustic musical signals. Journal of the Acoustical Society of America 1998,103(1):588–601. 10.1121/1.421129

    Article  Google Scholar 

  3. Klapuri A: Sound onset detection by applying psychoacoustic knowledge. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '99), March 1999, Phoenix, Ariz, USA 6: 3089–3092.

    Google Scholar 

  4. Klapuri AP, Eronen AJ, Astola JT: Analysis of the meter of acoustic musical signals. IEEE Transactions on Audio, Speech and Language Processing 2006,14(1):342–355.

    Article  Google Scholar 

  5. Gouyon F, Klapuri A, Dixon S, et al.: An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Audio, Speech and Language Processing 2006,14(5):1832–1844.

    Article  Google Scholar 

  6. Duxbury C, Bello JP, Davies M, Sandler M: Compled domain onset detection for musical signals. Proceedings of 6th International Conference on Digital Audio Effects (DAFx '03), September 2003, London, UK

    Google Scholar 

  7. Duxbury C, Bello JP, Davies M, Sandler M: A combined phase and amplitude based approach to onset detection for audio segmentation. Proceedings of the 4th European Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '03), April 2003, London, UK

    Google Scholar 

  8. Bello JP, Sandler M: Phase-based note onset detection for music signals. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 441–444.

    Google Scholar 

  9. Bello JP, Duxbury C, Davies M, Sandler M: On the use of phase and energy for musical onset detection in the complex domain. IEEE Signal Processing Letters 2004,11(6):553–556. 10.1109/LSP.2004.827951

    Article  Google Scholar 

  10. Kapanci E, Pfeffer A: A hierarchical approach to onset detection. Proceedings of the International Computer Music Conference (ICMC '04), October 2004, Miami, Fla, USA

    Google Scholar 

  11. Davy M, Godsill S: Detection of abrupt spectral changes using support vector machines an application to audio signal segmentation. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1313–1316.

    Google Scholar 

  12. Marolt M, Kavcic A, Privosnik M: Neural networks for note onset detection in piano music. Proceedings of the International Computer Music Conference (ICMC '02), September 2002, Gotenborg, Sweden

    Google Scholar 

  13. Brown JC:Calculation of a constant spectral transform. Journal of the Acoustical Society of America 1991,89(1):425–434. 10.1121/1.400476

    Article  Google Scholar 

  14. Brown JC, Puckette MS:An efficient algorithm for the calculation of a constant transform. Journal of the Acoustical Society of America 1992,92(5):2698–2701. 10.1121/1.404385

    Article  Google Scholar 

  15. Bishop CM: Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK; 1995.

    MATH  Google Scholar 

  16. Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes in C: The Art of Scientific Computing. 2nd edition. Cambridge University Press, Cambridge, Mass, USA; 1993.

    MATH  Google Scholar 

  17. Large EW, Kolen JF: Resonance and the perception of musical meter. Connection Science 1994,6(1):177–208.

    Article  Google Scholar 

  18. Eck D: Finding downbeats with a relaxation oscillator. Psychological Research 2002,66(1):18–25. 10.1007/s004260100070

    Article  MathSciNet  Google Scholar 

  19. Dixon SE: Automatic extraction of tempo and beat from expressive performances. Journal of New Music Research 2001,30(1):39–58. 10.1076/jnmr.30.1.39.7119

    Article  MathSciNet  Google Scholar 

  20. Cemgil AT, Kappen B: Monte Carlo methods for tempo tracking and rhythm quantization. Journal of Artificial Intelligence Research 2003, 18: 45–81.

    Article  Google Scholar 

  21. Cemgil AT, Kappen B, Desain PWM, Honing HJ: On tempo tracking: tempogram representation and Kalman filtering. Journal of New Music Research 2001,29(4):259–273.

    Article  Google Scholar 

  22. Brown JC: Determination of the meter of musical scores by autocorrelation. Journal of the Acoustical Society of America 1993,94(4):1953–1957. 10.1121/1.407518

    Article  Google Scholar 

  23. Tzanetakis G, Cook P: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 2002,10(5):293–302. 10.1109/TSA.2002.800560

    Article  Google Scholar 

  24. Toiviainen P, Eerola T: The role of accent periodicities in meter induction: a classificatin study. In Proceedings of the 8th International Conference on Music Perception and Cognition (ICMPC8 '04), August 2004, Evanston, Ill, USA. Edited by: Lipscomb S, Ashley R, Gjerdingen R, Webster P. Causal Productions;

    Google Scholar 

  25. Eck D: A machine-learning approach to musical sequence induction that uses autocorrelation to bridge long timelags. In Proceedings of the 8th International Conference on Music Perception and Cognition (ICMPC8 '04), August 2004, Evanston, Ill, USA. Edited by: Lipscomb SD, Ashley R, Gjerdingen RO, Webster P. Causal Productions;

    Google Scholar 

  26. Goto M: An audio-based real-time beat tracking system for music with or without drum-sounds. Journal of New Music Research 2001,30(2):159–171. 10.1076/jnmr.30.2.159.7114

    Article  Google Scholar 

  27. Eck D: Meter and autocorrelation. 10th Rhythm Perception and Production Workshop (RPPW '05), July 2005, Blitzen, Belgium

    Google Scholar 

  28. Leveau P, Daudet L, Richard G: Methodology and tools for the evaluation of automatic onset detection algorithms in music. Proceedings of 5th International Conference on Music Information Retrieval (ISMIR '04), October 2004, Barcelona, Spain

    Google Scholar 

  29. McKinney M, Moelants D: Mirex 2005: tempo contest. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September, London, UK

  30. Eck D, Casagrande N: A tempo-extraction algorithm using an autocorrelation phase matrix and shannon entropy. MIREX tempo extraction contest, 2005, https://doi.org/www.music-ir.org/evaluation/mirex-results/

  31. LeCun Y, Bengio Y: Convolutional networks for images, speech, and time-series. In The Handbook of Brain Theory and Neural Networks. Edited by: Arbib . MIT Press, Cambridge, Mass, USA; 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandre Lacoste.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lacoste, A., Eck, D. A Supervised Classification Algorithm for Note Onset Detection. EURASIP J. Adv. Signal Process. 2007, 043745 (2006). https://doi.org/10.1155/2007/43745

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2007/43745

Keywords