Open Access

A Discriminative Model for Polyphonic Piano Transcription

EURASIP Journal on Advances in Signal Processing20062007:048317

https://doi.org/10.1155/2007/48317

Received: 6 December 2005

Accepted: 29 June 2006

Published: 29 October 2006

Abstract

We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.

[1234567891011121314151617]

Authors’ Affiliations

(1)
Laboratory for Recognition and Organization of Speech and Audio, Department of Electrical Engineering, Columbia University

References

  1. Moorer JA: On the transcription of musical sound by computer. Computer Music Journal 1977,1(4):32-38.Google Scholar
  2. Rossi L, Girolami G, Leca M: Identification of polyphonic piano signals. Acustica 1997,83(6):1077-1084.Google Scholar
  3. Sterian AD: Model-based segmentation of time-frequency images for musical transcription, Ph.D. thesis. University of Michigan, Ann Arbor, Mich, USA; 1999.Google Scholar
  4. Dixon S: On the computer recognition of solo piano music. Proceedings of Australasian Computer Music Conference, July 2000, Brisbane, Australia 31-37.Google Scholar
  5. Bello JP, Daudet L, Sandler M: Time-domain polyphonic transcription using self-generating databases. Proceedings of the 112th Convention of the Audio Engineering Society, May 2002, Munich, GermanyGoogle Scholar
  6. Klapuri A: A perceptually motivated multiple-f0 estimation method. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USAGoogle Scholar
  7. Ryynänen M, Klapuri A: Polyphonic music transcription using note event modeling. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USAGoogle Scholar
  8. Marolt M: A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 2004,6(3):439-449. 10.1109/TMM.2004.827507View ArticleGoogle Scholar
  9. Godsill S, Davy M: Bayesian harmonic models for musical pitch estimation and analysis. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1769-1772.Google Scholar
  10. Cemgil AT, Kappen HJ, Barber D: A generative model for music transcription. IEEE Transactions on Speech and Audio Processing 2006,14(2):679-694.View ArticleGoogle Scholar
  11. Kashino K, Godsill SJ: Bayesian estimation of simultaneous musical notes based on frequency domain modelling. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Que, Canada 4: 305-308.Google Scholar
  12. Ellis DPW, Poliner GE: Classification-based melody transcription. to appear in Machine Learning, http://dx.doi.org/10.1007/s10994-006-8373-9 to appear in Machine Learning,
  13. Platt J: Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods - Support Vector Learning. Edited by: Scholkopf B, Burges CJC, Smola AJ. MIT Press, Cambridge, Mass, USA; 1999:185-208.Google Scholar
  14. Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, Calif, USA; 2000.Google Scholar
  15. National Institute of Standards and Technology Spring 2004 (RT-04S) rich transcription meeting recognition evaluation plan, 2004. http://nist.gov/speech/tests/rt/rt2004/spring/
  16. Taskar B, Guestrin C, Koller D: Max-margin Markov networks. Proceedings of Neural Information Processing Systems Conference (NIPS '03), December 2003, Vancouver, CanadaGoogle Scholar
  17. Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC: Estimating the support of a high-dimensional distribution. Neural Computation 2001,13(7):1443-1471. 10.1162/089976601750264965View ArticleMATHGoogle Scholar

Copyright

© Poliner and Ellis 2007