Skip to main content

Advertisement

A Discriminative Model for Polyphonic Piano Transcription

Article metrics

Abstract

We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.

References

  1. 1.

    Moorer JA: On the transcription of musical sound by computer. Computer Music Journal 1977,1(4):32–38.

  2. 2.

    Rossi L, Girolami G, Leca M: Identification of polyphonic piano signals. Acustica 1997,83(6):1077–1084.

  3. 3.

    Sterian AD: Model-based segmentation of time-frequency images for musical transcription, Ph.D. thesis. University of Michigan, Ann Arbor, Mich, USA; 1999.

  4. 4.

    Dixon S: On the computer recognition of solo piano music. Proceedings of Australasian Computer Music Conference, July 2000, Brisbane, Australia 31–37.

  5. 5.

    Bello JP, Daudet L, Sandler M: Time-domain polyphonic transcription using self-generating databases. Proceedings of the 112th Convention of the Audio Engineering Society, May 2002, Munich, Germany

  6. 6.

    Klapuri A: A perceptually motivated multiple-f0 estimation method. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA

  7. 7.

    Ryynänen M, Klapuri A: Polyphonic music transcription using note event modeling. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA

  8. 8.

    Marolt M: A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 2004,6(3):439–449. 10.1109/TMM.2004.827507

  9. 9.

    Godsill S, Davy M: Bayesian harmonic models for musical pitch estimation and analysis. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1769–1772.

  10. 10.

    Cemgil AT, Kappen HJ, Barber D: A generative model for music transcription. IEEE Transactions on Speech and Audio Processing 2006,14(2):679–694.

  11. 11.

    Kashino K, Godsill SJ: Bayesian estimation of simultaneous musical notes based on frequency domain modelling. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Que, Canada 4: 305–308.

  12. 12.

    Ellis DPW, Poliner GE: Classification-based melody transcription. to appear in Machine Learning, https://doi.org/10.1007/s10994-006-8373-9 to appear in Machine Learning,

  13. 13.

    Platt J: Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods - Support Vector Learning. Edited by: Scholkopf B, Burges CJC, Smola AJ. MIT Press, Cambridge, Mass, USA; 1999:185–208.

  14. 14.

    Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, Calif, USA; 2000.

  15. 15.

    National Institute of Standards and Technology Spring 2004 (RT-04S) rich transcription meeting recognition evaluation plan, 2004. https://doi.org/nist.gov/speech/tests/rt/rt2004/spring/

  16. 16.

    Taskar B, Guestrin C, Koller D: Max-margin Markov networks. Proceedings of Neural Information Processing Systems Conference (NIPS '03), December 2003, Vancouver, Canada

  17. 17.

    Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC: Estimating the support of a high-dimensional distribution. Neural Computation 2001,13(7):1443–1471. 10.1162/089976601750264965

Download references

Author information

Correspondence to Graham E. Poliner.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Poliner, G.E., Ellis, D.P.W. A Discriminative Model for Polyphonic Piano Transcription. EURASIP J. Adv. Signal Process. 2007, 048317 (2006) doi:10.1155/2007/48317

Download citation

Keywords

  • Support Vector Machine
  • Information Technology
  • Support Vector
  • Markov Model
  • Spectral Feature