Skip to main content
  • Research Article
  • Open access
  • Published:

Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps

Abstract

We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features different from those of monophonic sounds. To cope with this, we weight features based on how much they are affected by overlapping. First, we quantitatively evaluate the influence of overlapping on each feature as the ratio of the within-class variance to the between-class variance in the distribution of training data obtained from polyphonic sounds. Then, we generate feature axes using a weighted mixture that minimizes the influence via linear discriminant analysis. In addition, we improve instrument identification using musical context. Experimental results showed that the recognition rates using both feature weighting and musical context were 84.1 for duo, 77.6 for trio, and 72.3 for quartet; those without using either were 53.4, 49.6, and 46.5, respectively.

References

  1. Good M: MusicXML: an internet-friendly format for sheet music. Proceedings of the XML Conference & Exposition, December 2001, Orlando, Fla, USA

    Google Scholar 

  2. Bellini P, Nesi P: WEDELMUSIC format: an XML music notation format for emerging applications. Proceedings of the International Conference of Web Delivering of Music, November 2001, Florence, Italy 79–86.

    Google Scholar 

  3. Manjunath BS, Salembier P, Sikora T: Introduction of MPEG-7. John Wiley & Sons, New York, NY, USA; 2002.

    Google Scholar 

  4. Nagatsuka T, Saiwaki N, Katayose H, Inokuchi S: Automatic transcription system for ensemble music. Proceedings of the International Symposium of Musical Acoustics (ISMA '92), 1992, Tokyo, Japan 79–82.

    Google Scholar 

  5. Brown GJ, Cooke M: Perceptual grouping of musical sounds: a computational model. Journal of New Music Research 1994, 23: 107–132. 10.1080/09298219408570651

    Article  Google Scholar 

  6. Martin KD: Automatic transcription of simple polyphonic music. Proceedings of 3rd Joint meeting of the Acoustical Society of America and Japan, December 1996, Honolulu, Hawaii, USA

    Google Scholar 

  7. Kashino K, Nakadai K, Kinoshita T, Tanaka H: Application of the Bayesian probability network to music scene analysis. In Computational Auditory Scene Analysis. Edited by: Rosenthal DF, Okuno HG. Lawrence Erlbaum Associates, Mahwah, NJ, USA; 1998:115–137.

    Google Scholar 

  8. Klapuri A, Virtanen T, Eronen A, Seppanen J: Automatic transcription of musical recordings. Proceedings of Workshop on Consistent & Reliable Acoustic Cues (CRAC '01), September 2001, Aalborg, Denmark

    Google Scholar 

  9. Sakuraba Y, Kitahara T, Okuno HG: Comparing features for forming music streams in automatic music transcription. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 4: 273–276.

    Google Scholar 

  10. Kashino K, Murase H: Sound source identification system for ensemble music based on template adaptation and music stream extraction. Speech Communication 1999,27(3):337–349. 10.1016/S0167-6393(98)00078-8

    Article  Google Scholar 

  11. Kinoshita T, Sakai S, Tanaka H: Musical sound source identification based on frequency component adaptation. Proceedings of IJCAI Workshop on Computational Auditory Scene Analysis (IJCAI-CASA '99), July-August 1999, Stockholm, Sweden 18–24.

    Google Scholar 

  12. Eggink J, Brown GJ: A missing feature approach to instrument identification in polyphonic music. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 553–556.

    Google Scholar 

  13. Eggink J, Brown GJ: Application of missing feature theory to the recognition of musical instruments in polyphonic audio. Proceedings of International Symposium on Music Information Retrieval (ISMIR '03), October 2003, Baltimore, Md, USA

    Google Scholar 

  14. Martin KD: Sound-source recognition: a theory and computational model, Ph.D. thesis. MIT, Cambridge, Mass, USA; 1999.

    Google Scholar 

  15. Eronen A, Klapuri A: Musical instrument recognition using cepstral coefficients and temporal features. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 753–756.

    Google Scholar 

  16. Fraser A, Fujinaga I: Toward real-time recognition of acoustic musical instruments. Proceedings of International Computer Music Conference (ICMC '99), October 1999, Beijing, China 175–177.

    Google Scholar 

  17. Fujinaga I, MacMillan K: Realtime recognition of orchestral instruments. Proceedings of International Computer Music Conference (ICMC '00), August 2000, Berlin, Germany 141–143.

    Google Scholar 

  18. Agostini G, Longari M, Pollastri E: Musical instrument timbres classification with spectral features. EURASIP Journal on Applied Signal Processing 2003,2003(1):5–14. 10.1155/S1110865703210118

    Google Scholar 

  19. Kitahara T, Goto M, Okuno HG: Pitch-dependent identification of musical instrument sounds. Applied Intelligence 2005,23(3):267–275. 10.1007/s10489-005-4612-1

    Article  Google Scholar 

  20. Marques J, Moreno PJ: A study of musical instrument classification using Gaussian mixture models and support vector machines. In CRL Technical Report Series CRL/4. Cambridge Research Laboratory, Cambridge, Mass, USA; 1999.

    Google Scholar 

  21. Brown JC: Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. Journal of the Acoustical Society of America 1999,105(3):1933–1941. 10.1121/1.426728

    Article  Google Scholar 

  22. Krishna AG, Sreenivas TV: Music instrument recognition: from isolated notes to solo phrases. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 4: 265–268.

    Google Scholar 

  23. Kostek B: Musical instrument classification and duet analysis employing music information retrieval techniques. Proceedings of the IEEE 2004,92(4):712–729. 10.1109/JPROC.2004.825903

    Article  Google Scholar 

  24. Essid S, Richard G, David B: Instrument recognition in polyphonic music. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 3: 245–248.

    Google Scholar 

  25. Kitahara T, Goto M, Komatani K, Ogata T, Okuno HG: Instrument identification in polyphonic music: feature weighting with mixed sounds, pitch-dependent timbre modeling, and use of musical context. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 558–563.

    Google Scholar 

  26. Bregman AS: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge, Mass, USA; 1990.

    Google Scholar 

  27. Huron D: Tone and voice: a derivation of the rules of voice-leading from perceptual principles. Music Perception 2001,19(1):1–64. 10.1525/mp.2001.19.1.1

    Article  MathSciNet  Google Scholar 

  28. Kameoka H, Nishimoto T, Sagayama S: Harmonic-temporal-structured clustering via deterministic annealing EM algorithm for audio feature extraction. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 115–122.

    Google Scholar 

  29. Goto M, Hashiguchi H, Nishimura T, Oka R: RWC music database: popular, classical, and jazz music databases. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France 287–288.

    Google Scholar 

  30. Goto M, Hashiguchi H, Nishimura T, Oka R: RWC music database: music genre database and musical instrument sound database. Proceedings of 4th International Conference on Music Information Retrieval (ISMIR '03), October 2003, Washington, DC, USA 229–230.

    Google Scholar 

  31. Goto M: A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication 2004,43(4):311–329. 10.1016/j.specom.2004.07.001

    Article  MathSciNet  Google Scholar 

  32. Kameoka H, Nishimoto T, Sagayama S: Audio stream segregation of multi-pitch music signal based on time-space clustering using Gaussian Kernel 2-dimensional model. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 3: 5–8.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tetsuro Kitahara.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kitahara, T., Goto, M., Komatani, K. et al. Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps. EURASIP J. Adv. Signal Process. 2007, 051979 (2006). https://doi.org/10.1155/2007/51979

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2007/51979

Keywords