Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps

Kitahara, Tetsuro; Goto, Masataka; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

doi:10.1155/2007/51979

Research Article
Open access
Published: 01 December 2006

Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps

Tetsuro Kitahara¹,
Masataka Goto²,
Kazunori Komatani¹,
Tetsuya Ogata¹ &
…
Hiroshi G. Okuno¹

EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 051979 (2006) Cite this article

2971 Accesses
34 Citations
Metrics details

Abstract

We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features different from those of monophonic sounds. To cope with this, we weight features based on how much they are affected by overlapping. First, we quantitatively evaluate the influence of overlapping on each feature as the ratio of the within-class variance to the between-class variance in the distribution of training data obtained from polyphonic sounds. Then, we generate feature axes using a weighted mixture that minimizes the influence via linear discriminant analysis. In addition, we improve instrument identification using musical context. Experimental results showed that the recognition rates using both feature weighting and musical context were 84.1 for duo, 77.6 for trio, and 72.3 for quartet; those without using either were 53.4, 49.6, and 46.5, respectively.

References

Good M: MusicXML: an internet-friendly format for sheet music. Proceedings of the XML Conference & Exposition, December 2001, Orlando, Fla, USA
Google Scholar
Bellini P, Nesi P: WEDELMUSIC format: an XML music notation format for emerging applications. Proceedings of the International Conference of Web Delivering of Music, November 2001, Florence, Italy 79–86.
Google Scholar
Manjunath BS, Salembier P, Sikora T: Introduction of MPEG-7. John Wiley & Sons, New York, NY, USA; 2002.
Google Scholar
Nagatsuka T, Saiwaki N, Katayose H, Inokuchi S: Automatic transcription system for ensemble music. Proceedings of the International Symposium of Musical Acoustics (ISMA '92), 1992, Tokyo, Japan 79–82.
Google Scholar
Brown GJ, Cooke M: Perceptual grouping of musical sounds: a computational model. Journal of New Music Research 1994, 23: 107–132. 10.1080/09298219408570651
Article Google Scholar
Martin KD: Automatic transcription of simple polyphonic music. Proceedings of 3rd Joint meeting of the Acoustical Society of America and Japan, December 1996, Honolulu, Hawaii, USA
Google Scholar
Kashino K, Nakadai K, Kinoshita T, Tanaka H: Application of the Bayesian probability network to music scene analysis. In Computational Auditory Scene Analysis. Edited by: Rosenthal DF, Okuno HG. Lawrence Erlbaum Associates, Mahwah, NJ, USA; 1998:115–137.
Google Scholar
Klapuri A, Virtanen T, Eronen A, Seppanen J: Automatic transcription of musical recordings. Proceedings of Workshop on Consistent & Reliable Acoustic Cues (CRAC '01), September 2001, Aalborg, Denmark
Google Scholar
Sakuraba Y, Kitahara T, Okuno HG: Comparing features for forming music streams in automatic music transcription. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 4: 273–276.
Google Scholar
Kashino K, Murase H: Sound source identification system for ensemble music based on template adaptation and music stream extraction. Speech Communication 1999,27(3):337–349. 10.1016/S0167-6393(98)00078-8
Article Google Scholar
Kinoshita T, Sakai S, Tanaka H: Musical sound source identification based on frequency component adaptation. Proceedings of IJCAI Workshop on Computational Auditory Scene Analysis (IJCAI-CASA '99), July-August 1999, Stockholm, Sweden 18–24.
Google Scholar
Eggink J, Brown GJ: A missing feature approach to instrument identification in polyphonic music. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 553–556.
Google Scholar
Eggink J, Brown GJ: Application of missing feature theory to the recognition of musical instruments in polyphonic audio. Proceedings of International Symposium on Music Information Retrieval (ISMIR '03), October 2003, Baltimore, Md, USA
Google Scholar
Martin KD: Sound-source recognition: a theory and computational model, Ph.D. thesis. MIT, Cambridge, Mass, USA; 1999.
Google Scholar
Eronen A, Klapuri A: Musical instrument recognition using cepstral coefficients and temporal features. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 753–756.
Google Scholar
Fraser A, Fujinaga I: Toward real-time recognition of acoustic musical instruments. Proceedings of International Computer Music Conference (ICMC '99), October 1999, Beijing, China 175–177.
Google Scholar
Fujinaga I, MacMillan K: Realtime recognition of orchestral instruments. Proceedings of International Computer Music Conference (ICMC '00), August 2000, Berlin, Germany 141–143.
Google Scholar
Agostini G, Longari M, Pollastri E: Musical instrument timbres classification with spectral features. EURASIP Journal on Applied Signal Processing 2003,2003(1):5–14. 10.1155/S1110865703210118
Google Scholar
Kitahara T, Goto M, Okuno HG: Pitch-dependent identification of musical instrument sounds. Applied Intelligence 2005,23(3):267–275. 10.1007/s10489-005-4612-1
Article Google Scholar
Marques J, Moreno PJ: A study of musical instrument classification using Gaussian mixture models and support vector machines. In CRL Technical Report Series CRL/4. Cambridge Research Laboratory, Cambridge, Mass, USA; 1999.
Google Scholar
Brown JC: Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. Journal of the Acoustical Society of America 1999,105(3):1933–1941. 10.1121/1.426728
Article Google Scholar
Krishna AG, Sreenivas TV: Music instrument recognition: from isolated notes to solo phrases. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 4: 265–268.
Google Scholar
Kostek B: Musical instrument classification and duet analysis employing music information retrieval techniques. Proceedings of the IEEE 2004,92(4):712–729. 10.1109/JPROC.2004.825903
Article Google Scholar
Essid S, Richard G, David B: Instrument recognition in polyphonic music. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 3: 245–248.
Google Scholar
Kitahara T, Goto M, Komatani K, Ogata T, Okuno HG: Instrument identification in polyphonic music: feature weighting with mixed sounds, pitch-dependent timbre modeling, and use of musical context. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 558–563.
Google Scholar
Bregman AS: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge, Mass, USA; 1990.
Google Scholar
Huron D: Tone and voice: a derivation of the rules of voice-leading from perceptual principles. Music Perception 2001,19(1):1–64. 10.1525/mp.2001.19.1.1
Article MathSciNet Google Scholar
Kameoka H, Nishimoto T, Sagayama S: Harmonic-temporal-structured clustering via deterministic annealing EM algorithm for audio feature extraction. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 115–122.
Google Scholar
Goto M, Hashiguchi H, Nishimura T, Oka R: RWC music database: popular, classical, and jazz music databases. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France 287–288.
Google Scholar
Goto M, Hashiguchi H, Nishimura T, Oka R: RWC music database: music genre database and musical instrument sound database. Proceedings of 4th International Conference on Music Information Retrieval (ISMIR '03), October 2003, Washington, DC, USA 229–230.
Google Scholar
Goto M: A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication 2004,43(4):311–329. 10.1016/j.specom.2004.07.001
Article MathSciNet Google Scholar
Kameoka H, Nishimoto T, Sagayama S: Audio stream segregation of multi-pitch music signal based on time-space clustering using Gaussian Kernel 2-dimensional model. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 3: 5–8.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Sakyo-Ku, Kyoto, 606-8501, Japan
Tetsuro Kitahara, Kazunori Komatani, Tetsuya Ogata & Hiroshi G. Okuno
National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki, 305-8568, Japan
Masataka Goto

Authors

Tetsuro Kitahara
View author publications
You can also search for this author in PubMed Google Scholar
Masataka Goto
View author publications
You can also search for this author in PubMed Google Scholar
Kazunori Komatani
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuya Ogata
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi G. Okuno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tetsuro Kitahara.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kitahara, T., Goto, M., Komatani, K. et al. Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps. EURASIP J. Adv. Signal Process. 2007, 051979 (2006). https://doi.org/10.1155/2007/51979

Download citation

Received: 07 December 2005
Revised: 27 July 2006
Accepted: 13 August 2006
Published: 01 December 2006
DOI: https://doi.org/10.1155/2007/51979

Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords