Skip to content

Advertisement

  • Research Article
  • Open Access

Audio Key Finding: Considerations in System Design and Case Studies on Chopin's 24 Preludes

EURASIP Journal on Advances in Signal Processing20062007:056561

https://doi.org/10.1155/2007/56561

  • Received: 8 December 2005
  • Accepted: 22 June 2006
  • Published:

Abstract

We systematically analyze audio key finding to determine factors important to system design, and the selection and evaluation of solutions. First, we present a basic system, fuzzy analysis spiral array center of effect generator algorithm, with three key determination policies: nearest-neighbor (NN), relative distance (RD), and average distance (AD). AD achieved a 79% accuracy rate in an evaluation on 410 classical pieces, more than 8% higher RD and NN. We show why audio key finding sometimes outperforms symbolic key finding. We next propose three extensions to the basic key finding system—the modified spiral array (mSA), fundamental frequency identification (F0), and post-weight balancing (PWB)—to improve performance, with evaluations using Chopin's Preludes (Romantic repertoire was the most challenging). F0 provided the greatest improvement in the first 8 seconds, while mSA gave the best performance after 8 seconds. Case studies examine when all systems were correct, or all incorrect.

Keywords

  • Information Technology
  • System Design
  • Average Distance
  • Effect Generator
  • Quantum Information

[1234567891011121314151617181920212223]

Authors’ Affiliations

(1)
Integrated Media Systems Center, Department of Computer Science, USC Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089-0781, USA
(2)
Integrated Media Systems Center, Epstein Department of Industrial and Systems Engineering, USC Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089-0193, USA

References

  1. Chew E: Towards a mathematical model of tonality, Doctoral dissertation.Google Scholar
  2. Chew E: Modeling tonality: applications to music cognition. Proceedings of the 23rd Annual Meeting of the Cognitive Science Society (CogSci '01), August 2001, Edinburgh, Scotland, UK 206-211.Google Scholar
  3. Chuan C-H, Chew E: Fuzzy analysis in pitch-class determination for polyphonic audio key finding. Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK 296-303.Google Scholar
  4. Longuet-Higgins HC, Steedman MJ: On interpreting bach. In Machine Intelligence. Volume 6. Edinburgh University Press, Edinburgh, Scotland, UK; 1971:221-241.Google Scholar
  5. Krumhansl CL: Quantifying tonal hierarchies and key distances. In Cognitive Foundations of Musical Pitch. Oxford University Press, New York, NY, USA; 1990:16-49. chapter 2Google Scholar
  6. Temperley D: What's key for key? the Krumhansl-Schmuckler key-finding algorithm reconsidered. Music Perception 1999,17(1):65-100.View ArticleGoogle Scholar
  7. Chuan C-H, Chew E: Polyphonic audio key finding using the spiral array CEG algorithm. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '05), July 2005, Amsterdam, The Netherlands 21-24.Google Scholar
  8. Gómez E, Herrera P: Estimating the tonality of polyphonic audio files: cognitive versus machine learning modelling strategies. Proceedings of 5th International Conference on Music Information Retrieval (ISMIR '04), October 2004, Barcelona, Spain 92-95.Google Scholar
  9. Pauws S: Musical key extraction from audio. Proceedings of 5th International Conference on Music Information Retrieval (ISMIR '04), October 2004, Barcelona, Spain 96-99.Google Scholar
  10. 1st Annual Music Information Retrieval Evaluation eXchange, MIREX 2005, http://www.music-ir.org/mirex2005/index.php/Main_Page
  11. Chuan C-H, Chew E: Audio key finding using FACEG: fuzzy analysis with the CEG algorithm. Abstract of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UKGoogle Scholar
  12. Gómez E: Key estimation from polyphonic audio. Abstract of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UKGoogle Scholar
  13. İzmirli Ö: An algorithm for audio key finding. Abstract of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UKGoogle Scholar
  14. Pauws S: KEYEX: audio key extraction. Abstract of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UKGoogle Scholar
  15. Purwins H, Blankertz B: Key finding in audio. Abstract of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UKGoogle Scholar
  16. Zhu Y: An audio key finding algorithm. Abstract of the 1st Annual Music Information Retrieval Evaluation eXchange (MIREX '05), September 2005, London, UKGoogle Scholar
  17. Chew E, François ARJ: Interactive multi-scale visualizations of tonal evolution in MuSA.RT Opus 2. Computers in Entertainment 2005,3(4):1-16. special issue on Music VisualizationView ArticleGoogle Scholar
  18. Chew E, Chen Y-C: Mapping MIDI to the spiral array: disambiguating pitch spellings. Proceedings of the 8th INFORMS Computing Society Conference (ICS '03), January 2003, Chandler, Ariz, USA 259-275.Google Scholar
  19. Chew E, Chen Y-C: Real-time pitch spelling using the spiral array. Computer Music Journal 2005,29(2):61-76. 10.1162/0148926054094378View ArticleGoogle Scholar
  20. İzmirli Ö: Template based key finding from audio. Proceedings of the International Computer Music Conference (ICMC '05), September 2005, Barcelona, SpainGoogle Scholar
  21. Electronic Music Studios in the University of Iowa, http://theremin.music.uiowa.edu/MIS.html
  22. Klapuri AP: Multiple fundamental frequency estimation based on harmonicity and spectral smoothness. IEEE Transactions on Speech and Audio Processing 2003,11(6):804-816. 10.1109/TSA.2003.815516View ArticleGoogle Scholar
  23. Klapuri A: A perceptually motivated multiple-F0 estimation method. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2005, New Paltz, NY, USAGoogle Scholar

Copyright

© C.-H. Chuan and E. Chew. 2007

Advertisement