- Research Article
- Open access
- Published:
Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 089686 (2006)
Abstract
One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations.
References
Bartsch MA, Wakefield GH: Audio thumbnailing of popular music using chroma-based representations. IEEE Transactions on Multimedia 2005,7(1):96–104.
Cooper M, Foote J: Automatic music summarization via similarity analysis. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France
Dannenberg R, Hu N: Pattern discovery techniques for music audio. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France
Goto M: A chorus-section detecting method for musical audio signals. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 437–440.
Lu L, Wang M, Zhang H-J: Repeating pattern discovery and structure analysis from acoustic music data. Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04), October 2004, New York, NY, USA 275–282.
Maddage NC, Xu C, Kankanhalli MS, Shao X: Content-based music structure analysis with applications to music semantics understanding. proceedings of the 12th ACM International Conference on Multimedia, October 2004, New York, NY, USA 112–119.
Peeters G, Burthe AL, Rodet X: Toward automatic music audio summary generation from signal analysis. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France 94–100.
Foote J: Visualizing music and audio using selfsimilarity. Proceedings of the 7th ACM International Conference on Multimedia (MM '99), October–November 1999, Orlando, Fla, USA 77–80.
Bartsch MA, Wakefield GH: To catch a chorus: using chroma-based representations for audio thumbnailing. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '01), October 2001, New Paltz, NY, USA 15–18.
Logan B, Chu S: Music summarization using key phrases. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 749–752.
Xu C, Maddage NC, Shao X: Automatic music classification and summarization. IEEE Transactions on Speech and Audio Processing 2005,13(3):441–450.
Chai W: Structural analysis of musical signals via pattern matching. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 549–552.
Chai W, Vercoe B: Music thumbnailing via structural analysis. Proceedings of the ACM International Multimedia Conference and Exhibition (MM '03), November 2003, Berkeley, Calif, USA 223–226.
Goto M: SmartMusicKIOSK: music listening station with chorus-search function. Proceedings of the Annual ACM Symposium on User Interface Softaware and Technology (UIST '03), November 2003, Vancouver, BC, Canada 31–40.
Kurth F, Müller M, Damm D, Fremerey C, Ribbrock A, Clausen M: Syncplayer—an advanced system for content-based audio access. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK
Tzanetakis G, Ermolinskyi A, Cook P: Pitch histograms in audio and symbolic music information retrieval. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France
Proakis JG, Manolakis DG: Digital Signal Processsing. Prentice Hall, Englewood Cliffs, NJ, USA; 1996.
Müller M, Kurth F, Clausen M: Audio matching via chroma-based statistical features. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Müller, M., Kurth, F. Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations. EURASIP J. Adv. Signal Process. 2007, 089686 (2006). https://doi.org/10.1155/2007/89686
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/89686