Multimodal Semantic Analysis and Annotation for Basketball Video

Liu, Song; Xu, Min; Yi, Haoran; Chia, Liang-Tien; Rajan, Deepu

doi:10.1155/ASP/2006/32135

Research Article
Open access
Published: 01 December 2006

Multimodal Semantic Analysis and Annotation for Basketball Video

Song Liu¹,
Min Xu¹,
Haoran Yi¹,
Liang-Tien Chia¹ &
…
Deepu Rajan¹

EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 032135 (2006) Cite this article

1697 Accesses
19 Citations
Metrics details

Abstract

This paper presents a new multiple-modality method for extracting semantic information from basketball video. The visual, motion, and audio information are extracted from video to first generate some low-level video segmentation and classification. Domain knowledge is further exploited for detecting interesting events in the basketball video. For video, both visual and motion prediction information are utilized for shot and scene boundary detection algorithm; this will be followed by scene classification. For audio, audio keysounds are sets of specific audio sounds related to semantic events and a classification method based on hidden Markov model (HMM) is used for audio keysound identification. Subsequently, by analyzing the multimodal information, the positions of potential semantic events, such as "foul" and "shot at the basket," are located with additional domain knowledge. Finally, a video annotation is generated according to MPEG-7 multimedia description schemes (MDSs). Experimental results demonstrate the effectiveness of the proposed method.

References

Gong YH, Sin LT, Chuan CH, Zhang H, Sakauchi M: Automatic parsing of TV soccer programs. Proceedings of International Conference on Multimedia Computing and Systems (ICMCS '95), May 1995, Washington, DC, USA 167–174.
Chapter Google Scholar
Tan Y-P, Saur DD, Kulkami SR, Ramadge PJ: Rapid estimation of camera motion from compressed video with application to video annotation. IEEE Transactions on Circuits and Systems for Video Technology 2000, 10(1):133–146. 10.1109/76.825867
Article Google Scholar
Xu P, Xie L, Chang S-F, Divakaran A, Vetro A, Sun H: Algorithms and system for segmentation and structure analysis in soccer video. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '01), August 2001, Tokyo, Japan 721–724.
Google Scholar
Ekin A, Tekalp AM, Mehrotra R: Automatic soccer video analysis and summarization. IEEE Transactions on Image Processing 2003, 12(7):796–807. 10.1109/TIP.2003.812758
Article Google Scholar
Lu H, Tan Y-P: Content-based sports video analysis and modeling. Proceedings of 7th International Conference on Control, Automation, Robotics and Vision (ICARCV '02), December 2002, Singapore 1198–1203.
Google Scholar
Fu Y, Ekin A, Tekalp AM, Mehrotra R: Temporal segmentation of video objects for hierarchical object-based motion description. IEEE Transactions on Image Processing 2002, 11(2):135–145. 10.1109/83.982821
Article Google Scholar
Duan L-Y, Xu M, Chua T-S, Tian Q, Xu C-S: A mid-level representation framework for semantic sports video analysis. Proceedings of 11th ACM International Conference on Multimedia, November 2003, Berkeley, Calif, USA 33–44.
Google Scholar
Han M, Hua W, Xu W, Gong YH: An integrated baseball digest system using maximum entropy method. Proceedings of 10th ACM International Conference on Multimedia, December 2002, Juan les Pins, France 347–350.
Google Scholar
Nepal S, Srinivasan U, Reynolds G: Automatic detection of goal segments in basketball videos. Proceedings of 9th ACM International Conference on Multimedia, September 2001, Ottawa, Ontario, Canada 9: 261–269.
Google Scholar
Xu M, Duan L-Y, Xu C-S, Kankanhalli M, Tian Q: Event detection in basketball video using multiple modalities. Proceedings of 4th International Conference on Information, Communications and Signal Processing and the 4th Pacific Rim Conference on Multimedia (ICICS-PCM '03), December 2003, Singapore 3: 1526–1530.
Google Scholar
Naphade MR, Huang TS: Semantic video indexing using a probabilistic framework. Proceedings of International Conference on Pattern Recognition (ICPR '00), September 2000, Barcelona, Spain 3: 3083–3088.
Google Scholar
Snoek CGM, Worring M: Multimedia event-based video indexing using time intervals. IEEE Transactions on Multimedia 2005, 7(4):638–647.
Article Google Scholar
Rui Y, Gupta A, Acero A: Automatically extracting highlights for TV baseball programs. Proceedings of 8th ACM International Conference on Multimedia, October–November 2000, Los Angeles, Calif, USA 105–115.
Google Scholar
Xu M, Maddage NC, Xu C-S, Kankanhalli M, Tian Q: Creating audio keywords for event detection in soccer video. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA 2: 281–284.
Google Scholar
Rabiner L, Juang B-H: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.
Google Scholar
Assfalg J, Bertini M, Colombo C, Del Bimbo A, Nunziati W: Semantic annotation of soccer videos: automatic highlights identification. Computer Vision and Image Understanding 2003, 92(2–3):285–305. 10.1016/j.cviu.2003.06.004
Article Google Scholar
Pan H, van Beek P, Sezan MI: Detection of slow-motion replay segments in sports video for highlights generation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 3: 1649–1652.
Google Scholar
Xie L, Xu P, Chang S-F, Divakaran A, Sun H: Structure analysis of soccer video with domain knowledge and hidden Markov models . Pattern Recognition Letters 2004, 25(7):767–775. 10.1016/j.patrec.2004.01.005
Article Google Scholar
Xiong Z, Radhakrishnan R, Divakaran A, Huang TS: Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), April 2003, Hong Kong, China 5: 632–635.
Google Scholar
Nam J, Tewfik A: Combined audio and visual streams analysis for video sequence segmentation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), April 1997, Munich, Germany 4: 2665–2668.
Google Scholar
Saraceno C, Leonardi R: Identification of story units in audio-visual sequences by joint audio and video processing. Proceedings of International Conference on Image Processing (ICIP '98), October 1998, Chicago, Ill, USA 1: 363–367.
Article Google Scholar
Yi H, Rajan D, Chia LT: A unified approach to detection of shot boundaries and subshots in compressed video. Proceedings of International Conference on Image Processing (ICIP '03), September 2003, Barcelona, Spain 2: 1005–1008.
Google Scholar
Siew LH, Hodgson RM, Wood EJ: Texture measures for carpet wear assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence 1988, 10(1):92–105. 10.1109/34.3870
Article Google Scholar
Haralick RM, Shanmugam K, Dinstein I: Textural features for image classification. IEEE Transactions System, Man, and Cybernetics 1973, 3(6):610–621.
Article Google Scholar
Stiller C, Konrad J: Estimating motion in image sequences. IEEE Signal Processing Magazine 1999, 16(4):70–91. 10.1109/79.774934
Article Google Scholar
Szeliski R: Video mosaics for virtual environments. IEEE Computer Graphics and Applications 1996, 16(2):22–30. 10.1109/38.486677
Article Google Scholar
Young S, Evermann G, Kershaw D, et al.: The HTK Book (for HTK Version 3.1). Cambridge University Engineering Department, Cambridge, UK, December 2002
Google Scholar
Manjunath BS, Salembier P, Sikora T: Introduction to MPEG-7. John Wiley & Sons, New York, NY, USA; 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Nanyang Technological University, Block N4, 02A-32, Nanyang Avenue, Singapore, 639798
Song Liu, Min Xu, Haoran Yi, Liang-Tien Chia & Deepu Rajan

Authors

Song Liu
View author publications
You can also search for this author in PubMed Google Scholar
Min Xu
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Yi
View author publications
You can also search for this author in PubMed Google Scholar
Liang-Tien Chia
View author publications
You can also search for this author in PubMed Google Scholar
Deepu Rajan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Song Liu.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Liu, S., Xu, M., Yi, H. et al. Multimodal Semantic Analysis and Annotation for Basketball Video. EURASIP J. Adv. Signal Process. 2006, 032135 (2006). https://doi.org/10.1155/ASP/2006/32135

Download citation

Received: 01 September 2004
Revised: 17 February 2005
Accepted: 14 March 2005
Published: 01 December 2006
DOI: https://doi.org/10.1155/ASP/2006/32135

Multimodal Semantic Analysis and Annotation for Basketball Video

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords