Skip to content

Advertisement

Open Access

Discovering Recurrent Image Semantics from Class Discrimination

  • Joo-Hwee Lim1 and
  • Jesse S. Jin2, 3
EURASIP Journal on Advances in Signal Processing20062006:076093

https://doi.org/10.1155/ASP/2006/76093

Received: 17 August 2004

Accepted: 5 April 2005

Published: 26 January 2006

Abstract

Supervised statistical learning has become a critical means to design and learn visual concepts (e.g., faces, foliage, buildings, etc.) in content-based indexing systems. The drawback of this approach is the need of manual labeling of regions. While several automatic image annotation methods proposed recently are very promising, they usually rely on the availability and analysis of associated text descriptions. In this paper, we propose a hybrid learning framework to discover local semantic regions and generate their samples for training of local detectors with minimal human intervention. A multiscale segmentation-free framework is proposed to embed the soft presence of discovered semantic regions and local class patterns in an image independently for indexing and matching. Based on 2400 heterogeneous consumer images with 16 semantic queries, both similarity matching based on individual index and integrated similarity matching have outperformed a feature fusion approach by 26% and 37% in average precisions, respectively.

Keywords

Image AnnotationFusion ApproachFeature FusionSimilarity MatchVisual Concept

[123456789101112131415161718192021222324252627282930313233]

Authors’ Affiliations

(1)
Institute for Infocomm Research, Singapore
(2)
School of Design, Communication and Information Technology, Faculty of Science and Information Technology, University of Newcastle, Callaghan, Australia
(3)
Chair of Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Erlangen, Germany

References

  1. Hsu WH-M, Chang S-F: Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 2: 1091-1094.Google Scholar
  2. Li B, Goh K, Chang EY: Confidence-based dynamic ensemble for image annotation and semantics discovery. Proceedings of 11th ACM International Conference on Multimedia (MM '03), November 2003, Berkeley, Calif, USA 195-206.Google Scholar
  3. Snoek CGM, Worring M, Hauptmann AG: Detection of TV news monologues by style analysis. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 2: 1103-1106.Google Scholar
  4. Tseng BL, Lin C-Y, Naphade MR, Natsev A, Smith JR: Normalized classifier fusion for semantic visual concept detection. Proceedings of IEEE International Conference on Image Processing (ICIP '03), September 2003, Barcelona, Spain 2: 535-538.Google Scholar
  5. Amir A, Iyengar G, Lin C-Y, et al.: The IBM semantic concept detection framework. 2003, http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.htmlGoogle Scholar
  6. Lin C-Y, Tseng BL, Smith JR: VideoAnnEx: IBM MPEG-7 annotation tool for multimedia indexing and concept learning. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USAGoogle Scholar
  7. Adams WH, Iyengar G, Lin C-Y, et al.: Semantic indexing of multimedia content using visual, audio, and text cues. EURASIP Journal on Applied Signal Processing 2003, 2003(2):170-185. 10.1155/S1110865703211173View ArticleGoogle Scholar
  8. Wang L, Chan KL, Zhang Z: Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 1: 629-634.Google Scholar
  9. Wu Y, Tian Q, Huang TS: Discriminant-EM algorithm with application to image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '00), June 2000, Hilton Head Island, SC, USA 1: 222-227.View ArticleGoogle Scholar
  10. Lu YL, Hu C, Zhu X, Zhang HJ, Yang Q: A unified framework for semantics and feature based relevance feedback in image retrieval systems. Proceedings of 8th ACM International Conference on Multimedia (MM '00), October–November 2000, Los Angeles, Calif, USA 31-37.Google Scholar
  11. Liu W, Sun Y, Zhang H: MiAlbum—a system for home photo management using the semi-automatic image annotation approach. Proceedings of 8th ACM International Conference on Multimedia (MM '00), October–November 2000, Los Angeles, Calif, USA 479-480.Google Scholar
  12. Benitez AB, Chang S-F: Automatic multimedia knowledge discovery, summarization and evaluation. to appear in IEEE Trans. MultimediaGoogle Scholar
  13. Benitez AB, Smith JR, Chang S-F: MediaNet: a multimedia information network for knowledge representation. Internet Multimedia Management Systems, November 2000, Boston, Mass, USA, Proceedings of SPIE 4210: 1-12.View ArticleGoogle Scholar
  14. Benitez AB, Chang S-F: Image classification using multimedia knowledge networks. Proceedings of IEEE International Confererence on Image Processing (ICIP '03), September 2003, Barcelona, Spain 3: 613-616.Google Scholar
  15. Duygulu P, Barnard K, de Freitas N, Forsyth D: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Proceedings of 7th European Conference on Computer Vision (ECCV '02), May 2002, Copenhagen, Denmark 4: 97-112.MATHGoogle Scholar
  16. Barnard K, Duygulu P, Forsyth D, de Freitas N, Blei DM, Jordan MI: Matching words and pictures. Journal of Machine Learning Research 2003, 3(6):1107-1135.MATHGoogle Scholar
  17. Kutics A, Nakagawa A, Tanaka K, Yamada M, Sanbe Y, Ohtsuka S: Linking images and keywords for semantics-based image retrieval. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA 1: 777-780.Google Scholar
  18. Li J, Wang JZ: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions On Pattern Analysis And Machine Intelligence 2003, 25(9):1075-1088. 10.1109/TPAMI.2003.1227984View ArticleGoogle Scholar
  19. Barnard K, Duygulu P, Guru R, Gabbur P, Forsyth D: The effects of segmentation and feature choice in a translation model of object recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 675-682.Google Scholar
  20. Fergus R, Perona P, Zisserman A: Object class recognition by unsupervised scale-invariant learning. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 264-271.Google Scholar
  21. Selinger A, Nelson RC: Minimally supervised acquisition of 3D recognition models from cluttered images. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001, Kauai, Hawaii, USA 1: 213-220.Google Scholar
  22. Weber M, Welling M, Perona P: Unsupervised learning of models for recognition. Proceedings of 6th European Conference on Computer Vision (ECCV '00), June–July 2000, Dublin, Ireland 1: 18-32.Google Scholar
  23. Schmid C: Constructing models for content-based image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001, Kauai, Hawaii, USA 2: 39-45.Google Scholar
  24. Vapnik VN: Statistical Learning Theory. John Wiley & Sons, New York, NY, USA; 1998.MATHGoogle Scholar
  25. Bezdek JC: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, NY, USA; 1981.View ArticleMATHGoogle Scholar
  26. Vailaya A, Figueiredo MAT, Jain AK, Zhang H-J: Image classification for content-based indexing. IEEE Transactions On Image Processing 2001, 10(1):117-130. 10.1109/83.892448View ArticleMATHGoogle Scholar
  27. Manjunath BS, Ma WY: Texture features for browsing and retrieval of image data. IEEE Transactions On Pattern Analysis And Machine Intelligence 1996, 18(8):837-842. 10.1109/34.531803View ArticleGoogle Scholar
  28. Boughorbel S, Tarel J-P, Fleuret F: Non-mercer kernel for SVM object recognition. Proceedings of British Machine Vision Conference (BMVC '04), September 2004, London, UK 137-146.Google Scholar
  29. Joachims T: Making large-scale SVM learning practical. In Advances in Kernel Methods—Support Vector Learning. Edited by: Schölkopf B, Burges CJC, Smola A. MIT Press, Cambridge, Mass, USA; 1999:169-184.Google Scholar
  30. Bishop CM: Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK; 1995.MATHGoogle Scholar
  31. Papageorgiou CP, Oren M, Poggio T: A general framework for object detection. Proceedings of IEEE 6th International Conference on Computer Vision (ICCV '98), January 1998, Bombay, India 555-562.Google Scholar
  32. Swain MJ, Ballard DH: Color indexing. International Journal of Computer Vision 1991, 7(1):11-32. 10.1007/BF00130487View ArticleGoogle Scholar
  33. Szummer M, Picard RW: Indoor-outdoor image classification. Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Databases, January 1998, Bombay, India 42-51.View ArticleGoogle Scholar

Copyright

© Lim and Jin 2006

Advertisement