Discovering Recurrent Image Semantics from Class Discrimination

Lim, Joo-Hwee; Jin, Jesse S.

doi:10.1155/ASP/2006/76093

Research Article
Open access
Published: 01 December 2006

Discovering Recurrent Image Semantics from Class Discrimination

Joo-Hwee Lim¹ &
Jesse S. Jin^2,3

EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 076093 (2006) Cite this article

1218 Accesses
6 Citations
Metrics details

Abstract

Supervised statistical learning has become a critical means to design and learn visual concepts (e.g., faces, foliage, buildings, etc.) in content-based indexing systems. The drawback of this approach is the need of manual labeling of regions. While several automatic image annotation methods proposed recently are very promising, they usually rely on the availability and analysis of associated text descriptions. In this paper, we propose a hybrid learning framework to discover local semantic regions and generate their samples for training of local detectors with minimal human intervention. A multiscale segmentation-free framework is proposed to embed the soft presence of discovered semantic regions and local class patterns in an image independently for indexing and matching. Based on 2400 heterogeneous consumer images with 16 semantic queries, both similarity matching based on individual index and integrated similarity matching have outperformed a feature fusion approach by 26% and 37% in average precisions, respectively.

References

Hsu WH-M, Chang S-F: Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 2: 1091–1094.
Google Scholar
Li B, Goh K, Chang EY: Confidence-based dynamic ensemble for image annotation and semantics discovery. Proceedings of 11th ACM International Conference on Multimedia (MM '03), November 2003, Berkeley, Calif, USA 195–206.
Google Scholar
Snoek CGM, Worring M, Hauptmann AG: Detection of TV news monologues by style analysis. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 2: 1103–1106.
Google Scholar
Tseng BL, Lin C-Y, Naphade MR, Natsev A, Smith JR: Normalized classifier fusion for semantic visual concept detection. Proceedings of IEEE International Conference on Image Processing (ICIP '03), September 2003, Barcelona, Spain 2: 535–538.
Google Scholar
Amir A, Iyengar G, Lin C-Y, et al.: The IBM semantic concept detection framework. 2003, https://doi.org/www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
Google Scholar
Lin C-Y, Tseng BL, Smith JR: VideoAnnEx: IBM MPEG-7 annotation tool for multimedia indexing and concept learning. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA
Google Scholar
Adams WH, Iyengar G, Lin C-Y, et al.: Semantic indexing of multimedia content using visual, audio, and text cues. EURASIP Journal on Applied Signal Processing 2003, 2003(2):170–185. 10.1155/S1110865703211173
Google Scholar
Wang L, Chan KL, Zhang Z: Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 1: 629–634.
Google Scholar
Wu Y, Tian Q, Huang TS: Discriminant-EM algorithm with application to image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '00), June 2000, Hilton Head Island, SC, USA 1: 222–227.
Google Scholar
Lu YL, Hu C, Zhu X, Zhang HJ, Yang Q: A unified framework for semantics and feature based relevance feedback in image retrieval systems. Proceedings of 8th ACM International Conference on Multimedia (MM '00), October–November 2000, Los Angeles, Calif, USA 31–37.
Google Scholar
Liu W, Sun Y, Zhang H: MiAlbum—a system for home photo management using the semi-automatic image annotation approach. Proceedings of 8th ACM International Conference on Multimedia (MM '00), October–November 2000, Los Angeles, Calif, USA 479–480.
Google Scholar
Benitez AB, Chang S-F: Automatic multimedia knowledge discovery, summarization and evaluation. to appear in IEEE Trans. Multimedia
Benitez AB, Smith JR, Chang S-F: MediaNet: a multimedia information network for knowledge representation. Internet Multimedia Management Systems, November 2000, Boston, Mass, USA, Proceedings of SPIE 4210: 1–12.
Article Google Scholar
Benitez AB, Chang S-F: Image classification using multimedia knowledge networks. Proceedings of IEEE International Confererence on Image Processing (ICIP '03), September 2003, Barcelona, Spain 3: 613–616.
Google Scholar
Duygulu P, Barnard K, de Freitas N, Forsyth D: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Proceedings of 7th European Conference on Computer Vision (ECCV '02), May 2002, Copenhagen, Denmark 4: 97–112.
MATH Google Scholar
Barnard K, Duygulu P, Forsyth D, de Freitas N, Blei DM, Jordan MI: Matching words and pictures. Journal of Machine Learning Research 2003, 3(6):1107–1135.
MATH Google Scholar
Kutics A, Nakagawa A, Tanaka K, Yamada M, Sanbe Y, Ohtsuka S: Linking images and keywords for semantics-based image retrieval. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA 1: 777–780.
Google Scholar
Li J, Wang JZ: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions On Pattern Analysis And Machine Intelligence 2003, 25(9):1075–1088. 10.1109/TPAMI.2003.1227984
Article Google Scholar
Barnard K, Duygulu P, Guru R, Gabbur P, Forsyth D: The effects of segmentation and feature choice in a translation model of object recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 675–682.
Google Scholar
Fergus R, Perona P, Zisserman A: Object class recognition by unsupervised scale-invariant learning. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 264–271.
Google Scholar
Selinger A, Nelson RC: Minimally supervised acquisition of 3D recognition models from cluttered images. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001, Kauai, Hawaii, USA 1: 213–220.
Google Scholar
Weber M, Welling M, Perona P: Unsupervised learning of models for recognition. Proceedings of 6th European Conference on Computer Vision (ECCV '00), June–July 2000, Dublin, Ireland 1: 18–32.
Google Scholar
Schmid C: Constructing models for content-based image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001, Kauai, Hawaii, USA 2: 39–45.
Google Scholar
Vapnik VN: Statistical Learning Theory. John Wiley & Sons, New York, NY, USA; 1998.
MATH Google Scholar
Bezdek JC: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, NY, USA; 1981.
Book Google Scholar
Vailaya A, Figueiredo MAT, Jain AK, Zhang H-J: Image classification for content-based indexing. IEEE Transactions On Image Processing 2001, 10(1):117–130. 10.1109/83.892448
Article Google Scholar
Manjunath BS, Ma WY: Texture features for browsing and retrieval of image data. IEEE Transactions On Pattern Analysis And Machine Intelligence 1996, 18(8):837–842. 10.1109/34.531803
Article Google Scholar
Boughorbel S, Tarel J-P, Fleuret F: Non-mercer kernel for SVM object recognition. Proceedings of British Machine Vision Conference (BMVC '04), September 2004, London, UK 137–146.
Google Scholar
Joachims T: Making large-scale SVM learning practical. In Advances in Kernel Methods—Support Vector Learning. Edited by: Schölkopf B, Burges CJC, Smola A. MIT Press, Cambridge, Mass, USA; 1999:169–184.
Google Scholar
Bishop CM: Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK; 1995.
MATH Google Scholar
Papageorgiou CP, Oren M, Poggio T: A general framework for object detection. Proceedings of IEEE 6th International Conference on Computer Vision (ICCV '98), January 1998, Bombay, India 555–562.
Google Scholar
Swain MJ, Ballard DH: Color indexing. International Journal of Computer Vision 1991, 7(1):11–32. 10.1007/BF00130487
Article Google Scholar
Szummer M, Picard RW: Indoor-outdoor image classification. Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Databases, January 1998, Bombay, India 42–51.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Joo-Hwee Lim
School of Design, Communication and Information Technology, Faculty of Science and Information Technology, University of Newcastle, Callaghan, NSW, 2308, Australia
Jesse S. Jin
Chair of Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Erlangen, 91058, Germany
Jesse S. Jin

Authors

Joo-Hwee Lim
View author publications
You can also search for this author in PubMed Google Scholar
Jesse S. Jin
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lim, JH., Jin, J.S. Discovering Recurrent Image Semantics from Class Discrimination. EURASIP J. Adv. Signal Process. 2006, 076093 (2006). https://doi.org/10.1155/ASP/2006/76093

Download citation

Received: 17 August 2004
Revised: 01 March 2005
Accepted: 05 April 2005
Published: 01 December 2006
DOI: https://doi.org/10.1155/ASP/2006/76093

Discovering Recurrent Image Semantics from Class Discrimination

Abstract

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords