- Research Article
- Open Access
An Attention-Driven Model for Grouping Similar Images with Image Retrieval Applications
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 043450 (2006)
Recent work in the computational modeling of visual attention has demonstrated that a purely bottom-up approach to identifying salient regions within an image can be successfully applied to diverse and practical problems from target recognition to the placement of advertisement. This paper proposes an application of a combination of computational models of visual attention to the image retrieval problem. We demonstrate that certain shortcomings of existing content-based image retrieval solutions can be addressed by implementing a biologically motivated, unsupervised way of grouping together images whose salient regions of interest (ROIs) are perceptually similar regardless of the visual contents of other (less relevant) parts of the image. We propose a model in which only the salient regions of an image are encoded as ROIs whose features are then compared against previously seen ROIs and assigned cluster membership accordingly. Experimental results show that the proposed approach works well for several combinations of feature extraction techniques and clustering algorithms, suggesting a promising avenue for future improvements, such as the addition of a top-down component and the inclusion of a relevance feedback mechanism.
Marques O, Furht B: Content-Based Image and Video Retrieval. Kluwer Academic, Boston, Mass, USA; 2002.
Rui Y, Huang TS, Chang S-F: Image retrieval: current techniques, promising directions, and open issues. Journal of Visual Communication and Image Representation 1999,10(1):39–62. 10.1006/jvci.1999.0413
Smeulders AMW, Worring M, Santini S, Gupta A, Jain R: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(12):1349–1380. 10.1109/34.895972
Enser PGB, Sandom CJ: Towards a comprehensive survey of the semantic gap in visual image retrieval. Proceedings of the 2nd International Conference on Image and Video Retrieval (CIVR '03), July 2003, Urbana-Champaign, Ill, USA 291–299.
Zhao R, Grosky WI: Narrowing the semantic gap—improved text-based web document retrieval using visual features. IEEE Transactions on Multimedia 2002,4(2):189–200. 10.1109/TMM.2002.1017733
Zhao R, Grosky WI: Negotiating the semantic gap: from feature maps to semantic landscapes. Pattern Recognition 2002,35(3):593–600. 10.1016/S0031-3203(01)00062-0
Colombo C, Del Bimbo A: Visible image retrieval. In Image Databases: Search and Retrieval of Digital Imagery. Edited by: Castelli V, Bergman LD. John Wiley & Sons, New York, NY, USA; 2002:11–33. chapter 2
Leung CHC, Ip HH-S: Benchmarking for content-based visual information search. Proceedings of the 4th International Conference on Advances in Visual Information Systems (VISUAL '00), November 2000, Lyon, France 442–456.
Müller H, Müller W, Squire DM: Automated benchmarking in content-based image retrieval. Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '01), August 2001, Tokyo, Japan 290.
Carson C, Belongie S, Greenspan H, Malik J: Blobworld: image segmentation using expectation-maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence 2002,24(8):1026–1038. 10.1109/TPAMI.2002.1023800
Ma W-Y, Manjunath BS: NeTra: a toolbox for navigating large image databases. Multimedia Systems 1999,7(3):184–198. 10.1007/s005300050121
Li Y, Shapiro L: Object recognition for content-based image retrieval. https://doi.org/www.cs.washington.edu/homes/shapiro/
Hoiem D, Sukthankar R, Schneiderman H, Huston L: Object-based image retrieval using the statistical structure of images. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), June-July 2004, Washington, DC, USA 2: 490–497.
Tao Y, Grosky WI: Image matching using the OBIR system with feature point histograms. Proceedings of the 4th Working Conference on Visual Database Systems (VDB '98), May 1998, L'Aquila, Italy 192–197.
Itti L, Koch C, Niebur E: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998,20(11):1254–1259. 10.1109/34.730558
Stentiford FWM: An attention based similarity measure with application to content-based information retrieval. Storage and Retrieval for Media Databases, January 2003, Santa Clara, Calif, USA, Proceedings of SPIE 5021: 221–232.
Baeza-Yates R, Ribeiro-Neto B: Modern Information Retrieval. Addison-Wesley/ACM Press, New York, NY, USA; 1999.
Chang S-F, Smith JR, Beigi M, Benitez A: Visual information retrieval from large distributed online repositories. Communications of the ACM 1997,40(12):63–71. 10.1145/265563.265573
Palmer S: Vision Science: Photons to Phenomenology. MIT Press, Cambridge, Mass, USA; 1999.
Veltkamp R, Tanase M: A survey of content-based image retrieval systems. In Content-Based Image and Video Retrieval. Edited by: Marques O, Furht B. Kluwer Academic, Boston, Mass, USA; 2002:47–101. chapter 5
Chang E, Cheng K-T, Lai W-C, Wu C-T, Chang C, Wu Y-L: PBIR: perception-based image retrieval-a system that can quickly capture subjective image query concepts. Proceedings of the 9th ACM International Conference on Multimedia, September 2001, Ottawa, Canada 611–614.
Marques O, Barman N: Semi-automatic semantic annotation of images using machine learning techniques. Proceedings of the 2nd International Semantic Web Conference (ISWC '03), October 2003, Sanibel Island, Fla, USA, Lecture Notes in Computer Science 2870: 550–565.
Marques O, Furht B: MUSE: a content-based image search and retrieval system using relevance feedback. Multimedia Tools and Applications 2002,17(1):21–50. 10.1023/A:1014679605305
Oliva A: Gist of a scene. In Neurobiology of Attention. Edited by: Itti L, Rees G, Tsotsos J. Academic Press, Elsevier, New York, NY, USA; 2005:251–256. chapter 41
Styles EA: Attention, Perception, and Memory: An Integrated Introduction. Taylor & Francis Routledge, New York, NY, USA; 2005.
Noton D, Stark L: Scanpaths in eye movements during pattern perception. Science 1971,171(968):308–311. 10.1126/science.171.3968.308
Connor C, Egeth H, Yantis S: Visual attention: bottom-up versus top-down. Current Biology 2004,14(19):R850–R852. 10.1016/j.cub.2004.09.041
Santini S, Jain R: The graphical specification of similarity queries. Journal of Visual Languages and Computing 1996,7(4):403–421. 10.1006/jvlc.1996.0021
Pylyshyn ZW: Seeing and Visualizing: It's Not What You Think. MIT Press, Cambridge, Mass, USA; 2006.
Palmer S: The effects of contextual scenes on the identification of objects. Memory & Cognition 1975,3(5):519–526. 10.3758/BF03197524
Biederman I: Perceiving real-world scenes. Science 1972,177(43):77–80. 10.1126/science.177.4043.77
Itti L, Koch C: Computational modeling of visual attention. Nature Reviews Neuroscience 2001,2(3):194–203. 10.1038/35058500
Rutishauser U, Walther D, Koch C, Perona P: Is bottom-up attention useful for object recognition? Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), June-July 2004, Washington, DC, USA 2: 37–44.
Walther D, Itti L, Riesenhuber M, Poggio T, Koch C: Attentional selection for object recognition—a gentle way. Proceedings of the 2nd International Workshop on Biologically Motivated Computer Vision (BMCV '02), November 2002, Tubingen, Germany, Lecture Notes In Computer Science 2525: 472–479.
Navalpakkam V, Itti L: Modeling the influence of task on attention. Vision Research 2005,45(2):205–231. 10.1016/j.visres.2004.07.042
Einhäuser W, König P: Does luminance-contrast contribute to a saliency map for overt visual attention? European Journal of Neuroscience 2003,17(5):1089–1097. 10.1046/j.1460-9568.2003.02508.x
Parkhurst D, Law K, Niebur E: Modeling the role of salience in the allocation of overt visual attention. Vision Research 2002,42(1):107–123. 10.1016/S0042-6989(01)00250-4
Parkhurst D, Niebur E: Texture contrast attracts overt visual attention in natural scenes. European Journal of Neuroscience 2004,19(3):783–789. 10.1111/j.0953-816X.2003.03183.x
Peters RJ, Iyer A, Itti L, Koch C: Components of bottom-up gaze allocation in natural images. Vision Research 2005,45(18):2397–2416. 10.1016/j.visres.2005.03.019
Henderson JM, Brockmole JR, Castelhano MS, Mack M: Image salience versus cognitive control of eye movements in real-world scenes: evidence from visual search. In Eye Movement Research: Insights Into Mind and Brain. Edited by: van Gompel R, Fischer M, Murray W, Hill R. Elsevier, Amsterdam, The Netherlands; in press
Itti L, Koch C: Feature combination strategies for saliency-based visual attention systems. Journal of Electronic Imaging 2001,10(1):161–169. 10.1117/1.1333677
Bamidele A, Stentiford FWM, Morphett J: An attention-based approach to content-based image retrieval. BT Technology Journal 2004,22(3):151–160.
Boccignone G, Picariello A, Moscato V, Albanese M: Image similarity based on animate vision: information path matching. Proceedings of the 8th International Workshop on Multimedia Information Systems (MIS '02), October 2002, Tempe, Ariz, USA 66–75.
Ballard DH: Animate vision. Artificial Intelligence 1991,48(1):57–86. 10.1016/0004-3702(91)90080-4
Bamidele A, Stentiford FWM: An attention based similarity measure used to identify image clusters. Proceedings of the 2nd European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, November-December 2005, London, UK
Machrouh J, Tarroux P: Attentional mechanisms for interactive image exploration. EURASIP Journal on Applied Signal Processing 2005,2005(14):2391–2396. 10.1155/ASP.2005.2391
Bradley AP, Stentiford FWM: JPEG 2000 and region of interest coding. Proceedings of Digital Image Computing: Techniques and Applications (DICTA '02), January 2002, Melbourne, Australia 303–308.
Draper B, Baek K, Boody J: Implementing the expert object recognition pathway. Proceedings of the 3rd International Conference on Vision Systems (ICVS '03), April 2003, Graz, Austria
Itti L, Gold C, Koch C: Visual attention and target detection in cluttered natural scenes. Optical Engineering 2001,40(9):1784–1793. 10.1117/1.1389063
Itti L, Koch C: A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research 2000,40(10–12):1489–1506.
Newcombe R: An interactive bottom-up visual attention toolkit in Java. https://doi.org/privatewww.essex.ac.uk/~ranewc/research/visualAttentionJava.html
Ma W-Y, Zhang HJ: Benchmarking of image features for content-based retrieval. Proceedings of the 32nd IEEE Conference Record of the Asilomar Conference on Signals, Systems and Computers, November 1998, Pacific Grove, Calif, USA 1: 253–256.
Chen Y, Wang JZ, Krovetz R: CLUE: cluster-based retrieval of images by unsupervised learning. IEEE Transactions on Image Processing 2005,14(8):1187–1201.
Kaufman L, Rousseeuw P: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, New York, NY, USA; 1990.
About this article
Cite this article
Marques, O., Mayron, L.M., Borba, G.B. et al. An Attention-Driven Model for Grouping Similar Images with Image Retrieval Applications. EURASIP J. Adv. Signal Process. 2007, 043450 (2006). https://doi.org/10.1155/2007/43450
- Feature Extraction
- Visual Attention
- Image Retrieval
- Similar Image
- Relevance Feedback