A lot of work has been devoted to content-based image retrieval from large image databases. The traditional approaches are based on the analysis of the whole image content both in terms of low-level and semantic characteristics. We investigate in this paper an approach based on attentional mechanisms and active vision. We describe a visual architecture that combines bottom-up and top-down approaches for identifying regions of interest according to a given goal. We show that a coarse description of the searched target combined with a bottom-up saliency map provides an efficient way to find specified targets on images. The proposed system is a first step towards the development of software agents able to search for image content in image databases.