Skip to main content

Compact Visualisation of Video Summaries


This paper presents a system for compact and intuitive video summarisation aimed at both high-end professional production environments and small-screen portable devices. To represent large amounts of information in the form of a video key-frame summary, this paper studies the narrative grammar of comics, and using its universal and intuitive rules, lays out visual summaries in an efficient and user-centered way. In addition, the system exploits visual attention modelling and rapid serial visual presentation to generate highly compact summaries on mobile devices. A robust real-time algorithm for key-frame extraction is presented. The system ranks importance of key-frame sizes in the final layout by balancing the dominant visual representability and discovery of unanticipated content utilising a specific cost function and an unsupervised robust spectral clustering technique. A final layout is created using an optimisation algorithm based on dynamic programming. Algorithm efficiency and robustness are demonstrated by comparing the results with a manually labelled ground truth and with optimal panelling solutions.


  1. 1.

    Smeulders AWM, Worring M, Santini S, Gupta A, Jain R: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(12):1349-1380. 10.1109/34.895972

    Article  Google Scholar 

  2. 2.

    McCloud S: Understanding Comics. Tundra, Northampton, UK; 1993.

    Google Scholar 

  3. 3.

    Dony RD, Mateer JW, Robinson JA: Techniques for automated reverse storyboarding. IEE Proceedings: Vision, Image and Signal Processing 2005,152(4):425-436. 10.1049/ip-vis:20045109

    Google Scholar 

  4. 4.

    Collomosse JP, Rowntree D, Hall PM: Video analysis for cartoon-like special effects. Proceedings of the 14th British Machine Vision Conference (BMVC '03), September 2003, Norwich, UK 749–758.

    Google Scholar 

  5. 5.

    Uchihashi S, Foote J, Girgensohn A, Boreczky J: Video manga: generating semantically meaningful video summaries. Proceedings of the 7th ACM International Multimedia Conference & Exhibition (MULTIMEDIA '99), October-November 1999, Orlando, Fla, USA 383–392.

    Google Scholar 

  6. 6.

    Girgensohn A: A fast layout algorithm for visual video summaries. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA 2: 77–80.

    Google Scholar 

  7. 7.

    Ćalić J, Thomas B: Spatial analysis in key-frame extraction using video segmentation. Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '04), April 2004, Lisboa, Portugal

    Google Scholar 

  8. 8.

    Bellman RE, Dreyfus SE: Applied Dynamic Programming. Princeton University Press, Princeton, NJ, USA; 1962.

    Book  Google Scholar 

  9. 9.

    Itti L, Koch C, Niebur E: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998,20(11):1254-1259. 10.1109/34.730558

    Article  Google Scholar 

  10. 10.

    de Bruijn O, Spence R: Rapid serial visual presentation: a space-time trade-off in information presentation. Proceedings of the Working Conference on Advanced Visual Interfaces (AVI '00), May 2000, Palermo, Italy 189–192.

    Google Scholar 

  11. 11.

    Porter S, Mirmehdi M, Thomas B: Temporal video segmentation and classification of edit effects. Image and Vision Computing 2003,21(13-14):1097-1106. 10.1016/j.imavis.2003.08.014

    Article  Google Scholar 

  12. 12.

    Ćalić J, Campbell NW, Mirmehdi M, et al.: ICBR: multimedia management system for intelligent content based retrieval. In Proceedings of the 3rd International Conference on Image and Video Retrieval (CIVR '04), July 2004, Dublin, Ireland, Lecture Notes in Computer Science. Volume 3115. Springer; 601–609.

    Google Scholar 

  13. 13.

    Yeo B-L, Liu B: Rapid scene analysis on compressed video. IEEE Transactions on Circuits and Systems for Video Technology 1995,5(6):533-544. 10.1109/76.475896

    Article  Google Scholar 

  14. 14.

    Lucas BD, Kanade T: An iterative image registration technique with an application to stereo vision. Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI '81), August 1981, Vancouver, BC, Canada 674–679.

    Google Scholar 

  15. 15.

    Kuleshov LV, Levaco R: Kuleshov on film: writings by Lev Kuleshov. University of California Press, Berkeley, Calif, USA; 1974.

    Google Scholar 

  16. 16.

    Derrida J: Of Grammatology. Johns Hopkins University Press, Baltimore, Md, USA; 1997.

    Google Scholar 

  17. 17.

    Chung FRK: Spectral Graph Theory, CBMS Regional Conference Series in Mathematics. Volume 92. American Mathematical Society, Providence, RI, USA; 1997.

    Google Scholar 

  18. 18.

    Jain AK, Murty MN, Flynn PJ: Data clustering: a review. ACM Computing Surveys 1999,31(3):316-323.

    Article  Google Scholar 

  19. 19.

    Shi J, Malik J: Normalized cuts and image segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '97), June 1997, San Juan, Puerto Rico, USA 731–737.

    Google Scholar 

  20. 20.

    Zelnik-Manor L, Perona P: Self-tuning spectral clustering. Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS '04), December 2004, Vancouver, BC, Canada

    Google Scholar 

  21. 21.

    Ng MJA, Jordan MI, Weiss Y: On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems 14 (NIPS '01), December 2001, Vancouver, BC, Canada 849–856.

    Google Scholar 

  22. 22.

    Polito M, Perona P: Grouping and dimensionality reduction by locally linear embedding. Advances in Neural Information Processing Systems 14 (NIPS '01), December 2001, Vancouver, BC, Canada 1255–1262.

    Google Scholar 

  23. 23.

    Ridler TW, Calvard S: Picture thresholding using an iterative selection method. IEEE Transactions on Systems, Man and Cybernetics 1978,8(8):629-632.

    Google Scholar 

  24. 24.

    Eisner W: Comics and Sequential Art. Poorhouse, Tamarac, Fla, USA; 2001.

    Google Scholar 

  25. 25.

    Andrews GE: The Theory of Partitions, Encyclopedia of Mathematics and Its Applications. Volume 2. Addison-Wesley, Reading, Mass, USA; 1976.

    Google Scholar 

  26. 26.

    Nijenhuis A, Wilf HS: Combinatorial Algorithms: For Computers and Calculators, Computer Science and Applied Mathematics. 2nd edition. Academic Press, New York, NY, USA; 1978.

    MATH  Google Scholar 

  27. 27.

    Lodi A, Martello S, Monaci M: Two-dimensional packing problems: a survey. European Journal of Operational Research 2002,141(2):241-252. 10.1016/S0377-2217(02)00123-6

    MathSciNet  Article  Google Scholar 

  28. 28.

    Sweeney PE, Paternoster ER: Cutting and packing problems: a categorized, application-orientated research bibliography. Journal of the Operational Research Society 1992,43(7):691-706.

    Article  Google Scholar 

  29. 29.

    Spence R: Rapid, serial and visual: a presentation technique with potential. Information Visualization 2002,1(1):13-19.

    Article  Google Scholar 

  30. 30.

    Tse T, Marchionini G, Ding W, Slaughter L, Komlodi A: Dynamic key frame presentation techniques for augmenting video browsing. Proceedings of the Working Conference on Advanced Visual Interfaces (AVI '98), May 1998, L'Aquila, Italy 185–194.

    Google Scholar 

  31. 31.

    Itti L, Koch C: Computational modelling of visual attention. Nature Reviews Neuroscience 2001,2(3):194-203. 10.1038/35058500

    Article  Google Scholar 

  32. 32.

    Walther D: Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics, Ph.D. thesis. California Institute of Technology, Pasadena, Calif, USA; 2006.

    Google Scholar 

  33. 33.

    Calic J, Gibson D, Campbell NW: Efficient Layout of Comic-Like Video Summaries. IEEE Transactions on Circuits and Systems for Video Technology 2007,17(7):931-936.

    Article  Google Scholar 

  34. 34.

    Ćalić J, Campbell NW, Calway A, et al.: Towards intelligent content based retrieval of wildlife videos. Proceedings of the 6th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '05), April 2005, Montreux, Switzerland

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Janko Ćalić.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Ćalić, J., Campbell, N.W. Compact Visualisation of Video Summaries. EURASIP J. Adv. Signal Process. 2007, 019496 (2007).

Download citation


  • Spectral Cluster
  • Rapid Serial Visual Presentation
  • Video Summarisation
  • Label Ground Truth
  • Intuitive Rule