Skip to main content


Compact Visualisation of Video Summaries

Article metrics


This paper presents a system for compact and intuitive video summarisation aimed at both high-end professional production environments and small-screen portable devices. To represent large amounts of information in the form of a video key-frame summary, this paper studies the narrative grammar of comics, and using its universal and intuitive rules, lays out visual summaries in an efficient and user-centered way. In addition, the system exploits visual attention modelling and rapid serial visual presentation to generate highly compact summaries on mobile devices. A robust real-time algorithm for key-frame extraction is presented. The system ranks importance of key-frame sizes in the final layout by balancing the dominant visual representability and discovery of unanticipated content utilising a specific cost function and an unsupervised robust spectral clustering technique. A final layout is created using an optimisation algorithm based on dynamic programming. Algorithm efficiency and robustness are demonstrated by comparing the results with a manually labelled ground truth and with optimal panelling solutions.


  1. 1.

    Smeulders AWM, Worring M, Santini S, Gupta A, Jain R: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(12):1349-1380. 10.1109/34.895972

  2. 2.

    McCloud S: Understanding Comics. Tundra, Northampton, UK; 1993.

  3. 3.

    Dony RD, Mateer JW, Robinson JA: Techniques for automated reverse storyboarding. IEE Proceedings: Vision, Image and Signal Processing 2005,152(4):425-436. 10.1049/ip-vis:20045109

  4. 4.

    Collomosse JP, Rowntree D, Hall PM: Video analysis for cartoon-like special effects. Proceedings of the 14th British Machine Vision Conference (BMVC '03), September 2003, Norwich, UK 749–758.

  5. 5.

    Uchihashi S, Foote J, Girgensohn A, Boreczky J: Video manga: generating semantically meaningful video summaries. Proceedings of the 7th ACM International Multimedia Conference & Exhibition (MULTIMEDIA '99), October-November 1999, Orlando, Fla, USA 383–392.

  6. 6.

    Girgensohn A: A fast layout algorithm for visual video summaries. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA 2: 77–80.

  7. 7.

    Ćalić J, Thomas B: Spatial analysis in key-frame extraction using video segmentation. Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '04), April 2004, Lisboa, Portugal

  8. 8.

    Bellman RE, Dreyfus SE: Applied Dynamic Programming. Princeton University Press, Princeton, NJ, USA; 1962.

  9. 9.

    Itti L, Koch C, Niebur E: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998,20(11):1254-1259. 10.1109/34.730558

  10. 10.

    de Bruijn O, Spence R: Rapid serial visual presentation: a space-time trade-off in information presentation. Proceedings of the Working Conference on Advanced Visual Interfaces (AVI '00), May 2000, Palermo, Italy 189–192.

  11. 11.

    Porter S, Mirmehdi M, Thomas B: Temporal video segmentation and classification of edit effects. Image and Vision Computing 2003,21(13-14):1097-1106. 10.1016/j.imavis.2003.08.014

  12. 12.

    Ćalić J, Campbell NW, Mirmehdi M, et al.: ICBR: multimedia management system for intelligent content based retrieval. In Proceedings of the 3rd International Conference on Image and Video Retrieval (CIVR '04), July 2004, Dublin, Ireland, Lecture Notes in Computer Science. Volume 3115. Springer; 601–609.

  13. 13.

    Yeo B-L, Liu B: Rapid scene analysis on compressed video. IEEE Transactions on Circuits and Systems for Video Technology 1995,5(6):533-544. 10.1109/76.475896

  14. 14.

    Lucas BD, Kanade T: An iterative image registration technique with an application to stereo vision. Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI '81), August 1981, Vancouver, BC, Canada 674–679.

  15. 15.

    Kuleshov LV, Levaco R: Kuleshov on film: writings by Lev Kuleshov. University of California Press, Berkeley, Calif, USA; 1974.

  16. 16.

    Derrida J: Of Grammatology. Johns Hopkins University Press, Baltimore, Md, USA; 1997.

  17. 17.

    Chung FRK: Spectral Graph Theory, CBMS Regional Conference Series in Mathematics. Volume 92. American Mathematical Society, Providence, RI, USA; 1997.

  18. 18.

    Jain AK, Murty MN, Flynn PJ: Data clustering: a review. ACM Computing Surveys 1999,31(3):316-323.

  19. 19.

    Shi J, Malik J: Normalized cuts and image segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '97), June 1997, San Juan, Puerto Rico, USA 731–737.

  20. 20.

    Zelnik-Manor L, Perona P: Self-tuning spectral clustering. Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS '04), December 2004, Vancouver, BC, Canada

  21. 21.

    Ng MJA, Jordan MI, Weiss Y: On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems 14 (NIPS '01), December 2001, Vancouver, BC, Canada 849–856.

  22. 22.

    Polito M, Perona P: Grouping and dimensionality reduction by locally linear embedding. Advances in Neural Information Processing Systems 14 (NIPS '01), December 2001, Vancouver, BC, Canada 1255–1262.

  23. 23.

    Ridler TW, Calvard S: Picture thresholding using an iterative selection method. IEEE Transactions on Systems, Man and Cybernetics 1978,8(8):629-632.

  24. 24.

    Eisner W: Comics and Sequential Art. Poorhouse, Tamarac, Fla, USA; 2001.

  25. 25.

    Andrews GE: The Theory of Partitions, Encyclopedia of Mathematics and Its Applications. Volume 2. Addison-Wesley, Reading, Mass, USA; 1976.

  26. 26.

    Nijenhuis A, Wilf HS: Combinatorial Algorithms: For Computers and Calculators, Computer Science and Applied Mathematics. 2nd edition. Academic Press, New York, NY, USA; 1978.

  27. 27.

    Lodi A, Martello S, Monaci M: Two-dimensional packing problems: a survey. European Journal of Operational Research 2002,141(2):241-252. 10.1016/S0377-2217(02)00123-6

  28. 28.

    Sweeney PE, Paternoster ER: Cutting and packing problems: a categorized, application-orientated research bibliography. Journal of the Operational Research Society 1992,43(7):691-706.

  29. 29.

    Spence R: Rapid, serial and visual: a presentation technique with potential. Information Visualization 2002,1(1):13-19.

  30. 30.

    Tse T, Marchionini G, Ding W, Slaughter L, Komlodi A: Dynamic key frame presentation techniques for augmenting video browsing. Proceedings of the Working Conference on Advanced Visual Interfaces (AVI '98), May 1998, L'Aquila, Italy 185–194.

  31. 31.

    Itti L, Koch C: Computational modelling of visual attention. Nature Reviews Neuroscience 2001,2(3):194-203. 10.1038/35058500

  32. 32.

    Walther D: Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics, Ph.D. thesis. California Institute of Technology, Pasadena, Calif, USA; 2006.

  33. 33.

    Calic J, Gibson D, Campbell NW: Efficient Layout of Comic-Like Video Summaries. IEEE Transactions on Circuits and Systems for Video Technology 2007,17(7):931-936.

  34. 34.

    Ćalić J, Campbell NW, Calway A, et al.: Towards intelligent content based retrieval of wildlife videos. Proceedings of the 6th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '05), April 2005, Montreux, Switzerland

Download references

Author information

Correspondence to Janko Ćalić.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article


  • Spectral Cluster
  • Rapid Serial Visual Presentation
  • Video Summarisation
  • Label Ground Truth
  • Intuitive Rule