Skip to main content


An Automated Video Object Extraction System Based on Spatiotemporal Independent Component Analysis and Multiscale Segmentation


Video content analysis is essential for efficient and intelligent utilizations of vast multimedia databases over the Internet. In video sequences, object-based extraction techniques are important for content-based video processing in many applications. In this paper, a novel technique is developed to extract objects from video sequences based on spatiotemporal independent component analysis (stICA) and multiscale analysis. The stICA is used to extract the preliminary source images containing moving objects in video sequences. The source image data obtained after stICA analysis are further processed using wavelet-based multiscale image segmentation and region detection techniques to improve the accuracy of the extracted object. An automated video object extraction system is developed based on these new techniques. Preliminary results demonstrate great potential for the new stICA and multiscale-segmentation-based object extraction system in content-based video processing applications.


  1. 1.

    MPEG Video Group : Mpeg-4 video verification model version 11.0. ISO/IEC JTC1/SC29/WG11 MPEG98/N2172, March 1997

  2. 2.

    Wang JYA, Adelson EH: Representing moving images with layers. IEEE Transactions on Image Processing 1994, 3(5):625–638. 10.1109/83.334981

  3. 3.

    Borshukov GD, Bozdagi G, Altunbasak Y, Tekalp AM: Motion segmentation by multistage affine classification. IEEE Transactions on Image Processing 1997, 6(11):1591–1594. 10.1109/83.641420

  4. 4.

    Adiv G: Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 1985, 7(4):384–401.

  5. 5.

    Murray DW, Buxton BF: Scene segmentation from visual motion using global optimisation. IEEE Transactions on Pattern Analysis and Machine Intelligence 1987, 9(2):220–228.

  6. 6.

    Moscheni F, Bhattacharjee S, Kunt M: Spatio-temporal segmentation based on region merging. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998, 20(9):897–915. 10.1109/34.713358

  7. 7.

    Neri A, Colonnese S, Russo G, Talone P: Automatic moving object and background separation. Signal Processing 1998, 66(2):219–232. 10.1016/S0165-1684(98)00007-3

  8. 8.

    Kim C, Hwang J-N: Fast and automatic video object segmentation and tracking for content-based applications. IEEE Transactions on Circuits and Systems for Video Technology 2002, 12(2):122–129. 10.1109/76.988659

  9. 9.

    Papadimitriou T, Diamantaras KI, Strintzis MG, Roumeliotis M: Video scene segmentation using spatial contours and 3-D robust motion estimation. IEEE Transactions on Circuits and Systems for Video Technology 2004, 14(4):485–497. 10.1109/TCSVT.2004.825562

  10. 10.

    Meier T, Ngan KN: Automatic segmentation of moving objects for video object plane generation. IEEE Transactions on Circuits and Systems for Video Technology 1998, 8(5):525–538. 10.1109/76.718500

  11. 11.

    Meier T, Ngan KN: Video segmentation for content-based coding. IEEE Transactions on Circuits and Systems for Video Technology 1999, 9(8):1190–1203. 10.1109/76.809155

  12. 12.

    Xu H, Younis AA, Kabuka MR: Automatic moving object extraction for content-based applications. IEEE Transactions on Circuits and Systems for Video Technology 2004, 14(6):796–812. 10.1109/TCSVT.2004.828338

  13. 13.

    Jan Y-H, Lin DW: Extraction of video objects by combined motion and edge analysis. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '02), May 2002, Scottsdale, Ariz, USA 5: 677–680.

  14. 14.

    Pan J, Li S, Zhang Y-Q: Automatic extraction of moving objects using multiple features and multiple frames. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '00), May 2000, Geneva, Switzerland 1: 36–39.

  15. 15.

    Sun S, Haynor DR, Kim Y: Semiautomatic video object segmentation using VSnakes. IEEE Transactions on Circuits and Systems for Video Technology 2003, 13(1):75–82. 10.1109/TCSVT.2002.808089

  16. 16.

    Gu C, Lee M-C: Semiautomatic segmentation and tracking of semantic video objects. IEEE Transactions on Circuits and Systems for Video Technology 1998, 8(5):572–584. 10.1109/76.718504

  17. 17.

    Zhong D, Chang S-F: An integrated approach for content-based video object segmentation and retrieval. IEEE Transactions on Circuits and Systems for Video Technology 1999, 9(8):1259–1268. 10.1109/76.809160

  18. 18.

    Gatica-Perez D, Sun M-T, Gu C: Multiview extensive partition operators for semantic video object extraction. IEEE Transactions on Circuits and Systems for Video Technology 2001, 11(7):788–801. 10.1109/76.931107

  19. 19.

    McKeown MJ, Jung T-P, Makeig S, et al.: Spatially independent activity patterns in functional MRI data during the Stroop color-naming task. Proceedings of the National Academy of Sciences of the United States of America 1998, 95(3):803–810. 10.1073/pnas.95.3.803

  20. 20.

    Bell AJ, Sejnowski TJ: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 1995, 7(6):1129–1159. 10.1162/neco.1995.7.6.1129

  21. 21.

    Stone JV, Porrill J, Buchel C, Friston K: Spatial, temporal, and spatiotemporal independent component analysis of fMRI data. In Proceedings of 18th Leeds Statistical Research Workshop on Spatial-Temporal Modeling and Its applications, July 1999, Leeds, UK Edited by: Aykroyd RG, Mardia KV, Drydent IL. 23–28.

  22. 22.

    Herault J, Jutten C: Space or time adaptive signal processing by neural networks model. Proceedings of International Conference on Neural Networks for Computing, April 1986, Snowbird, Utah, USA 206–211.

  23. 23.

    Hill RO: Elementary Linear Algebra. Academic Press, Orlando, Fla, USA; 1986.

  24. 24.

    Cardoso J-F: Blind signal separation: statistical principles. Proceedings of the IEEE 1998, 86(10):2009–2025. 10.1109/5.720250

  25. 25.

    Lee T, Girolami M: Independent component analysis using an extended informax algorithm for mixed sub-gaussian and super-gaussian sources. Proceedings of 4th Annual Joint Symposium on Neural Computation, May 1997, Los Angeles, Calif, USA 7: 132–139.

  26. 26.

    Hyvärinen A, Karhunen J, Oja E: Independent Component Analysis. John Wiley & Sons, New York, NY, USA; 2001.

  27. 27.

    Dennis JE, Schnabel RB: Numerical Methods for Unconstrained Optimization and Nonlinear Equations. SIAM, Philadelphia, Pa, USA; 1996.

  28. 28.

    Hsung T-C, Lun DP-K, Siu W-C: Denoising by singularity detection. IEEE Transactions on Signal Processing 1999, 47(11):3139–3144. 10.1109/78.796450

  29. 29.

    Mallat S, Hwang WL: Singularity detection and processing with wavelets. IEEE Transactions on Information Theory 1992, 38( 2):617–643. 10.1109/18.119727

  30. 30.

    Hlavac V, Sonka M, Boyle R: Image Processing, Analysis and Machine Vision. 2nd edition. PWS Publishing, Boston, Mass, USA; 1999.

  31. 31.

    Marshall AD, Martin RR: Computer Vision, Models and Inspection. World Scientific Publishing, River Edge, NJ, USA; 1993.

  32. 32.

    Tabb M, Ahuja N: Multiscale image segmentation by integrated edge and region detection. IEEE Transactions on Image Processing 1997, 6(5):642–655. 10.1109/83.568922

  33. 33.

    Zhang X-P: Target segmentation and extraction from geographic images based on multiscale analysis. Proceedings of 5th WSES/IEEE World Multiconference on Circuits, Systems, Communications & Computers (CSCC '01), July 2001, Rethymnon, Greece

  34. 34.

    Zhang X-P: Multiscale tumor detection and segmentation in mammograms. Proceedings of IEEE International Symposium on Biomedical Imaging (ISBI '02), July 2002, Washington, DC, USA 213–216.

  35. 35.

    Zhang X-P, Desai MD: Segmentation of bright targets using wavelets and adaptive thresholding. IEEE Transactions on Image Processing 2001, 10(7):1020–1030. 10.1109/83.931096

  36. 36.

    Lay DC: Linear Algebra and Its Applications. Addison-Wesley, Boston, Mass, USA; 1993.

  37. 37.

    Kim I-M, Kim H-M: A new resource allocation scheme based on a PSNR criterion for wireless video transmission to stationary receivers over Gaussian channels. IEEE Transactions on Wireless Communications 2002, 1(3):393–401. 10.1109/TWC.2002.800538

  38. 38.

    Saha S, Vemuri R: An analysis on the effect of image features on lossy coding performance. IEEE Signal Processing Letters 2000, 7(5):104–107. 10.1109/97.841153

Download references

Author information

Correspondence to Xiao-Ping Zhang.

Rights and permissions

Reprints and Permissions

About this article


  • Image Segmentation
  • Video Sequence
  • Source Image
  • Processing Application
  • Video Content