Open Access

Mixed-State Models for Nonstationary Multiobject Activities

EURASIP Journal on Advances in Signal Processing20062007:065989

Received: 13 June 2006

Accepted: 30 October 2006

Published: 27 December 2006


We present a mixed-state space approach for modeling and segmenting human activities. The discrete-valued component of the mixed state represents higher-level behavior while the continuous state models the dynamics within behavioral segments. A basis of behaviors based on generic properties of motion trajectories is chosen to characterize segments of activities. A Viterbi-based algorithm to detect boundaries between segments is described. The usefulness of the proposed approach for temporal segmentation and anomaly detection is illustrated using the TSA airport tarmac surveillance dataset, the bank monitoring dataset, and the UCF database of human actions.


Information TechnologyHuman ActionState ModelQuantum InformationGeneric Property


Authors’ Affiliations

Department of Electrical and Computer Engineering, Center for Automation Research, University of Maryland, College Park, USA


  1. Sin B, Kim JH: Nonstationary hidden Markov model. Signal Processing 1995,46(1):31-46. 10.1016/0165-1684(95)00070-TView ArticleMATHGoogle Scholar
  2. Vaswani N, Chowdhury AR, Chellappa R: Activity recognition using the dynamics of the configuration of interacting objects. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 633-640.Google Scholar
  3. Aggarwal JK, Cai Q: Human motion analysis: a review. Computer Vision and Image Understanding 1999,73(3):428-440. 10.1006/cviu.1998.0744View ArticleGoogle Scholar
  4. Starner T, Pentland A: Real-time American Sign Language recognition from video using hidden Markov models. Proceedings of the IEEE International Symposium on Computer Vision (ISCV '95), November 1995, Coral Gables, Fla, USA 265-270.View ArticleGoogle Scholar
  5. Polana R, Nelson R: Low level recognition of human motion (or how to get your man without finding his body parts). Proceedings of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects, November 1994, Austin, Tex, USA 77-82.View ArticleGoogle Scholar
  6. Bobick A, Davis J: Real-time recognition of activity using temporal templates. Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96), December 1996, Sarasota, Fla, USA 39-42.View ArticleGoogle Scholar
  7. Ivanov YA, Bobick AF: Recognition of visual activities and interactions by stochastic parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(8):852-872. 10.1109/34.868686View ArticleGoogle Scholar
  8. Kale A, Sundaresan A, Rajagopalan AN, et al.: Identification of humans using gait. IEEE Transactions on Image Processing 2004,13(9):1163-1173. 10.1109/TIP.2004.832865View ArticleGoogle Scholar
  9. Izo T, Grimson WEL: Simultaneous pose estimation and camera calibration from multiple views. Proceedings of IEEE Workshop on Motion of Non-Rigid and Articulated Objects, June 2004, Washington, DC, USA 1: 14-21.Google Scholar
  10. Brand M, Oliver N, Pentland A: Coupled hidden Markov models for complex action recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '97), June 1997, San Juan, Puerto Rico, USA 994-999.View ArticleGoogle Scholar
  11. Nevatia R, Zhao T, Hongeng S: Hierarchical language-based representation of events in video streams. Proceedings of 2nd IEEE Workshop on Event Mining: Detection and Recognition of Events in Video, June 2003, Madison, Wis, USA 4: 39-45.Google Scholar
  12. Syeda-Mahmood T, Vasilescu A, Sethi S: Recognizing action events from multiple viewpoints. Proceedings of IEEE Workshop on Detection and Recognition of Events in Video, July 2001, Vancouver, CanadaGoogle Scholar
  13. Rao C, Yilmaz A, Shah M: View-invariant representation and recognition of actions. International Journal of Computer Vision 2002,50(2):203-226. 10.1023/A:1020350100748View ArticleMATHGoogle Scholar
  14. Koller D, Lerner U: Sampling in factored dynamic systems. In Sequential Monte Carlo Methods in Practice. Springer, New York, NY, USA; 2001:445-464.View ArticleGoogle Scholar
  15. Hamid R, Huang Y, Essa I: ARGMode—activity recognition using graphical models. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 4: 38-43.Google Scholar
  16. Vu V, Bremond F, Thonnat M: Automatic video interpretation: a novel algorithm for temporal scenario recognition. Proceedings of the 18th International Joint Conferences on Artificial Intelligence (IJCAI '03), August 2003, Acapulco, MexicoGoogle Scholar
  17. Isard M, Blake A: A mixed-state condensation tracker with automatic model-switching. Proceedings of the 6th IEEE International Conference on Computer Vision (ICCV '98), January 1998, Bombay, India 107-112.Google Scholar
  18. Ghahramani Z, Hinton GE: Variational learning for switching state-space models. Neural Computation 2000,12(4):831-864. 10.1162/089976600300015619View ArticleGoogle Scholar
  19. Kurzhanski AB, Varaiya P: Dynamic optimization for reachability problems. Journal of Optimization Theory and Applications 2001,108(2):227-251. 10.1023/A:1026497115405MathSciNetView ArticleGoogle Scholar
  20. Tomlin C, Pappas GJ, Sastry S: Conflict resolution for air traffic management: a study in multiagent hybrid systems. IEEE Transactions on Automatic Control 1998,43(4):509-521. 10.1109/9.664154MathSciNetView ArticleMATHGoogle Scholar
  21. Stauffer C, Grimson WEL: Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(8):747-757. 10.1109/34.868677View ArticleGoogle Scholar
  22. Parameswaran V, Chellappa R: View invariance for human action recognition. International Journal of Computer Vision 2006,66(1):83-101. 10.1007/s11263-005-3671-4View ArticleGoogle Scholar
  23. Kendall DG, Barden D, Carne TK, Le H: Shape and Shape Theory. John Wiley & Sons, New York, NY, USA; 1999.View ArticleMATHGoogle Scholar
  24. Zhong H, Shi J, Visontai M: Detecting unusual activity in video. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), June 2004, Washington, DC, USA 2: 819-826.Google Scholar
  25. Haritaoglu I, Cutler R, Harwood D, Davis LS: Backpack: detection of people carrying objects using silhouettes. Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV '99), September 1999, Kerkyra, Greece 1: 102-107.View ArticleGoogle Scholar
  26. Wren CR, Azarbayejani A, Darrell T, Pentland AP: Pfinder: real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence 1997,19(7):780-785. 10.1109/34.598236View ArticleGoogle Scholar
  27. Lucas BD, Kanade T: An iterative image registration technique with an application to stereo vision. Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI '81), August 1981, Vancouver, BC, Canada 674-679.Google Scholar
  28. Ephraim Y, Dembo A, Rabiner LR: Minimum discrimination information approach for hidden Markov modeling. IEEE Transactions on Information Theory 1989,35(5):1001-1013. 10.1109/18.42209MathSciNetView ArticleMATHGoogle Scholar
  29. Rabiner LR: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 1989,77(2):257-286. 10.1109/5.18626View ArticleGoogle Scholar
  30. Vidyasagar M: Nonlinear Systems Analysis. Prentice Hall, Englewood Cliffs, NJ, USA; 1993.MATHGoogle Scholar
  31. Baum L, Petrie T, Soules G, Weiss N: A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics 1970,41(1):164-171. 10.1214/aoms/1177697196MathSciNetView ArticleMATHGoogle Scholar
  32. Forney GD Jr.: The Viterbi algorithm. Proceedings of the IEEE 1973,61(3):268-278.MathSciNetView ArticleGoogle Scholar
  33. Kavčić A, Moura JMF: The Viterbi algorithm and Markov noise memory. IEEE Transactions on Information Theory 2000,46(1):291-301. 10.1109/18.817531View ArticleMATHGoogle Scholar
  34. Georis B, Maziere M, Bremond F, Thonnat M: A video interpretation platform applied to bank agency monitoring. Proceedings of Workshop on Intelligent Distributed Surveillance Systems (IDSS '04), February 2004, London, UK 46-50.View ArticleGoogle Scholar
  35. DeNatale F, Mayora-Ibarra O, Prisciandaro L: Interactive home assistant for supporting elderly citizens. Proceedings of EUSAI Workshop on Ambient Intelligence Technologies for WellBeing at Home, November 2004, Eindhoven, The NetherlandsGoogle Scholar


© N.P. Cuntoor and R. Chellappa 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.