Skip to main content

Mixed-State Models for Nonstationary Multiobject Activities

Abstract

We present a mixed-state space approach for modeling and segmenting human activities. The discrete-valued component of the mixed state represents higher-level behavior while the continuous state models the dynamics within behavioral segments. A basis of behaviors based on generic properties of motion trajectories is chosen to characterize segments of activities. A Viterbi-based algorithm to detect boundaries between segments is described. The usefulness of the proposed approach for temporal segmentation and anomaly detection is illustrated using the TSA airport tarmac surveillance dataset, the bank monitoring dataset, and the UCF database of human actions.

References

  1. Sin B, Kim JH: Nonstationary hidden Markov model. Signal Processing 1995,46(1):31–46. 10.1016/0165-1684(95)00070-T

    Article  MATH  Google Scholar 

  2. Vaswani N, Chowdhury AR, Chellappa R: Activity recognition using the dynamics of the configuration of interacting objects. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 633–640.

    Google Scholar 

  3. Aggarwal JK, Cai Q: Human motion analysis: a review. Computer Vision and Image Understanding 1999,73(3):428–440. 10.1006/cviu.1998.0744

    Article  Google Scholar 

  4. Starner T, Pentland A: Real-time American Sign Language recognition from video using hidden Markov models. Proceedings of the IEEE International Symposium on Computer Vision (ISCV '95), November 1995, Coral Gables, Fla, USA 265–270.

    Chapter  Google Scholar 

  5. Polana R, Nelson R: Low level recognition of human motion (or how to get your man without finding his body parts). Proceedings of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects, November 1994, Austin, Tex, USA 77–82.

    Chapter  Google Scholar 

  6. Bobick A, Davis J: Real-time recognition of activity using temporal templates. Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96), December 1996, Sarasota, Fla, USA 39–42.

    Chapter  Google Scholar 

  7. Ivanov YA, Bobick AF: Recognition of visual activities and interactions by stochastic parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(8):852–872. 10.1109/34.868686

    Article  Google Scholar 

  8. Kale A, Sundaresan A, Rajagopalan AN, et al.: Identification of humans using gait. IEEE Transactions on Image Processing 2004,13(9):1163–1173. 10.1109/TIP.2004.832865

    Article  Google Scholar 

  9. Izo T, Grimson WEL: Simultaneous pose estimation and camera calibration from multiple views. Proceedings of IEEE Workshop on Motion of Non-Rigid and Articulated Objects, June 2004, Washington, DC, USA 1: 14–21.

    Google Scholar 

  10. Brand M, Oliver N, Pentland A: Coupled hidden Markov models for complex action recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '97), June 1997, San Juan, Puerto Rico, USA 994–999.

    Chapter  Google Scholar 

  11. Nevatia R, Zhao T, Hongeng S: Hierarchical language-based representation of events in video streams. Proceedings of 2nd IEEE Workshop on Event Mining: Detection and Recognition of Events in Video, June 2003, Madison, Wis, USA 4: 39–45.

    Google Scholar 

  12. Syeda-Mahmood T, Vasilescu A, Sethi S: Recognizing action events from multiple viewpoints. Proceedings of IEEE Workshop on Detection and Recognition of Events in Video, July 2001, Vancouver, Canada

    Google Scholar 

  13. Rao C, Yilmaz A, Shah M: View-invariant representation and recognition of actions. International Journal of Computer Vision 2002,50(2):203–226. 10.1023/A:1020350100748

    Article  MATH  Google Scholar 

  14. Koller D, Lerner U: Sampling in factored dynamic systems. In Sequential Monte Carlo Methods in Practice. Springer, New York, NY, USA; 2001:445–464.

    Chapter  MATH  Google Scholar 

  15. Hamid R, Huang Y, Essa I: ARGMode—activity recognition using graphical models. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 4: 38–43.

    Google Scholar 

  16. Vu V, Bremond F, Thonnat M: Automatic video interpretation: a novel algorithm for temporal scenario recognition. Proceedings of the 18th International Joint Conferences on Artificial Intelligence (IJCAI '03), August 2003, Acapulco, Mexico

    Google Scholar 

  17. Isard M, Blake A: A mixed-state condensation tracker with automatic model-switching. Proceedings of the 6th IEEE International Conference on Computer Vision (ICCV '98), January 1998, Bombay, India 107–112.

    Google Scholar 

  18. Ghahramani Z, Hinton GE: Variational learning for switching state-space models. Neural Computation 2000,12(4):831–864. 10.1162/089976600300015619

    Article  Google Scholar 

  19. Kurzhanski AB, Varaiya P: Dynamic optimization for reachability problems. Journal of Optimization Theory and Applications 2001,108(2):227–251. 10.1023/A:1026497115405

    Article  MathSciNet  MATH  Google Scholar 

  20. Tomlin C, Pappas GJ, Sastry S: Conflict resolution for air traffic management: a study in multiagent hybrid systems. IEEE Transactions on Automatic Control 1998,43(4):509–521. 10.1109/9.664154

    Article  MathSciNet  MATH  Google Scholar 

  21. Stauffer C, Grimson WEL: Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(8):747–757. 10.1109/34.868677

    Article  Google Scholar 

  22. Parameswaran V, Chellappa R: View invariance for human action recognition. International Journal of Computer Vision 2006,66(1):83–101. 10.1007/s11263-005-3671-4

    Article  Google Scholar 

  23. Kendall DG, Barden D, Carne TK, Le H: Shape and Shape Theory. John Wiley & Sons, New York, NY, USA; 1999.

    Book  MATH  Google Scholar 

  24. Zhong H, Shi J, Visontai M: Detecting unusual activity in video. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), June 2004, Washington, DC, USA 2: 819–826.

    Google Scholar 

  25. Haritaoglu I, Cutler R, Harwood D, Davis LS: Backpack: detection of people carrying objects using silhouettes. Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV '99), September 1999, Kerkyra, Greece 1: 102–107.

    MATH  Google Scholar 

  26. Wren CR, Azarbayejani A, Darrell T, Pentland AP: Pfinder: real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence 1997,19(7):780–785. 10.1109/34.598236

    Article  Google Scholar 

  27. Lucas BD, Kanade T: An iterative image registration technique with an application to stereo vision. Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI '81), August 1981, Vancouver, BC, Canada 674–679.

    Google Scholar 

  28. Ephraim Y, Dembo A, Rabiner LR: Minimum discrimination information approach for hidden Markov modeling. IEEE Transactions on Information Theory 1989,35(5):1001–1013. 10.1109/18.42209

    Article  MathSciNet  MATH  Google Scholar 

  29. Rabiner LR: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 1989,77(2):257–286. 10.1109/5.18626

    Article  Google Scholar 

  30. Vidyasagar M: Nonlinear Systems Analysis. Prentice Hall, Englewood Cliffs, NJ, USA; 1993.

    MATH  Google Scholar 

  31. Baum L, Petrie T, Soules G, Weiss N: A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics 1970,41(1):164–171. 10.1214/aoms/1177697196

    Article  MATH  Google Scholar 

  32. Forney GD Jr.: The Viterbi algorithm. Proceedings of the IEEE 1973,61(3):268–278.

    Article  MathSciNet  Google Scholar 

  33. Kavčić A, Moura JMF: The Viterbi algorithm and Markov noise memory. IEEE Transactions on Information Theory 2000,46(1):291–301. 10.1109/18.817531

    Article  MATH  Google Scholar 

  34. Georis B, Maziere M, Bremond F, Thonnat M: A video interpretation platform applied to bank agency monitoring. Proceedings of Workshop on Intelligent Distributed Surveillance Systems (IDSS '04), February 2004, London, UK 46–50.

    Chapter  Google Scholar 

  35. DeNatale F, Mayora-Ibarra O, Prisciandaro L: Interactive home assistant for supporting elderly citizens. Proceedings of EUSAI Workshop on Ambient Intelligence Technologies for WellBeing at Home, November 2004, Eindhoven, The Netherlands

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naresh P Cuntoor.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Cuntoor, N.P., Chellappa, R. Mixed-State Models for Nonstationary Multiobject Activities. EURASIP J. Adv. Signal Process. 2007, 065989 (2006). https://doi.org/10.1155/2007/65989

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2007/65989

Keywords