Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2015 | 5 | 145--153
Tytuł artykułu

Spatial Information in Classification of Activity Videos

Warianty tytułu
Języki publikacji
Spatial information describes the relative spatial position of an object in a video. Such information may aid several video analysis tasks such as object, scene, event and activity recognition. This paper studies the effect of spatial information on video activity recognition. The paper firstly performs activity recognition on KTH and Weizmann videos using Hidden Markov Model and k-Nearest Neighbour classifiers trained on Histogram Of Oriented Optical Flows feature. Histogram of Oriented Optical Flows feature is based on optical flow vectors and ignores any spatial information present in a video. Further, in this paper, a new feature set, referred to as Regional Motion Vectors is proposed. This feature like Histogram of Oriented Optical Flow is derived from optical flow vectors; however, unlike Histogram of Oriented Optical Flows preserves any spatial information in a video. Activity recognition was again performed using the two classifiers, this time trained on Regional Motion Vectors feature. Results show that when Regional Motion Vectors is used as the feature set on the KTH dataset, there is a significant improvement in the performance of k-Nearest Neighbour. When Regional Motion Vector is used on the Weizmann dataset, performances of the k-Nearest Neighbour improves significantly for some of the cases and for the other cases, the performance is comparable to when oriented optical flows is used as a feature set. Slight improvement is achieved by Hidden Markov Model on both the datasets. As Histogram of Oriented Optical Flows ignores spatial information and Regional Motion Vectors preserves it, the increase in the performance of the classifiers on using Reginal Motion Vectors instead of Histogram of Oriented Optical Flows illustrates the importance of spatial information in video activity recognition. (original abstract)
Opis fizyczny
  • School of Computing and Mathematics, University Of Ulster
  • School of Computing and Mathematics, University Of Ulster
  • School of Computing and Mathematics, University Of Ulster
  • School of Computing and Mathematics, University Of Ulster
  • J. R. R. Uijlings, A. Smeulders, and R. J. H. Scha, "What is the spatial extent of an object?" in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 2009, pp. 770-777. [Online]. Available:
  • D. Lowe, "Object recognition from local scale-invariant features," in Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, vol. 2, 1999, pp. 1150-1157 vol.2. [Online]. Available:
  • N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in In CVPR, 2005, pp. 886-893. [Online]. Available:
  • R. Chaudhry, A. Ravichandran, G. Hager, and R. Vidal, "Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 2009, pp. 1932-1939. [Online]. Available:
  • K. Brki´c, A. Pinz, S. ˇSegvi´c, and Z. Kalafati´c, "Histogram-based description of local space-time appearance," in Proceedings of the 17th Scandinavian Conference on Image Analysis, ser. SCIA'11. Berlin, Heidelberg: Springer-Verlag, 2011, pp. 206-217.
  • C. Tsai, "Bag-of-words representation in image annotation: A review," ISRN Artificial Intelligence, vol. 2012, p. 19 pages, 2012. [Online]. Available:
  • A. Gilbert, J. Illingworth, and R. Bowden, "Fast realistic multi-action recognition using mined dense spatio-temporal features," in Computer Vision, 2009 IEEE 12th International Conference on, Sept 2009, pp. 925-931.
  • J. C. Niebles, H. Wang, and L. Fei-Fei, "Unsupervised learning of human action categories using spatial-temporal words," Int. J. Comput. Vision, vol. 79, no. 3, pp. 299-318, Sep. 2008. [Online]. Available:
  • A. Kl¨aser, M. Marszałek, and C. Schmid, "A spatio-temporal descriptor based on 3d-gradients," in British Machine Vision Conference, sep 2008, pp. 995-1004.
  • F. M. Carrillo, A. Manzanera, and E. R. Castro, "A motion descriptor based on statistics of optical flow orientations for action classification in video-surveillance," in Int. Conf. on Multimedia and Signal Processing (CMSP'12), Shanghai, China, december 2012.
  • J. Pers, V. Sulic, M. Kristan, M. Perse, K. Polanec, and S. Kovacic, "Histograms of optical flow for efficient representation of body motion." Pattern Recognition Letters, vol. 31, no. 11, pp. 1369-1376, 2010. [Online]. Available:
  • I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, "Learning realistic human actions from movies," 2013 IEEE Conference on Computer Vision and Pattern Recognition, vol. 0, pp. 1-8, 2008.
  • H. Wang, A. Klaser, C. Schmid, and C.-L. Liu, "Action recognition by dense trajectories," in Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, ser. CVPR '11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 3169-3176. [Online]. Available:
  • M. Grundmann, F. Meier, and I. A. Essa, "3d shape context and distance transform for action recognition." in ICPR. IEEE, 2008, pp. 1-4.
  • T. Li, T. Mei, I.-S. Kweon, and X.-S. Hua, "Contextual bag-of-words for visual categorization," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 21, no. 4, pp. 381-392, 2011. [Online]. Available:
  • S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, vol. 2, 2006, pp. 2169-2178. [Online]. Available:
  • K. Brkic, S. Rasic, A. Pinz, S. Segvic, and Z. Kalafatic, "Combining spatio-temporal appearance descriptors and optical flow for human action recognition in video data," CoRR, vol. abs/1310.0308, 2013.
  • C. Schuldt, I. Laptev, and B. Caputo, "Recognizing human actions: a local svm approach," in Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol. 3, 2004, pp. 32-36 Vol.3. [Online]. Available: http: //
  • L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, "Actions as space-time shapes," Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2247-2253, December 2007. [Online]. Available:
  • B. D. Lucas and T. Kanade, "An iterative image registration technique with an application to stereo vision (ijcai)," in Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI '81), April 1981, pp. 674-679.
  • H. Wang, "Nearest neighbors by neighborhood counting," IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, pp. 942-953, June 2006. [Online]. Available:
  • P. Senin, "Dynamic Time Warping Algorithm Review," Department of Information and Computer Sciences, University of Hawaii, Honolulu, Hawaii 96822, Tech. Rep. CSDL-08-04, Dec. 2008.
  • D. S. Hirschberg, "A linear space algorithm for computing maximal common subsequences," Commun. ACM, vol. 18, no. 6, pp. 341-343, Jun. 1975. [Online]. Available: 360861
  • H. Wang, "All common subsequences," in Proceedings of the 20th international joint conference on Artifical intelligence, ser. IJCAI'07. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2007, pp. 635-640.
  • K. Murphy, "Hidden Markov Model (HMM) toolbox for matlab," 1998.
Typ dokumentu
Identyfikator YADDA

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.