Figure 3From: Novel Kernel-Based Recognizers of Human ActionsThe features we use for unsupervised action recognition are described in [24], and are computed for each single frame of the video (left), while also considering the preceding one for computing optical flow. The resulting feature vector (right) consolidates data from two separate pathways computing form and flow, respectively. The former (top) computes gabor filter responses at different directions and scales; the letter in different directions, scales, and velocities. In both pathways, data is max-pooled for improving shift-invariance and summarized by matching with a set of templates.Back to article page