Skip to main content

Object tracking system using a VSW algorithm based on color and point features

Abstract

An object tracking system using a variable search window (VSW) algorithm based on color and feature points is proposed. A meanshift algorithm is an object tracking technique that works according to color probability distributions. An advantage of this algorithm based on color is that it is robust to specific color objects; however, a disadvantage is that it is sensitive to non-specific color objects due to illumination and noise. Therefore, to offset this weakness, it presents the VSW algorithm based on robust feature points for the accurate tracking of moving objects. The proposed method extracts the feature points of a detected object which is the region of interest (ROI), and generates a VSW using the given information which is the positions of extracted feature points. The goal of this paper is to achieve an efficient and effective object tracking system that meets the accurate tracking of moving objects. Through experiments, the object tracking system is implemented that it performs more precisely than existing techniques.

1. Introduction

The object tracking means tracing the progress of objects as they move over a sequence of images. Visual object tracking in complex environments is an important topic in the intelligent surveillance field. A good tracking algorithm should be able to work well in many real circumstances, such as background clutters, occlusions, and different illuminations [1, 2].

The object tracking methods can be divided into three groups according to feature values of the object which seems to be foreground: color-based method, boundary-based method, and model-based method. The color-based method is used for the color probability distribution of object tracking. Typical color-based methods are meanshift and continuously adaptive meanshift (Camshift) algorithms [3, 4]. It should facilitate fast calculation because of the simple implementation. Therefore, the color-based method is widely used in object tracking because it is easy to extract and robust against partial occlusion. However, they are vulnerable to sudden illumination changes and backgrounds with similar colors. The boundary-based method is used for contour information of object such as condensation algorithm [5]. It is suitable for tracking rigid object which seldom changes their boundaries such as the heads of people. However, it is difficult for real-time processing because of complicated calculations. The model-based method, in other words motion templates, tracks the object after learning the templates in advance [6]. As a result, these methods are combined to achieve more robust tracking results such as tracking algorithm based on combining the color and boundary-based methods [7, 8].

The tracking algorithm proposed in this article employs point-based and color-based multiple features, i.e., it is an effective improvement of meanshift with scale invariant feature transform (SIFT) algorithm. It is a method that extracts the feature points of a detected object, and generates a variable search window (VSW) using the given information. This information is the positions of extracted feature points. This approach can solve the problem of a similar color distribution and improve the performance of the object tracking. The main contributions of this article are as follows: (1) the improvement of meanshift with SIFT algorithm is proposed for the object tracking, and (2) the performance of the proposed tracking algorithm can experimentally be proved against the existing algorithms.

The rest of this article is organized as follows. In Section 2, Gaussian mixture modeling (GMM) and post-processing of detected objects are introduced. Typical tracking methods and the proposed tracking algorithm are described in Sections 3. The experimental results and the tracking performances are in Section 4. Finally, conclusions are given in Section 5.

2. Object detection

Most methods for object detection are based on per-pixel background models [912]. A pixel-based method does not consider the general things in the frame and therefore shadows and noise must be handled afterwards. A flowchart for the object detection method is shown in Figure 1.

Figure 1
figure1

Flowchart for the moving object detection based on background modeling.

2.1. Gaussian mixture model

A GMM is a parametric probability density function represented as a weighted sum of Gaussian component densities [13, 14]. This method is suggested by Stauffer et al., which models each pixel as a mixture of Gaussian distributions and uses an online approximation to update the model. The model assumes that each pixel in the frame is modeled by a mixture of K Gaussian distributions where different Gaussian distributions represent different colors. The probability of observing the current pixel value is

P ( X t ) = i = 1 K ω i , t η ( X t , μ i , t , i , t ) ,
(1)

where K is the number of the distributions, ω i, t is an estimate of the weight (what portion of the data is accounted for by this Gaussian) of the i th Gaussian in the mixture at time t, μ i, t is the mean value of the i th Gaussian in the mixture at time t, ∑ i, t is the covariance matrix of the i th Gaussian in the mixture at time t, X t is a random variable vector and η(X t , μ i, t , ∑ i, t ) is a Gaussian probability density function.

A pixel value X t that matches the Gaussian distribution can be defined as

matching : X t - μ i , t < λ σ i , t ummatching : X t - μ i , t λ σ i , t i = 1 , 2 , , K ,
(2)

where λ is 2.5 and σ is standard deviation. So, a match is defined as a pixel value within 2.5 standard deviations of a distribution.

The prior weights of the K distributions at time t are adjusted as follows:

ω i , t = ( 1 - α ) ω i , t - 1 + α ( M i , t ) ,
(3)

where M i, t is 1 for the model matched and 0 for the remaining models, and α is a learning rate.

The mean and the variance parameters for unmatched distributions remain the same. The parameters of the distribution which matches the new observation are updated as follows:

μ i , t = ( 1 - ρ ) μ i , t - 1 + ρ X t ,
(4)
σ i , t 2 = ( 1 - ρ ) σ i , t - 1 2 + ρ ( X t - μ i , t ) T ( X t - μ i , t - 1 ) ,
(5)

where ρ = αη(X t |μ i ,σ i ).

If X t does not match any Gaussian distributions, the least probable distribution is replaced with a new distribution which has its mean value, an initially high variance and low prior weight.

After the updates, all the components in the mixture are ordered by the value of ω/σ. Then, the first B distributions which exceed certain threshold Tbg are retained for a background distribution and B can be defined as

B = arg min b k = 1 b ω k > T bg ,
(6)

where Tbg is a measure of the minimum portion of the data that should be accounted for by the background.

2.2. Post-processing

The GMM is used for the segmentation, extraction of objects, and background area. However, detected objects can contain noise types such as shadows and illuminations. Therefore, it needs to remove the shadows using morphological filters. A deterministic non-model-based approach among shadow removal techniques is used for general surroundings. This approach is based on the fact that it can consider a pixel as a shadow if it has similar chromaticity but lower brightness than identical pixels in the background image. Equation 7 shows the decision as to whether or not a certain pixel is part of the shadow [15, 16] as follows:

Shadow = { 1 , if BR img < BR bg , & CH img = CH bg ± T 0 , otherwise
(7)

where BRimg is the brightness of an input image, BRbg is the brightness of a background image, CHimg is the chromaticity of an input image, CHbg is the chromaticity of a background image and T is threshold value (= 0.5).

Figure 2 shows a comparison with the resulting images of the shadow removal and non-removal. The process of using the shadow removal considerably reduces a lot of the noises. So, the detected object becomes clearer.

Figure 2
figure2

A comparison of the resulting images of shadow removal and non-removal.

3. Object tracking

3.1. Meanshift algorithm

The meanshift algorithm, which iteratively shifts a datum point to the average of data points in its neighborhood, is a robust statistical method. This algorithm finds local maxima in any probability distribution. It is used for tasks such as clustering, mode-seeking, probability density estimations, and tracking [17, 18]. Table 1 shows the finding of the maximization of probability distribution using the meanshift algorithm [19].

Table 1 Meanshift algorithm for finding the maximization of probability distribution

3.2. SIFT algorithm

The SIFT algorithm was introduced by David G. Lowe, a professor at the University of British Columbia (UBC). This algorithm is used in various applications, such as feature extraction and matching. Figure 3 shows the steps of the SIFT algorithm. This is divided into the detector and descriptor categories largely. It generally has four steps [20, 21]. In this article, we use detected feature points (= keypoints) using the SIFT algorithm, i.e., the proposed method is implemented until the extraction step of keypoints.

Figure 3
figure3

Steps of the SIFT algorithm.

The first stage of computation searches over all scales and image locations. It is implemented efficiently with a difference-of-Gaussian (DOG) image to identify potential interest points that are invariant to scale and orientation. The scale space of an image is defined as follows:

L ( x , y , σ ) = G ( x , y , σ ) * I ( x , y ) ,
(8)

where I(x,y) is an input image, G(x,y,σ) is a variable-scale Gaussian, and * is the convolution operation.

Stable keypoint locations in scale space can be computed from the DOG separated by a constant multiplicative factor k:

D ( x , y , σ ) = G ( x , y , k σ ) - G ( x , y , σ ) * I ( x , y ) = L ( x , y , k σ ) - L ( x , y , σ ) .
(9)

Figure 4 shows Gaussian and DOG pyramids of region of interest (ROI) in Video 1. For each octave of scale space, the initial image is repeatedly convolved with Gaussian to produce the set of scale space images shown in Figure 4a. Adjacent Gaussian images are subtracted to produce the DOG images in Figure 4b. After each octave, the Gaussian image is down-sampled by a factor of 2, and the process repeated. In Figure 4b, the DOG images show that values higher than 0 represent 255 because of seeing difference well, but real values of the DOG images are 0 or low values.

Figure 4
figure4

Gaussian and DOG pyramids of ROI in Video 1 (160 frame). (a) Gaussian pyramid. (b) DOG pyramid.

To detect the local maxima and minima of D(x,y,σ), each sample point is compared to its eight neighbors in the current image and nine neighbors in the scale above and below. At each candidate location, a detailed model is fit to determine the location and scale. Keypoints are selected based on measures of their stability. If this value is below a threshold, signifying that the structure has low contrast (sensitive to noise), the keypoint will be removed. For poorly defined peaks in scale-normalized Laplacian of Gaussian operators, the ratio of the principal curvatures of each candidate keypoint is evaluated. If the ratio is below a threshold, the keypoint is retained. Figure 5 shows the extraction of feature using the SIFT algorithm in Video 1. Figure 5a has 184 features in the whole region of a frame, and Figure 5b has 32 features in the ROI. Therefore, we reduce processing time due to extract feature points in the ROI only.

Figure 5
figure5

Extraction of feature points using the SIFT algorithm in Video 1 (140 frame). (a) 184 features in the whole region. (b) 32 features in the ROI.

3.3 VSW algorithm

The proposed tracking system has three steps. The first step involves background modeling construction using the GMM. The second step is an execution of the post-processing of the detected objects for noise removal. The last step is the tracking of the moving object using the VSW algorithm. Finally, the proposed tracking system finds the most accurate object through a new search window. During the meanshift tracking, a color histogram can easily be computed. However, this process does not update the size of a search window and convergence into a local maxima point is easily done. Camshift tracking can update the search window. However, it is sensitive to non-specific color objects due to illumination and noise. It may include some similarly colored background areas that distract the tracking process. Therefore, to overcome this weakness, in this article, we present the VSW algorithm which generates a VSW with robust feature points for an accurate tracking of moving objects.

A flowchart for the whole system is shown in Figure 6. The detected object by means of background modeling is set as the ROI. The next step involves the splitting off of the hue color in this region and calculation of a histogram of the hue color during the first frame of the object detection. The frames are found to maximize probability distribution using both the meanshift and a search window of new location. The search window is enlarged to the region at one pixel per each side of rectangle because enlarged region can prohibit extraction of non-feature point. The Gaussian pyramid images are created to this region, and the DOG pyramid images are created to the Gaussian pyramid images. Maxima or minima candidate keypoints are found in the DOG pyramid images. After filtering keypoints, robust feature points are extracted in only the ROI by the SIFT. The outermost feature point on the each side of a rectangle is found. We then take four feature points, and generate a VSW with them. Finally, we can track moving objects with the calculated histogram in advance, and create a new variable window. A blue dotted rectangle represents the meanshift. A red dotted rectangle indicates the SIFT. An orange dotted rectangle denotes the VSW algorithm.

Figure 6
figure6

Flowchart for the whole system.

Table 2 shows the steps of the proposed method. According to the condition of step 5, the object tracking is stopped or not. If the condition is true, then the proposed algorithm stops the object tracking, and if the condition is false, then the proposed algorithm continues the object tracking.

Table 2 The steps of the proposed algorithm

VSW algorithm: The proposed VSW algorithm

Track_window = track_object_rect + 1 (per the each side of rectangle)

Img = Track_window(ROI)

1. Feature extraction : SIFT

2. Search window change with feature points

For all i such that 0 ≤ in do (n is the number of feature points)

If i = = 0 then (set min(x, y) with first feature point)

Min_x = Feature_x i , Min_y = Feature_y i

Else if i = = 1 then (set max(x, y) with second feature point)

Max_x = Feature_x i , Max_y = Feature_y i

If Min_x > Max_x then Swap Min_x with Max_x

End if

If Min_y > Max_y then Swap Min_y with Max_y

End if

Else (set min(x, y) and max(x, y) with over third feature point)

If Min_x > Feature_x i then Min_x = Feature_x i

End if

If Min_y > Feature_y i then Min_y = Feature_y i

End if

If Max_x > Feature_x i then Max_x = Feature_x i

End if

If Max_y > Feature_y i then Max_y = Feature_y i

End if

End if

End for

Track _ object _ rect = { ( Min _ x , Min _ y ) , ( Max _ x , Max _ y ) }

The proposed algorithm shows the proposed VSW algorithm. Track_object_rect indicates a detected object place and track_window indicates a search window. To prohibit extraction non-feature point, the search window is enlarged to the region at one pixel per the each side of rectangle. The SIFT algorithm is used to feature points, and n is the number of feature points. The most outer feature point in the each side of a rectangle is found for changing the search window. Therefore, we set four features into Min_x, Min_y, Max_x and Max_y.

Figure 7 shows an example of the generation of a VSW. The left image in Figure 7 is the region which is enlarged one pixel per the each side of a rectangle of the detected object. A blue rectangle which represents the edge of an image belongs to the search window. The existing meanshift algorithm is used with this fixed search window. However, in this article, to compensate for the weakness this algorithm, we extract feature points with the SIFT algorithm in only the ROI. In Figure 7, the right image shows the extracted feature point as expressed the '+' sign. The outermost feature points among all feature points generate the red rectangle shown in the new search window. Thus, the search window then changes from the area in the blue to the area in the red rectangle. Generating a VSW in each frame can increase the accuracy of the object tracking performance.

Figure 7
figure7

An example of a generation of a VSW.

Figure 8 shows the resulting image of the generation of the VSW during the tracking of a moving object in an experiment. The left image is the resulting object tracking frame using the proposed method. The right image is the ROI of an enlarged section of the left image. Feature points are extracted by the SIFT algorithm in the region of object detection which is the region within the dark blue dotted line. The red '+' signs denote the feature points. The outermost feature points among the feature points generate a new search window, as denoted here by the red solid line forming the rectangle, i.e., we can generate a VSW in each frame for more accurate object tracking.

Figure 8
figure8

The resulting image of the generation of the VSW while tracking a moving object.

Figure 9 shows the processing of the generation of the VSW from input frame. Through the proposed method, the region of detected object is more detailed because the background region is deleted.

Figure 9
figure9

The processing of the generation of the VSW from input frame.

4. Experimental results

The proposed algorithm is implemented in Microsoft Visual C++ and carried out on a PC with a 2.0 GHz Intel Core 2 processor with 2 GB of memory. Table 3 shows the detailed information of each video sequence. In the experiments, four video sequences are used. Especially, Intelligent room [22] and Pets 2006 [23] videos are mainly used to evaluate performance of the object tracking system. The others are personally captured by the videos for the experiment.

Table 3 Detailed information of each video sequence (M, multiple objects; S, single object)

To compute the tracking error, we create the ground-truth images, which are the images of the actual object region using Photoshop CS4. Figure 10 shows an example of the processing of the ground-truth images. We set a standard search window in the creation of the ground-truth image. In Figure 10, the center-point is denoted by a black '+' sign and the search window is denoted by the white rectangle. We set a new standard of distance error with the ground-truth images. It is marked visually every five frames after the initial detection of the object. Therefore, we can compare it with the detected object using different algorithms.

Figure 10
figure10

An example of the processing of the ground-truth image.

Figure 11 shows the comparison with resulting images of single-object tracking using each algorithm in Video 1 and Intelligent room. In Figure 11, a red rectangle denotes the proposed method, a green rectangle indicates the meanshift algorithm, a blue rectangle represents the Camshift algorithm, and a yellow rectangle denotes a meanshift + optical flow algorithm [24]. The proposed method tracked the object region more accurately than the other algorithms. Moreover, for the tracking of the object with the proposed method, it is clear that the search window is perfectly adapted to the size of the detected object. Through experimental results, color-based meanshift and Camshift algorithms missed the object because of illumination noise. The red rectangle as regards the size of the object can well change itself according to the variable size. When the size of object is bigger, the small red rectangle changes the big one.

Figure 11
figure11

A comparison of the resulting images of single-object tracking using each algorithm. (a) Video 1 (frames 80, 120, 160 and 180). (b) Intelligent room (frames 97, 156, 267, and 293).

Figure 12 shows the error comparison of the search window region using each algorithm in Video 1 and Intelligent room. To estimate accuracy of detected object in each algorithm, we measure the error of the region as follows:

Figure 12
figure12

Error comparison of the search window's region using each algorithm. (a) Video 1. (b) Intelligent room.

ER = MGR + FSR,
(10)

where ER is the error of the region, MGR is the miss region in ground-truth region, and FSR is the false region in search window region.

The criterion of ER is based on the region of detected object (= true region) in ground-truth images such as Figure 10. High value of ER means that the probability of false object tracking is high. In Figure 12, a blue dotted line has the most errors among the lines and a red dotted line has the least errors among the lines. Video 1 and Intelligent room got more error of the region as they went on.

Figures 13 and 14 show the comparison of the resulting images of multi-object tracking in Video 2 and Pets 2006 using each algorithm. Tracking multi-object using the meanshift algorithm based on color shows that several objects are missed in 661-frame of Video 2 and in 1002, 1040, 1136, 1175-frame of Pets 2006. Tracking multi-object using the Camshift algorithm based on color shows that nearby objects are not recognized multi-object but single-object in 619-frame of Video 2 and in 1040-frame of Pets 2006. Also, this tracking indicates that some objects among them are missed in 1002, 1175-frame of Pets 2006. Tracking based on color and feature points such as the meanshift + optical and the proposed method well tracks object. However, a tracking accuracy in the proposed method is higher than it in the meanshift + optical method, i.e., the proposed method is most similar with the object region of the ground-truth image.

Figure 13
figure13

A comparison of the resulting images of multi-object tracking in Video 2 using each algorithm (frames 540, 586, 619, and 661). (a) Meanshift. (b) Camshift. (c) Meanshift + optical. (d) Proposed method.

Figure 14
figure14

A comparison of the resulting images of multi-object tracking in Pets 2006 using each algorithm (frames 1002, 1040, 1136, and 1175). (a) Meanshift. (b) Camshift. (c) Meanshift + optical. (d) Proposed method.

Figure 15 shows the tracking result images of one-object among multi-object using the proposed method in Video 2 and Pets 2006. A red line represents the center of the search window, i.e., it is an object's route. Through experimental results, the proposed method can well track an object, cannot miss it among other objects.

Figure 15
figure15

The tracking result images of one-object among multi-object using the proposed method in Video 2 and Pets 2006. (a) Video 2 (frame 199). (b) Video 2 (frame 655). (c) Pets 2006 (frame 460).

Table 4 shows the comparison of the accuracy comparison for each algorithm. For estimating tracking accuracy, the accuracy is defined as follows:

Table 4 The comparison of the accuracy for each algorithm (%)
accuracy(% ) = the number of total frames - the number of false tracking frames the number of total frames × 100 .
(11)

In general, the accuracy of the proposed method is higher than other algorithms. The average accuracy for the proposed method is 97.17%. Through experiments, the proposed method can increase the tracking accuracy at about 3.99%.

Table 5 indicates the comparison of the average processing time for each algorithm. In case of Intelligent room, it takes 0.03058 s to process one frame. Owing to extracting feature points of the SIFT algorithm, it is slower than only color-based algorithms. However, the proposed method is faster than the meanshift + optical, and it is sufficient time to track the object in real time. The average processing time for the proposed method is 0.03642s, and the average processing time for meanshift + optical is 0.03709s. Through experiments, the proposed method can reduce the processing time at about 0.00067 s per a frame.

Table 5 The comparison of the average processing time for each algorithm (s/frame)

5. Conclusions

A VSW algorithm based on color and feature points is proposed for accurate tracking of moving objects. When the size of object changes, and the tracked object has a similar color to the background color in an image, the color-based meanshift and Camshift algorithms easily miss the object. This article has demonstrated that the search window's size in the meanshift algorithm can be changed using robust feature points to solve the problems encountered when tracking an object with a fixed search window size and a color similar to the background. In general, the accuracy of the proposed method is higher than other algorithms. The average accuracy for the proposed method is 97.17%. Through experiments, the proposed method can increase the tracking accuracy at about 3.99%. In this article, we improve the object tracking accuracy through the experiment of various videos. Therefore, combining multiple-features makes the object tracking more robust in tracking applications. According to the experimental results, the proposed method shows more precise performance than other algorithms.

Abbreviations

Camshift:

continuously adaptive meanshift

DOG:

difference of Gaussian

GMM:

Gaussian mixture model

ROI:

region of interest

SIFT:

scale invariant feature transform

VSW:

variable search window.

References

  1. 1.

    Yilmaz A, Javed O, Shah M: Object tracking: a survey. ACM J Comput Surv 2006,38(4):45.

    Google Scholar 

  2. 2.

    Hu J-S, Juan C-W, Wang J-J: A spatial-color mean-shift object tracking algorithm with scale and orientation estimation. Pattern Recogn Lett 2008,29(16):2165-2173. 10.1016/j.patrec.2008.08.007

    Article  Google Scholar 

  3. 3.

    Cheng Y: Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 1995,17(8):790-799. 10.1109/34.400568

    Article  Google Scholar 

  4. 4.

    Comaniciu D, Ramesh V, Meer P: Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 2003,25(5):564-577. 10.1109/TPAMI.2003.1195991

    Article  Google Scholar 

  5. 5.

    Isard M, Blake A: Condensation-conditional density propagation for visual tracking. Int J Comput Vis 1998,29(1):5-28. 10.1023/A:1008078328650

    Article  Google Scholar 

  6. 6.

    Papageorgiou C, Oren M, Poggio T: A general framework for object detection. International Conference on Computer Vision 1998, 555-562.

    Google Scholar 

  7. 7.

    Dixit M, Venkatesh KS: Combining edge and color features for tracking partially occluded humans. ACCV 2009 2010, 140-149. Part II, LNCS 5995

    Chapter  Google Scholar 

  8. 8.

    Akazawa Y, Okada Y, Niijima K: Robust tracking algorithm based on color and edge distribution for real-time video based motion capture systems. IAPR workshop on Machine Vision Applications 2002, 60-63.

    Google Scholar 

  9. 9.

    Friedman N, Russell S: Image segmentation in video sequences: a probabilistic approach. Proc 13th Conf Uncertainty in Artificial Intelligence (UAI) 1997, 175-181.

    Google Scholar 

  10. 10.

    KaewTrakulPong P, Bowden R: An improved adaptive background mixture model for realtime tracking with shadow detection. Proc 2nd European Workshop on Advanced Video Based Surveillance Systems (AVBS '01) 2001, 1-5.

    Google Scholar 

  11. 11.

    Lee DS: Effective Gaussian mixture learning for video background subtraction. IEEE Trans Pattern Anal Mach Intell 2005,27(5):827-832.

    Article  Google Scholar 

  12. 12.

    Pnevmatikakis A, Polymenakos L: Kalman tracking with target feedback on adaptive background learning. Machine Learning for Multimodal Interaction (MLMI) 2006, 114-122. LNCS 4299

    Chapter  Google Scholar 

  13. 13.

    Stauffer C, Eric W, Grimson L: Learning patterns of activity using real-time tracking. IEEE Trans Pattern Anal Mach Intell 2000,22(8):747-757. 10.1109/34.868677

    Article  Google Scholar 

  14. 14.

    Bouwmans T, Baf FE, Vachon B: Background modeling using mixture of gaussians for foreground detection--a survey. Recent Patents Comput Sci 2008,1(3):219-237.

    Article  Google Scholar 

  15. 15.

    Prati A, Mikic I, Trivedi MM, Cucchiara R: Detecting moving shadows: algorithms and evaluation. IEEE Trans Pattern Anal Mach Intell 2003,25(7):918-923. 10.1109/TPAMI.2003.1206520

    Article  Google Scholar 

  16. 16.

    Cucchiara R, Grana C, Piccardi M, Prati A, Sirotti S: Improving shadow suppression in moving object detection with HSV color information. Proc IEEE Intelligent Transportation Systems Conf 2001, 334-339.

    Google Scholar 

  17. 17.

    Bradski GR: Computer vision face tracking for use in a perceptual user interface. Intel Technol J 1998,2(2):12-21.

    Google Scholar 

  18. 18.

    Kailath T: The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans Commun Technol 1967,15(1):52-60. 10.1109/TCOM.1967.1089532

    Article  Google Scholar 

  19. 19.

    Comaniciu D, Ramesh V, Meer P: Real-time tracking of non-rigid object using mean shift. Proc Conf Computer Vision and Pattern Recognition 2000, 2: 142-149.

    Google Scholar 

  20. 20.

    Lowe DG: Distinctive image features from scale-invariant keypoints. IJCV 2004,60(2):91-110.

    Article  Google Scholar 

  21. 21.

    Zhou H, Yuan Y, Shi C: Object tracking using SIFT features and mean shift. Comput Vis Image Understand 2009,113(3):345-352. 10.1016/j.cviu.2008.08.006

    Article  Google Scholar 

  22. 22.

    ATON Dataset[http://cvrr.ucsd.edu/aton/shadow/]

  23. 23.

    Pets 2006[http://www.cvg.rdg.ac.uk/PETS2006/data.html]

  24. 24.

    Lim HY, Kang DS: Object tracking based on MCS tracker with motion information. the 2nd International Conf Information Technology Convergence and Services (ITCS) 2010, 2: 12.

    Google Scholar 

Download references

Acknowledgements

This study was supported by the Dong-A University research fund.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Dae-Seong Kang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Lim, HY., Kang, DS. Object tracking system using a VSW algorithm based on color and point features. EURASIP J. Adv. Signal Process. 2011, 60 (2011). https://doi.org/10.1186/1687-6180-2011-60

Download citation

Keywords

  • background modeling
  • meanshift
  • object tracking
  • search window
  • SIFT