Skip to main content

Table 1 Highlighted related works

From: Fast, parallel implementation of particle filtering on the GPU architecture

References

GPU type

GPU computes

Modela

Number ofSMs on GPU

Number ofcores

GPU clock

Time included

Runtime data

[12]

GTX 280

All; spreading-narrowing technique

BOT model,25 time steps

30

240

1.3 GHz

Sampling +weightnormalization +resampling

1,050 particles79.4 ms, positionerror 0.06245; 2,000 particles 124.8 ms,position error0.06226

[13]

GTX 280

All; sequentialresampling

A factorstochastic volatility model

30

240

1.3 GHz

SMC algorithm:no further details

8,192 particles82 ms; 16,382particles 144 ms;65,536 particles465 ms

[14]

8800 GTS

All; resamplinguses offline-initiated texture

Skin detection + spreadingregion ofinterest

12

96

1.2 GHz

Object trackingtime; no furtherdetails

1.44 - 13.55 speed-up in fps bcompared to CPU.Best 90 fps formultiple- and 225fps for single-object tracking.

[15]

8800 GTX

Weight calculation

Face trackingmodel

16

128

1.35 GHz

Face trackingalgorithm time;no furtherdetails

No executiontime measurementsfor particle filter

[16]

9400M

All; randomnumbers fromCPU

BOT model,25 time steps

2

16

450 MHz

Sampling +weighting +weightnormalization+ resampling

For 2,048 particles:best time 168.3 ms,position 0.078 -0.083; for 4,096particles: best time 168.0 ms, position0.077 - 0.081

[17]

GTX 580

All; distributedresampling

Dynamic equations tomodel arobotic arm

16

512

2 GHz

Sum of kernels:random numbergeneration +sampling + local sort + global estimate + exchange + resampling

64,000 particles0.3 ms

Proposed

GTX 550 Ti

All; all parallel

BOT model,24 time steps

4

192

1.8 GHz

All operations(including memory transfers, PF steps and file I/O)

2,048 particles 77 ms with 0.09 position error; 16,384 particles 77 ms 0.07 position error

  1. Summary of related works including the following parameters: used model, outline of the technique and the GPU if measurements were made on it. Direct comparison is often hardly feasible due to the differences of the mentioned parameters. aAs given in the references; bframes per second.