From: Fast, parallel implementation of particle filtering on the GPU architecture
References | GPU type | GPU computes | Modela | Number ofSMs on GPU | Number ofcores | GPU clock | Time included | Runtime data |
---|---|---|---|---|---|---|---|---|
[12] | GTX 280 | All; spreading-narrowing technique | BOT model,25 time steps | 30 | 240 | 1.3 GHz | Sampling +weightnormalization +resampling | 1,050 particles79.4 ms, positionerror 0.06245; 2,000 particles 124.8 ms,position error0.06226 |
[13] | GTX 280 | All; sequentialresampling | A factorstochastic volatility model | 30 | 240 | 1.3 GHz | SMC algorithm:no further details | 8,192 particles82 ms; 16,382particles 144 ms;65,536 particles465 ms |
[14] | 8800 GTS | All; resamplinguses offline-initiated texture | Skin detection + spreadingregion ofinterest | 12 | 96 | 1.2 GHz | Object trackingtime; no furtherdetails | 1.44 - 13.55 speed-up in fps bcompared to CPU.Best 90 fps formultiple- and 225fps for single-object tracking. |
[15] | 8800 GTX | Weight calculation | Face trackingmodel | 16 | 128 | 1.35 GHz | Face trackingalgorithm time;no furtherdetails | No executiontime measurementsfor particle filter |
[16] | 9400M | All; randomnumbers fromCPU | BOT model,25 time steps | 2 | 16 | 450 MHz | Sampling +weighting +weightnormalization+ resampling | For 2,048 particles:best time 168.3 ms,position 0.078 -0.083; for 4,096particles: best time 168.0 ms, position0.077 - 0.081 |
[17] | GTX 580 | All; distributedresampling | Dynamic equations tomodel arobotic arm | 16 | 512 | 2 GHz | Sum of kernels:random numbergeneration +sampling + local sort + global estimate + exchange + resampling | 64,000 particles0.3 ms |
Proposed | GTX 550 Ti | All; all parallel | BOT model,24 time steps | 4 | 192 | 1.8 GHz | All operations(including memory transfers, PF steps and file I/O) | 2,048 particles 77 ms with 0.09 position error; 16,384 particles 77 ms 0.07 position error |