Skip to main content

Table 7 Impact of memory layout and loop-fusion transform

From: Covariance tracking: architecture optimizations for embedded systems

 

Penryn-M

Haswell-M

Cortex-A9

Cortex-A15

cpp

SoA

447

300

830

646

AoS

207

178

836

520

AoS + T

126

66

503

238

AoS + SIMD

165

69

476

325

AoS + T SIMD

92

38

201

169

speedups

SoA/AoS

×2.2

×1.7

×1.0

×1.2

AoS/AoS + T

×1.6

×2.7

×1.7

×2.2

SIMD/SIMD + T

×1.8

×1.8

×2.4

×1.9

SoA/AoS+SIMD + T

×4 . 9

×7 . 9

×4 . 1

×3 . 8

Execution time (ms) for 512 × 512 images

AoS+T SIMD

20.1

6.1

43.9

26.1

Estimated energy consumption (mJ) for 512 × 512 images

AoS+T SIMD

201

91.4

52.7

44.3

  1. cpp and speedups, execution time, and energy consumption for Penryn-M, Haswell-M, Cortex-A9 and Cortex-A15.