Skip to main content

Table 6 Complexity, memory accesses, and arithmetic intensity of scalar/SIMD versions with/without loop-fusion: numerical results for n F = 7 and n F = 8

From: Covariance tracking: architecture optimizations for embedded systems

Version

Without loop-fusion

With loop-fusion

SIMD

n F

Arith

Mem

AI

Arith

Mem

AI

Scalar

7

133

259

0.51

98

105

0.93

Scalar

8

168

328

0.51

124

132

0.94

SSE

7

49

66

0.74

40

27

1.48

SSE

8

59

82

0.72

48

33

1.45

Neon

7

65

66

0.98

56

27

2.07

Neon

8

77

82

0.94

66

33

2.00

AVX

7

27

37

0.73

22

15

1.47

AVX

8

33

45

0.73

27

18

1.50

  1. For a given version, loop-fusion divides the complexity by 1.2 and memory accesses by 2.5.