From: A multisource fusion framework driven by user-defined knowledge for egocentric activity recognition
Proposed method (%)
Method proposed in [15] (ConvNets+LSTM) + pooling fusion (%)
Method proposed in [16] (DT + temporal enhanced features) + Fisher vector (%)
Average pooling
Maximum pooling
Vision
79.2
68.5%
75.0
78.4
Sensors
43.1
–
49.5
69.0
Fusion
85.4
76.5
80.5
83.7