A bottom-up summarization algorithm for videos in the wild

Pan, Gang; Zheng, Yaoxian; Zhang, Rufei; Han, Zhenjun; Sun, Di; Qu, Xingming

doi:10.1186/s13634-019-0611-y

EURASIP Journal on Advances in Signal Processing

Table 1 Quantitative results

From: A bottom-up summarization algorithm for videos in the wild

Video name	Dataset [26]		Humans [26]			Computational methods [26, 37]
	Ran	Max	Worst	Mean	Best	Uniform	Cluster	Att.	Superframe	Ours
Base jumping	0.144	0.398	0.113	0.257	0.396	0.168	0.109	0.194	0.121	0.218
Bike Polo	0.134	0.503	0.190	0.322	0.436	0.058	0.130	0.076	0.356	0.296
Scuba	0.138	0.387	0.109	0.217	0.302	0.162	0.135	0.200	0.184	0.261
Valparaiso Downhill	0.142	0.427	0.148	0.272	0.400	0.154	0.154	0.231	0.242	0.306
Bearpark climbing	0.147	0.330	0.129	0.208	0.267	0.152	0.158	0.227	0.118	0.218
Bus in rock tunnel	0.135	0.359	0.126	0.198	0.270	0.124	0.102	0.112	0.135	0.205
Car railcrossing	0.140	0.515	0.245	0.357	0.454	0.146	0.146	0.064	0.362	0.132
Cockpit landing	0.136	0.443	0.110	0.279	0.366	0.129	0.156	0.116	0.172	0.298
Cooking	0.145	0.528	0.273	0.379	0.496	0.171	0.139	0.118	0.321	0.293
Eiffel Tower	0.130	0.467	0.233	0.312	0.426	0.166	0.179	0.136	0.295	0.205
Excavators river	0.144	0.411	0.108	0.303	0.397	0.131	0.163	0.041	0.189	0.123
Jumps	0.149	0.611	0.214	0.483	0.569	0.052	0.298	0.243	0.427	0.309
Kids playing in leaves	0.139	0.394	0.141	0.289	0.416	0.209	0.165	0.084	0.089	0.182
Playing on water slide	0.134	0.340	0.139	0.195	0.284	0.186	0.141	0.124	0.200	0.179
Saving dolphins	0.144	0.313	0.095	0.188	0.242	0.165	0.214	0.154	0.145	0.169
St Maarten Landing	0.143	0.624	0.365	0.496	0.606	0.092	0.096	0.419	0.313	0.513
Statue of Liberty	0.122	0.332	0.096	0.184	0.280	0.143	0.125	0.083	0.192	0.153
Uncut evening flight	0.131	0.506	0.206	0.350	0.421	0.122	0.098	0.299	0.271	0.346
Paluma jump	0.139	0.662	0.346	0.509	0.642	0.132	0.072	0.028	0.181	0.214
Playing ball	0.145	0.403	0.190	0.271	0.364	0.179	0.176	0.140	0.174	0.137
Notre Dame	0.137	0.360	0.179	0.231	0.287	0.124	0.141	0.138	0.235	0.205
Air Force One	0.144	0.490	0.185	0.332	0.457	0.161	0.143	0.215	0.318	0.407
Fire domino	0.145	0.514	0.170	0.394	0.517	0.233	0.349	0.252	0.130	0.311
Car over camera	0.134	0.490	0.214	0.346	0.418	0.099	0.296	0.201	0.372	0.366
Paintball	0.127	0.550	0.145	0.399	0.503	0.109	0.198	0.281	0.320	0.374
Mean	0.139	0.454	0.179	0.311	0.409	0.143	0.163	0.167	0.234	0.257
Relative to max	31%	100%	39%	68%	90%	31%	36%	37%	52%	57%
Relative to mean	45%	146%	58%	100%	131%	46%	53%	54%	75%	83%

We show f-measures at 15% summary length for our method, the baselines, and the human selections. We highlight the best (italics) and the second best (bold) computational methods. “Ran” represents random sample. “Uniform” and “Cluster” are computational methods from [26]. “Att.” is the visual attention from [37]. “Superframe” is the method from [26]

Back to article page