From: Joint modality fusion and temporal context exploitation for semantic video analysis
Domain | Content used | Formed sets | ||
---|---|---|---|---|
 |  |
|
| U te |
Tennis e1:rally, e2:serve, e3:replay, e4:break | 16 videos (352 × 288, 25 fps) of professional tennis games from various international tournaments | 437 shots e1:167 e2:44 e3:27 e4:199 | 754 shots e1:258 e2:85 e3:41 e4:370 | 424 shots e1:138 e2:52 e3:23 e4:211 |
News e1:anchor, e2:reporting, e3:reportage, e4:graphics | 32 videos (352 × 288, 25 fps) of news broadcast from Deutsche Welle1 | 338 shots e1:70 e2:46 e3:174 e4:48 | 557 shots e1:80 e2:71 e3:337 e4:69 | 293 shots e1:59 e2:28 e3:174 e4:32 |
Volleyball-I e1:rally, e2:serve, e3:replay, e4:break | 20 videos (352 × 264, 25 fps) of Volleyball broadcast from the Beijing 2008 men's olympic tournament | 262 shots e1:67 e2:42 e3:27 e4:126 | 562 shots e1:129 e2:94 e3:69 e4:270 | 532 shots e1:151 e2:74 e3:71 e4:236 |
Volleyball-II E1:rally, e2:ace, e3:serve, e4:serve preparation, e5:replay, e6:player celebration, e7:tracking single player, e8:face close-up, e9:tracking multiple players | Same with Volleyball-I videos. The videos forming test set U te are the same with the ones used for evaluation in the volleyball-I domain. Difference in the total number of considered shots is due to the more extended set of semantic classes used for performing manual video annotation. | 422 shots e1:96 e2:18 e3:50 e4:24 e5:41 e6:78 e7:49 e8:23 e9:43 | 452 shots e1:90 e2:20 e3:45 e4:32 e5:55 e6:99 e7:34 e8:17 e9:60 | 538 shots e1:122 e2:17 e3:60 e4:19 e5:71 e6:94 e7:57 e8:21 e9:77 |