Skip to main content

Table 1 Datasets used for experimentation

From: Joint modality fusion and temporal context exploitation for semantic video analysis

Domain

Content used

Formed sets

  

U t r 1

U t r 2

U te

Tennis

e1:rally, e2:serve, e3:replay, e4:break

16 videos (352 × 288, 25 fps) of professional tennis games from various international tournaments

437 shots

e1:167 e2:44

e3:27 e4:199

754 shots

e1:258 e2:85

e3:41 e4:370

424 shots

e1:138 e2:52

e3:23 e4:211

News

e1:anchor, e2:reporting, e3:reportage, e4:graphics

32 videos (352 × 288, 25 fps) of news broadcast from Deutsche Welle1

338 shots

e1:70 e2:46

e3:174 e4:48

557 shots

e1:80 e2:71

e3:337 e4:69

293 shots

e1:59 e2:28

e3:174 e4:32

Volleyball-I

e1:rally, e2:serve, e3:replay, e4:break

20 videos (352 × 264, 25 fps) of Volleyball broadcast from the Beijing 2008 men's olympic tournament

262 shots

e1:67 e2:42

e3:27 e4:126

562 shots

e1:129 e2:94

e3:69 e4:270

532 shots

e1:151 e2:74

e3:71 e4:236

Volleyball-II

E1:rally, e2:ace, e3:serve, e4:serve preparation, e5:replay, e6:player celebration, e7:tracking single player, e8:face close-up, e9:tracking multiple players

Same with Volleyball-I videos.

The videos forming test set U te are the same with the ones used for evaluation in the volleyball-I domain.

Difference in the total number of considered shots is due to the more extended set of semantic classes used for performing manual video annotation.

422 shots

e1:96 e2:18

e3:50 e4:24

e5:41 e6:78

e7:49 e8:23

e9:43

452 shots

e1:90 e2:20

e3:45 e4:32

e5:55 e6:99

e7:34 e8:17

e9:60

538 shots

e1:122 e2:17

e3:60 e4:19

e5:71 e6:94

e7:57 e8:21

e9:77