Skip to main content

Table 2 Backbone network parameters after modification

From: Water surface object detection using panoramic vision based on improved single-shot multibox detector

Name of convolutional layer

Input size

Kernel

Stride

Output size

Conv1_x

300*300

7*7, 64

2

150*150*64

Conv2_x

150*150*64

3*3 max pool

2

75*75*256

\(\left[ {\begin{array}{*{20}c} {1*1} & {64} \\ {3*3} & {64} \\ {1*1} & {256} \\ \end{array} } \right]\)*3

1

Conv3_x

75*75*256

\(\left[ {\begin{array}{*{20}c} {1*1} & {128} \\ {3*3} & {128} \\ {1*1} & {512} \\ \end{array} } \right]\)*4

2

38*38*512

Conv4_x

38*38*512

\(\left[ {\begin{array}{*{20}c} {1*1} & {256} \\ {3*3} & {256} \\ {1*1} & {1024} \\ \end{array} } \right]\)*6

1

38*38*1024