From: An efficient pruning scheme of deep neural networks for Internet of Things applications
Notation | Definition |
---|---|
L | The number of convolutional layers |
p | The overall pruning rate for all channels |
Cl | The original total number of channels in |
 | each layer, 1≤l≤L |
\(\mathbf {w}^{l}_{k}\) | The convolutional kernel, |
 | \(\mathbf {w}^{l}_{k}\in \mathbb {R}^{C^{l-1}\times 3\times 3},1\le l\le L, 1\le k \le C^{l}\) |
Hl | The height of channels, 1≤l≤L |
Wl | The width of channels, 1≤l≤L |
\(\mathbf {z}^{l}_{k}\) | The feature map or channel, |
 | \(\mathbf {z}^{l}_{k}\in \mathbb {R}^{H^{l}\times W^{l}},1\le l\le L, 1\le k \le C^{l}\) |
fl | The feature saliency, \(\mathbf {f}^{l}\in \mathbb {R}^{C^{l}}\) |
 | \(f^{l}_{k}\in \mathbf {f}^{l}, 1\le k\le C^{l}\) |
[al]i | The remaining channels in each layer in |
 | the i-th training epoch, 1≤l≤L |
\(\Theta ^{l}_{k}\) | The evaluation on channels’ significance w.r.t. |
 | a single mini-batch, 1≤l≤L,1≤k≤Cl |
[ξl]i | The layers’ significance evaluation in the i-th |
 | training epoch, 1≤l≤L |
J | The loss function adopted to evaluate the difference |
 | between the observed values and the actual ones |
ε | The smoothing factor |
s | The proportion of redistributing channels |