From: Level-wise aligned dual networks for text–video retrieval
Model
Model complexity
Parameters (M)
FLOPs (G)
Dual Encoding
65.7
1.08
LADN
109.4
1.46