Fig. 1From: Level-wise aligned dual networks for text–video retrievalIllustration of text-to-video retrieval: given a text query, retrieve the corresponding video from the databaseBack to article page