Skip to main content
Fig. 3 | EURASIP Journal on Advances in Signal Processing

Fig. 3

From: Deep person re-identification in UAV images

Fig. 3

The proposed framework. We feed the input image to a base network (ResNet-50) and extract the global feature vector after global average pooling (GAP). The global branch contains two fully connected layers, which transform the feature vector into 128-d embedding. The embedding is feed to the triplet loss. The local branch divides the feature vector into Nc channel groups. Each channel group is feed to the shared 1×1 convolutional layer. Then, each feature vector is feed to its own fully connected layer. The output of the fully connected layer is feed to the L-GM loss function

Back to article page