1707.ICCV.Neural Person Search Machines检测加识别 论文阅读笔记

Neural Person Search Machines(NPSM)一个新颖的end-to-end的(检测+reid)行人搜索识别方法

论文贡献:
1.提出了一种NPSM框架(基于LSTM记忆递归网络attention机制)来模拟人的视觉搜索机制,在记忆query/probe特征信息的指导下,递归地由小到大定位有效区域,由粗到精的得到iamge中与query匹配的行人区域。
2.相比于现阶段PRW和CUHK-SYSU用的two-stage strategy或者组合策略,该方法提出了无约束检测,引入query-aware 信息的区域缩减机制(包含更多的上下文信息),同时地解决定位和query的行人识别匹配。 memory of query person can also effectively guide the neural search model to find the right person
3.做了很多实验,在最新的PRW和CUHK-SYSU数据集上得到了最好的性能

缺点是:ranking排序时间比较耗时,尤其是在PRW数据集上。

主要涉及的论文总结:
作者主要基于以下论文构架自己的模型
T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang. Joint detection and identification feature learning for person search. arXiv:1604.01850,2017.
S. Xingjian, Z. Chen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c.Woo. Convolutional lstm network: A machine learning approach for precipitation nowcasting. In NIPS, pages 802–810, 2015.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016

行人检测的文章:
1.基于传统特征:
DPM:P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models.
IEEE TPAMI, 32(9):1627–1645, 2010.
ACF:P. Dollar, R. Appel, S. Belongie, and P. Perona. Fast feature pyramids ´for object detection. IEEE TPAMI, 36(8):1532–1545, 2014
LDCF:W. Nam, P. Dollar, and J. H. Han. Local decorrelation for improved pedestrian detection. NIPS, 1:424–432, 2014.

2.基于CNN:
DeepParts :Y. Tian, P. Luo, X. Wang, and X. Tang. Deep learning strong parts for pedestrian detection. In IEEE ICCV, pages 1904–1912, 2015.
CompACT boosting :Z. Cai, M. Saberian, and N. Vasconcelos. Learning complexityaware cascades for deep pedestrian detection. In IEEE CVPR, pages 3361–3369, 2015.

CCF:B. Yang, J. Yan, Z. Lei, and S. Z. Li. Convolutional channel features.In IEEE ICCV, pages 82–90, 2015.

R-CNN:Rich feature hierarchies for accurate object detection and semantic segmentation
fast R-CNN:Fast R-CNN https://github.com/rbgirshick/fast-rcnn
faster R-CNN:S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards realtime object detection with region proposal networks. In NIPS, pages 91–99, 2015.

其他:
Edgeboxes(proposal model ):C. L. Zitnick and P. Dollar. Edge boxes: Locating object proposals ´ from edges. In ECCV, pages 391–405. Springer, 2014.


论文介绍:

论文采用的框架与传统的方法对比:
这里写图片描述

框架结构:
这里写图片描述
1.采用了Xiao Tong的OIMloss,FCN提取特征图采用resnet50+ROI pooling.
2.在训练时,在NSN网络的各时间步采用segmentation alike softmax loss as the “region shrinkage loss”训练策略使网络产生合适的包含target的attention maps
3.为使学到的特征更具判别性,训练时增加an identification loss following the “Identification Net”
4.Region Shrinkage with Primitive Memory。more context information would be included from a large region and the number of irrelevant person candidates with the target person would be recursively reduced in the search process。


实验
评价标准:采用mAp反映the accuracy of detecting the query person from
the gallery images。cmc采用top-1,计数只在预测框与GT的IoU>0.5.

Baseline中按数据集PRW和CUHK-SYSU来区分:
PRW:R-CNN [7] detectors of DPM [6], CCF [36],ACF [4], LDCF [21]) and recognizers (LOMO, XQDA [17], IDEdet, CWS [41]). AlexNet作为R-CNN detector的基网络,其中,DPM-AlexNet比DPM整合其他的(如VGG,ResNet)性能更优
CUHK-SYSU:CNN(Faster-RCNN with ResNet50)+IDNet的组合和基于以上和OIM的联合优化等

Analytic experiments on CUHK-SYSU benchmark to investigate the contribution of each component in our proposed NPSM architecture.
这里写图片描述

Comparison with State-of-the-art Methods
CUHK-SYSU数据集上
这里写图片描述

PRW数据集上
这里写图片描述

Attention maps produced by our NPSM 可视化
这里写图片描述