[ALGORITHM]
@inproceedings{wu2019sequence,
title={Sequence level semantics aggregation for video object detection},
author={Wu, Haiping and Chen, Yuntao and Wang, Naiyan and Zhang, Zhaoxiang},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={9217--9225},
year={2019}
}
We observe around 1 mAP fluctuations in performance, and provide the best model.
Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP@50 | Config | Download |
---|---|---|---|---|---|---|---|
R-50-DC5 | pytorch | 7e | 3.49 | 7.5 | 78.4 | config | model | log |
R-101-DC5 | pytorch | 7e | 5.18 | 7.2 | 81.5 | config | model | log |