Optimizing Queries over Video via Lightweight Keypoint-based Object Detection

Jiansheng Dong, Jingling Yuan, Lin Li, Xian Zhong, Weiru Liu

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

44 Downloads (Pure)


Recent advancements in convolutional neural networks based object detection have enabled analyzing the mounting video data with high accuracy. However, inference speed is a major drawback of these video analysis system because of the
heavy object detectors. To address the computational and practicability challenges of video analysis, we propose FastQ, a system for efficient querying over video at scale. Given a target video, FastQ can automatically label the category and number of objects for each frame. We introduce a novel lightweight object detector named FDet to improve the efficiency of query system. First, a difference detector filters the frames whose difference is less than the threshold. Second, FDet is employed to efficiently label the remaining frames. To reduce inference time, FDet detects a center keypoint and a pair of corners from the feature map generated by a lightweight backbone to predict the bounding boxes. FDet completely avoid the complicated computation related to anchor boxes. Compared with state-of-the-art real-time detectors, FDet achieves superior performance with 29.1% AP on COCO benchmark at 25.3ms. Experiments show that FastQ achieves 150× to 300× speed-ups while maintaining more than 90% accuracy in video queries.
Original languageEnglish
Title of host publicationACM International Conference on Multimedia Retrieval (ICMR)'2020
PublisherAssociation for Computing Machinery (ACM)
Number of pages7
Publication statusPublished - 8 Jun 2020
EventACM International Conference on Multimedia Retrieval (ICMR) , 2020 - Dublin, Dublin, Ireland
Duration: 8 Jun 202011 Jun 2020


ConferenceACM International Conference on Multimedia Retrieval (ICMR) , 2020
Internet address


  • object detection
  • real-time
  • anchor-free
  • keypoint-based
  • high-resolution representation

Fingerprint Dive into the research topics of 'Optimizing Queries over Video via Lightweight Keypoint-based Object Detection'. Together they form a unique fingerprint.

Cite this