Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video 论文
详细信息
- 发表期刊/会议
- Journal of Computational Vision and Imaging Systems
- 发表日期
- 2017-10-15
- 发表年份
- 2017
关键词
摘要
Object detection is considered one of the most challenging problemsin this field of computer vision, as it involves the combinationof object classification and object localization within a scene. Recently,deep neural networks (DNNs) have been demonstrated toachieve superior object detection performance compared to otherapproaches, with YOLOv2 (an improved You Only Look Once model)being one of the state-of-the-art in DNN-based object detectionmethods in terms of both speed and accuracy. Although YOLOv2can achieve real-time performance on a powerful GPU, it still remainsvery challenging for leveraging this approach for real-timeobject detection in video on embedded computing devices withlimited computational power and limited memory. In this paper,we propose a new framework called Fast YOLO, a fast You OnlyLook Once framework which accelerates YOLOv2 to be able toperform object detection in video on embedded devices in a realtimemanner. First, we leverage the evolutionary deep intelligenceframework to evolve the YOLOv2 network architecture and producean optimized architecture (referred to as O-YOLOv2 here) that has2.8X fewer parameters with just a 2% IOU drop. To further reducepower consumption on embedded devices while maintaining performance,a motion-adaptive inference method is introduced intothe proposed Fast YOLO framework to reduce the frequency ofdeep inference with O-YOLOv2 based on temporal motion characteristics.Experimental results show that the proposed Fast YOLOframework can reduce the number of deep inferences by an averageof 38.13%, and an average speedup of 3.3X for objectiondetection in video compared to the original YOLOv2, leading FastYOLO to run an average of 18FPS on a Nvidia Jetson TX1 embeddedsystem.