Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video 论文

2017Journal of Computational Vision and Imaging Systems引用 365
Advanced Neural Network ApplicationsVisual Attention and Saliency DetectionCCD and CMOS Imaging Sensors

详细信息

发表期刊/会议
Journal of Computational Vision and Imaging Systems
发表日期
2017-10-15
发表年份
2017

关键词

Advanced Neural Network ApplicationsVisual Attention and Saliency DetectionCCD and CMOS Imaging Sensors

摘要

Object detection is considered one of the most challenging problemsin this field of computer vision, as it involves the combinationof object classification and object localization within a scene. Recently,deep neural networks (DNNs) have been demonstrated toachieve superior object detection performance compared to otherapproaches, with YOLOv2 (an improved You Only Look Once model)being one of the state-of-the-art in DNN-based object detectionmethods in terms of both speed and accuracy. Although YOLOv2can achieve real-time performance on a powerful GPU, it still remainsvery challenging for leveraging this approach for real-timeobject detection in video on embedded computing devices withlimited computational power and limited memory. In this paper,we propose a new framework called Fast YOLO, a fast You OnlyLook Once framework which accelerates YOLOv2 to be able toperform object detection in video on embedded devices in a realtimemanner. First, we leverage the evolutionary deep intelligenceframework to evolve the YOLOv2 network architecture and producean optimized architecture (referred to as O-YOLOv2 here) that has2.8X fewer parameters with just a 2% IOU drop. To further reducepower consumption on embedded devices while maintaining performance,a motion-adaptive inference method is introduced intothe proposed Fast YOLO framework to reduce the frequency ofdeep inference with O-YOLOv2 based on temporal motion characteristics.Experimental results show that the proposed Fast YOLOframework can reduce the number of deep inferences by an averageof 38.13%, and an average speedup of 3.3X for objectiondetection in video compared to the original YOLOv2, leading FastYOLO to run an average of 18FPS on a Nvidia Jetson TX1 embeddedsystem.