R<sup>2</sup>CNN: Rotational Region CNN for Arbitrarily-Oriented Scene Text Detection 论文

2018引用 280
Handwritten Text Recognition TechniquesVehicle License Plate RecognitionAdvanced Image and Video Retrieval Techniques

详细信息

发表日期
2018-08-01
发表年份
2018

关键词

Handwritten Text Recognition TechniquesVehicle License Plate RecognitionAdvanced Image and Video Retrieval Techniques

摘要

Scene text detection is challenging as the input may have different orientations, sizes, font styles, lighting conditions, perspective distortions and languages. This paper addresses the problem by designing a Rotational Region CNN (R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> CNN). R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> CNN includes a Text Region Proposal Network (Text-RPN) to estimate approximate text regions and a multitask refinement network to get the precise inclined box. Our work has the following features. First, we use a novel multi-task regression method to support arbitrarily-oriented scene text detection. Second, we introduce multiple ROIPoolings to address the scene text detection problem for the first time. Third, we use an inclined Non-Maximum Suppression (NMS) to post-process the detection candidates. Experiments show that our method outperforms the state-of-the-art on standard benchmarks: ICDAR 2013, ICDAR 2015, COCO-Text and MSRA-TD500.