scispace - formally typeset
Open AccessProceedings ArticleDOI

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

Reads0
Chats0
TLDR
A new Convolutional Neural Networks (CNNs) based method, named Deep Matching Prior Network (DMPNet), to detect text with tighter quadrangle, which has better overall performance than L2 loss and smooth L1 loss in terms of robustness and stability.
Abstract
Detecting incidental scene text is a challenging task because of multi-orientation, perspective distortion, and variation of text size, color and scale. Retrospective research has only focused on using rectangular bounding box or horizontal sliding window to localize text, which may result in redundant background noise, unnecessary overlap or even information loss. To address these issues, we propose a new Convolutional Neural Networks (CNNs) based method, named Deep Matching Prior Network (DMPNet), to detect text with tighter quadrangle. First, we use quadrilateral sliding windows in several specific intermediate convolutional layers to roughly recall the text with higher overlapping area and then a shared Monte-Carlo method is proposed for fast and accurate computing of the polygonal areas. After that, we designed a sequential protocol for relative regression which can exactly predict text with compact quadrangle. Moreover, a auxiliary smooth Ln loss is also proposed for further regressing the position of text, which has better overall performance than L2 loss and smooth L1 loss in terms of robustness and stability. The effectiveness of our approach is evaluated on a public word-level, multi-oriented scene text database, ICDAR 2015 Robust Reading Competition Challenge 4 Incidental scene text localization. The performance of our method is evaluated by using F-measure and found to be 70.64%, outperforming the existing state-of-the-art method with F-measure 63.76%.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Object Detection in 20 Years: A Survey

TL;DR: This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019), and makes an in-deep analysis of their challenges as well as technical improvements in recent years.
Proceedings ArticleDOI

Character Region Awareness for Text Detection

TL;DR: Zhang et al. as mentioned in this paper proposed a new scene text detection method to effectively detect text area by exploring each character and affinity between characters, which significantly outperforms the state-of-the-art detectors.
Proceedings ArticleDOI

Rotation-Sensitive Regression for Oriented Scene Text Detection

TL;DR: The proposed method named Rotation-sensitive Regression Detector (RRD) achieves state-of-the-art performance on several oriented scene text benchmark datasets, including ICDAR 2015, MSRA-TD500, RCTW-17, and COCO-Text, and achieves a significant improvement on a ship collection dataset, demonstrating its generality on oriented object detection.
Book ChapterDOI

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TL;DR: A more flexible representation for scene text is proposed, termed as TextSnake, which is able to effectively represent text instances in horizontal, oriented and curved forms and outperforms the baseline on Total-Text by more than 40% in F-measure.
Journal ArticleDOI

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

TL;DR: An obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object is introduced, and five extra target variables are added to the regression head of faster R-CNN, which requires ignorable extra computation time.
References
More filters
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.
Book ChapterDOI

SSD: Single Shot MultiBox Detector

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
Related Papers (5)