Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

doi:10.1109/CVPR.2017.368

Open AccessProceedings ArticleDOI

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

Yuliang Liu, +1 more

- pp 3454-3461

Chats0

TLDR

A new Convolutional Neural Networks (CNNs) based method, named Deep Matching Prior Network (DMPNet), to detect text with tighter quadrangle, which has better overall performance than L2 loss and smooth L1 loss in terms of robustness and stability.

Abstract:

Detecting incidental scene text is a challenging task because of multi-orientation, perspective distortion, and variation of text size, color and scale. Retrospective research has only focused on using rectangular bounding box or horizontal sliding window to localize text, which may result in redundant background noise, unnecessary overlap or even information loss. To address these issues, we propose a new Convolutional Neural Networks (CNNs) based method, named Deep Matching Prior Network (DMPNet), to detect text with tighter quadrangle. First, we use quadrilateral sliding windows in several specific intermediate convolutional layers to roughly recall the text with higher overlapping area and then a shared Monte-Carlo method is proposed for fast and accurate computing of the polygonal areas. After that, we designed a sequential protocol for relative regression which can exactly predict text with compact quadrangle. Moreover, a auxiliary smooth Ln loss is also proposed for further regressing the position of text, which has better overall performance than L2 loss and smooth L1 loss in terms of robustness and stability. The effectiveness of our approach is evaluated on a public word-level, multi-oriented scene text database, ICDAR 2015 Robust Reading Competition Challenge 4 Incidental scene text localization. The performance of our method is evaluated by using F-measure and found to be 70.64%, outperforming the existing state-of-the-art method with F-measure 63.76%.

Citations

PDF

Open Access

More filters

Posted Content

Object Detection in 20 Years: A Survey

Zhengxia Zou, +3 more

- 13 May 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019), and makes an in-deep analysis of their challenges as well as technical improvements in recent years.

...read moreread less

Proceedings ArticleDOI

Character Region Awareness for Text Detection

Young Min Baek, +4 more

TL;DR: Zhang et al. as mentioned in this paper proposed a new scene text detection method to effectively detect text area by exploring each character and affinity between characters, which significantly outperforms the state-of-the-art detectors.

...read moreread less

Proceedings ArticleDOI

Rotation-Sensitive Regression for Oriented Scene Text Detection

Minghui Liao, +4 more

TL;DR: The proposed method named Rotation-sensitive Regression Detector (RRD) achieves state-of-the-art performance on several oriented scene text benchmark datasets, including ICDAR 2015, MSRA-TD500, RCTW-17, and COCO-Text, and achieves a significant improvement on a ship collection dataset, demonstrating its generality on oriented object detection.

...read moreread less

Book ChapterDOI

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Shangbang Long, +5 more

TL;DR: A more flexible representation for scene text is proposed, termed as TextSnake, which is able to effectively represent text instances in horizontal, oriented and curved forms and outperforms the baseline on Total-Text by more than 40% in F-measure.

...read moreread less

Journal ArticleDOI

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

Yongchao Xu, +6 more

- 01 Apr 2021 -

IEEE Transactions on Pattern Analysis an...

TL;DR: An obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object is introduced, and five extra target variables are added to the regression head of faster R-CNN, which requires ignorable extra computation time.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

...read moreread less

Collapse

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

Citations

Object Detection in 20 Years: A Survey

Character Region Awareness for Text Detection

Rotation-Sensitive Regression for Oriented Scene Text Detection

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

References

Very Deep Convolutional Networks for Large-Scale Image Recognition

You Only Look Once: Unified, Real-Time Object Detection

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

SSD: Single Shot MultiBox Detector

Related Papers (5)

EAST: An Efficient and Accurate Scene Text Detector

SSD: Single Shot MultiBox Detector

ICDAR 2015 competition on Robust Reading

Synthetic Data for Text Localisation in Natural Images

Deep Residual Learning for Image Recognition