MIDV-2019: challenges of the modern mobile-based document OCR

doi:10.1117/12.2558438

Open AccessProceedings ArticleDOI

MIDV-2019: challenges of the modern mobile-based document OCR

- Vol. 11433, pp 717-722

TLDR

In this article, the authors presented a new dataset, the MIDV-2019 dataset, containing video clips shot with modern high-resolution mobile cameras, with strong projective distortions and with low lighting conditions.

Abstract:

Recognition of identity documents using mobile devices has become a topic of a wide range of computer vision research. The portfolio of methods and algorithms for solving such tasks as face detection, document detection and rectification, text field recognition, and other, is growing, and the scarcity of datasets has become an important issue. One of the openly accessible datasets for evaluating such methods is MIDV-500, containing video clips of 50 identity document types in various conditions. However, the variability of capturing conditions in MIDV-500 did not address some of the key issues, mainly significant projective distortions and different lighting conditions. In this paper we present a MIDV-2019 dataset, containing video clips shot with modern high-resolution mobile cameras, with strong projective distortions and with low lighting conditions. The description of the added data is presented, and experimental baselines for text field recognition in different conditions.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Houghencoder: Neural Network Architecture for Document Image Semantic Segmentation

Alexander Sheshkus, +2 more

TL;DR: HoughEncoder outperforms UNet which shows state-of-the-art results in many semantic image segmentation tasks even while it has a one hundred times fewer parameters.

...read moreread less

Journal ArticleDOI

Text recognition for Vietnamese identity card based on deep features network

Duc Phan Van Hoai, +2 more

- 17 Feb 2021 -

International Journal on Document Analys...

TL;DR: This paper investigates to develop a method for Vietnamese identity card recognition based on deep features network that achieves an accuracy of more than 96.7% and 89.8% on character level and word level, respectively.

...read moreread less

Posted Content

MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis

Konstantin B. Bulatov, +10 more

- 01 Jul 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: The MIDV-2020 dataset as discussed by the authors contains 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation.

...read moreread less

Book ChapterDOI

Fast End-to-End Deep Learning Identity Document Detection, Classification and Cropping

Guillaume Chiron, +2 more

TL;DR: In this article, a modular approach using a fully multi-stage deep learning based approach is proposed to detect, classifying, and aligning captured documents onto their reference model, which allows to accurately classify the document and estimate its quadrilateral (localization).

...read moreread less

Journal ArticleDOI

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

- 01 Apr 2022 -

Computer Optics

TL;DR: The MIDV-2020 dataset as mentioned in this paper contains 1000 video clips, 2000 scanned images and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation.

...read moreread less

Collapse

MIDV-2019: challenges of the modern mobile-based document OCR

Citations

Houghencoder: Neural Network Architecture for Document Image Semantic Segmentation

Text recognition for Vietnamese identity card based on deep features network

MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis

Fast End-to-End Deep Learning Identity Document Detection, Classification and Cropping

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

Related Papers (4)

MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream

Fast Method of ID Documents Location and Type Identification for Mobile and Server Application

Detection and pose estimation of human face with multiple model images

Hybrid Features Extraction for Adaptive Face Images Retrieval