scispace - formally typeset
Search or ask a question

Showing papers on "Optical character recognition published in 2015"


Journal ArticleDOI
TL;DR: A novel system for the automatic detection and recognition of text in traffic signs using Maximally stable extremal regions and hue, saturation, and value color thresholding to locate a large number of candidates and interprets the text contained within detected candidate regions.
Abstract: We propose a novel system for the automatic detection and recognition of text in traffic signs. Scene structure is used to define search regions within the image, in which traffic sign candidates are then found. Maximally stable extremal regions (MSERs) and hue, saturation, and value color thresholding are used to locate a large number of candidates, which are then reduced by applying constraints based on temporal and structural information. A recognition stage interprets the text contained within detected candidate regions. Individual text characters are detected as MSERs and are grouped into lines, before being interpreted using optical character recognition (OCR). Recognition accuracy is vastly improved through the temporal fusion of text results across consecutive frames. The method is comparatively evaluated and achieves an overall $F_{\rm measure}$ of 0.87.

96 citations


Proceedings ArticleDOI
22 Aug 2015
TL;DR: The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.
Abstract: We propose to address the problem of Document Image Binarization (DIB) using Long Short-Term Memory (LSTM) which is specialized in processing very long sequences. Thus, the image is considered as a 2D sequence of pixels and in accordance to this a 2D LSTM is employed for the classification of each pixel as text or background. The proposed approach processes the information using local context and then propagates the information globally in order to achieve better visual coherence. The method is robust against most of the document artifacts. We show that with a very simple network without any feature extraction and with limited amount of data the proposed approach works reasonably well for the DIBCO 2013 dataset. Furthermore a synthetic dataset is considered to measure the performance of the proposed approach with both binarization and OCR groundtruth. The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.

52 citations


Journal ArticleDOI
TL;DR: A new optical character recognition (OCR) approach which allows real-time, automatic extraction and recognition of digits in images and videos and has outperformed state-of-the-art methods.

50 citations


Journal ArticleDOI
TL;DR: A new system to detect and recognize Brazilian vehicle license plates, in which the registered users have permission to enter the location, getting a 98.5% success rate on the tested cases.

45 citations


Proceedings ArticleDOI
08 Feb 2015
TL;DR: This paper shows that using a simple pre-processing step that normalizes the position and baseline of letters, an Arabic handwriting recognition method can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance.
Abstract: In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.

44 citations


Proceedings ArticleDOI
01 Sep 2015
TL;DR: A new OCR correction strategy, customised for historical medical documents, which combines rule-based correction of regular errors with a medically-tuned spell-checking strategy, whose corrections are guided by information about subject-specific language usage from the publication period of the article to be corrected.
Abstract: Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, owing to large-scale digitisation efforts. Searchable access is typically provided by applying Optical Character Recognition (OCR) software to scanned page images. Often, however, the automatically recognised text contains a large number of errors, since OCR systems are typically optimised to deal with modern documents, and can struggle with historical document features, including variable print characteristics and archaic vocabulary usage. Low quality OCR text can reduce the efficiency of search systems over historical archives, particularly semantic systems that are based on the application of sophisticated text mining (TM) techniques. We report on a new OCR correction strategy, customised for historical medical documents. The method combines rule-based correction of regular errors with a medically-tuned spell-checking strategy, whose corrections are guided by information about subject-specific language usage from the publication period of the article to be corrected. The performance of our method compares favourably to other OCR post-correction strategies, in improving word-level accuracy of poor-quality documents by up to 16%.

37 citations


Journal ArticleDOI
TL;DR: This work forms the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps, and estimates all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters.
Abstract: Segmentation of handwritten document images into text-lines and words is an essential task for optical character recognition. However, since the features of handwritten document are irregular and diverse depending on the person, it is considered a challenging problem. In order to address the problem, we formulate the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps. Even though many parameters are involved in our formulation, we estimate all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters. Experimental results on ICDAR 2009/2013 handwriting segmentation databases show that proposed method achieves the state-of-the-art performance on Latin-based and Indian languages.

37 citations


Book ChapterDOI
TL;DR: LeNet-5, a Convolutional Neural Network trained with gradient based learning and backpropagation algorithm is used for classification of Malayalam character images and result obtained for multi-class classifier shows that CNN performance is dropping down when the number of classes exceeds range of 40.
Abstract: Optical Character Recognition (OCR) has an important role in information retrieval which converts scanned documents into machine editable and searchable text formats. This work is focussing on the recognition part of OCR. LeNet-5, a Convolutional Neural Network (CNN) trained with gradient based learning and backpropagation algorithm is used for classification of Malayalam character images. Result obtained for multi-class classifier shows that CNN performance is dropping down when the number of classes exceeds range of 40. Accuracy is improved by grouping misclassified characters together. Without grouping, CNN is giving an average accuracy of 75% and after grouping the performance is improved upto 92%. Inner level classification is done using multi-class SVM which is giving an average accuracy in the range of 99-100%.

36 citations


Posted Content
TL;DR: It is reported that the winning entry of text image super-resolution framework has largely improved the OCR performance with low-resolution images used as input, reaching an OCR accuracy score of 77.19%, which is comparable with that of using the original high- resolution images.
Abstract: Text image super-resolution is a challenging yet open research problem in the computer vision community. In particular, low-resolution images hamper the performance of typical optical character recognition (OCR) systems. In this article, we summarize our entry to the ICDAR2015 Competition on Text Image Super-Resolution. Experiments are based on the provided ICDAR2015 TextSR dataset (3) and the released Tesseract-OCR 3.02 system (1). We report that our winning entry of text image super-resolution framework has largely improved the OCR performance with low-resolution images used as input, reaching an OCR accuracy score of 77.19%, which is comparable with that of using the original high-resolution images (78.80%). Index Terms—super resolution; optical character recogni- tion.

33 citations


Journal ArticleDOI
TL;DR: A segmentation-based method is proposed for developing Nastalique OCR, deriving principles and techniques for the pre-processing and recognition, and the work is extensible to other languages using Nastsalique.
Abstract: Much work on Arabic language optical character recognition (OCR) has been on Naskh writing style. Nastalique style, used for most of languages using Arabic script across Southern Asia, is much more challenging to process due to its compactness, cursiveness, higher context sensitivity and diagonality. This makes the Nastalique writing more complex with multiple letters horizontally overlapping each other. Due to these reasons, existing methods used for Naskh would not work for Nastalique and therefore most work on Nastalique has used non-segmentation methods. The current paper presents new approach for segmentation-based analysis for Nastalique style. The paper explains the complexity of Nastalique, why Naskh based techniques cannot work for Nastalique, and proposes a segmentation-based method for developing Nastalique OCR, deriving principles and techniques for the pre-processing and recognition. The OCR is developed for Urdu language. The system is optimized using 79,093 instances of 5249 main bodies derived from a corpus of 18 million words, giving recognition accuracy of 97.11 %. The system is then tested on document images of books with 87.44 % main body recognition accuracy. The work is extensible to other languages using Nastalique.

30 citations


Patent
26 May 2015
TL;DR: In this article, the authors present a multithreaded approach for processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized texts.
Abstract: Various embodiments enable a device to perform tasks such as processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized text. In at least one embodiment, processing the image includes substantially simultaneously or concurrently processing the image with at least two recognition engines, such as at least two optical character recognition (OCR) engines, running in a multithreaded mode. In at least one embodiment, the recognition engines can be tuned so that their respective processing speeds are roughly the same. Utilizing multiple recognition engines enables processing latency to be close to that of using only one recognition engine.

Journal ArticleDOI
TL;DR: This paper is introducing a novel text detection technique using many blocks coming from frame decomposition which allowed the extraction of text coordinates using MapReduce programming model and it is found that the running speed can be more than 2 times as fast as classic approach.

Proceedings ArticleDOI
23 Aug 2015
TL;DR: Evaluation results show that the character error rate obtained with LSTM varies from 5.51% to 14.68% and is better than two well-known OCR engines, namely, Tesseract and ABBYY FineReader.
Abstract: This paper reports on high-performance Optical Character Recognition (OCR) experiments using Long Short-Term Memory (LSTM) Networks for Greek polytonic script. Even though there are many Greek polytonic manuscripts, the digitization of such documents has not been widely applied, and very limited work has been done on the recognition of such scripts. We have collected a large number of diverse document pages of Greek polytonic scripts in a novel database, called Polyton-DB, containing 15; 689 textlines of synthetic and authentic printed scripts and performed baseline experiments using LSTM Networks. Evaluation results show that the character error rate obtained with LSTM varies from 5.51% to 14.68% (depending on the document) and is better than two well-known OCR engines, namely, Tesseract and ABBYY FineReader.

Proceedings ArticleDOI
19 Mar 2015
TL;DR: This work proposes a combined horizontal and vertical projection feature extraction scheme for recognition of Gurmukhi characters, an Indic script commonly used in state of Punjab in India.
Abstract: Despite the advancements in Optical Character Recognition (OCR) technologies, problem of Indic script character recognition remains challenging. Especially in case of handwritten characters the challenges are even more. In this work, we focus on off-line recognition of handwritten characters of Gurmukhi, an Indic script commonly used in state of Punjab in India. As a part of this work, we collected a Gurmukhi character dataset of 3500 images. This dataset is collected from 10 writers. We propose a combined horizontal and vertical projection feature extraction scheme for recognition of Gurmukhi characters. We have tested our method on the collected dataset and achieved a high character recognition accuracy of 98.06%.

Proceedings ArticleDOI
19 Mar 2015
TL;DR: The objective of this paper is to recognition of text from image for better understanding of the reader by using particular sequence of different processing module.
Abstract: Text recognition in images is a research area which attempts to develop a computer system with the ability to automatically read the text from images These days there is a huge demand in storing the information available in paper documents format in to a computer storage disk and then later reusing this information by searching process One simple way to store information from these paper documents in to computer system is to first scan the documents and then store them as images But to reuse this information it is very difficult to read the individual contents and searching the contents form these documents line-by-line and word-by-word The challenges involved in this the font characteristics of the characters in paper documents and quality of images Due to these challenges, computer is unable to recognize the characters while reading them Thus there is a need of character recognition mechanisms to perform Document Image Analysis (DIA) which transforms documents in paper format to electronic format In this paper we have discuss method for text recognition from images The objective of this paper is to recognition of text from image for better understanding of the reader by using particular sequence of different processing module

Proceedings ArticleDOI
01 Nov 2015
TL;DR: The general architecture of modern OCR system with details of each module is discussed, and Moore neighborhood tracing is applied for extracting boundary of characters and then chain rule for feature extraction.
Abstract: Artificial intelligence, pattern recognition and computer vision has a significant importance in the field of electronics and image processing. Optical character recognition (OCR) is one of the main aspects of pattern recognition and has evolved greatly since its beginning. OCR is a system which recognized the readable characters from optical data and converts it into digital form. Various methodologies have been developed for this purpose using different approaches. In this paper, general architecture of modern OCR system with details of each module is discussed. We applied Moore neighborhood tracing for extracting boundary of characters and then chain rule for feature extraction. In the classification stage for character recognition, SVM is trained and is applied on suitable example.

Proceedings ArticleDOI
10 Jan 2015
TL;DR: A method for the automatic detection and recognition of text and symbols painted on the road surface is presented and achieves F-measures of 0.85 for text characters and 0.91 for symbols.
Abstract: A method for the automatic detection and recognition of text and symbols painted on the road surface is presented. Candidate regions are detected as maximally stable extremal regions (MSER) in a frame which has been transformed into an inverse perspective mapping (IPM) image, showing the road surface with the effects of perspective distortion removed. Detected candidates are then sorted into words and symbols, before they are interpreted using separate recognition stages. Symbol-based road markings are recognised using histogram of oriented gradient (HOG) features and support vector machines (SVM). Text-based road signs are recognised using a third-party optical character recognition (OCR) package, after application of a perspective correction stage. Matching of regions between frames, and temporal fusion of results is used to improve performance. The proposed method is validated using a data-set of videos, and achieves F-measures of 0.85 for text characters and 0.91 for symbols.

Journal ArticleDOI
TL;DR: This paper has attempted the problem of Optical Character Recognition for handwritten Gujarati alphabets with SVM and kNN and the result is compared with that of SVM.
Abstract: Gujarati language is used in the western state of Gujarat. Because of its peculiarities, its Optical Character Recognition becomes very difficult. For this language very less work has been done in the area of Optical Character Recognition. In this paper, I have attempted the problem of Optical Character Recognition for handwritten Gujarati alphabets. For this work, forty handwritten alphabets are collected from about one hundred and ninety nine writers. Here; aspect ratio, extent of alphabet, and image subdivision approach has been used as feature space and support vector machine (SVM) has been used for the classification purpose and it gives 86.66 % of performance accuracy. kNN is also used for classification and the result is compared with that of SVM. The paper also describes support vector machine.

Proceedings ArticleDOI
07 Apr 2015
TL;DR: A new and simple, but fast and efficient technique for automatic number plate recognition (ANPR) using SIFT (Scale Invariant Feature Transform) features, which is used to automatically locate and recognize, as a special case, the Jordanian license plates.
Abstract: The aim of this paper is on presenting a new and simple, but fast and efficient technique for automatic number plate recognition (ANPR) using SIFT (Scale Invariant Feature Transform) features The proposed system is used to automatically locate and recognize, as a special case, the Jordanian license plates In the core of our system, SIFT-based template matching technique is used to locate special marks in the license plate Upon successful detection of those marks, the license plate is segmented out from the original image and OCR (Optical Character Recognition) is used to recognize the characters or numbers from the plate Due to the various invariance virtues of SIFT, our method can adaptively deal with various changes in the license plates, such as rotation, scaling, and illumination Experimental results using real datasets are presented, which show that our system has a good performance

Journal ArticleDOI
TL;DR: This paper presents an application of optical word recognition and fuzzy control to a smartphone automatic test system and the proposed control scheme allows the robot arm to perform different assigned test functions successfully.
Abstract: This paper presents an application of optical word recognition and fuzzy control to a smartphone automatic test system. The system consists of a robot arm and two webcams. After the words from the control panel that represent commands are recognized by the robot system, the robot arm performs the corresponding actions to test the smartphone. One of the webcams is utilized to capture commands on the screen of the control panel, the other to recognize the words on the screen of the tested smartphone. The method of image processing is based on the Red-Green-Blue (RGB) and Hue-Saturation-Luminance (HSL) color spaces to reduce the influence of light. Fuzzy theory is used in the robot arm’s position control. The Optical Character Recognition (OCR) technique is applied to the word recognition, and the recognition results are then checked by a dictionary process to increase the recognition accuracy. The camera which is used to recognize the tested smartphone also provides object coordinates to the fuzzy controller, then the robot arm moves to the desired positions and presses the desired buttons. The proposed control scheme allows the robot arm to perform different assigned test functions successfully.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: The main aim of this paper is to propose an efficient feature extraction and classification techniques for OCR system for handwritten Kannada characters and numerals which involves several phases such as preprocessing, feature extraction, classification and classification.
Abstract: The frontier area of research in the field of pattern recognition and image processing is handwritten character recognition. This leads to a great demand for OCR system containing handwritten documents. In order to recognize the text present in a document, an Optical Character Recognition (OCR) system is developed. In this paper, OCR system for handwritten Kannada characters and numerals is developed which involves several phases such as preprocessing, feature extraction and classification. Preprocessing includes the techniques that are suitable to convert the input image into an acceptable form for feature extraction. The main aim of this paper is to propose an efficient feature extraction and classification techniques. Suitable features are extracted as structural features and wavelet transform is employed for extracting global features. Artificial neural network classifier is used for recognizing the handwritten Kannada characters and numerals. The proposed method is experimented on 4800 images of handwritten Kannada characters and obtained an average accuracy of 91.00%. Also, the proposed method is experimented on 1000 images of handwritten Kannada numerals and obtained an average accuracy of 97.60%.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: The proposed approach presents the preprocessing, features extraction and classification of Urdu language text using three different features extraction techniques, the Hu moments, Zernike moments and the Principal Component Analysis (PCA) for classification.
Abstract: this article reports the development and experimental analysis of an Urdu Optical Character REcognition (OCR) system. The proposed approach presents the preprocessing, features extraction and classification of Urdu language text. Three different features extraction techniques, the Hu moments, Zernike moments and the Principal Component Analysis (PCA) are used. Decision Tree algorithm J-48 is used for classification. A medium size database of 441 characters is created consisting of hand written and machine written Urdu language characters. An overall best recognition accuracy of 92.06% is achieved using the Hu moments.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: In this study, an Android application is developed by integrating Tesseract OCR engine, Bing translator and phones' built-in speech out technology that helps travelers who visit a foreign country to understand messages portrayed in different language.
Abstract: Smartphones have been known as most commonly used electronic devices in daily life today. As hardware embedded in smartphones can perform much more task than traditional phones, the smartphones are no longer just a communication device but also considered as a powerful computing device which able to capture images, record videos, surf the internet and etc. With advancement of technology, it is possible to apply some techniques to perform text detection and translation. Therefore, an application that allows smartphones to capture an image and extract the text from it to translate into English and speech it out is no longer a dream. In this study, an Android application is developed by integrating Tesseract OCR engine, Bing translator and phones' built-in speech out technology. Final deliverable is tested by various type of target end user from a different language background and concluded that the application benefits many users. By using this app, travelers who visit a foreign country able to understand messages portrayed in different language. Visually impaired users are also able to access important message from a printed text through speech out feature.

Proceedings ArticleDOI
23 Aug 2015
TL;DR: This paper proposes a word-level script identification technique for six handwritten Indic scripts- Bangla, Devanagari, Gurumukhi, Malayalam, Oriya Telugu and the Roman script using a combination of elliptical and polygonal approximation techniques.
Abstract: Automatic script identification from handwritten document images facilitates many important applications such as indexing, sorting and triage. A given Optical Character Recognition (OCR) system is typically trained on only a single script but for documents or collections containing different scripts, there must be some way to automatically identify the script prior to OCR. For Indic script research, some results have been reported in the literature but the task is far from solved. In this paper, we propose a word-level script identification technique for six handwritten Indic scripts- Bangla, Devanagari, Gurumukhi, Malayalam, Oriya Telugu and the Roman script. A set of 82 features has been designed using a combination of elliptical and polygonal approximation techniques. Our approach has been evaluated on a dataset of 7000 handwritten text words, using multiple classifiers. A Multi-Layer Perceptron (MLP) classifier was found to be the best classifier resulting in 95.35% accuracy. The result is progressive considering the complexities and shape variations of the Indic scripts.

Proceedings ArticleDOI
01 Oct 2015
TL;DR: The objective of this review paper is to summarize the well-known methods for text recognition from images for better understanding of the reader.
Abstract: Text recognition in images is an active research area which attempts to develop a computer application with the ability to automatically read the text from images. Nowadays there is a huge demand of storing the information available on paper documents in to a computer readable form for later use. One simple way to store information from these paper documents in to computer system is to first scan the documents and then store them as images. However to reuse this information it is very difficult to read the individual contents and searching the contents form these documents line-by-line and word-by-word. The challenges involved are: font characteristics of the characters in paper documents and quality of the images. Due to these challenges, computer is unable to recognize the characters while reading them. Thus, there is a need of character recognition mechanisms to perform document image analysis which transforms documents in paper format to electronic format. In this paper, we have reviewed and analyzed different methods for text recognition from images. The objective of this review paper is to summarize the well-known methods for better understanding of the reader.

Proceedings ArticleDOI
23 Aug 2015
TL;DR: This research aims to evaluate the performance of sequence classifiers like HMM and LSTM and compare their performance with descriptor based classifier like SIFT and introduces a database of 480,000 images containing 1000 unique ligatures or sub-words of Pashto.
Abstract: Optical Character Recognition (OCR) of cursive scripts like Pashto and Urdu is difficult due the presence of complex ligatures and connected writing styles. In this paper, we evaluate and compare different approaches for the recognition of such complex ligatures. The approaches include Hidden Markov Model (HMM), Long Short Term Memory (LSTM) network and Scale Invariant Feature Transform (SIFT). Current state of the art in cursive script assumes constant scale without any rotation, while real world data contain rotation and scale variations. This research aims to evaluate the performance of sequence classifiers like HMM and LSTM and compare their performance with descriptor based classifier like SIFT. In addition, we also assess the performance of these methods against the scale and rotation variations in cursive script ligatures. Moreover, we introduce a database of 480,000 images containing 1000 unique ligatures or sub-words of Pashto. In this database, each ligature has 40 scale and 12 rotation variations. The evaluation results show a significantly improved performance of LSTM over HMM and traditional feature extraction technique such as SIFT.

Book ChapterDOI
24 Nov 2015
TL;DR: The proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language and can cope with high variations in writing style and skewed header lines as input.
Abstract: The proper character level segmentation of printed or handwritten text is an important preprocessing step for optical character recognition OCR. It is noticed that the languages having cursive nature in writing make the segmentation problem much more complicated. Hindi is one of the well known language in India having this cursive nature in writing style. The main challenge in handwritten character segmentation is to handle the inherent variability in the writing style of different individuals. In this paper, we present an efficient character segmentation method for handwritten Hindi words. Segmentation is performed on the basis of some structural patterns observed in the writing style of this language. The proposed method can cope with high variations in writing style and skewed header lines as input. The method has been tested on our own database for both printed and handwritten words. The average success rate is 96.93i¾?%. The method yields fairly good results for this database comparing with other existing methods. We foresee that the proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language.

Posted Content
TL;DR: An end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model using a deep convolutional neural network to achieve acceptable error rates is presented.
Abstract: In this paper, we address the task of Optical Character Recognition(OCR) for the Telugu script. We present an end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model. The segmentation is based on mathematical morphology. The classification module, which is the most challenging task of the three, is a deep convolutional neural network. The language is modelled as a third degree markov chain at the glyph level. Telugu script is a complex alphasyllabary and the language is agglutinative, making the problem hard. In this paper we apply the latest advances in neural networks to achieve state-of-the-art error rates. We also review convolutional neural networks in great detail and expound the statistical justification behind the many tricks needed to make Deep Learning work.

Proceedings ArticleDOI
08 Feb 2015
TL;DR: This work proposes a NR-IQA method with the objective quality measure of OCR accuracy, which combines distortion-specific quality metrics and achieves competitive results with learning-based NR- IQA methods on standard datasets, and performs better on heterogeneous documents.
Abstract: No-reference image quality assessment (NR-IQA) aims at computing an image quality score that best correlates with either human perceived image quality or an objective quality measure, without any prior knowledge of reference images. Although learning-based NR-IQA methods have achieved the best state-of-the-art results so far, those methods perform well only on the datasets on which they were trained. The datasets usually contain homogeneous documents, whereas in reality, document images come from different sources. It is unrealistic to collect training samples of images from every possible capturing device and every document type. Hence, we argue that a metric-based IQA method is more suitable for heterogeneous documents. We propose a NR-IQA method with the objective quality measure of OCR accuracy. The method combines distortion-specific quality metrics. The final quality score is calculated taking into account the proportions of, and the dependency among different distortions. Experimental results show that the method achieves competitive results with learning-based NR-IQA methods on standard datasets, and performs better on heterogeneous documents.

Journal ArticleDOI
TL;DR: Two original recognition methods are presented, based on application of mathematical fuzzy logic and the second one is based on representation of an image by a fuzzy-valued function, which are compared with a simple neural network classifier and few other common methods.