scispace - formally typeset
Open AccessDissertationDOI

Texture Feature-based Document Image Retrieval

Fahimeh Alaei
TLDR
A fast and non-parametric texture feature extraction method based on summarising the local grey-level structure of the image is further proposed in this research work and provided promising results, with lower computing time as well as smaller memory space consumption compared to other variations of local binary pattern-based methods.
Abstract
Storing and manipulating documents in digital form to contribute to a paperless society has been the propensity of emerging technology. There has been notable growth in the variety and quantity of digitised documents, which have often been scanned/photographed and archived as images without any labelling or sufficient index information. The growth of these kinds of document images will undoubtedly continue with new technology. To provide an effective way for retrieving and organizing these document images, many techniques have been implemented in the literature. However, designing automation systems to accurately retrieve document images from archives remains a challenging problem. Finding discriminative and effective features is the fundamental task for developing an efficient retrieval system. An overview of the literature reveals that research on document image retrieval using texture-based features has not yet been broadly investigated. Texture features are suitable for large volume data and are generally fast to compute. In this study, the effectiveness of more than 50 different texture-based feature extraction methods from four categories of texture features - statistical, transform-based, model-based, and structural approaches - are investigated in order to propose a more accurate method for document image retrieval. Moreover, the influence of resolution and similarity metrics on document image retrieval are examined. The MTDB, ITESOFT, and CLEF_IP datasets, which are heterogeneous datasets providing a great variety of page layouts and contents, are considered for experimentation, and the results are computed in terms of retrieval precision, recall, and F-score. By considering the performance, time complexity, and memory usage of different texture features on three datasets, the best category of texture features for obtaining the best retrieval results is discussed. The effectiveness of the transform-based category over other categories in regard to obtaining higher retrieval result is proven. Many new feature extraction and document image retrieval methods are proposed in this research. To attain fast document image retrieval, the number of extracted features and time complexity play a significant role in the retrieval process. Thus, a fast and non-parametric texture feature extraction method based on summarising the local grey-level structure of the image is further proposed in this research work. The proposed fast local binary pattern provided promising results, with lower computing time as well as smaller memory space consumption compared to other variations of local binary pattern-based methods. There is a challenge in DIR systems when document images in queries are of different resolutions from the document images considered for training the system. In addition, a small number of document image samples with a particular resolution may only be available for training a DIR system. To investigate these two issues, an under-sampling concept is considered to generate under-sampled images and to improve the retrieval results. In order to use more than one characteristic of document images for document image retrieval, two different texture-based features are used for feature extraction. The fast-local binary method as a statistical approach, and a wavelet analysis technique as a transform-based approach, are used for feature extraction, and two feature vectors are obtained for every document image. The classifier fusion method using the weighted average fusion of distance measures obtained in relation to each feature vector is then proposed to improve document image retrieval results. To extract features similar to human visual system perception, an appearance-based feature extraction method for document images is also proposed. In the proposed method, the Gist operator is employed on the sub-images obtained from the wavelet transform. Thereby, a set of global features from the original image as well as sub-images are extracted. Wavelet-based features are also considered as the second feature set. The classifier fusion technique is finally employed to find similarity distances between the extracted features using the Gist and wavelet transform from a given query and the knowledge-base. Higher document image retrieval results have been obtained from this proposed system compared to the other systems in the literature. The other appearance-based document image retrieval system proposed in this research is based on the use of a saliency map obtained from human visual attention. The saliency map obtained from the input document image is used to form a weighted document image. Features are then extracted from the weighted document images using the Gist operator. The proposed retrieval system provided the best document image retrieval results compared to the results reported from other systems. Further research could be undertaken to combine the properties of other approaches to improve retrieval result. Since in the conducted experiments, a priori knowledge regarding document image layout and content has not been considered, the use of prior knowledge about the document classes may also be integrated into the feature set to further improve the retrieval performance

read more

References
More filters
Journal ArticleDOI

Textural Features for Image Classification

TL;DR: These results indicate that the easily computable textural features based on gray-tone spatial dependancies probably have a general applicability for a wide variety of image-classification applications.
Journal ArticleDOI

A theory for multiresolution signal decomposition: the wavelet representation

TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Journal ArticleDOI

Multiresolution gray-scale and rotation invariant texture classification with local binary patterns

TL;DR: A generalized gray-scale and rotation invariant operator presentation that allows for detecting the "uniform" patterns for any quantization of the angular space and for any spatial resolution and presents a method for combining multiple operators for multiresolution analysis.
Journal ArticleDOI

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

TL;DR: The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
Journal ArticleDOI

A comparative study of texture measures with classification based on featured distributions

TL;DR: This paper evaluates the performance both of some texture measures which have been successfully used in various applications and of some new promising approaches proposed recently.
Related Papers (5)