Oriol Ramos Terrades
Other affiliations: Polytechnic University of Valencia
Bio: Oriol Ramos Terrades is an academic researcher from Autonomous University of Barcelona. The author has contributed to research in topics: Symbol (chemistry) & Visual Word. The author has an hindex of 14, co-authored 54 publications receiving 621 citations. Previous affiliations of Oriol Ramos Terrades include Polytechnic University of Valencia.
Papers published on a yearly basis
TL;DR: This paper tackles the problem of the combination of classifiers using a non-Bayesian probabilistic framework, and results on real data show that the proposed methods outperform other common combination schemes.
Abstract: The combination of the output of classifiers has been one of the strategies used to improve classification rates in general purpose classification systems. Some of the most common approaches can be explained using the Bayes' formula. In this paper, we tackle the problem of the combination of classifiers using a non-Bayesian probabilistic framework. This approach permits us to derive two linear combination rules that minimize misclassification rates under some constraints on the distribution of classifiers. In order to show the validity of this approach we have compared it with other popular combination rules from a theoretical viewpoint using a synthetic data set, and experimentally using two standard databases: the MNIST handwritten digit database and the GREC symbol database. Results on the synthetic data set show the validity of the theoretical approach. Indeed, results on real data show that the proposed methods outperform other common combination schemes.
TL;DR: This paper presents a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations and implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner.
Abstract: Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research.
26 Jul 2009
TL;DR: A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition, and is the first publicly available database for handwriting research, mostly written in Spanish and comparable in size to standard databases.
Abstract: A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. GERMANA is the result of digitising and annotating a 764-page Spanish manuscript from 1891, in which most pages only contain nearly calligraphed text written on ruled sheets of well-separated lines. To our knowledge, it is the first publicly available database for handwriting research, mostly written in Spanish and comparable in size to standard databases. Due to its sequential book structure, it is also well-suited for realistic assessment of interactive handwriting recognition systems. To provide baseline results for reference in future studies, empirical results are also reported, using standard techniques and tools for preprocessing, feature extraction, HMM-based image modelling, and language modelling.
01 Dec 2008
TL;DR: A histogram of the Radon transform, called HRT, is proposed, which is invariant to common geometrical transformations and is compared with several well-known descriptors to show the robustness of the method.
Abstract: In this paper we present a new descriptor based on the Radon transform. We propose a histogram of the Radon transform, called HRT, which is invariant to common geometrical transformations. For black and white shapes, the HRT descriptor is a histogram of shape lengths at each orientation. The experimental results, defined on different databases and compared with several well-known descriptors, show the robustness of our method.
••25 Aug 2013
TL;DR: This paper presents an automatic forgery detection method based on document's intrinsic features at character level based on the one hand on outlier character detection in a discriminant feature space and on the other hand on the detection of strictly similar characters.
Abstract: Paper documents still represent a large amount of information supports used nowadays and may contain critical data. Even though official documents are secured with techniques such as printed patterns or artwork, paper documents suffer from a lack of security. However, the high availability of cheap scanning and printing hardware allows non-experts to easily create fake documents. As the use of a watermarking system added during the document production step is hardly possible, solutions have to be proposed to distinguish a genuine document from a forged one. In this paper, we present an automatic forgery detection method based on document's intrinsic features at character level. This method is based on the one hand on outlier character detection in a discriminant feature space and on the other hand on the detection of strictly similar characters. Therefore, a feature set is computed for all characters. Then, based on a distance between characters of the same class, the character is classified as a genuine one or a fake one.
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.
TL;DR: Experimental results show that the proposed method outperforms the existing methods, in terms of image quality and recognition accuracy, as well as face super-resolution methods.
Abstract: This paper addresses the very low resolution (VLR) problem in face recognition in which the resolution of the face image to be recognized is lower than 16 × 16. With the increasing demand of surveillance camera-based applications, the VLR problem happens in many face application systems. Existing face recognition algorithms are not able to give satisfactory performance on the VLR face image. While face super-resolution (SR) methods can be employed to enhance the resolution of the images, the existing learning-based face SR methods do not perform well on such a VLR face image. To overcome this problem, this paper proposes a novel approach to learn the relationship between the high-resolution image space and the VLR image space for face SR. Based on this new approach, two constraints, namely, new data and discriminative constraints, are designed for good visuality and face recognition applications under the VLR problem, respectively. Experimental results show that the proposed SR algorithm based on relationship learning outperforms the existing algorithms in public face databases.
••07 Jun 2015
TL;DR: It is of great importance to identify feature effectiveness in a query-adaptive manner for image search because one does not know in advance whether a feature is effective or not for a given query.
Abstract: Feature fusion has been proven effective [35, 36] in image search. Typically, it is assumed that the to-be-fused heterogeneous features work well by themselves for the query. However, in a more realistic situation, one does not know in advance whether a feature is effective or not for a given query. As a result, it is of great importance to identify feature effectiveness in a query-adaptive manner.
••16 Jun 2012
TL;DR: This paper converts each confidence score vector obtained from one model into a pairwise relationship matrix, in which each entry characterizes the comparative relationship of scores of two test samples, to fuse the predicted confidence scores of multiple models.
Abstract: In this paper, we propose a rank minimization method to fuse the predicted confidence scores of multiple models, each of which is obtained based on a certain kind of feature. Specifically, we convert each confidence score vector obtained from one model into a pairwise relationship matrix, in which each entry characterizes the comparative relationship of scores of two test samples. Our hypothesis is that the relative score relations are consistent among component models up to certain sparse deviations, despite the large variations that may exist in the absolute values of the raw scores. Then we formulate the score fusion problem as seeking a shared rank-2 pairwise relationship matrix based on which each original score matrix from individual model can be decomposed into the common rank-2 matrix and sparse deviation errors. A robust score vector is then extracted to fit the recovered low rank score relation matrix. We formulate the problem as a nuclear norm and l 1 norm optimization objective function and employ the Augmented Lagrange Multiplier (ALM) method for the optimization. Our method is isotonic (i.e., scale invariant) to the numeric scales of the scores originated from different models. We experimentally show that the proposed method achieves significant performance gains on various tasks including object categorization and video event detection.
01 Jul 2017
TL;DR: Li et al. as mentioned in this paper presented an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images, which considers document semantic structure extraction as a pixel-wise segmentation task, and proposes a unified model that classifies pixels based not only on their visual appearance, but also on the content of underlying text.
Abstract: We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Moreover, we propose an efficient synthetic document generation process that we use to generate pretraining data for our network. Once the network is trained on a large set of synthetic documents, we fine-tune the network on unlabeled real documents using a semi-supervised approach. We systematically study the optimum network architecture and show that both our multimodal approach and the synthetic data pretraining significantly boost the performance.