scispace - formally typeset
Search or ask a question
Author

Gonzalo A. Ruz

Bio: Gonzalo A. Ruz is an academic researcher from Adolfo Ibáñez University. The author has contributed to research in topics: Boolean network & Computer science. The author has an hindex of 15, co-authored 64 publications receiving 766 citations. Previous affiliations of Gonzalo A. Ruz include Coordenadoria de Aperfeiçoamento de Pessoal de Nível Superior & Cardiff University.


Papers
More filters
Journal ArticleDOI
TL;DR: This work considers Bayesian network classifiers to perform sentiment analysis on two datasets in Spanish: the 2010 Chilean earthquake and the 2017 Catalan independence referendum, and adopts a Bayes factor approach, yielding more realistic networks.

178 citations

01 Jan 2006
TL;DR: The application of data mining to a real industrial problem through the implementation of an automatic fraud detection system changed the original non-standard medical claims checking process to a standardized process helping to fight against new, unusual and known fraudulent/abusive behaviors.
Abstract: This paper describes an effective medical claim fraud/abuse detection system based on data mining used by a Chilean private health insurance company. Fraud and abuse in medical claims have become a major concern within health insurance companies in Chile the last years due to the increasing losses in revenues. Processing medical claims is an exhausting manual task carried out by a few medical experts who have the responsibility of approving, modifying or rejecting the subsidies requested within a limited period from their reception. The proposed detection system uses one committee of multilayer perceptron neural networks (MLP) for each one of the entities involved in the fraud/abuse problem: medical claims, affiliates, medical professionals and employers. Results of the fraud detection system show a detection rate of approximately 75 fraudulent and abusive cases per month, making the detection 6.6 months earlier than without the system. The application of data mining to a real industrial problem through the implementation of an automatic fraud detection system changed the original non-standard medical claims checking process to a standardized process helping to fight against new, unusual and known fraudulent/abusive behaviors.

97 citations

Journal Article
TL;DR: In this paper, a neurofuzzy color image segmentation method for wood surface defect detection is proposed, which grows boxes from a set of pixels called seeds, to find the minimum bounded rectangle (MBR) for each defect present in the wood board image.
Abstract: A crucial step in developing automated visual inspection systems for wood boards is image segmentation, which aims to achieve a high defect detection rate with a low false positive rate (clear wood areas identified as defect areas). In this study, a neurofuzzy color image segmentation method for wood surface defect detection is proposed. The method is called fuzzy min-max neural network for image segmentation (FMMIS). The FMMIS method grows boxes from a set of pixels called seeds, to find the minimum bounded rectangle (MBR) for each defect present in the wood board image. An automatic method to select seeds from defective regions as starting points to FMMIS is also presented. The FMMIS method was applied to a set of 900 images of radiata pine boards, which included samples from the following 10 categories of defects: birdseye and freckle, bark and pitch pockets, wane, splits, blue stain, stain, pith, dead knots, live knots, and holes. The FMMIS achieved a defect detection rate of 95 percent on the test set, with only 6 percent of false positives. To measure the quality of segmentation, the area recognition rate (ARR) criterion was computed using as a reference the manually placed MBR for each defect. The ARR achieved 94.4 percent on the test set. Also a relative index was used to compare the quality of segmentation between FMMIS and the segmentation module of a previously developed system, based on histogram thresholding. The results show that FMMIS allows us to obtain significant improvements compared with previous work.

61 citations

Journal ArticleDOI
TL;DR: Improvements in the segmentation module, feature extraction module, and the classification module of a low-cost automated visual inspection (AVI) system for wood defect classification are presented and the use of computational intelligence techniques improved significantly the overall performance of the proposed automated visual inspectors for wood boards.
Abstract: This article presents improvements in the segmentation module, feature extraction module, and the classification module of a low-cost automated visual inspection (AVI) system for wood defect classification. One of the major drawbacks in the low-cost AVI system was the erroneous segmentation of clear wood regions as defects, which then introduces confusion in the classification module. To reduce this problem, we use the fuzzy min-max neural network for image segmentation (FMMIS). The FMMIS method grows boxes from a set of seed pixels, yielding ideally the minimum bounded rectangle for each defect present in the wood board image. Additional features with texture information are considered for the feature extraction module, and multi-class support vector machines are compared with multilayer perceptron neural networks in the classification module. Results using the FMMIS, additional features, and a pairwise classification support vector machine on a 550 test wood image set containing 11 defect categories show 91% of correct classification, which is significantly better than the original 75% of the low-cost AVI system. The use of computational intelligence techniques improved significantly the overall performance of the proposed automated visual inspection system for wood boards.

52 citations

Journal ArticleDOI
TL;DR: It is suggested that socio-demographic attributes has no predictive power on performance, while the operational information of the activities of the sale agent can predict the future performance of the agent.
Abstract: This study presents an approach to predict the performance of sales agents of a call center dedicated exclusively to sales and telemarketing activities. This approach is based on a naive Bayesian classifier. The objective is to know what levels of the attributes are indicative of individuals who perform well. A sample of 1037 sales agents was taken during the period between March and September of 2009 on campaigns related to insurance sales and service pre-paid phone services, to build the naive Bayes network. It has been shown that, socio-demographic attributes are not suitable for predicting performance. Alternatively, operational records were used to predict production of sales agents, achieving satisfactory results. In this case, the classifier training and testing is done through a stratified tenfold cross-validation. It classified the instances correctly 80.60% of times, with the proportion of false positives of 18.1% for class no (does not achieve minimum) and 20.8% for the class yes (achieves equal or above minimum acceptable). These results suggest that socio-demographic attributes has no predictive power on performance, while the operational information of the activities of the sale agent can predict the future performance of the agent.

50 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

01 Jan 2002

9,314 citations

01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations