scispace - formally typeset
Search or ask a question
Author

Enrique Alfonseca

Other affiliations: Autonomous University of Madrid
Bio: Enrique Alfonseca is an academic researcher from Google. The author has contributed to research in topics: WordNet & Ontology (information science). The author has an hindex of 24, co-authored 74 publications receiving 3164 citations. Previous affiliations of Enrique Alfonseca include Autonomous University of Madrid.


Papers
More filters
Proceedings ArticleDOI
31 May 2009
TL;DR: This paper presents and compares WordNet-based and distributional similarity approaches, and pioneer cross-lingual similarity, showing that the methods are easily adapted for a cross-lingsual task with minor losses.
Abstract: This paper presents and compares WordNet-based and distributional similarity approaches. The strengths and weaknesses of each approach regarding similarity and relatedness tasks are discussed, and a combination is presented. Each of our methods independently provide the best results in their class on the RG and WordSim353 datasets, and a supervised combination of them yields the best published results on all datasets. Finally, we pioneer cross-lingual similarity, showing that our methods are easily adapted for a cross-lingual task with minor losses.

936 citations

Proceedings ArticleDOI
01 Sep 2015
TL;DR: It is demonstrated that even the most basic version of the LSTM system, given no syntactic information or desired compression length, performs surprisingly well: around 30% of the compressions from a large test set could be regenerated.
Abstract: We present an LSTM approach to deletion-based sentence compression where the task is to translate a sentence into a sequence of zeros and ones, corresponding to token deletion decisions. We demonstrate that even the most basic version of the system, which is given no syntactic information (no PoS or NE tags, or dependencies) or desired compression length, performs surprisingly well: around 30% of the compressions from a large test set could be regenerated. We compare the LSTM system with a competitive baseline which is trained on the same amount of data but is additionally provided with all kinds of linguistic features. In an experiment with human raters the LSTMbased model outperforms the baseline achieving 4.5 in readability and 3.8 in informativeness.

293 citations

Journal ArticleDOI
TL;DR: This paper presents how the benefit of considering learning styles with adaptation purposes, as part of the user model, can be extended to the context of collaborative learning as a key feature for group formation.
Abstract: Learning style models constitute a valuable tool for improving individual learning by the use of adaptation techniques based on them. In this paper, we present how the benefit of considering learning styles with adaptation purposes, as part of the user model, can be extended to the context of collaborative learning as a key feature for group formation. We explore the effects that the combination of students with different learning styles in specific groups may have in the final results of the tasks accomplished by them collaboratively. With this aim, a case study with 166 students of computer science has been carried out, from which conclusions are drawn. We also describe how an existing web-based system can take advantage of learning style information in order to form more productive groups. Our ongoing work concerning the automatic extraction of grouping rules starting from data about previous interactions within the system is also outlined. Finally, we present our challenges, related to the continuous improvement of collaboration by the use and dynamic modification of automatic grouping rules.

212 citations

Book ChapterDOI
06 Jun 2005
TL;DR: An approach taken for automatically associating entries from an on-line encyclopedia with concepts in an ontology or a lexical semantic network is described, which will be applied to enriching ontologies with encyclopedic knowledge.
Abstract: We describe an approach taken for automatically associating entries from an on-line encyclopedia with concepts in an ontology or a lexical semantic network. It has been tested with the Simple English Wikipedia and WordNet, although it can be used with other resources. The accuracy in disambiguating the sense of the encyclopedia entries reaches 91.11% (83.89% for polysemous words). It will be applied to enriching ontologies with encyclopedic knowledge.

170 citations

01 Jan 2004
TL;DR: This work describes here a procedure to automatically extend an ontology with domainspecific knowledge that is completely unsupervised, so it can be applied to different languages and domains.
Abstract: Knowledge Acquisition is still the bottleneck in building many kinds of applications, such as inference engines. We describe here a procedure to automatically extend an ontology with domainspecific knowledge. The main advantage of our approach is that it is completely unsupervised, so it can be applied to different languages and domains. Our initial results have been highly successful and we believe that with some improvement in accuracy it can be applied to large ontologies.

155 citations


Cited by
More filters
Proceedings ArticleDOI
08 May 2007
TL;DR: YAGO as discussed by the authors is a light-weight and extensible ontology with high coverage and quality, which includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE).
Abstract: We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships - and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.

3,710 citations

Journal ArticleDOI
TL;DR: Observations about languages, named entity types, domains and textual genres studied in the literature, along with other critical aspects of NERC such as features and evaluation methods, are reported.
Abstract: This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.

2,537 citations

Proceedings ArticleDOI
02 Sep 2015
TL;DR: The authors propose a local attention-based model that generates each word of the summary conditioned on the input sentence, which shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.
Abstract: Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.

2,339 citations

01 Jan 2002
TL;DR: In this paper, the interactions learners have with each other build interpersonal skills, such as listening, politely interrupting, expressing ideas, raising questions, disagreeing, paraphrasing, negotiating, and asking for help.
Abstract: 1. Interaction. The interactions learners have with each other build interpersonal skills, such as listening, politely interrupting, expressing ideas, raising questions, disagreeing, paraphrasing, negotiating, and asking for help. 2. Interdependence. Learners must depend on one another to accomplish a common objective. Each group member has specific tasks to complete, and successful completion of each member’s tasks results in attaining the overall group objective.

2,171 citations

Journal ArticleDOI
01 Nov 2010
TL;DR: The most relevant studies carried out in educational data mining to date are surveyed and the different groups of user, types of educational environments, and the data they provide are described.
Abstract: Educational data mining (EDM) is an emerging interdisciplinary research area that deals with the development of methods to explore data originating in an educational context. EDM uses computational approaches to analyze educational data in order to study educational questions. This paper surveys the most relevant studies carried out in this field to date. First, it introduces EDM and describes the different groups of user, types of educational environments, and the data they provide. It then goes on to list the most typical/common tasks in the educational environment that have been resolved through data-mining techniques, and finally, some of the most promising future lines of research are discussed.

1,723 citations