scispace - formally typeset
Search or ask a question

Showing papers by "Douwe Kiela published in 2013"


Proceedings Article
01 Oct 2013
TL;DR: A novel unsupervised approach to detecting the compositionality of multi-word expressions through substituting the constituent words with their “neighbours” in a semantic vector space and averaging over the distance between the original phrase and the substituted neighbour phrases is presented.
Abstract: We present a novel unsupervised approach to detecting the compositionality of multi-word expressions. We compute the compositionality of a phrase through substituting the constituent words with their “neighbours” in a semantic vector space and averaging over the distance between the original phrase and the substituted neighbour phrases. Several methods of obtaining neighbours are presented. The results are compared to existing supervised results and achieve state-of-the-art performance on a verb-object dataset of human compositionality ratings.

39 citations


01 Jan 2013
TL;DR: This work implements both a vector space model and a Latent Dirichlet Allocation Model to explore the extent to which concreteness is reflected in the distributional patterns in corpora.
Abstract: An increasing body of empirical evidence suggests that concreteness is a fundamental dimension of semantic representation. By implementing both a vector space model and a Latent Dirichlet Allocation (LDA) Model, we explore the extent to which concreteness is reflected in the distributional patterns in corpora. In one experiment, we show that that vector space models can be tailored to better model semantic domains of particular degrees of concreteness. In a second experiment, we show that the quality of the representations of

16 citations


01 Aug 2013
TL;DR: The extent to which concreteness is reflected in the distributional patterns in corpora is explored and the quality of the representations of abstract words in LDA models can be improved by supplementing the training data with information on the physical properties of concrete concepts.
Abstract: An increasing body of empirical evidence suggests that concreteness is a fundamental dimension of semantic representation. By implementing both a vector space model and a Latent Dirichlet Allocation (LDA) Model, we explore the extent to which concreteness is reflected in the distributional patterns in corpora. In one experiment, we show that that vector space models can be tailored to better model semantic domains of particular degrees of concreteness. In a second experiment, we show that the quality of the representations of abstract words in LDA models can be im-words in LDA models can be improved by supplementing the training data with information on the physical properties of concrete concepts. We conclude by discussing the implications for computational systems and also for how concrete and abstract concepts are represented in the mind

6 citations


Proceedings Article
01 Jun 2013
TL;DR: This paper describes methods that were submitted as part of the *SEM shared task on Semantic Textual Similarity, which found that the simplest combination has the highest consistency across the different data sets.
Abstract: This paper describes methods that were submitted as part of the *SEM shared task on Semantic Textual Similarity. Multiple kernels provide different views of syntactic structure, from both tree and dependency parses. The kernels are then combined with simple lexical features using Gaussian process regression, which is trained on different subsets of training data for each run. We found that the simplest combination has the highest consistency across the different data sets, while introduction of more training data and models requires training and test data with matching qualities.

2 citations