scispace - formally typeset
Search or ask a question
Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.


Papers
More filters
Journal ArticleDOI
TL;DR: The R package LSAfun enables a variety of functions and computations based on Vector Semantic Models such as Latent Semantic Analysis (LSA), which are procedures to obtain a high-dimensional vector representation for words (and documents) from a text corpus.
Abstract: In this article, the R package LSAfun is presented. This package enables a variety of functions and computations based on Vector Semantic Models such as Latent Semantic Analysis (LSA) Landauer, Foltz and Laham (Discourse Processes 25:259–284, 1998), which are procedures to obtain a high-dimensional vector representation for words (and documents) from a text corpus. Such representations are thought to capture the semantic meaning of a word (or document) and allow for semantic similarity comparisons between words to be calculated as the cosine of the angle between their associated vectors. LSAfun uses pre-created LSA spaces and provides functions for (a) Similarity Computations between words, word lists, and documents; (b) Neighborhood Computations, such as obtaining a word’s or document’s most similar words, (c) plotting such a neighborhood, as well as similarity structures for any word lists, in a two- or three-dimensional approximation using Multidimensional Scaling, (d) Applied Functions, such as computing the coherence of a text, answering multiple choice questions and producing generic text summaries; and (e) Composition Methods for obtaining vector representations for two-word phrases. The purpose of this package is to allow convenient access to computations based on LSA.

108 citations

Proceedings ArticleDOI
22 Sep 2008
TL;DR: This work describes a graphical logical form as a semantic representation for text understanding, which has the TRIPS parser at the core, augmented with statistical preprocessing techniques and online lexical lookup.
Abstract: We describe a graphical logical form as a semantic representation for text understanding. This representation was designed to bridge the gap between highly expressive "deep" representations of logical forms and more shallow semantic encodings such as word senses and semantic relations. It preserves rich semantic content while allowing for compact ambiguity encoding and viable partial representations. We describe our system for semantic text processing, which has the TRIPS parser at the core, augmented with statistical preprocessing techniques and online lexical lookup. We also present an evaluation metric for the representation and use it to evaluate the performance of the TRIPS parser on the common task paragraphs.

108 citations

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features and examines the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates.
Abstract: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.

107 citations

Proceedings ArticleDOI
29 Jan 2006
TL;DR: This paper builds a prototype named Swish that constantly monitors users' desktop activities using a stream of windows events, and implements two criteria of window "relatedness", namely the semantic similarity of their titles, and the temporal closeness in their access patterns.
Abstract: Information workers are often involved in multiple tasks and activities that they must perform in parallel or in rapid succession. In consequence, task management itself becomes yet another task that information workers need to perform in order to get the rest of their work done. Recognition of this problem has led to research on task management systems, which can help by allowing fast task switching, fast task resumption, and automatic task identification. In this paper we focus on the latter: we tackle the problem of automatically detecting the tasks that the user is involved in, by identifying which of the windows on the user's desktop are related to each other. The underlying assumption is that windows that belong to the same task share some common properties with one another that we can detect from data. We will refer to this problem as the task assignment problem.To address this problem, we have built a prototype named Swish that: (1) constantly monitors users' desktop activities using a stream of windows events; (2) logs and processes this raw event stream, and (3) implements two criteria of window "relatedness", namely the semantic similarity of their titles, and the temporal closeness in their access patterns.In addition to describing the Swish prototype in detail, we validate it with 4 hours of user data, obtaining task classification accuracies of about 70%. We also discuss our plans on including Swish in a number of intelligent user interfaces and future lines of research.

107 citations

Journal ArticleDOI
TL;DR: These data represent the largest behavioral database on semantic priming and are available to researchers to aid in selecting stimuli, testing theories, and reducing potential confounds in their studies.
Abstract: Speeded naming and lexical decision data for 1,661 target words following related and unrelated primes were collected from 768 subjects across four different universities. These behavioral measures have been integrated with demographic information for each subject and descriptive characteristics for every item. Subjects also completed portions of the Woodcock–Johnson reading battery, three attentional control tasks, and a circadian rhythm measure. These data are available at a user-friendly Internet-based repository ( http://spp.montana.edu ). This Web site includes a search engine designed to generate lists of prime–target pairs with specific characteristics (e.g., length, frequency, associative strength, latent semantic similarity, priming effect in standardized and raw reaction times). We illustrate the types of questions that can be addressed via the Semantic Priming Project. These data represent the largest behavioral database on semantic priming and are available to researchers to aid in selecting stimuli, testing theories, and reducing potential confounds in their studies.

107 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Unsupervised learning
22.7K papers, 1M citations
83% related
Feature vector
48.8K papers, 954.4K citations
83% related
Web service
57.6K papers, 989K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023202
2022522
2021641
2020837
2019866
2018787