scispace - formally typeset
Search or ask a question
Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.


Papers
More filters
Journal ArticleDOI
TL;DR: It is argued that the amount of perceptual and other semantic information that can be learned from purely distributional statistics has been underappreciated and that future focus should be on understanding the cognitive mechanisms humans use to integrate the two sources.
Abstract: Since their inception, distributional models of semantics have been criticized as inadequate cognitive theories of human semantic learning and representation. A principal challenge is that the representations derived by distributional models are purely symbolic and are not grounded in perception and action; this challenge has led many to favor feature-based models of semantic representation. We argue that the amount of perceptual and other semantic information that can be learned from purely distributional statistics has been underappreciated. We compare the representations of three feature-based and nine distributional models using a semantic clustering task. Several distributional models demonstrated semantic clustering comparable with clustering-based on feature-based representations. Furthermore, when trained on child-directed speech, the same distributional models perform as well as sensorimotor-based feature representations of children's lexical semantic knowledge. These results suggest that, to a large extent, information relevant for extracting semantic categories is redundantly coded in perceptual and linguistic experience. Detailed analyses of the semantic clusters of the feature-based and distributional models also reveal that the models make use of complementary cues to semantic organization from the two data streams. Rather than conceptualizing feature-based and distributional models as competing theories, we argue that future focus should be on understanding the cognitive mechanisms humans use to integrate the two sources.

140 citations

Proceedings ArticleDOI
07 Apr 2008
TL;DR: The paper describes the inferencing requirements, challenges in supporting a sufficiently expressive set of RDFS/OWL constructs, and techniques adopted to build a scalable inference engine for Oracle semantic data store.
Abstract: This inference engines are an integral part of semantic data stores. In this paper, we describe our experience of implementing a scalable inference engine for Oracle semantic data store. This inference engine computes production rule based entailment of one or more RDFS/OWL encoded semantic data models. The inference engine capabilities include (i) inferencing based on semantics of RDFS/OWL constructs and user-defined rules, (ii) computing ancillary information (namely, semantic distance and proof) for inferred triples, and (iii) validation of semantic data model based on RDFS/OWL semantics. A unique aspect of our approach is that the inference engine is implemented entirely as a database application on top of Oracle database. The paper describes the inferencing requirements, challenges in supporting a sufficiently expressive set of RDFS/OWL constructs, and techniques adopted to build a scalable inference engine. A performance study conducted using both native and synthesized semantic datasets demonstrates the effectiveness of our approach.

140 citations

Journal ArticleDOI
TL;DR: This paper proposes a semi-supervised loss to jointly minimize the empirical error on labeled data, as well as the embedding error on both labeled and unlabeled data, which can preserve the semantic similarity and capture the meaningful neighbors on the underlying data structures for effective hashing.
Abstract: Hashing methods have been widely used for efficient similarity retrieval on large scale image database. Traditional hashing methods learn hash functions to generate binary codes from hand-crafted features, which achieve limited accuracy since the hand-crafted features cannot optimally represent the image content and preserve the semantic similarity. Recently, several deep hashing methods have shown better performance because the deep architectures generate more discriminative feature representations. However, these deep hashing methods are mainly designed for supervised scenarios, which only exploit the semantic similarity information, but ignore the underlying data structures. In this paper, we propose the semi-supervised deep hashing approach, to perform more effective hash function learning by simultaneously preserving semantic similarity and underlying data structures. The main contributions are as follows: 1) We propose a semi-supervised loss to jointly minimize the empirical error on labeled data, as well as the embedding error on both labeled and unlabeled data, which can preserve the semantic similarity and capture the meaningful neighbors on the underlying data structures for effective hashing. 2) A semi-supervised deep hashing network is designed to extensively exploit both labeled and unlabeled data, in which we propose an online graph construction method to benefit from the evolving deep features during training to better capture semantic neighbors. To the best of our knowledge, the proposed deep network is the first deep hashing method that can perform hash code learning and feature learning simultaneously in a semi-supervised fashion. Experimental results on five widely-used data sets show that our proposed approach outperforms the state-of-the-art hashing methods.

140 citations

Book ChapterDOI
12 Jul 2001
TL;DR: A computational model is developed to determine the directional similarity between extended spatial objects, which forms a foundation for meaningful spatial similarity operators and confirms the cognitive plausibility of the similarity model.
Abstract: Like people who casually assess similarity between spatial scenes in their routine activities, users of pictorial databases are often interested in retrieving scenes that are similar to a given scene, and ranking them according to degrees of their match. For example, a town architect would like to query a database for the towns that have a landscape similar to the landscape of the site of a planned town. In this paper, we develop a computational model to determine the directional similarity between extended spatial objects, which forms a foundation for meaningful spatial similarity operators. The model is based on the direction-relation matrix. We derive how the similarity assessment of two direction-relation matrices corresponds to determining the least cost for transforming one direction-relation matrix into another. Using the transportation algorithm, the cost can be determined efficiently for pairs of arbitrary direction-relation matrices. The similarity values are evaluated empirically with several types of movements that create increasingly less similar direction relations. The tests confirm the cognitive plausibility of the similarity model.

139 citations

Journal ArticleDOI
TL;DR: In both tasks, PD patients' performance was selectively impaired for action verbs (relative to controls), indicating that the motor system plays a more central role in the processing of action verbs than in theprocessing of abstract verbs, arguing for a causal role of sensory-motor systems in semantic processing.

139 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Unsupervised learning
22.7K papers, 1M citations
83% related
Feature vector
48.8K papers, 954.4K citations
83% related
Web service
57.6K papers, 989K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023202
2022522
2021641
2020837
2019866
2018787