scispace - formally typeset
Search or ask a question
Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.


Papers
More filters
Journal ArticleDOI
TL;DR: Computational investigations with an attractor dynamical model, a spreading activation model, and a decision model revealed that a combination of excitatory and inhibitory mechanisms is required to obtain peak timing, providing new constraints on models of semantic processing.
Abstract: Semantic similarity effects provide critical insight into the organization of semantic knowledge and the nature of semantic processing. In the present study, we examined the dynamics of semantic similarity effects by using the visual world eyetracking paradigm. Four objects were shown on a computer monitor, and participants were instructed to click on a named object, during which time their gaze position was recorded. The likelihood of fixating competitor objects was predicted by the degree of semantic similarity to the target concept. We found reliable, graded competition that depended on degree of target-competitor similarity, even for distantly related items for which priming has not been found in previous priming studies. Time course measures revealed a consistently earlier fixation peak for near semantic neighbors relative to targets. Computational investigations with an attractor dynamical model, a spreading activation model, and a decision model revealed that a combination of excitatory and inhibitory mechanisms is required to obtain such peak timing, providing new constraints on models of semantic processing.

89 citations

Journal ArticleDOI
TL;DR: This paper proposes studying semantic information in abstract images created from collections of clip art, and creates 1,002 sets of 10 semantically similar abstract images with corresponding written descriptions to discover semantically important features, the relations of words to visual features and methods for measuring semantic similarity.
Abstract: Relating visual information to its linguistic semantic meaning remains an open and challenging area of research. The semantic meaning of images depends on the presence of objects, their attributes and their relations to other objects. But precisely characterizing this dependence requires extracting complex visual information from an image, which is in general a difficult and yet unsolved problem. In this paper, we propose studying semantic information in abstract images created from collections of clip art. Abstract images provide several advantages over real images. They allow for the direct study of how to infer high-level semantic information, since they remove the reliance on noisy low-level object, attribute and relation detectors, or the tedious hand-labeling of real images. Importantly, abstract images also allow the ability to generate sets of semantically similar scenes. Finding analogous sets of real images that are semantically similar would be nearly impossible. We create 1,002 sets of 10 semantically similar abstract images with corresponding written descriptions. We thoroughly analyze this dataset to discover semantically important features, the relations of words to visual features and methods for measuring semantic similarity. Finally, we study the relation between the saliency and memorability of objects and their semantic importance.

89 citations

Journal ArticleDOI
TL;DR: The concept of local user similarity and global user similarity is introduced, based on surprisal-based vector similarity and the application of the concept of maximin distance in graph theory, to develop a collaborative filtering framework called LS&GS.
Abstract: Collaborative filtering as a classical method of information retrieval has been widely used in helping people to deal with information overload. In this paper, we introduce the concept of local user similarity and global user similarity, based on surprisal-based vector similarity and the application of the concept of maximin distance in graph theory. Surprisal-based vector similarity expresses the relationship between any two users based on the quantities of information (called surprisal) contained in their ratings. Global user similarity defines two users being similar if they can be connected through their locally similar neighbors. Based on both of Local User Similarity and Global User Similarity, we develop a collaborative filtering framework called LS&GS. An empirical study using the MovieLens dataset shows that our proposed framework outperforms other state-of-the-art collaborative filtering algorithms.

89 citations

Journal ArticleDOI
TL;DR: A subject–action–object (SAO) based semantic technological similarity is exploited and used to relieve human experts’ work in identifying patent infringement, allowing large sets of patents to be handled with minimal effort by human experts.
Abstract: Companies should investigate possible patent infringement and cope with potential risks because patent litigation may have a tremendous financial impact An important factor to identify the possibility of patent infringement is the technological similarity among patents, so this paper considered technological similarity as a criterion for judging the possibility of infringement Technological similarities can be measured by transforming patent documents into abstracted forms which contain specific technological key-findings and structural relationships among technological components in the invention Although keyword-based technological similarity has been widely adopted for patent analysis related research, it is inadequate for identifying patent infringement because a keyword vector cannot reflect specific technological key-findings and structural relationships among technological components As a remedy, this paper exploited a subject---action---object (SAO) based semantic technological similarity An SAO structure explicitly describes the structural relationships among technological components in the patent, and the set of SAO structures is considered to be a detailed picture of the inventor's expertise, which is the specific key-findings in the patent Therefore, an SAO based semantic technological similarity can identify patent infringement Semantic similarity between SAO structures is automatically measured using SAO based semantic similarity measurement method using WordNet, and the technological relationships among patents were mapped onto a 2-dimensional space using multidimensional scaling (MDS) Furthermore, a clustering algorithm is used to automatically suggest possible patent infringement cases, allowing large sets of patents to be handled with minimal effort by human experts The proposed method will be verified by detecting real patent infringement in prostate cancer treatment technology, and we expect this method to relieve human experts' work in identifying patent infringement

89 citations

02 Oct 2017
TL;DR: This article evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants, on syntactic and semantic analogies and extrinsically on POS tagging and sentence semantic similarity tasks.
Abstract: Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing systems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants. We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Word2Vec. We evaluated them intrinsically on syntactic and semantic analogies and extrinsically on POS tagging and sentence semantic similarity tasks. The obtained results suggest that word analogies are not appropriate for word embedding evaluation instead task-specific evaluations may be a better option; Wang2Vec appears to be a robust model; the increase in performance in our evaluations with bigger models is not worth the increase in memory usage for models with more than 300 dimensions.

88 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Unsupervised learning
22.7K papers, 1M citations
83% related
Feature vector
48.8K papers, 954.4K citations
83% related
Web service
57.6K papers, 989K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023202
2022522
2021641
2020837
2019866
2018787