scispace - formally typeset
Search or ask a question
Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.


Papers
More filters
Proceedings ArticleDOI
04 Feb 2010
TL;DR: It is shown that a substantial level of local lexical and topical alignment is observable among users who lie close to each other in the social network, and suggests that users with similar topical interests are more likely to be friends, and semantic similarity measures among users based solely on their annotation metadata should be predictive of social links.
Abstract: Web 2.0 applications have attracted a considerable amount of attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and semantic components of social media has been only partially explored. Here we focus on Flickr and Last.fm, two social media systems in which we can relate the tagging activity of the users with an explicit representation of their social network. We show that a substantial level of local lexical and topical alignment is observable among users who lie close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local alignment between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with similar topical interests are more likely to be friends, and therefore semantic similarity measures among users based solely on their annotation metadata should be predictive of social links. We test this hypothesis on the Last.fm data set, confirming that the social network constructed from semantic similarity captures actual friendship more accurately than Last.fm's suggestions based on listening patterns.

181 citations

Patent
31 Mar 2004
TL;DR: In this article, techniques are disclosed that locate implicitly defined semantic structures in a document, such as, for example, implicitly defined lists in an HTML document, which can be used in the calculation of distance values between terms in the documents.
Abstract: Techniques are disclosed that locate implicitly defined semantic structures in a document, such as, for example, implicitly defined lists in an HTML document. The semantic structures can be used in the calculation of distance values between terms in the documents. The distance values may be used, for example, in the generation of ranking scores that indicate a relevance level of the document to a search query.

181 citations

Patent
18 Oct 2000
TL;DR: In this paper, state vectors representing the semantic content of a document are superpositioned to construct a single vector representing a semantic abstract for the document, which can be used to locate documents with similar semantic content.
Abstract: State vectors representing the semantic content of a document are created. The state vectors are superpositioned to construct a single vector representing a semantic abstract for the document. The single vector can be normalized. Once constructed, the single vector semantic abstract can be compared with semantic abstracts for other documents to measure a semantic distance between the documents, and can be used to locate documents with similar semantic content.

180 citations

Proceedings ArticleDOI
25 Jun 2005
TL;DR: This paper presents a novel algorithm for the acquisition of Information Extraction patterns that makes the assumption that useful patterns will have similar meanings to those already identified as relevant.
Abstract: This paper presents a novel algorithm for the acquisition of Information Extraction patterns. The approach makes the assumption that useful patterns will have similar meanings to those already identified as relevant. Patterns are compared using a variation of the standard vector space model in which information from an ontology is used to capture semantic similarity. Evaluation shows this algorithm performs well when compared with a previously reported document-centric approach.

180 citations

01 Jan 2005
TL;DR: This article presents a method of word sense disambiguation that assigns a target word the sense that is most related to the senses of its neighboring words and explores the use of measures of similarity and relatedness that are based on finding paths in a concept network, information content derived from a large corpus, and word sense glosses.
Abstract: This article presents a method of word sense disambiguation that assigns a target word the sense that is most related to the senses of its neighboring words. We explore the use of measures of similarity and relatedness that are based on finding paths in a concept network, information content derived from a large corpus, and word sense glosses. We observe that measures of relatedness are useful sources of information for disambiguation, and in particular we find that two gloss based measures that we have developed are particularly flexible and effective measures for word sense disambiguation.

180 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Unsupervised learning
22.7K papers, 1M citations
83% related
Feature vector
48.8K papers, 954.4K citations
83% related
Web service
57.6K papers, 989K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023202
2022522
2021641
2020837
2019866
2018787