Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Image retagging

[...]

Dong Liu¹, Xian-Sheng Hua², Meng Wang², Hong-Jiang Zhang³•Institutions (3)

Harbin Institute of Technology¹, Microsoft², Advanced Technology Center³

25 Oct 2010

TL;DR: This paper proposes a social image "retagging" scheme that aims at assigning images with better content descriptors and shows the remarkable performance improvements brought by retagging via two applications, i.e., tag-based search and automatic annotation.

...read moreread less

Abstract: Online social media repositories such as Flickr and Zooomr allow users to manually annotate their images with freely-chosen tags, which are then used as indexing keywords to facilitate image search and other applications. However, these tags are frequently imprecise and incomplete, though they are provided by human beings, and many of them are almost only meaningful for the image owners (such as the name of a dog). Thus there is still a gap between these tags and the actual content of the images, and this significantly limits tag-based applications, such as search and browsing. To tackle this issue, this paper proposes a social image "retagging" scheme that aims at assigning images with better content descriptors. The refining process, including denoising and enriching, is formulated as an optimization framework based on the consistency between "visual similarity" and "semantic similarity" in social images, that is, the visually similar images tend to have similar semantic descriptors, and vice versa. An effective iterative bound optimization algorithm is applied to learn the improved tag assignment. In addition, as many tags are intrinsically not closely-related to the visual content of the images, we employ knowledge based method to differentiate visual content related tags from unrelated ones and then constrain the tagging vocabulary of our automatic algorithm within the content related tags. Finally, to improve the coverage of the tags, we further enrich the tag set with appropriate synonyms and hypernyms based on an external knowledge base. Experimental results on a Flickr image collection demonstrate the effectiveness of this approach. We will also show the remarkable performance improvements brought by retagging via two applications, i.e., tag-based search and automatic annotation.

...read moreread less

116 citations

Semantic Matching of Web ervicesCapabilities

[...]

M. Paolucci

01 Jan 2002

TL;DR: This paper claims that location of web services should be based on the semantic match between a declarative description of the service being sought, and a description ofthe service being offered, and that this match is outside the representation capabilities of registries such as UDDI and languages such as WSDL.

...read moreread less

Abstract: The Web is moving from being a collection of pages toward a collection of services that interoperate through the Internet. The first step toward this interoperation is the location of other services that can help toward the solution of a problem. In this paper we claim that location of web services should be based on the semantic match between a declarative description of the service being sought, and a description of the service being offered. Furthermore, we claim that this match is outside the representation capabilities of registries such as UDDI and languages such as WSDL.We propose a solution based on DAML-S, a DAML-based language for service description, and we show how service capabilities are presented in the Profile section of a DAML-S description and how a semantic match between advertisements and requests is performed.

...read moreread less

115 citations

Proceedings Article•DOI•

Are Word Embedding-based Features Useful for Sarcasm Detection?

[...]

Aditya Joshi¹, Vaibhav Tripathi¹, Kevin Patel¹, Pushpak Bhattacharyya¹, Mark J. Carman² - Show less +1 more•Institutions (2)

Indian Institute of Technology Bombay¹, Monash University²

01 Nov 2016

TL;DR: This article explored if prior work can be enhanced using semantic similarity/discordance between word embeddings, and augmented word embedding-based features to four feature sets reported in the past.

...read moreread less

Abstract: This paper makes a simple increment to state-of-the-art in sarcasm detection research. Existing approaches are unable to capture subtle forms of context incongruity which lies at the heart of sarcasm. We explore if prior work can be enhanced using semantic similarity/discordance between word embeddings. We augment word embedding-based features to four feature sets reported in the past. We also experiment with four types of word embeddings. We observe an improvement in sarcasm detection, irrespective of the word embedding used or the original feature set to which our features are augmented. For example, this augmentation results in an improvement in F-score of around 4\% for three out of these four feature sets, and a minor degradation in case of the fourth, when Word2Vec embeddings are used. Finally, a comparison of the four embeddings shows that Word2Vec and dependency weight-based features outperform LSA and GloVe, in terms of their benefit to sarcasm detection.

...read moreread less

115 citations

[...]

Jeffrey Hau¹, William E. Lee¹, John Darlington¹•Institutions (1)

Imperial College London¹

01 Jan 2005

TL;DR: A metric for measuring the similarity of semantic services annotated with OWL ontology, calculated by defining the intrinsic information value of a service description based on the “inferencibility” of each of OWL Lite constructs is proposed.

...read moreread less

Abstract: Establishing the compatibility of services is an essential prerequisite to service composition. By formally defining the similarity of semantic services, useful information can be obtained about their compatibility. In this paper we propose a metric for measuring the similarity of semantic services annotated with OWL ontology. Similarity is calculated by defining the intrinsic information value of a service description based on the “inferencibility” of each of OWL Lite constructs. We apply this technique to OWL-S, an emerging standard for defining semantic service metadata and demonstrate how to measure the similarity of OWL-S annotated services.

...read moreread less

115 citations

Posted Content•

Relevance-based Word Embedding

[...]

Hamed Zamani¹, W. Bruce Croft¹•Institutions (1)

University of Massachusetts Amherst¹

09 May 2017-arXiv: Information Retrieval

TL;DR: This article proposed relevance-based word embedding models that learn word representations based on query-document relevance information and classify each term as belonging to the relevant or non-relevant class for each query.

...read moreread less

Abstract: Learning a high-dimensional dense representation for vocabulary terms, also known as a word embedding, has recently attracted much attention in natural language processing and information retrieval tasks. The embedding vectors are typically learned based on term proximity in a large corpus. This means that the objective in well-known word embedding algorithms, e.g., word2vec, is to accurately predict adjacent word(s) for a given word or context. However, this objective is not necessarily equivalent to the goal of many information retrieval (IR) tasks. The primary objective in various IR tasks is to capture relevance instead of term proximity, syntactic, or even semantic similarity. This is the motivation for developing unsupervised relevance-based word embedding models that learn word representations based on query-document relevance information. In this paper, we propose two learning models with different objective functions; one learns a relevance distribution over the vocabulary set for each query, and the other classifies each term as belonging to the relevant or non-relevant class for each query. To train our models, we used over six million unique queries and the top ranked documents retrieved in response to each query, which are assumed to be relevant to the query. We extrinsically evaluate our learned word representation models using two IR tasks: query expansion and query classification. Both query expansion experiments on four TREC collections and query classification experiments on the KDD Cup 2005 dataset suggest that the relevance-based word embedding models significantly outperform state-of-the-art proximity-based embedding models, such as word2vec and GloVe.

...read moreread less

115 citations

Collapse

Network Information

Performance

Metrics

15,319

Papers

407,958

Citations

No. of papers in the topic in previous years
Year	Papers
2023	202
2022	522
2021	641
2020	837
2019	866
2018	787

Semantic similarity

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics