scispace - formally typeset
Search or ask a question
Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.


Papers
More filters
Journal ArticleDOI
TL;DR: The authors found no difference in the recall of two-digit numbers when distractors were either numbers or words and non-words that were designed to be phonologically similar to the targets.
Abstract: Two experiments that tested whether semantic similarity between visually presented targets and auditorily presented distractors has an effect on serial recall of the visual targets are reported. In Experiment 1, we found no difference in the recall of two-digit numbers when distractors were either numbers or words and non-words that were designed to be phonologically similar to the targets. In Experiment 2 the “semantic distance” between targets and distractors had no effect on serial recall. Taken together, these experiments conceptually replicate and extend earlier results, and they establish constraints for models of the effect of unattended acoustic information on serial recall.

88 citations

Journal ArticleDOI
TL;DR: A computational model of Random Forest for miRNA-disease association (RFMDA) prediction based on machine learning is developed and the results of cross-validation and case studies indicated that RFMDA is a reliable model for predicting mi RNA- disease associations.
Abstract: Since the first microRNA (miRNA) was discovered, a lot of studies have confirmed the associations between miRNAs and human complex diseases. Besides, obtaining and taking advantage of association information between miRNAs and diseases play an increasingly important role in improving the treatment level for complex diseases. However, due to the high cost of traditional experimental methods, many researchers have proposed different computational methods to predict potential associations between miRNAs and diseases. In this work, we developed a computational model of Random Forest for miRNA-disease association (RFMDA) prediction based on machine learning. The training sample set for RFMDA was constructed according to the human microRNA disease database (HMDD) version (v.)2.0, and the feature vectors to represent miRNA-disease samples were defined by integrating miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. The Random Forest algorithm was first employed to infer miRNA-disease associations. In addition, a filter-based method was implemented to select robust features from the miRNA-disease feature set, which could efficiently distinguish related miRNA-disease pairs from unrelated miRNA-disease pairs. RFMDA achieved areas under the curve (AUCs) of 0.8891, 0.8323, and 0.8818 ± 0.0014 under global leave-one-out cross-validation, local leave-one-out cross-validation, and 5-fold cross-validation, respectively, which were higher than many previous computational models. To further evaluate the accuracy of RFMDA, we carried out three types of case studies for four human complex diseases. As a result, 43 (esophageal neoplasms), 46 (lymphoma), 47 (lung neoplasms), and 48 (breast neoplasms) of the top 50 predicted disease-related miRNAs were verified by experiments in different kinds of case studies. The results of cross-validation and case studies indicated that RFMDA is a reliable model for predicting miRNA-disease associations.

88 citations

Journal ArticleDOI
TL;DR: Through a series of large-scale leave-one-out cross-validation experiments, it is shown that the gene semantic similarity network can achieve not only higher coverage but also higher accuracy than the PPI network in the inference of disease genes.
Abstract: Motivation The inference of genes that are truly associated with inherited human diseases from a set of candidates resulting from genetic linkage studies has been one of the most challenging tasks in human genetics. Although several computational approaches have been proposed to prioritize candidate genes relying on protein-protein interaction (PPI) networks, these methods can usually cover less than half of known human genes.

88 citations

Journal ArticleDOI
TL;DR: This work proposes efficient unsupervised and task-specific learning objectives that scale the model to large datasets and demonstrates improvements on both language modeling and several phrase semantic similarity tasks with various phrase lengths.
Abstract: Lexical embeddings can serve as useful representations for words for a variety of NLP tasks, but learning embeddings for phrases can be challenging. While separate embeddings are learned for each word, this is infeasible for every phrase. We construct phrase embeddings by learning how to compose word embeddings using features that capture phrase structure and context. We propose efficient unsupervised and task-specific learning objectives that scale our model to large datasets. We demonstrate improvements on both language modeling and several phrase semantic similarity tasks with various phrase lengths. We make the implementation of our model and the datasets available for general use.

88 citations

Journal ArticleDOI
TL;DR: A computational text analysis technique for measuring the moral loading of concepts as they are used in a corpus, using latent semantic analysis to compute the semantic similarity between concepts and moral keywords taken from the “Moral foundation Dictionary”.
Abstract: In this paper we present a computational text analysis technique for measuring the moral loading of concepts as they are used in a corpus. This method is especially useful for the study of online corpora as it allows for the rapid analysis of moral rhetoric in texts such as blogs and tweets as events unfold. We use latent semantic analysis to compute the semantic similarity between concepts and moral keywords taken from the A¢Â€ÂœMoral foundation DictionaryA¢Â€Â. This measure of semantic similarity represents the loading of these concepts on the five moral dimensions identified by moral foundation theory. We demonstrate the efficacy of this method using three different concepts and corpora.

87 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Unsupervised learning
22.7K papers, 1M citations
83% related
Feature vector
48.8K papers, 954.4K citations
83% related
Web service
57.6K papers, 989K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023202
2022522
2021641
2020837
2019866
2018787