Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Neural CRF Model for Sentence Alignment in Text Simplification

[...]

Chao Jiang¹, Mounica Maddela, Wuwei Lan¹, Yang Zhong¹, Wei Xu¹ - Show less +1 more•Institutions (1)

Ohio State University¹

05 May 2020

TL;DR: This paper proposed a neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents, but also utilizes a neural sentence pair model to capture semantic similarity for text simplification.

...read moreread less

Abstract: The success of a text simplification system heavily depends on the quality and quantity of complex-simple sentence pairs in the training corpus, which are extracted by aligning sentences between parallel articles. To evaluate and improve sentence alignment quality, we create two manually annotated sentence-aligned datasets from two commonly used text simplification corpora, Newsela and Wikipedia. We propose a novel neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity. Experiments demonstrate that our proposed approach outperforms all the previous work on monolingual sentence alignment task by more than 5 points in F1. We apply our CRF aligner to construct two new text simplification datasets, Newsela-Auto and Wiki-Auto, which are much larger and of better quality compared to the existing datasets. A Transformer-based seq2seq model trained on our datasets establishes a new state-of-the-art for text simplification in both automatic and human evaluation.

...read moreread less

96 citations

Journal Article•DOI•

Models of high-dimensional semantic space predict language-mediated eye movements in the visual world.

[...]

Falk Huettig¹, Philip T. Quinlan², Scott A. McDonald³, Gerry T. M. Altmann²•Institutions (3)

Max Planck Society¹, University of York², University of Edinburgh³

01 Jan 2006-Acta Psychologica

TL;DR: The data suggest that the visual world paradigm can, together with other methodologies, converge on the evidence that may help adjudicate between different theoretical accounts of the psychological semantics, and provide further evidence that language-mediated eye movements to objects in the concurrent visual environment are driven by semantic similarity rather than all-or-none categorical knowledge.

...read moreread less

96 citations

Journal Article•DOI•

Semantic priming of category relations in schizophrenia.

[...]

Beth A. Ober, Sophia Vinogradov¹, Gregory K. Shenaut²•Institutions (2)

University of California, San Francisco¹, University of California, Berkeley²

01 Apr 1995-Neuropsychology (journal)

96 citations

Journal Article•DOI•

DOSim: An R package for similarity between diseases based on Disease Ontology

[...]

Jiang Li¹, Bingsheng Gong¹, Xi Chen¹, Tao Liu¹, Chao Wu¹, Fan Zhang¹, Chunquan Li¹, Xiang Li¹, Shaoqi Rao¹, Xia Li¹ - Show less +6 more•Institutions (1)

Harbin Medical University¹

29 Jun 2011-BMC Bioinformatics

TL;DR: An R-based software package that can be used to detect disease-driven gene modules, and to annotate the modules for functions and pathways, and can reflect the modular characteristic of disease related genes and promote the understanding of the complex pathogenesis of diseases.

...read moreread less

Abstract: The construction of the Disease Ontology (DO) has helped promote the investigation of diseases and disease risk factors. DO enables researchers to analyse disease similarity by adopting semantic similarity measures, and has expanded our understanding of the relationships between different diseases and to classify them. Simultaneously, similarities between genes can also be analysed by their associations with similar diseases. As a result, disease heterogeneity is better understood and insights into the molecular pathogenesis of similar diseases have been gained. However, bioinformatics tools that provide easy and straight forward ways to use DO to study disease and gene similarity simultaneously are required. We have developed an R-based software package (DOSim) to compute the similarity between diseases and to measure the similarity between human genes in terms of diseases. DOSim incorporates a DO-based enrichment analysis function that can be used to explore the disease feature of an independent gene set. A multilayered enrichment analysis (GO and KEGG annotation) annotation function that helps users explore the biological meaning implied in a newly detected gene module is also part of the DOSim package. We used the disease similarity application to demonstrate the relationship between 128 different DO cancer terms. The hierarchical clustering of these 128 different cancers showed modular characteristics. In another case study, we used the gene similarity application on 361 obesity-related genes. The results revealed the complex pathogenesis of obesity. In addition, the gene module detection and gene module multilayered annotation functions in DOSim when applied on these 361 obesity-related genes helped extend our understanding of the complex pathogenesis of obesity risk phenotypes and the heterogeneity of obesity-related diseases. DOSim can be used to detect disease-driven gene modules, and to annotate the modules for functions and pathways. The DOSim package can also be used to visualise DO structure. DOSim can reflect the modular characteristic of disease related genes and promote our understanding of the complex pathogenesis of diseases. DOSim is available on the Comprehensive R Archive Network (CRAN) or http://bioinfo.hrbmu.edu.cn/dosim .

...read moreread less

96 citations

Journal Article•DOI•

Knowledge-based vector space model for text clustering

[...]

Liping Jing¹, Michael K. Ng², Joshua Zhexue Huang³•Institutions (3)

Beijing Jiaotong University¹, Hong Kong Baptist University², University of Hong Kong³

01 Oct 2010-Knowledge and Information Systems

TL;DR: A new similarity measure is defined that combines the edge-counting technique, the average distance and the position weighting method to compute the similarity of two terms from an ontology hierarchy to re-weight term frequency in the VSM.

...read moreread less

Abstract: This paper presents a new knowledge-based vector space model (VSM) for text clustering. In the new model, semantic relationships between terms (e.g., words or concepts) are included in representing text documents as a set of vectors. The idea is to calculate the dissimilarity between two documents more effectively so that text clustering results can be enhanced. In this paper, the semantic relationship between two terms is defined by the similarity of the two terms. Such similarity is used to re-weight term frequency in the VSM. We consider and study two different similarity measures for computing the semantic relationship between two terms based on two different approaches. The first approach is based on the existing ontologies like WordNet and MeSH. We define a new similarity measure that combines the edge-counting technique, the average distance and the position weighting method to compute the similarity of two terms from an ontology hierarchy. The second approach is to make use of text corpora to construct the relationships between terms and then calculate their semantic similarities. Three clustering algorithms, bisecting k-means, feature weighting k-means and a hierarchical clustering algorithm, have been used to cluster real-world text data represented in the new knowledge-based VSM. The experimental results show that the clustering performance based on the new model was much better than that based on the traditional term-based VSM.

...read moreread less

95 citations

Collapse

Network Information

Performance

Metrics

15,319

Papers

407,958

Citations

No. of papers in the topic in previous years
Year	Papers
2023	202
2022	522
2021	641
2020	837
2019	866
2018	787

Semantic similarity

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics