Showing papers on "Semantic similarity published in 2020"

PDF

Open Access

Posted Content•

On the Sentence Embeddings from Pre-trained Language Models

[...]

Bohan Li¹, Hao Zhou², Junxian He¹, Mingxuan Wang¹, Yiming Yang¹, Lei Li¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Fudan University²

02 Nov 2020-arXiv: Computation and Language

TL;DR: This paper proposes to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective and achieves significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks.

...read moreread less

Abstract: Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture semantic meaning of sentences. In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task theoretically, and then analyze the BERT sentence embeddings empirically. We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity. To address this issue, we propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective. Experimental results show that our proposed BERT-flow method obtains significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks. The code is available at this https URL.

...read moreread less

267 citations

Proceedings Article•DOI•

On the Sentence Embeddings from Pre-trained Language Models

[...]

Bohan Li¹, Hao Zhou², Junxian He¹, Mingxuan Wang¹, Yiming Yang¹, Lei Li¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Fudan University²

24 Nov 2020

TL;DR: BERT-flow as mentioned in this paper transforms the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.

...read moreread less

178 citations

Posted Content•

Top2Vec: Distributed Representations of Topics

[...]

Dimo Angelov

19 Aug 2020-arXiv: Computation and Language

TL;DR: This model does not require stop-word lists, stemming or lemmatization, and it automatically finds the number of topics, and the resulting topic vectors are jointly embedded with the document and word vectors with distance between them representing semantic similarity.

...read moreread less

Abstract: Topic modeling is used for discovering latent semantic structure, usually referred to as topics, in a large collection of documents. The most widely used methods are Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis. Despite their popularity they have several weaknesses. In order to achieve optimal results they often require the number of topics to be known, custom stop-word lists, stemming, and lemmatization. Additionally these methods rely on bag-of-words representation of documents which ignore the ordering and semantics of words. Distributed representations of documents and words have gained popularity due to their ability to capture semantics of words and documents. We present $\texttt{top2vec}$, which leverages joint document and word semantic embedding to find $\textit{topic vectors}$. This model does not require stop-word lists, stemming or lemmatization, and it automatically finds the number of topics. The resulting topic vectors are jointly embedded with the document and word vectors with distance between them representing semantic similarity. Our experiments demonstrate that $\texttt{top2vec}$ finds topics which are significantly more informative and representative of the corpus trained on than probabilistic generative models.

...read moreread less

130 citations

Proceedings Article•DOI•

Reducing Sentiment Bias in Language Models via Counterfactual Evaluation

[...]

Po-Sen Huang¹, Huan Zhang², Ray Jiang¹, Robert Stanforth³, Johannes Welbl⁴, Jack W. Rae¹, Vishal Maini, Dani Yogatama¹, Pushmeet Kohli¹ - Show less +5 more•Institutions (4)

Google¹, University of California, Los Angeles², Massachusetts Institute of Technology³, University College London⁴

01 Nov 2020

TL;DR: This paper quantifies sentiment bias by adopting individual and group fairness metrics from the fair machine learning literature, and proposes embedding and sentiment prediction-derived regularization on the language model’s latent representations.

...read moreread less

Abstract: Advances in language modeling architectures and the availability of large text corpora have driven progress in automatic text generation. While this results in models capable of generating coherent texts, it also prompts models to internalize social biases present in the training corpus. This paper aims to quantify and reduce a particular type of bias exhibited by language models: bias in the sentiment of generated text. Given a conditioning context (e.g., a writing prompt) and a language model, we analyze if (and how) the sentiment of the generated text is affected by changes in values of sensitive attributes (e.g., country names, occupations, genders) in the conditioning context using a form of counterfactual evaluation. We quantify sentiment bias by adopting individual and group fairness metrics from the fair machine learning literature, and demonstrate that large-scale models trained on two different corpora (news articles, and Wikipedia) exhibit considerable levels of bias. We then propose embedding and sentiment prediction-derived regularization on the language model’s latent representations. The regularizations improve fairness metrics while retaining comparable levels of perplexity and semantic similarity.

...read moreread less

109 citations

Journal Article•DOI•

Dissociable effects of prediction and integration during language comprehension: evidence from a large-scale study using brain potentials

[...]

Mante S. Nieuwland¹, Mante S. Nieuwland², Dale J. Barr³, Federica Bartolozzi¹, Federica Bartolozzi², Simon Busch-Moreno⁴, Emily Darley⁵, David I. Donaldson⁶, Heather J. Ferguson⁷, Xiao Fu⁴, Evelien Heyselaar², Evelien Heyselaar⁸, Falk Huettig², E. Matthew Husband⁹, Aine Ito¹, Aine Ito⁹, Nina Kazanina⁵, Vita Kogan¹, Zdenko Kohút¹⁰, Eugenia Kulakova⁴, Diane Mézière¹, Stephen Politzer-Ahles¹¹, Stephen Politzer-Ahles⁹, Guillaume A. Rousselet³, Shirley-Ann Rueschemeyer¹⁰, Katrien Segaert⁸, Katrien Segaert⁷, Jyrki Tuomainen⁴, Sarah Von Grebmer Zu Wolfsthurn⁵ - Show less +25 more•Institutions (11)

University of Edinburgh¹, Max Planck Society², University of Glasgow³, University College London⁴, University of Bristol⁵, University of Stirling⁶, University of Kent⁷, University of Birmingham⁸, University of Oxford⁹, University of York¹⁰, Hong Kong Polytechnic University¹¹

03 Feb 2020-Philosophical Transactions of the Royal Society B

TL;DR: The results challenge the view that the predictability-dependent N400 reflects the effects of either prediction or integration, and suggest that semantic facilitation of predictable words arises from a cascade of processes that activate and integrate word meaning with context into a sentence-level meaning.

...read moreread less

Abstract: Composing sentence meaning is easier for predictable words than for unpredictable words. Are predictable words genuinely predicted, or simply more plausible and therefore easier to integrate with sentence context? We addressed this persistent and fundamental question using data from a recent, large-scale (n = 334) replication study, by investigating the effects of word predictability and sentence plausibility on the N400, the brain's electrophysiological index of semantic processing. A spatio-temporally fine-grained mixed-effect multiple regression analysis revealed overlapping effects of predictability and plausibility on the N400, albeit with distinct spatio-temporal profiles. Our results challenge the view that the predictability-dependent N400 reflects the effects of either prediction or integration, and suggest that semantic facilitation of predictable words arises from a cascade of processes that activate and integrate word meaning with context into a sentence-level meaning. This article is part of the theme issue 'Towards mechanistic models of meaning composition'.

...read moreread less

108 citations

Proceedings Article•DOI•

tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection

[...]

Nicole Peinelt, Dong Nguyen¹, Maria Liakata²•Institutions (2)

The Turing Institute¹, University of Warwick²

01 Jul 2020

TL;DR: This work proposes a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and shows that the model improves performance over strong neural baselines across a variety of English language datasets.

...read moreread less

Abstract: Semantic similarity detection is a fundamental task in natural language understanding. Adding topic information has been useful for previous feature-engineered semantic similarity models as well as neural models for other tasks. There is currently no standard way of combining topics with pretrained contextual representations such as BERT. We propose a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and show that our model improves performance over strong neural baselines across a variety of English language datasets. We find that the addition of topics to BERT helps particularly with resolving domain-specific cases.

...read moreread less

107 citations

Journal Article•DOI•

MedSTS: A resource for clinical semantic textual similarity

[...]

Yanshan Wang¹, Naveed Afzal¹, Sunyang Fu¹, Liwei Wang¹, Feichen Shen¹, Majid Rastegar-Mojarad¹, Hongfang Liu¹ - Show less +3 more•Institutions (1)

Mayo Clinic¹

01 Mar 2020

TL;DR: The MedSTS_ann corpus as mentioned in this paper is a corpus of 174,629 sentence pairs gathered from a clinical corpus at Mayo Clinic, containing 1068 sentence pairs annotated by two medical experts with semantic similarity scores of 0-5 (low to high similarity).

...read moreread less

Abstract: The adoption of electronic health records (EHRs) has enabled a wide range of applications leveraging EHR data. However, the meaningful use of EHR data largely depends on our ability to efficiently extract and consolidate information embedded in clinical text where natural language processing (NLP) techniques are essential. Semantic textual similarity (STS) that measures the semantic similarity between text snippets plays a significant role in many NLP applications. In the general NLP domain, STS shared tasks have made available a huge collection of text snippet pairs with manual annotations in various domains. In the clinical domain, STS can enable us to detect and eliminate redundant information that may lead to a reduction in cognitive burden and an improvement in the clinical decision-making process. This paper elaborates our efforts to assemble a resource for STS in the medical domain, MedSTS. It consists of a total of 174,629 sentence pairs gathered from a clinical corpus at Mayo Clinic. A subset of MedSTS (MedSTS_ann) containing 1068 sentence pairs was annotated by two medical experts with semantic similarity scores of 0–5 (low to high similarity). We further analyzed the medical concepts in the MedSTS corpus, and tested four STS systems on the MedSTS_ann corpus. In the future, we will organize a shared task by releasing the MedSTS_ann corpus to motivate the community to tackle the real world clinical problems.

...read moreread less

102 citations

Journal Article•DOI•

Evolution of Semantic Similarity -- A Survey

[...]

Dhivya Chandrasekaran¹, Vijay Mago¹•Institutions (1)

Lakehead University¹

19 Apr 2020-arXiv: Computation and Language

TL;DR: This survey article traces the evolution of semantic similarity methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus- based, deep neural network–based methods, and hybrid methods.

...read moreread less

Abstract: Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. In order to address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network-based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place, for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.

...read moreread less

98 citations

Proceedings Article•DOI•

Neural CRF Model for Sentence Alignment in Text Simplification

[...]

Chao Jiang¹, Mounica Maddela, Wuwei Lan¹, Yang Zhong¹, Wei Xu¹ - Show less +1 more•Institutions (1)

Ohio State University¹

05 May 2020

TL;DR: This paper proposed a neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents, but also utilizes a neural sentence pair model to capture semantic similarity for text simplification.

...read moreread less

Abstract: The success of a text simplification system heavily depends on the quality and quantity of complex-simple sentence pairs in the training corpus, which are extracted by aligning sentences between parallel articles. To evaluate and improve sentence alignment quality, we create two manually annotated sentence-aligned datasets from two commonly used text simplification corpora, Newsela and Wikipedia. We propose a novel neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity. Experiments demonstrate that our proposed approach outperforms all the previous work on monolingual sentence alignment task by more than 5 points in F1. We apply our CRF aligner to construct two new text simplification datasets, Newsela-Auto and Wiki-Auto, which are much larger and of better quality compared to the existing datasets. A Transformer-based seq2seq model trained on our datasets establishes a new state-of-the-art for text simplification in both automatic and human evaluation.

...read moreread less

96 citations

Proceedings Article•DOI•

FusAtNet: Dual Attention based SpectroSpatial Multimodal Fusion Network for Hyperspectral and LiDAR Classification

[...]

Satyam Mohla¹, Shivam Pande¹, Biplab Banerjee¹, Subhasis Chaudhuri¹•Institutions (1)

Indian Institute of Technology Bombay¹

14 Jun 2020

TL;DR: The proposed FusAtNet framework achieves the state-of-the-art classification performance, including on the largest HSI-LiDAR dataset available, University of Houston (Data Fusion Contest - 2013), opening new avenues in multimodal feature fusion for classification.

...read moreread less

Abstract: With recent advances in sensing, multimodal data is becoming easily available for various applications, especially in remote sensing (RS), where many data types like multispectral imagery (MSI), hyperspectral imagery (HSI), LiDAR etc. are available. Effective fusion of these multisource datasets is becoming important, for these multimodality features have been shown to generate highly accurate land-cover maps. However, fusion in the context of RS is non-trivial considering the redundancy involved in the data and the large domain differences among multiple modalities. In addition, the feature extraction modules for different modalities hardly interact among themselves, which further limits their semantic relatedness. As a remedy, we propose a feature fusion and extraction framework, namely FusAtNet, for collective land-cover classification of HSIs and LiDAR data in this paper. The proposed framework effectively utilizses HSI modality to generate an attention map using "self-attention" mechanism that highlights its own spectral features. Similarly, a "cross-attention" approach is simultaneously used to harness the LiDAR derived attention map that accentuates the spatial features of HSI. These attentive spectral and spatial representations are then explored further along with the original data to obtain modality-specific feature embeddings. The modality oriented joint spectro-spatial information thus obtained, is subsequently utilized to carry out the land-cover classification task. Experimental evaluations on three HSILiDAR datasets show that the proposed method achieves the state-of-the-art classification performance, including on the largest HSI-LiDAR dataset available, University of Houston (Data Fusion Contest - 2013), opening new avenues in multimodal feature fusion for classification.

...read moreread less

93 citations

Book Chapter•DOI•

Gene Ontology Semantic Similarity Analysis Using GOSemSim.

[...]

Guangchuang Yu¹•Institutions (1)

Southern Medical University¹

01 Jan 2020-Methods of Molecular Biology

TL;DR: This chapter illustrates the use of GOSemSim on a list of regulators in preimplantation embryos with step-by-step analysis as well as instructions on interpretation and visualization of the results.

...read moreread less

Abstract: The GOSemSim package, an R-based tool within the Bioconductor project, offers several methods based on information content and graph structure for measuring semantic similarity among GO terms, gene products and gene clusters. In this chapter, I illustrate the use of GOSemSim on a list of regulators in preimplantation embryos. A step-by-step analysis was provided as well as instructions on interpretation and visualization of the results. GOSemSim is open-source and is available from https://www.bioconductor.org/packages/GOSemSim .

...read moreread less

Journal Article•DOI•

An efficient approach based on multi-sources information to predict circRNA–disease associations using deep convolutional neural network

[...]

Lei Wang¹, Zhu-Hong You¹, Yu-An Huang², De-Shuang Huang³, Keith C. C. Chan² - Show less +1 more•Institutions (3)

Chinese Academy of Sciences¹, Hong Kong Polytechnic University², Tongji University³

01 Jul 2020-Bioinformatics

TL;DR: An efficient computational method based on multi-source information combined with deep convolutional neural network to predict circRNA-disease associations that achieves the best results and can provide reliable candidates for biological experiments.

...read moreread less

Abstract: Motivation Emerging evidence indicates that circular RNA (circRNA) plays a crucial role in human disease. Using circRNA as biomarker gives rise to a new perspective regarding our diagnosing of diseases and understanding of disease pathogenesis. However, detection of circRNA-disease associations by biological experiments alone is often blind, limited to small scale, high cost and time consuming. Therefore, there is an urgent need for reliable computational methods to rapidly infer the potential circRNA-disease associations on a large scale and to provide the most promising candidates for biological experiments. Results In this article, we propose an efficient computational method based on multi-source information combined with deep convolutional neural network (CNN) to predict circRNA-disease associations. The method first fuses multi-source information including disease semantic similarity, disease Gaussian interaction profile kernel similarity and circRNA Gaussian interaction profile kernel similarity, and then extracts its hidden deep feature through the CNN and finally sends them to the extreme learning machine classifier for prediction. The 5-fold cross-validation results show that the proposed method achieves 87.21% prediction accuracy with 88.50% sensitivity at the area under the curve of 86.67% on the CIRCR2Disease dataset. In comparison with the state-of-the-art SVM classifier and other feature extraction methods on the same dataset, the proposed model achieves the best results. In addition, we also obtained experimental support for prediction results by searching published literature. As a result, 7 of the top 15 circRNA-disease pairs with the highest scores were confirmed by literature. These results demonstrate that the proposed model is a suitable method for predicting circRNA-disease associations and can provide reliable candidates for biological experiments. Availability and implementation The source code and datasets explored in this work are available at https://github.com/look0012/circRNA-Disease-association. Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

Proceedings Article•DOI•

Boosting Few-Shot Learning With Adaptive Margin Loss

[...]

Aoxue Li¹, Weiran Huang², Xu Lan³, Jiashi Feng⁴, Zhenguo Li², Liwei Wang¹ - Show less +2 more•Institutions (4)

Peking University¹, Huawei², Queen Mary University of London³, National University of Singapore⁴

14 Jun 2020

TL;DR: The authors proposed an adaptive margin principle to improve the generalization ability of metric-based meta-learning approaches for few-shot learning problems, where semantic similarity between each pair of classes is considered to separate samples in the feature embedding space from similar classes.

...read moreread less

Abstract: Few-shot learning (FSL) has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in learning to generalize from a few examples. This paper proposes an adaptive margin principle to improve the generalization ability of metric-based meta-learning approaches for few-shot learning problems. Specifically, we first develop a class-relevant additive margin loss, where semantic similarity between each pair of classes is considered to separate samples in the feature embedding space from similar classes. Further, we incorporate the semantic context among all classes in a sampled training task and develop a task-relevant additive margin loss to better distinguish samples from different classes. Our adaptive margin method can be easily extended to a more realistic generalized FSL setting. Extensive experiments demonstrate that the proposed method can boost the performance of current metric-based meta-learning approaches, under both the standard FSL and generalized FSL settings.

...read moreread less

Journal Article•DOI•

Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering

[...]

Massimo Esposito¹, Emanuele Damiano¹, Aniello Minutolo¹, Giuseppe De Pietro¹, Hamido Fujita - Show less +1 more•Institutions (1)

Indian Council of Agricultural Research¹

01 Apr 2020-Information Sciences

TL;DR: This paper proposes a hybrid Query Expansion (QE) approach, based on lexical resources and word embeddings, for QA systems, which is implemented into an existing QA system and experimentally evaluated, with respect to different possible configurations and selected baselines, for the Italian language and in the Cultural Heritage domain.

...read moreread less

Journal Article•DOI•

Measurement of Text Similarity: A Survey

[...]

Jiapeng Wang, Yihong Dong

31 Aug 2020-Information-an International Interdisciplinary Journal

TL;DR: This paper systematically combs the research status of similarity measurement, analyzes the advantages and disadvantages of current methods, develops a more comprehensive classification description system of text similarity measurement algorithms, and summarizes the future development direction.

...read moreread less

Abstract: Text similarity measurement is the basis of natural language processing tasks, which play an important role in information retrieval, automatic question answering, machine translation, dialogue systems, and document matching. This paper systematically combs the research status of similarity measurement, analyzes the advantages and disadvantages of current methods, develops a more comprehensive classification description system of text similarity measurement algorithms, and summarizes the future development direction. With the aim of providing reference for related research and application, the text similarity measurement method is described by two aspects: text distance and text representation. The text distance can be divided into length distance, distribution distance, and semantic distance; text representation is divided into string-based, corpus-based, single-semantic text, multi-semantic text, and graph-structure-based representation. Finally, the development of text similarity is also summarized in the discussion section.

...read moreread less

Proceedings Article•DOI•

Unsupervised Paraphrasing by Simulated Annealing

[...]

Xianggen Liu¹, Lili Mou², Fandong Meng³, Hao Zhou⁴, Jie Zhou⁵, Sen Song¹ - Show less +2 more•Institutions (5)

Tsinghua University¹, The Chinese University of Hong Kong², Beijing Jiaotong University³, Shanghai Jiao Tong University⁴, Tencent⁵

01 Jul 2020

TL;DR: Results show that UPSA achieves the state-of-the-art performance compared with previous unsupervised methods in terms of both automatic and human evaluations, and outperforms most existing domain-adapted supervised models, showing the generalizability of UPSA.

...read moreread less

Abstract: We propose UPSA, a novel approach that accomplishes Unsupervised Paraphrasing by Simulated Annealing. We model paraphrase generation as an optimization problem and propose a sophisticated objective function, involving semantic similarity, expression diversity, and language fluency of paraphrases. UPSA searches the sentence space towards this objective by performing a sequence of local editing. We evaluate our approach on various datasets, namely, Quora, Wikianswers, MSCOCO, and Twitter. Extensive results show that UPSA achieves the state-of-the-art performance compared with previous unsupervised methods in terms of both automatic and human evaluations. Further, our approach outperforms most existing domain-adapted supervised models, showing the generalizability of UPSA.

...read moreread less

Book Chapter•DOI•

Hard negative examples are hard, but useful

[...]

Hong Xuan¹, Abby Stylianou², Xiaotong Liu¹, Robert Pless¹•Institutions (2)

George Washington University¹, Saint Louis University²

23 Aug 2020

TL;DR: In this paper, the authors characterize the space of triplets and derive why hard negatives make triplet loss training fail, and propose a simple fix to the loss function and show that, with this fix, optimizing with hard negative examples becomes feasible.

...read moreread less

Abstract: Triplet loss is an extremely common approach to distance metric learning. Representations of images from the same class are optimized to be mapped closer together in an embedding space than representations of images from different classes. Much work on triplet losses focuses on selecting the most useful triplets of images to consider, with strategies that select dissimilar examples from the same class or similar examples from different classes. The consensus of previous research is that optimizing with the hardest negative examples leads to bad training behavior. That’s a problem – these hardest negatives are literally the cases where the distance metric fails to capture semantic similarity. In this paper, we characterize the space of triplets and derive why hard negatives make triplet loss training fail. We offer a simple fix to the loss function and show that, with this fix, optimizing with hard negative examples becomes feasible. This leads to more generalizable features, and image retrieval results that outperform state of the art for datasets with high intra-class variance. Code is available at: https://github.com/littleredxh/HardNegative.git

...read moreread less

Posted Content•

Boosting Few-Shot Learning With Adaptive Margin Loss

[...]

Aoxue Li¹, Weiran Huang², Xu Lan³, Jiashi Feng⁴, Zhenguo Li², Liwei Wang¹ - Show less +2 more•Institutions (4)

Peking University¹, Huawei², Queen Mary University of London³, National University of Singapore⁴

28 May 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: An adaptive margin principle is proposed to improve the generalization ability of metric-based meta-learning approaches for few-shot learning problems by developing a class-relevant additive margin loss, where semantic similarity between each pair of classes is considered to separate samples in the feature embedding space from similar classes.

...read moreread less

Journal Article•DOI•

A collaborative filtering recommender system using genetic algorithm

[...]

Bushra Alhijawi¹, Yousef Kilani¹•Institutions (1)

Hashemite University¹

01 Nov 2020-Information Processing and Management

TL;DR: A novel genetic-based recommender system (BLIGA) that depends on the semantic information and historical rating data and its capability to achieve more accurate predictions than the alternative methods regardless of the number of K-neighbors is presented.

...read moreread less

Abstract: This paper presents a novel genetic-based recommender system (BLIGA) that depends on the semantic information and historical rating data. The main contribution of this research lies in evaluating the possible recommendation lists instead of evaluating items then forming the recommendation list. BLIGA utilizes the genetic algorithm to find the best list of items to the active user. Thus, each individual represents a candidate recommendation list. BLIGA hierarchically evaluates the individuals using three fitness functions. The first function uses semantic information about items to estimates the strength of the semantic similarity between items. The second function estimates the similarity in satisfaction level between users. The third function depends on the predicted ratings to select the best recommendation list. BLIGA results have been compared against recommendation results from alternative collaborative filtering methods. The results demonstrate the superiority of BLIGA and its capability to achieve more accurate predictions than the alternative methods regardless of the number of K-neighbors.

...read moreread less

Posted Content•

SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization

[...]

Yang Gao¹, Wei Zhao², Steffen Eger²•Institutions (2)

Royal Holloway, University of London¹, Technische Universität Darmstadt²

07 May 2020-arXiv: Computation and Language

TL;DR: This work proposes SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques.

...read moreread less

Abstract: We study unsupervised multi-document summarization evaluation metrics, which require neither human-written reference summaries nor human annotations (e.g. preferences, ratings, etc.). We propose SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. Compared to the state-of-the-art unsupervised evaluation metrics, SUPERT correlates better with human ratings by 18-39%. Furthermore, we use SUPERT as rewards to guide a neural-based reinforcement learning summarizer, yielding favorable performance compared to the state-of-the-art unsupervised summarizers. All source code is available at this https URL.

...read moreread less

Journal Article•DOI•

CGMVQA: A New Classification and Generative Model for Medical Visual Question Answering

[...]

Fuji Ren¹, Yangyang Zhou¹•Institutions (1)

University of Tokushima¹

11 Mar 2020-IEEE Access

TL;DR: The proposed CGMVQA model, including classification and answer generation capabilities, is effective in medical visual question answering and can better assist doctors in clinical analysis and diagnosis.

...read moreread less

Abstract: Medical images are playing an important role in the medical domain. A mature medical visual question answering system can aid diagnosis, but there is no satisfactory method to solve this comprehensive problem so far. Considering that there are many different types of questions, we propose a model called CGMVQA, including classification and answer generation capabilities to turn this complex problem into multiple simple problems in this paper. We adopt data augmentation on images and tokenization on texts. We use pre-trained ResNet152 to extract image features and add three kinds of embeddings together to deal with texts. We reduce the parameters of the multi-head self-attention transformer to cut the computational cost down. We adjust the masking and output layers to change the functions of the model. This model establishes new state-of-the-art results: 0.640 of classification accuracy, 0.659 of word matching and 0.678 of semantic similarity in ImageCLEF 2019 VQA-Med data set. It suggests that the CGMVQA is effective in medical visual question answering and can better assist doctors in clinical analysis and diagnosis.

...read moreread less

Proceedings Article•DOI•

SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization

[...]

Yang Gao¹, Wei Zhao², Steffen Eger²•Institutions (2)

Chinese Academy of Sciences¹, Technische Universität Darmstadt²

01 Jul 2020

TL;DR: The authors propose to measure the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques.

...read moreread less

Abstract: We study unsupervised multi-document summarization evaluation metrics, which require neither human-written reference summaries nor human annotations (e.g. preferences, ratings, etc.). We propose SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. Compared to the state-of-the-art unsupervised evaluation metrics, SUPERT correlates better with human ratings by 18- 39%. Furthermore, we use SUPERT as rewards to guide a neural-based reinforcement learning summarizer, yielding favorable performance compared to the state-of-the-art unsupervised summarizers. All source code is available at https://github.com/yg211/acl20-ref-free-eval.

...read moreread less

Journal Article•DOI•

Default network contributions to episodic and semantic processing during divergent creative thinking: A representational similarity analysis.

[...]

Roger E. Beaty¹, Qunlin Chen¹, Alexander P. Christensen², Yoed N. Kenett³, Paul J. Silvia², Mathias Benedek⁴, Daniel L. Schacter⁵ - Show less +3 more•Institutions (5)

Pennsylvania State University¹, University of North Carolina at Greensboro², University of Pennsylvania³, University of Graz⁴, Harvard University⁵

01 Apr 2020-NeuroImage

TL;DR: The findings point to dissociable contributions of episodic and semantic memory processes to creative cognition and suggest that distinct regions within the default network support specific memory-related processes during divergent thinking.

...read moreread less

Proceedings Article•DOI•

Semantic Pyramid for Image Generation

[...]

Assaf Shocher¹, Yossi Gandelsman², Inbar Mosseri², Michal Yarom², Michal Irani¹, William T. Freeman², Tali Dekel² - Show less +3 more•Institutions (2)

Weizmann Institute of Science¹, Google²

14 Jun 2020

TL;DR: In this paper, the semantic generation pyramid (SGP) model is proposed to generate images with a controllable extent of semantic similarity to a reference image, and different manipulation tasks such as semantically controlled inpainting and compositing.

...read moreread less

Abstract: We present a novel GAN-based model that utilizes the space of deep features learned by a pre-trained classification model. Inspired by classical image pyramid representations, we construct our model as a Semantic Generation Pyramid -- a hierarchical framework which leverages the continuum of semantic information encapsulated in such deep features; this ranges from low level information contained in fine features to high level, semantic information contained in deeper features. More specifically, given a set of features extracted from a reference image, our model generates diverse image samples, each with matching features at each semantic level of the classification model. We demonstrate that our model results in a versatile and flexible framework that can be used in various classic and novel image generation tasks. These include: generating images with a controllable extent of semantic similarity to a reference image, and different manipulation tasks such as semantically-controlled inpainting and compositing; all achieved with the same model, with no further training.

...read moreread less

Journal Article•DOI•

Deep hierarchical encoding model for sentence semantic matching

[...]

Wenpeng Lu¹, Xu Zhang¹, Huimin Lu², Fangfang Li•Institutions (2)

Qilu University of Technology¹, Kyushu Institute of Technology²

01 Aug 2020-Journal of Visual Communication and Image Representation

TL;DR: A hierarchical encoding model (HEM) for sentence representation is proposed, further enhanced by a hierarchical matching mechanism for sentence interaction, which significantly outperforms the existing state-of-the-art neural models on the public real-world dataset.

...read moreread less

Posted Content•

An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph

[...]

Jiarui Jin¹, Jiarui Qin², Yuchen Fang², Kounianhua Du², Weinan Zhang², Yong Yu², Zheng Zhang³, Alexander J. Smola³ - Show less +4 more•Institutions (3)

Association for Computing Machinery¹, Shanghai Jiao Tong University², Amazon.com³

01 Jul 2020-arXiv: Information Retrieval

TL;DR: This work first analyzes the significance of learning interactions in HINs and proposes a novel formulation to capture the interactive patterns between each pair of nodes through their metapath-guided neighborhoods, and is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.

...read moreread less

Abstract: There is an influx of heterogeneous information network (HIN) based recommender systems in recent years since HIN is capable of characterizing complex graphs and contains rich semantics. Although the existing approaches have achieved performance improvement, while practical, they still face the following problems. On one hand, most existing HIN-based methods rely on explicit path reachability to leverage path-based semantic relatedness between users and items, e.g., metapath-based similarities. These methods are hard to use and integrate since path connections are sparse or noisy, and are often of different lengths. On the other hand, other graph-based methods aim to learn effective heterogeneous network representations by compressing node together with its neighborhood information into single embedding before prediction. This weakly coupled manner in modeling overlooks the rich interactions among nodes, which introduces an early summarization issue. In this paper, we propose an end-to-end Neighborhood-based Interaction Model for Recommendation (NIRec) to address the above problems. Specifically, we first analyze the significance of learning interactions in HINs and then propose a novel formulation to capture the interactive patterns between each pair of nodes through their metapath-guided neighborhoods. Then, to explore complex interactions between metapaths and deal with the learning complexity on large-scale networks, we formulate interaction in a convolutional way and learn efficiently with fast Fourier transform. The extensive experiments on four different types of heterogeneous graphs demonstrate the performance gains of NIRec comparing with state-of-the-arts. To the best of our knowledge, this is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.

...read moreread less

Journal Article•DOI•

A gradient from long-term memory to novel cognition: Transitions through default mode and executive cortex.

[...]

Xiuyi Wang¹, Daniel S. Margulies², Jonathan Smallwood¹, Elizabeth Jefferies¹•Institutions (2)

University of York¹, Centre national de la recherche scientifique²

15 Oct 2020-NeuroImage

TL;DR: The brain’s response to the task gradient varied systematically along the connectivity gradient, with the strongest response in default mode network when the probe and target items were highly overlapping conceptually.

...read moreread less

Posted Content•

CODER: Knowledge infused cross-lingual medical term embedding for term normalization.

[...]

Zheng Yuan, Zhengyun Zhao, Sheng Yu

05 Nov 2020-arXiv: Computation and Language

TL;DR: CODER embeddings excellently reflect semantic similarity and relatedness of medical concepts and can be used for embedding-based medical term normalization or to provide features for machine learning.

...read moreread less

Abstract: This paper proposes CODER: contrastive learning on knowledge graphs for cross-lingual medical term representation. CODER is designed for medical term normalization by providing close vector representations for different terms that represent the same or similar medical concepts with cross-lingual support. We train CODER via contrastive learning on a medical knowledge graph (KG) named the Unified Medical Language System, where similarities are calculated utilizing both terms and relation triplets from KG. Training with relations injects medical knowledge into embeddings and aims to provide potentially better machine learning features. We evaluate CODER in zero-shot term normalization, semantic similarity, and relation classification benchmarks, which show that CODERoutperforms various state-of-the-art biomedical word embedding, concept embeddings, and contextual embeddings. Our codes and models are available at this https URL.

...read moreread less

Proceedings Article•DOI•

An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph

[...]

Jiarui Jin¹, Jiarui Qin¹, Yuchen Fang¹, Kounianhua Du¹, Weinan Zhang¹, Yong Yu¹, Zheng Zhang², Alexander J. Smola² - Show less +4 more•Institutions (2)

Shanghai Jiao Tong University¹, Amazon.com²

23 Aug 2020

TL;DR: Wang et al. as discussed by the authors proposed an end-to-end Neighborhood-based Interaction Model for Recommendation (NIRec) to capture the interactive patterns between each pair of nodes through their metapath-guided neighborhoods.

...read moreread less

Abstract: There is an influx of heterogeneous information network (HIN) based recommender systems in recent years since HIN is capable of characterizing complex graphs and contains rich semantics. Although the existing approaches have achieved performance improvement, while practical, they still face the following problems. On one hand, most existing HIN-based methods rely on explicit path reachability to leverage path-based semantic relatedness between users and items, e.g., metapath-based similarities. These methods are hard to use and integrate since path connections are sparse or noisy, and are often of different lengths. On the other hand, other graph-based methods aim to learn effective heterogeneous network representations by compressing node together with its neighborhood information into single embedding before prediction. This weakly coupled manner in modeling overlooks the rich interactions among nodes, which introduces an early summarization issue. In this paper, we propose an end-to-end Neighborhood-based Interaction Model for Recommendation (NIRec) to address above problems. Specifically, we first analyze the significance of learning interactions in HINs and then propose a novel formulation to capture the interactive patterns between each pair of nodes through their metapath-guided neighborhoods. Then, to explore complex interactions between metapaths and deal with the learning complexity on large-scale networks, we formulate interaction in a convolutional way and learn efficiently with fast Fourier transform. The extensive experiments on four different types of heterogeneous graphs demonstrate the performance gains of NIRec comparing with state-of-the-arts. To the best of our knowledge, this is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.

...read moreread less

Journal Article•DOI•

Weakly-Supervised Video Moment Retrieval via Semantic Completion Network

[...]

Zhijie Lin¹, Zhou Zhao¹, Zhu Zhang¹, Qi Wang², Huasheng Liu² - Show less +1 more•Institutions (2)

Zhejiang University¹, Alibaba Group²

03 Apr 2020

TL;DR: In this paper, a weakly-supervised moment retrieval framework is proposed, which requires only coarse video-level annotations for training and proposes a proposal generation module that aggregates the context information to generate and score all candidate proposals in one single pass.

...read moreread less

Abstract: Video moment retrieval is to search the moment that is most relevant to the given natural language query. Existing methods are mostly trained in a fully-supervised setting, which requires the full annotations of temporal boundary for each query. However, manually labeling the annotations is actually time-consuming and expensive. In this paper, we propose a novel weakly-supervised moment retrieval framework requiring only coarse video-level annotations for training. Specifically, we devise a proposal generation module that aggregates the context information to generate and score all candidate proposals in one single pass. We then devise an algorithm that considers both exploitation and exploration to select top-K proposals. Next, we build a semantic completion module to measure the semantic similarity between the selected proposals and query, compute reward and provide feedbacks to the proposal generation module for scoring refinement. Experiments on the ActivityCaptions and Charades-STA demonstrate the effectiveness of our proposed method.

...read moreread less

Collapse