scispace - formally typeset
Search or ask a question
Book ChapterDOI

SAEA: Self-Attentive Heterogeneous Sequence Learning Model for Entity Alignment

24 Sep 2020-pp 452-467
TL;DR: A Self-Attentive heterogeneous sequence learning model for Entity Alignment (SAEA) that allows us to capture long-term structural dependencies within entities and a degree-aware random walk to generate heterogeneous sequential data for self-attentive learning.
Abstract: We consider the problem of entity alignment in knowledge graphs. Previous works mainly focus on two aspects: One is to improve the TransE-based models which mostly only consider triple-level structural information i.e. relation triples or to make use of graph convolutional networks holding the assumption that equivalent entities are usually neighbored by some other equivalent entities. The other is to incorporate external features, such as attributes types, attribute values, entity names and descriptions to enhance the original relational model. However, the long-term structural dependencies between entities have not been exploited well enough and sometimes external resources are incomplete and unavailable. These will impair the accuracy and robustness of combinational models that use relations and other types of information, especially when iteration is performed. To better explore structural information between entities, we novelly propose a Self-Attentive heterogeneous sequence learning model for Entity Alignment (SAEA) that allows us to capture long-term structural dependencies within entities. Furthermore, considering low-degree entities and relations appear much less in sequences prodeced by traditional random walk methods, we design a degree-aware random walk to generate heterogeneous sequential data for self-attentive learning. To evaluate our proposed model, we conduct extensive experiments on real-world datasets. The experimental results show that our method outperforms various state-of-the-art entity alignment models using relation triples only.
Citations
More filters
Posted Content
TL;DR: This paper introduces an embedding-based approach which leverages a weakly aligned multilingual KG for semi-supervised cross-lingual learning using entity descriptions and shows that the performance of the proposed approach on the entity alignment task improves at each iteration of co-training, and eventually reaches a stage at which it significantly surpasses previous approaches.
Abstract: Multilingual knowledge graph (KG) embeddings provide latent semantic representations of entities and structured knowledge with cross-lingual inferences, which benefit various knowledge-driven cross-lingual NLP tasks. However, precisely learning such cross-lingual inferences is usually hindered by the low coverage of entity alignment in many KGs. Since many multilingual KGs also provide literal descriptions of entities, in this paper, we introduce an embedding-based approach which leverages a weakly aligned multilingual KG for semi-supervised cross-lingual learning using entity descriptions. Our approach performs co-training of two embedding models, i.e. a multilingual KG embedding model and a multilingual literal description embedding model. The models are trained on a large Wikipedia-based trilingual dataset where most entity alignment is unknown to training. Experimental results show that the performance of the proposed approach on the entity alignment task improves at each iteration of co-training, and eventually reaches a stage at which it significantly surpasses previous approaches. We also show that our approach has promising abilities for zero-shot entity alignment, and cross-lingual KG completion.

64 citations

Book ChapterDOI
11 Apr 2021
TL;DR: In this article, the authors proposed a framework for entity alignment (EA) which aims to discover the equivalent entities in different knowledge graphs (KGs) to increase knowledge coverage and quality.
Abstract: Entity alignment (EA) aims to discover the equivalent entities in different knowledge graphs (KGs). It is a pivotal step for integrating KGs to increase knowledge coverage and quality. Recent years have witnessed a rapid increase of EA frameworks. However, state-of-the-art solutions tend to rely on labeled data for model training. Additionally, they work under the closed-domain setting and cannot deal with entities that are unmatchable.

10 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed an unsupervised framework that performs entity alignment in the open world by mining useful features from the side information of KGs and devise an unmatchable entity prediction module to filter out un-matchable entities and produce preliminary alignment results.
Abstract: Abstract Entity alignment (EA) aims to discover the equivalent entities in different knowledge graphs (KGs). It is a pivotal step for integrating KGs to increase knowledge coverage and quality. Recent years have witnessed a rapid increase of EA frameworks. However, state-of-the-art solutions tend to rely on labeled data for model training. Additionally, they work under the closed-domain setting and cannot deal with entities that are unmatchable. To address these deficiencies, we offer an unsupervised framework that performs entity alignment in the open world. Specifically, we first mine useful features from the side information of KGs. Then, we devise an unmatchable entity prediction module to filter out unmatchable entities and produce preliminary alignment results. These preliminary results are regarded as the pseudo-labeled data and forwarded to the progressive learning framework to generate structural representations, which are integrated with the side information to provide a more comprehensive view for alignment. Finally, the progressive learning framework gradually improves the quality of structural embeddings and enhances the alignment performance. Furthermore, noticing that the pseudo-labeled data are of various qualities, we introduce the concept of confidence to measure the probability of an entity pair of being true and develop a confidence-based unsupervised EA framework . Our solutions do not require labeled data and can effectively filter out unmatchable entities. Comprehensive experimental evaluations validate the superiority of our proposals .

5 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed an unsupervised framework that performs entity alignment in the open world by mining useful features from the side information of KGs and devise an unmatchable entity prediction module to filter out un-matchable entities and produce preliminary alignment results.
Abstract: Abstract Entity alignment (EA) aims to discover the equivalent entities in different knowledge graphs (KGs). It is a pivotal step for integrating KGs to increase knowledge coverage and quality. Recent years have witnessed a rapid increase of EA frameworks. However, state-of-the-art solutions tend to rely on labeled data for model training. Additionally, they work under the closed-domain setting and cannot deal with entities that are unmatchable. To address these deficiencies, we offer an unsupervised framework that performs entity alignment in the open world. Specifically, we first mine useful features from the side information of KGs. Then, we devise an unmatchable entity prediction module to filter out unmatchable entities and produce preliminary alignment results. These preliminary results are regarded as the pseudo-labeled data and forwarded to the progressive learning framework to generate structural representations, which are integrated with the side information to provide a more comprehensive view for alignment. Finally, the progressive learning framework gradually improves the quality of structural embeddings and enhances the alignment performance. Furthermore, noticing that the pseudo-labeled data are of various qualities, we introduce the concept of confidence to measure the probability of an entity pair of being true and develop a confidence-based unsupervised EA framework . Our solutions do not require labeled data and can effectively filter out unmatchable entities. Comprehensive experimental evaluations validate the superiority of our proposals .

4 citations

References
More filters
Proceedings Article
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1, Jeffrey Dean1 
05 Dec 2013
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

24,012 citations


"SAEA: Self-Attentive Heterogeneous ..." refers methods in this paper

  • ...To deal with the difficulty of having too many output vectors that need to be updated every epoch, we use negative sampling [11] to update a sample of them....

    [...]

Proceedings ArticleDOI
24 Aug 2014
TL;DR: DeepWalk as mentioned in this paper uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences, which encode social relations in a continuous vector space, which is easily exploited by statistical models.
Abstract: We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs.DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide F1 scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data.DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.

8,117 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.
Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

7,072 citations

Posted Content
TL;DR: A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

7,019 citations


"SAEA: Self-Attentive Heterogeneous ..." refers methods in this paper

  • ...Inspired by [20], we adopt layer normalization and dropout strategy in the self-attention layer and feed-forward layer...

    [...]

  • ...Furthermore, unlike RNN-based models assuming that the next element in a relational path depends on the current input and hidden state which is inappropriate for paths in KGs, we adapt the original residual connection in Transformer to a special crossed residual module....

    [...]

  • ...Therefore, stimulated by the Transformer [20], a purely self-attention based sequence model achieving the start-of-the-art performance and efficiency, we seek to build a sequential alignment model based upon it....

    [...]

  • ...Towards this end, inspired by the new sequential model Transformer [20], which has achieved better performance than traditional recurrent models in machine translation tasks but have not been explored in KGs for entity alignment, we propose a brand-new Self-Attentive heterogeneous sequence learning model for Entity Alignment (SAEA)....

    [...]

Proceedings Article
05 Dec 2013
TL;DR: TransE is proposed, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities, which proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases.
Abstract: We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose TransE, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities. Despite its simplicity, this assumption proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases. Besides, it can be successfully trained on a large scale data set with 1M entities, 25k relationships and more than 17M training samples.

5,109 citations


"SAEA: Self-Attentive Heterogeneous ..." refers background in this paper

  • ...TransE [3] is the most popular one in the KG embedding area which makes each relation triple (h, r, t) meet the requirement of h+r≈t in vector space....

    [...]