scispace - formally typeset
Open AccessProceedings Article

Multi-modal Bayesian embeddings for learning social knowledge graphs

Reads0
Chats0
TLDR
A multi-modal Bayesian embedding model, GenVector, is proposed to learn latent topics that generate word and network embeddings in a shared latent topic space, and significantly decreases the error rate in an online A/B test with live users.
Abstract
We study the extent to which online social networks can be connected to knowledge bases. The problem is referred to as learning social knowledge graphs. We propose a multi-modal Bayesian embedding model, GenVector, to learn latent topics that generate word embeddings and network embeddings simultaneously. GenVector leverages large-scale unlabeled data with embeddings and represents data of two modalities--i.e., social network users and knowledge concepts--in a shared latent topic space. Experiments on three datasets show that the proposed method clearly outperforms state-of-the-art methods. We then deploy the method on AMiner, an online academic search system to connect with a network of 38,049,189 researchers with a knowledge base with 35,415,011 concepts. Our method significantly decreases the error rate of learning social knowledge graphs in an online A/B test with live users.

read more

Citations
More filters
Journal ArticleDOI

Graph embedding techniques, applications, and performance: A survey

TL;DR: A comprehensive and structured analysis of various graph embedding techniques proposed in the literature, and the open-source Python library, named GEM (Graph Embedding Methods, available at https://github.com/palash1992/GEM ), which provides all presented algorithms within a unified interface to foster and facilitate research on the topic.
Journal ArticleDOI

A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications

TL;DR: A comprehensive review of the literature in graph embedding can be found in this paper, where the authors introduce the formal definition of graph embeddings as well as the related concepts.
Posted Content

A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications

TL;DR: This survey conducts a comprehensive review of the literature in graph embedding and proposes two taxonomies ofGraph embedding which correspond to what challenges exist in differentgraph embedding problem settings and how the existing work addresses these challenges in their solutions.
Proceedings ArticleDOI

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec.

TL;DR: The NetMF method offers significant improvements over DeepWalk and LINE for conventional network mining tasks and provides the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian.
Proceedings ArticleDOI

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec

TL;DR: In this paper, a unified matrix factorization framework for skip-gram based network embedding was proposed, leading to a better understanding of latent network representation learning and the theory of graph Laplacian.
References
More filters
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.
Proceedings ArticleDOI

DeepWalk: online learning of social representations

TL;DR: DeepWalk as mentioned in this paper uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences, which encode social relations in a continuous vector space, which is easily exploited by statistical models.