A Survey of Heterogeneous Information Network Analysis

doi:10.1109/TKDE.2016.2598561

Home
/
Papers
/
A Survey of Heterogeneous Information Network Analysis

Journal Article•DOI•

A Survey of Heterogeneous Information Network Analysis

Chuan Shi¹, Yitong Li¹, Jiawei Zhang², Yizhou Sun³, Philip S. Yu² - Show less +1 more•Institutions (3)

Beijing University of Posts and Telecommunications¹, University of Illinois at Chicago², University of California, Los Angeles³

01 Jan 2017-IEEE Transactions on Knowledge and Data Engineering (IEEE)-Vol. 29, Iss: 1, pp 17-37

TL;DR: A survey of heterogeneous information network analysis can be found in this article, where the authors introduce basic concepts of HIN analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.

read less

Abstract: Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous information networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks, and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Compared to widely studied homogeneous information network, the heterogeneous information network contains richer structure and semantic information, which provides plenty of opportunities as well as a lot of challenges for data mining. In this paper, we provide a survey of heterogeneous information network analysis. We will introduce basic concepts of heterogeneous information network analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Heterogeneous Graph Attention Network

[...]

Xiao Wang¹, Houye Ji¹, Chuan Shi¹, Bai Wang¹, Yanfang Ye², Peng Cui³, Philip S. Yu⁴ - Show less +3 more•Institutions (4)

Beijing University of Posts and Telecommunications¹, West Virginia University², Tsinghua University³, University of Illinois at Chicago⁴

13 May 2019

TL;DR: Wang et al. as discussed by the authors proposed a heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions, which can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner.

...read moreread less

Abstract: Graph neural network, as a powerful graph representation technique based on deep learning, has shown superior performance and attracted considerable research interest. However, it has not been fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links. The heterogeneity and rich semantic information bring great challenges for designing a graph neural network for heterogeneous graph. Recently, one of the most exciting advancements in deep learning is the attention mechanism, whose great potential has been well demonstrated in various areas. In this paper, we first propose a novel heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions. Specifically, the node-level attention aims to learn the importance between a node and its meta-path based neighbors, while the semantic-level attention is able to learn the importance of different meta-paths. With the learned importance from both node-level and semantic-level attention, the importance of node and meta-path can be fully considered. Then the proposed model can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner. Extensive experimental results on three real-world heterogeneous graphs not only show the superior performance of our proposed model over the state-of-the-arts, but also demonstrate its potentially good interpretability for graph analysis.

...read moreread less

1,467 citations

Journal Article•DOI•

Heterogeneous Information Network Embedding for Recommendation

[...]

Chuan Shi¹, Binbin Hu¹, Wayne Xin Zhao², Philip S. Yu³•Institutions (3)

Beijing University of Posts and Telecommunications¹, Renmin University of China², University of Illinois at Chicago³

01 Feb 2019-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A novel heterogeneous network embedding based approach for HIN based recommendation, called HERec is proposed, which shows the capability of the HERec model for the cold-start problem, and reveals that the transformed embedding information from HINs can improve the recommendation performance.

...read moreread less

Abstract: Due to the flexibility in modelling data heterogeneity, heterogeneous information network (HIN) has been adopted to characterize complex and heterogeneous auxiliary data in recommender systems, called HIN based recommendation . It is challenging to develop effective methods for HIN based recommendation in both extraction and exploitation of the information from HINs. Most of HIN based recommendation methods rely on path based similarity, which cannot fully mine latent structure features of users and items. In this paper, we propose a novel heterogeneous network embedding based approach for HIN based recommendation, called HERec. To embed HINs, we design a meta-path based random walk strategy to generate meaningful node sequences for network embedding. The learned node embeddings are first transformed by a set of fusion functions, and subsequently integrated into an extended matrix factorization (MF) model. The extended MF model together with fusion functions are jointly optimized for the rating prediction task. Extensive experiments on three real-world datasets demonstrate the effectiveness of the HERec model. Moreover, we show the capability of the HERec model for the cold-start problem, and reveal that the transformed embedding information from HINs can improve the recommendation performance.

...read moreread less

768 citations

Proceedings Article•DOI•

Leveraging Meta-path based Context for Top- N Recommendation with A Neural Co-Attention Model

[...]

Binbin Hu¹, Chuan Shi¹, Wayne Xin Zhao², Philip S. Yu³•Institutions (3)

Beijing University of Posts and Telecommunications¹, Renmin University of China², University of Illinois at Chicago³

19 Jul 2018

TL;DR: A novel deep neural network with the co-attention mechanism for leveraging rich meta-path based context for top-N recommendation and performs well in the cold-start scenario and has potentially good interpretability for the recommendation results.

...read moreread less

Abstract: Heterogeneous information network (HIN) has been widely adopted in recommender systems due to its excellence in modeling complex context information. Although existing HIN based recommendation methods have achieved performance improvement to some extent, they have two major shortcomings. First, these models seldom learn an explicit representation for path or meta-path in the recommendation task. Second, they do not consider the mutual effect between the meta-path and the involved user-item pair in an interaction. To address these issues, we develop a novel deep neural network with the co-attention mechanism for leveraging rich meta-path based context for top-N recommendation. We elaborately design a three-way neural interaction model by explicitly incorporating meta-path based context. To construct the meta-path based context, we propose to use a priority based sampling technique to select high-quality path instances. Our model is able to learn effective representations for users, items and meta-path based context for implementing a powerful interaction function. The co-attention mechanism improves the representations for meta-path based con- text, users and items in a mutual enhancement way. Extensive experiments on three real-world datasets have demonstrated the effectiveness of the proposed model. In particular, the proposed model performs well in the cold-start scenario and has potentially good interpretability for the recommendation results.

...read moreread less

482 citations

Posted Content•

Graph Transformer Networks

[...]

Seongjun Yun¹, Minbyul Jeong¹, Raehyun Kim¹, Jaewoo Kang, Hyunwoo Kim¹ - Show less +1 more•Institutions (1)

Korea University¹

06 Nov 2019-arXiv: Learning

TL;DR: This paper proposes Graph Transformer Networks (GTNs) that are capable of generating new graph structures, which involve identifying useful connections between unconnected nodes on the original graph, while learning effective node representation on the new graphs in an end-to-end fashion.

...read moreread less

Abstract: Graph neural networks (GNNs) have been widely used in representation learning on graphs and achieved state-of-the-art performance in tasks such as node classification and link prediction. However, most existing GNNs are designed to learn node representations on the fixed and homogeneous graphs. The limitations especially become problematic when learning representations on a misspecified graph or a heterogeneous graph that consists of various types of nodes and edges. In this paper, we propose Graph Transformer Networks (GTNs) that are capable of generating new graph structures, which involve identifying useful connections between unconnected nodes on the original graph, while learning effective node representation on the new graphs in an end-to-end fashion. Graph Transformer layer, a core layer of GTNs, learns a soft selection of edge types and composite relations for generating useful multi-hop connections so-called meta-paths. Our experiments show that GTNs learn new graph structures, based on data and tasks without domain knowledge, and yield powerful node representation via convolution on the new graphs. Without domain-specific graph preprocessing, GTNs achieved the best performance in all three benchmark node classification tasks against the state-of-the-art methods that require pre-defined meta-paths from domain knowledge.

...read moreread less

255 citations

Proceedings Article•DOI•

Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation

[...]

Shaohua Fan¹, Junxiong Zhu², Xiaotian Han¹, Chuan Shi¹, Linmei Hu¹, Biyu Ma², Li Yongliang² - Show less +3 more•Institutions (2)

Beijing University of Posts and Telecommunications¹, Alibaba Group²

25 Jul 2019

TL;DR: A metapath-guided heterogeneous Graph Neural Network to learn the embeddings of objects in intent recommendation as a Heterogeneous Information Network is proposed and Offline experiments on real large-scale data show the superior performance of the proposed MEIRec, compared to representative methods.

...read moreread less

Abstract: With the prevalence of mobile e-commerce nowadays, a new type of recommendation services, called intent recommendation, is widely used in many mobile e-commerce Apps, such as Taobao and Amazon. Different from traditional query recommendation and item recommendation, intent recommendation is to automatically recommend user intent according to user historical behaviors without any input when users open the App. Intent recommendation becomes very popular in the past two years, because of revealing user latent intents and avoiding tedious input in mobile phones. Existing methods used in industry usually need laboring feature engineering. Moreover, they only utilize attribute and statistic information of users and queries, and fail to take full advantage of rich interaction information in intent recommendation, which may result in limited performances. In this paper, we propose to model the complex objects and rich interactions in intent recommendation as a Heterogeneous Information Network. Furthermore, we present a novel M etapath-guided E mbedding method for I ntent Rec ommendation~(called MEIRec). In order to fully utilize rich structural information, we design a metapath-guided heterogeneous Graph Neural Network to learn the embeddings of objects in intent recommendation. In addition, in order to alleviate huge learning parameters in embeddings, we propose a uniform term embedding mechanism, in which embeddings of objects are made up with the same term embedding space. Offline experiments on real large-scale data show the superior performance of the proposed MEIRec, compared to representative methods.Moreover, the results of online experiments on Taobao e-commerce platform show that MEIRec not only gains a performance improvement of 1.54% on CTR metric, but also attracts up to 2.66% of new users to search queries.

...read moreread less

235 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

The Structure and Function of Complex Networks

[...]

Mark Newman

01 Jan 2003-Siam Review

TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

...read moreread less

Abstract: Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

...read moreread less

17,647 citations

Book•

Social Network Analysis: Methods and Applications

[...]

Stanley Wasserman, Katherine Faust¹•Institutions (1)

University of South Carolina¹

25 Nov 1994

TL;DR: This paper presents mathematical representation of social networks in the social and behavioral sciences through the lens of Dyadic and Triadic Interaction Models, which describes the relationships between actor and group measures and the structure of networks.

...read moreread less

Abstract: Part I. Introduction: Networks, Relations, and Structure: 1. Relations and networks in the social and behavioral sciences 2. Social network data: collection and application Part II. Mathematical Representations of Social Networks: 3. Notation 4. Graphs and matrixes Part III. Structural and Locational Properties: 5. Centrality, prestige, and related actor and group measures 6. Structural balance, clusterability, and transitivity 7. Cohesive subgroups 8. Affiliations, co-memberships, and overlapping subgroups Part IV. Roles and Positions: 9. Structural equivalence 10. Blockmodels 11. Relational algebras 12. Network positions and roles Part V. Dyadic and Triadic Methods: 13. Dyads 14. Triads Part VI. Statistical Dyadic Interaction Models: 15. Statistical analysis of single relational networks 16. Stochastic blockmodels and goodness-of-fit indices Part VII. Epilogue: 17. Future directions.

...read moreread less

17,104 citations

Proceedings Article•

The PageRank Citation Ranking : Bringing Order to the Web

[...]

Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd

11 Nov 1999

TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.

...read moreread less

Abstract: The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. We compare PageRank to an idealized random Web surfer. We show how to efficiently compute PageRank for large numbers of pages. And, we show how to apply PageRank to search and to user navigation.

...read moreread less

14,400 citations

Journal Article•DOI•

Normalized cuts and image segmentation

[...]

Jianbo Shi¹, Jitendra Malik²•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

...read moreread less

13,789 citations

Proceedings Article•

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

[...]

John Lafferty¹, Andrew McCallum, Fernando Pereira•Institutions (1)

Carnegie Mellon University¹

28 Jun 2001

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Abstract: We present conditional random fields , a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

13,190 citations