Home
/
Authors
/
Qiaozhu Mei

Author

Qiaozhu Mei

Other affiliations: Vanderbilt University, University of Illinois at Urbana–Champaign, Peking University

Bio: Qiaozhu Mei is an academic researcher from University of Michigan. The author has contributed to research in topics: Topic model & Graph (abstract data type). The author has an hindex of 50, co-authored 186 publications receiving 15756 citations. Previous affiliations of Qiaozhu Mei include Vanderbilt University & University of Illinois at Urbana–Champaign.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

LINE: Large-scale Information Network Embedding

[...]

Jian Tang¹, Meng Qu², Mingzhe Wang², Ming Zhang², Jun Yan¹, Qiaozhu Mei³ - Show less +2 more•Institutions (3)

Microsoft¹, Peking University², University of Michigan³

18 May 2015

TL;DR: A novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.

...read moreread less

Abstract: This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online\footnote{\url{https://github.com/tangjianpku/LINE}}.

...read moreread less

3,492 citations

Proceedings Article•DOI•

LINE: Large-scale Information Network Embedding

[...]

Jian Tang¹, Meng Qu², Mingzhe Wang², Ming Zhang², Jun Yan¹, Qiaozhu Mei³ - Show less +2 more•Institutions (3)

Microsoft¹, Peking University², University of Michigan³

12 Mar 2015-arXiv: Learning

TL;DR: LINE as discussed by the authors proposes a network embedding method called LINE, which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.

...read moreread less

Abstract: This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the "LINE," which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online.

...read moreread less

3,447 citations

Proceedings Article•DOI•

Topic sentiment mixture: modeling facets and opinions in weblogs

[...]

Qiaozhu Mei¹, Xu Ling¹, Matthew Wondra¹, Hang Su², ChengXiang Zhai¹ - Show less +1 more•Institutions (2)

University of Illinois at Urbana–Champaign¹, Vanderbilt University²

08 May 2007

TL;DR: The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments and could also provide general sentiment models that are applicable to any ad hoc topics.

...read moreread less

Abstract: In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments. It could also provide general sentiment models that are applicable to any ad hoc topics. With a specifically designed HMM structure, the sentiment models and topic models estimated with TSM can be utilized to extract topic life cycles and sentiment dynamics. Empirical experiments on different Weblog datasets show that this approach is effective for modeling the topic facets and sentiments and extracting their dynamics from Weblog collections. The TSM model is quite general; it can be applied to any text collections with a mixture of topics and sentiments, thus has many potential applications, such as search result summarization, opinion tracking, and user behavior prediction.

...read moreread less

872 citations

Proceedings Article•

Rumor has it: Identifying Misinformation in Microblogs

[...]

Vahed Qazvinian¹, Emily Rosengren¹, Dragomir R. Radev¹, Qiaozhu Mei¹•Institutions (1)

University of Michigan¹

27 Jul 2011

TL;DR: This paper addresses the problem of rumor detection in microblogs and explores the effectiveness of 3 categories of features: content- based, network-based, and microblog-specific memes for correctly identifying rumors, and believes that its dataset is the first large-scale dataset on rumor detection.

...read moreread less

Abstract: A rumor is commonly defined as a statement whose true value is unverifiable. Rumors may spread misinformation (false information) or disinformation (deliberately false information) on a network of people. Identifying rumors is crucial in online social media where large amounts of information are easily spread across a large network by sources with unverified authority. In this paper, we address the problem of rumor detection in microblogs and explore the effectiveness of 3 categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumors. Moreover, we show how these features are also effective in identifying disinformers, users who endorse a rumor and further help it to spread. We perform our experiments on more than 10,000 manually annotated tweets collected from Twitter and show how our retrieval model achieves more than 0.95 in Mean Average Precision (MAP). Finally, we believe that our dataset is the first large-scale dataset on rumor detection. It can open new dimensions in analyzing online misinformation and other aspects of microblog conversations.

...read moreread less

792 citations

Proceedings Article•DOI•

PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks

[...]

Jian Tang¹, Meng Qu², Qiaozhu Mei³•Institutions (3)

Microsoft¹, Peking University², University of Michigan³

10 Aug 2015

TL;DR: A semi-supervised representation learning method for text data, which is called the predictive text embedding (PTE), which is comparable or more effective, much more efficient, and has fewer parameters to tune.

...read moreread less

Abstract: Unsupervised text embedding methods, such as Skip-gram and Paragraph Vector, have been attracting increasing attention due to their simplicity, scalability, and effectiveness. However, comparing to sophisticated deep learning architectures such as convolutional neural networks, these methods usually yield inferior results when applied to particular machine learning tasks. One possible reason is that these text embedding methods learn the representation of text in a fully unsupervised way, without leveraging the labeled information available for the task. Although the low dimensional representations learned are applicable to many different tasks, they are not particularly tuned for any task. In this paper, we fill this gap by proposing a semi-supervised representation learning method for text data, which we call the predictive text embedding (PTE). Predictive text embedding utilizes both labeled and unlabeled data to learn the embedding of text. The labeled information and different levels of word co-occurrence information are first represented as a large-scale heterogeneous text network, which is then embedded into a low dimensional space through a principled and efficient algorithm. This low dimensional embedding not only preserves the semantic closeness of words and documents, but also has a strong predictive power for the particular task. Compared to recent supervised approaches based on convolutional neural networks, predictive text embedding is comparable or more effective, much more efficient, and has fewer parameters to tune.

...read moreread less

703 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

Collapse

Cited by

PDF

Open Access

More filters

Posted Content•

Semi-Supervised Classification with Graph Convolutional Networks

[...]

Thomas Kipf¹, Max Welling¹•Institutions (1)

University of Amsterdam¹

09 Sep 2016-arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less

Abstract: We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

...read moreread less

15,696 citations

Posted Content•

Inductive Representation Learning on Large Graphs

[...]

William L. Hamilton, Rex Ying, Jure Leskovec

07 Jun 2017-arXiv: Social and Information Networks

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

...read moreread less

Abstract: Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.

...read moreread less

7,926 citations

Book•

Opinion Mining and Sentiment Analysis

[...]

Bo Pang¹, Lillian Lee²•Institutions (2)

Yahoo!¹, Cornell University²

08 Jul 2008

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.

...read moreread less

Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

...read moreread less

7,452 citations

Proceedings Article•DOI•

node2vec: Scalable Feature Learning for Networks

[...]

Aditya Grover¹, Jure Leskovec¹•Institutions (1)

Stanford University¹

13 Aug 2016

TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.

...read moreread less

Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

...read moreread less

7,072 citations

Journal Article•

Marketing for management.

[...]

Kent D. Seltman

01 Jan 2004-Marketing health services

TL;DR: The continuing convergence of the digital marketing and sales funnels has created a strategic continuum from digital lead generation to digital sales, which identifies the current composition of this digital continuum while providing opportunities to evaluate sales and marketing digital strategies.

...read moreread less

Abstract: MKT 6009 Marketing Internship (0 semester credit hours) Student gains experience and improves skills through appropriate developmental work assignments in a real business environment. Student must identify and submit specific business learning objectives at the beginning of the semester. The student must demonstrate exposure to the managerial perspective via involvement or observation. At semester end, student prepares an oral or poster presentation, or a written paper reflecting on the work experience. Student performance is evaluated by the work supervisor. Pass/Fail only. Prerequisites: (MAS 6102 or MBA major) and department consent required. (0-0) S MKT 6244 Digital Marketing Strategy (2 semester credit hours) Executive Education Course. The course explores three distinct areas within marketing and sales namely, digital marketing, traditional sales prospecting, and executive sales organization and strategy. The continuing convergence of the digital marketing and sales funnels has created a strategic continuum from digital lead generation to digital sales. The course identifies the current composition of this digital continuum while providing opportunities to evaluate sales and marketing digital strategies. Prerequisites: MKT 6301 and instructor consent required. (2-0) Y MKT 6301 (SYSM 6318) Marketing Management (3 semester credit hours) Overview of marketing management methods, principles and concepts including product, pricing, promotion and distribution decisions as well as segmentation, targeting and positioning. (3-0) S MKT 6309 Marketing Data Analysis and Research (3 semester credit hours) Methods employed in market research and data analysis to understand consumer behavior, customer journeys, and markets so as to enable better decision-making. Topics include understanding different sources of data, survey design, experiments, and sampling plans. The course will cover the techniques used for market sizing estimation and forecasting. In addition, the course will cover the foundational concepts and techniques used in data visualization and \"story-telling\" for clients and management. Corequisites: MKT 6301 and OPRE 6301. (3-0) Y MKT 6310 Consumer Behavior (3 semester credit hours) An exposition of the theoretical perspectives of consumer behavior along with practical marketing implication. Study of psychological, sociological and behavioral findings and frameworks with reference to consumer decision-making. Topics will include the consumer decision-making model, individual determinants of consumer behavior and environmental influences on consumer behavior and their impact on marketing. Prerequisite: MKT 6301. (3-0) Y MKT 6321 Interactive and Digital Marketing (3 semester credit hours) Introduction to the theory and practice of interactive and digital marketing. Topics covered include: online-market research, consumer behavior, conversion metrics, and segmentation considerations; ecommerce, search and display advertising, audiences, search engine marketing, email, mobile, video, social networks, and the Internet of Things. (3-0) T MKT 6322 Internet Business Models (3 semester credit hours) Topics to be covered are: consumer behavior on the Internet, advertising on the Internet, competitive strategies, market research using the Internet, brand management, managing distribution and supply chains, pricing strategies, electronic payment systems, and developing virtual organizations. Further, students learn auction theory, web content design, and clickstream analysis. Prerequisite: MKT 6301. (3-0) Y MKT 6323 Database Marketing (3 semester credit hours) Techniques to analyze, interpret, and utilize marketing databases of customers to identify a firm's best customers, understanding their needs, and targeting communications and promotions to retain such customers. Topics

...read moreread less

5,537 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse