Home
/
Authors
/
Tianbao Yang

Author

Tianbao Yang

Other affiliations: General Electric, Princeton University, Michigan State University

Bio: Tianbao Yang is an academic researcher from University of Iowa. The author has contributed to research in topics: Convex optimization & Convex function. The author has an hindex of 38, co-authored 247 publications receiving 5848 citations. Previous affiliations of Tianbao Yang include General Electric & Princeton University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Combining link and content for community detection: a discriminative approach

[...]

Tianbao Yang¹, Rong Jin¹, Yun Chi, Shenghuo Zhu•Institutions (1)

Michigan State University¹

28 Jun 2009

TL;DR: A discriminative model for combining the link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web is proposed and introduced and hidden variables are introduced to explicitly model the popularity of nodes.

...read moreread less

Abstract: In this paper, we consider the problem of combining link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web. Most existing approaches combine link and content information by a generative model that generates both links and contents via a shared set of community memberships. These generative models have some shortcomings in that they failed to consider additional factors that could affect the community memberships and isolate the contents that are irrelevant to community memberships. To explicitly address these shortcomings, we propose a discriminative model for combining the link and content analysis for community detection. First, we propose a conditional model for link analysis and in the model, we introduce hidden variables to explicitly model the popularity of nodes. Second, to alleviate the impact of irrelevant content attributes, we develop a discriminative model for content analysis. These two models are unified seamlessly via the community memberships. We present efficient algorithms to solve the related optimization problems based on bound optimization and alternating projection. Extensive experiments with benchmark data sets show that the proposed framework significantly outperforms the state-of-the-art approaches for combining link and content analysis for community detection.

...read moreread less

394 citations

Proceedings Article•

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

[...]

Tianbao Yang¹, Yu-Feng Li², Mehrdad Mahdavi³, Rong Jin³, Zhi-Hua Zhou² - Show less +1 more•Institutions (3)

General Electric¹, Michigan State University², Nanjing University³

03 Dec 2012

TL;DR: It is shown that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nystrom method can yield impressively better generalization error bound than random Fourier features based approach.

...read moreread less

Abstract: Both random Fourier features and the Nystrom method have been successfully applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from a distribution independent from the training data, basis functions used by the Nystrom method are randomly sampled from the training examples and are therefore data dependent. By exploring this difference, we show that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nystrom method can yield impressively better generalization error bound than random Fourier features based approach. We empirically verify our theoretical findings on a wide range of large data sets.

...read moreread less

328 citations

Journal Article•DOI•

Detecting communities and their evolutions in dynamic social networks--a Bayesian approach

[...]

Tianbao Yang¹, Yun Chi, Shenghuo Zhu, Yihong Gong, Rong Jin¹ - Show less +1 more•Institutions (1)

Michigan State University¹

01 Feb 2011-Machine Learning

TL;DR: This paper proposes a dynamic stochastic block model for finding communities and their evolution in a dynamic social network that captures the evolution of communities by explicitly modeling the transition of community memberships for individual nodes in the network.

...read moreread less

Abstract: Although a large body of work is devoted to finding communities in static social networks, only a few studies examined the dynamics of communities in evolving social networks. In this paper, we propose a dynamic stochastic block model for finding communities and their evolution in a dynamic social network. The proposed model captures the evolution of communities by explicitly modeling the transition of community memberships for individual nodes in the network. Unlike many existing approaches for modeling social networks that estimate parameters by their most likely values (i.e., point estimation), in this study, we employ a Bayesian treatment for parameter estimation that computes the posterior distributions for all the unknown parameters. This Bayesian treatment allows us to capture the uncertainty in parameter values and therefore is more robust to data noise than point estimation. In addition, an efficient algorithm is developed for Bayesian inference to handle large sparse social networks. Extensive experimental studies based on both synthetic data and real-life data demonstrate that our model achieves higher accuracy and reveals more insights in the data than several state-of-the-art algorithms.

...read moreread less

316 citations

Proceedings Article•DOI•

Hetero-ConvLSTM: A Deep Learning Approach to Traffic Accident Prediction on Heterogeneous Spatio-Temporal Data

[...]

Zhuoning Yuan¹, Xun Zhou¹, Tianbao Yang¹•Institutions (1)

University of Iowa¹

19 Jul 2018

TL;DR: A Hetero-ConvL STM framework is proposed, where a few novel ideas are implemented on top of the basic ConvLSTM model, such as incorporating spatial graph features and spatial model ensemble, which makes reasonably accurate predictions and significantly improves the prediction accuracy over baseline approaches.

...read moreread less

Abstract: Predicting traffic accidents is a crucial problem to improving transportation and public safety as well as safe routing. The problem is also challenging due to the rareness of accidents in space and time and spatial heterogeneity of the environment (e.g., urban vs. rural). Most previous research on traffic accident prediction conducted by domain researchers simply applied classical prediction models on limited data without addressing the above challenges properly, thus leading to unsatisfactory performance. A small number of recent works have attempted to use deep learning for traffic accident prediction. However, they either ignore time information or use only data from a small and homogeneous study area (a city), without handling spatial heterogeneity and temporal auto-correlation properly at the same time. In this paper we perform a comprehensive study on the traffic accident prediction problem using the Convolutional Long Short-Term Memory (ConvLSTM) neural network model. A number of detailed features such as weather, environment, road condition, and traffic volume are extracted from big datasets over the state of Iowa across 8 years. To address the spatial heterogeneity challenge in the data, we propose a Hetero-ConvLSTM framework, where a few novel ideas are implemented on top of the basic ConvLSTM model, such as incorporating spatial graph features and spatial model ensemble. Extensive experiments on the 8-year data over the entire state of Iowa show that the proposed framework makes reasonably accurate predictions and significantly improves the prediction accuracy over baseline approaches.

...read moreread less

245 citations

Proceedings Article•

Online Optimization with Gradual Variations

[...]

Chao-Kai Chiang¹, Chao-Kai Chiang², Tianbao Yang³, Chia-Jung Lee², Mehrdad Mahdavi, Chi-Jen Lu², Rong Jin³, Shenghuo Zhu - Show less +4 more•Institutions (3)

National Taiwan University¹, Academia Sinica², Michigan State University³

01 Jan 2012

TL;DR: It is shown that for the linear and general smooth convex loss functions, an online algorithm modified from the gradient descend algorithm can achieve a regret which only scales as the square root of the deviation, and as an application, this can also have such a logarithmic regret for the portfolio management problem.

...read moreread less

Abstract: We study the online convex optimization problem, in which an online algorithm has to make repeated decisions with convex loss functions and hopes to achieve a small regret. We consider a natural restriction of this problem in which the loss functions have a small deviation, measured by the sum of the distances between every two consecutive loss functions, according to some distance metrics. We show that for the linear and general smooth convex loss functions, an online algorithm modified from the gradient descend algorithm can achieve a regret which only scales as the square root of the deviation. For the closely related problem of prediction with expert advice, we show that an online algorithm modified from the multiplicative update algorithm can also achieve a similar regret bound for a different measure of deviation. Finally, for loss functions which are strictly convex, we show that an online algorithm modified from the online Newton step algorithm can achieve a regret which is only logarithmic in terms of the deviation, and as an application, we can also have such a logarithmic regret for the portfolio management problem.

...read moreread less

229 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Posted Content•

Communication-Efficient Learning of Deep Networks from Decentralized Data

[...]

H. Brendan McMahan¹, Eider Moore¹, Daniel Ramage¹, Seth Hampson, Blaise Aguera y Arcas¹ - Show less +1 more•Institutions (1)

Google¹

17 Feb 2016-arXiv: Learning

TL;DR: This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.

...read moreread less

Abstract: Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. For example, language models can improve speech recognition and text entry, and image models can automatically select good photos. However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the data center and training there using conventional approaches. We advocate an alternative that leaves the training data distributed on the mobile devices, and learns a shared model by aggregating locally-computed updates. We term this decentralized approach Federated Learning. We present a practical method for the federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets. These experiments demonstrate the approach is robust to the unbalanced and non-IID data distributions that are a defining characteristic of this setting. Communication costs are the principal constraint, and we show a reduction in required communication rounds by 10-100x as compared to synchronized stochastic gradient descent.

...read moreread less

5,936 citations

Convex Analysisの二,三の進展について

[...]

徹丸山

01 Feb 1977

5,933 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse