Home
/
Authors
/
Cheng-Tao Chu

Author

Cheng-Tao Chu

Bio: Cheng-Tao Chu is an academic researcher from LinkedIn. The author has contributed to research in topics: Support vector machine & Supervised learning. The author has an hindex of 3, co-authored 3 publications receiving 1305 citations. Previous affiliations of Cheng-Tao Chu include Stanford University.

Papers

PDF

Open Access

More filters

Proceedings Article•

Map-Reduce for Machine Learning on Multicore

[...]

Cheng-Tao Chu¹, Sang K. Kim¹, Yi-an Lin¹, Yuanyuan Yu¹, Gary Bradski¹, Kunle Olukotun¹, Andrew Y. Ng¹ - Show less +3 more•Institutions (1)

Stanford University¹

04 Dec 2006

TL;DR: This work shows that algorithms that fit the Statistical Query model can be written in a certain "summation form," which allows them to be easily parallelized on multicore computers and shows basically linear speedup with an increasing number of processors.

...read moreread less

Abstract: We are at the beginning of the multicore era. Computers will have increasingly many cores (processors), but there is still no good programming framework for these architectures, and thus no simple and unified way for machine learning to take advantage of the potential speed up. In this paper, we develop a broadly applicable parallel programming method, one that is easily applied to many different learning algorithms. Our work is in distinct contrast to the tradition in machine learning of designing (often ingenious) ways to speed up a single algorithm at a time. Specifically, we show that algorithms that fit the Statistical Query model [15] can be written in a certain "summation form," which allows them to be easily parallelized on multicore computers. We adapt Google's map-reduce [7] paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms including locally weighted linear regression (LWLR), k-means, logistic regression (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis (GDA), EM, and backpropagation (NN). Our experimental results show basically linear speedup with an increasing number of processors.

...read moreread less

1,310 citations

Proceedings Article•

Detection of word fragments in Mandarin telephone conversation.

[...]

Cheng-Tao Chu, Yun-Hsuan Sung, Yuan Zhao, Dan Jurafsky

01 Jan 2006

TL;DR: Preliminary work on the detection of word fragments in Mandarin conversational telephone speech extracted prosodic, voice quality, and lexical features, and trained Decision Tree and SVM classifiers found the most useful feature for Mandarin fragment detection was the identity of the neighboring character (word or morpheme).

...read moreread less

Abstract: We describe preliminary work on the detection of word fragments in Mandarin conversational telephone speech. We extracted prosodic, voice quality, and lexical features, and trained Decision Tree and SVM classifiers. Previous research shows that glottalization features are instrumental in English fragment detection. However, we show that Mandarin fragments are quite different than English; 90% of Mandarin fragments are followed immediately by a repetition of the fragmentary word. These repetition fragments are not glottalized, and they have a very specific distribution; the 12 most frequent words (“you”, “I”, “that”, “have”, “then”, etc.) cover 50% of the tokens of these fragments. Thus rather than glottalization, we found the most useful feature for Mandarin fragment detection was the identity of the neighboring character (word or morpheme). In an oracle experiment using the true (reference) neighboring words as well as prosodic and voice quality features, we achieved 80% accuracy in Mandarin fragment detection.

...read moreread less

11 citations

Journal Article•DOI•

Session analysis of people search within a professional social network

[...]

Ru He¹, Jiong Wang², Jin Tian¹, Cheng-Tao Chu², Bradley Scott Mauney², Igor Perisic² - Show less +2 more•Institutions (2)

Iowa State University¹, LinkedIn²

01 May 2013-Journal of the Association for Information Science and Technology

TL;DR: The underlying rationale of the refinements described is described and the empirically show that the content‐based method equipped with these refinements is able to achieve an excellent identification performance in the authors' domain.

...read moreread less

Abstract: We perform session analysis for our domain of people search within a professional social network. We find that the content-based method is appropriate to serve as a basis for the session identification in our domain. However, there remain some problems reported in previous research which degrade the identification performance (such as accuracy) of the content-based method. Therefore, in this article, we propose two important refinements to address these problems. We describe the underlying rationale of our refinements and then empirically show that the content-based method equipped with our refinements is able to achieve an excellent identification performance in our domain (such as 99.820% accuracy and 99.707% F-measure in our experiments). Next, because the time-based method has extremely low computation costs, which makes it suitable for many real-world applications, we investigate the feasibility of the time-based method in our domain by evaluating its identification performance based on our refined content-based method. Our experiments demonstrate that the performance of the time-based method is potentially acceptable to many real applications in our domain. Finally, we analyze several features of the identified sessions in our domain and compare them with the corresponding ones in general web search. The results illustrate the profession-oriented characteristics of our domain.

...read moreread less

3 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

...read moreread less

20,309 citations

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

01 Jan 2008-Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

...read moreread less

17,663 citations

Book•

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

[...]

Stephen Boyd¹, Neal Parikh¹, Eric Chu¹, Borja Peleato¹, Jonathan Eckstein² - Show less +1 more•Institutions (2)

Stanford University¹, Rutgers University²

23 May 2011

TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.

...read moreread less

Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

...read moreread less

17,433 citations

Proceedings Article•DOI•

TensorFlow: a system for large-scale machine learning

[...]

Martín Abadi¹, Paul Barham¹, Jianmin Chen¹, Zhifeng Chen¹, Andy Davis¹, Jeffrey Dean¹, Matthieu Devin¹, Sanjay Ghemawat¹, Geoffrey Irving¹, Michael Isard¹, Manjunath Kudlur¹, Josh Levenberg¹, Rajat Monga¹, Sherry Moore¹, Derek G. Murray¹, Benoit Steiner¹, Paul A. Tucker¹, Vijay K. Vasudevan¹, Pete Warden¹, Martin Wicke¹, Yuan Yu¹, Xiaoqiang Zheng¹ - Show less +18 more•Institutions (1)

Google¹

02 Nov 2016

TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.

...read moreread less

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

...read moreread less

10,913 citations

Posted Content•

TensorFlow: A system for large-scale machine learning

[...]

Google¹

27 May 2016-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated.

...read moreread less

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

...read moreread less

5,542 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse