Home
/
Authors
/
Peter Prettenhofer

Author

Peter Prettenhofer

Other affiliations: Graz University of Technology

Bio: Peter Prettenhofer is an academic researcher from Bauhaus University, Weimar. The author has contributed to research in topics: Web search query & Python (programming language). The author has an hindex of 10, co-authored 21 publications receiving 63936 citations. Previous affiliations of Peter Prettenhofer include Graz University of Technology.

Papers

PDF

Open Access

More filters

DOI•

scikit-learn/scikit-learn: Scikit-learn 0.22.1

[...]

Olivier Grisel, Andreas Mueller, Alexandre Gramfort, Gilles Louppe, Peter Prettenhofer, Mathieu Blondel, Vlad Niculae, Joel Nothman, Arnaud Joly, Jake Vanderplas, manoj kumar, Hanmin Qin, Nelle Varoquaux, Robert Layton, Loïc Estève, Jan Hendrik Metzen, Thomas J. Fan, Noel Dawe, Nicolas Hug, Rajagopalan Raghav, Guillaume Lemaitre, Johannes Schönberger, Adrin Jalali, Wei Li, Clay Woolam, Roman Yurchak, Kemal Eren, Tom Dupré la Tour, Eustache - Show less +25 more

02 Jan 2020

2 citations

DOI•

scikit-learn/scikit-learn: scikit-learn 0.23.1

[...]

Olivier Grisel, Andreas Mueller, Alexandre Gramfort, Gilles Louppe, Peter Prettenhofer, Mathieu Blondel, Vlad Niculae, Joel Nothman, Arnaud Joly, Jake Vanderplas, manoj kumar, Hanmin Qin, Thomas J. Fan, Nelle Varoquaux, Robert Layton, Loïc Estève, Jan Hendrik Metzen, Nicolas Hug, Noel Dawe, Guillaume Lemaitre, Adrin Jalali, Rajagopalan Raghav, Johannes Schönberger, Roman Yurchak, Wei Li, Clay Woolam, Kemal Eren, Tom Dupré la Tour, Eustache - Show less +25 more

19 May 2020

1 citations

Book Chapter•DOI•

An associative and adaptive network model for information retrieval in the Semantic Web

[...]

Peter Scheir, Peter Prettenhofer¹, Stefanie Lindstaedt², Chiara Ghidini³•Institutions (3)

Bauhaus University, Weimar¹, Graz University of Technology², fondazione bruno kessler³

01 Jan 2010

TL;DR: This chapter investigates how to improve retrieval performance in settings where resources are sparsely annotated with semantic information and presents an associative retrieval model for the Semantic Web and evaluates if and to which extent the use of associative retrieve techniques increases retrieval performance.

...read moreread less

Abstract: While it is agreed that semantic enrichment of resources would lead to better search results, at present the low coverage of resources on the web with semantic information presents a major hurdle in realizing the vision of search on the Semantic Web. To address this problem, this chapter investigates how to improve retrieval performance in settings where resources are sparsely annotated with semantic information. Techniques from soft computing are employed to find relevant material that was not originally annotated with the concepts used in a query. The authors present an associative retrieval model for the Semantic Web and evaluate if and to which extent the use of associative retrieval techniques increases retrieval performance. In addition, the authors present recent work on adapting the network structure based on relevance feedback by the user to further improve retrieval effectiveness. The evaluation of new retrieval paradigms such as retrieval based on technology for the Semantic Web presents an additional challenge since no off-the-shelf test corpora exist. Hence, this chapter gives a detailed description of the approach taken to evaluate the information retrieval service the authors have built. DOI: 10.4018/978-1-60566-992-2.ch014

...read moreread less

1 citations

Forecasting Daily Solar Energy Production Using Robust Regression Techniques

[...]

Gilles Louppe¹, Peter Prettenhofer•Institutions (1)

University of Liège¹

05 Feb 2014

1 citations

DOI•

scikit-learn: 0.17.1 release tag for DOI

[...]

Olivier Grisel, Andreas Mueller, Fabian Pedregosa, Alexandre Gramfort, Gilles Louppe, Peter Prettenhofer, Mathieu Blondel, Vlad Niculae, Arnaud Joly, Joel Nothman, Jake Vanderplas, manoj kumar, Robert Layton, Nelle Varoquaux, Noel Dawe, Johannes Schönberger, Denis A. Engemann, Wei Li, Rajagopalan Raghav, Clay Woolam, Kemal Eren, Eustache, Alexander Fabisch, Alexandre Passos, bthirion, Virgile Fritsch, Danny Sullivan, Hamzeh Alsalhi, Maheshakya Wijewardena - Show less +25 more

17 Apr 2016

1 citations

…
1
2
3
4
5

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.

...read moreread less

Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

...read moreread less

14,872 citations

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

09 Mar 2016-arXiv: Learning

TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

...read moreread less

13,333 citations

Journal Article•DOI•

SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

[...]

Pauli Virtanen¹, Ralf Gommers, Travis E. Oliphant, Matt Haberland², Matt Haberland³, Tyler Reddy⁴, David Cournapeau, Evgeni Burovski⁵, Pearu Peterson, Warren Weckesser⁶, Jonathan Bright, Stefan van der Walt⁶, Matthew Brett⁷, Joshua Wilson, K. Jarrod Millman⁶, Nikolay Mayorov, Andrew Nelson⁸, Eric Jones, Robert Kern, Eric B. Larson⁹, CJ Carey¹⁰, Ilhan Polat, Yu Feng⁶, Eric Moore, Jake Vanderplas⁹, Denis Laxalde, Josef Perktold, Robert Cimrman¹¹, Ian Henriksen¹², Ian Henriksen¹³, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro¹⁴, Fabian Pedregosa¹⁵, Paul van Mulbregt¹⁵, SciPy . Contributors - Show less +33 more•Institutions (15)

University of Jyväskylä¹, University of California, Los Angeles², California Polytechnic State University³, Los Alamos National Laboratory⁴, National Research University – Higher School of Economics⁵, University of California, Berkeley⁶, University of Birmingham⁷, Australian Nuclear Science and Technology Organisation⁸, University of Washington⁹, University of Massachusetts Amherst¹⁰, University of West Bohemia¹¹, University of Texas at Austin¹², Brigham Young University¹³, Universidade Federal de Minas Gerais¹⁴, Google¹⁵

23 Jul 2019-arXiv: Mathematical Software

TL;DR: SciPy as discussed by the authors is an open source scientific computing library for the Python programming language, which includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics.

...read moreread less

Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

...read moreread less

12,774 citations

Proceedings Article•DOI•

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

[...]

Marco Tulio Ribeiro¹, Sameer Singh¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.

...read moreread less

Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

...read moreread less

11,104 citations

Posted Content•

Inductive Representation Learning on Large Graphs

[...]

William L. Hamilton, Rex Ying, Jure Leskovec

07 Jun 2017-arXiv: Social and Information Networks

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

...read moreread less

Abstract: Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.

...read moreread less

7,926 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse