Home
/
Authors
/
Jake Vanderplas

Author

Jake Vanderplas

Bio: Jake Vanderplas is an academic researcher from University of Washington. The author has contributed to research in topics: Python (programming language) & Weak gravitational lensing. The author has an hindex of 30, co-authored 56 publications receiving 77174 citations.

Papers published on a yearly basis

2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

DOI•

mwaskom/seaborn: v0.9.0 (July 2018)

[...]

Michael Waskom, Olga Botvinnik, Drew O'Kane, Paul Hobson, Joel Ostblom, Saulius Lukauskas, David C Gemperline, Tom Augspurger, Yaroslav O. Halchenko, John B. Cole, Jordi Warmenhoven, Julian de Ruiter, Cameron Pye, Stephan Hoyer, Jake Vanderplas, Santi Villalba, Gero Kunter, Eric Quintero, Pete Bachant, Marcel Martin, Kyle Meyer, Alistair Miles, Yoav Ram, Thomas Brunner, Tal Yarkoni, Mike Lee Williams, Constantine Evans, Clark Fitzgerald, Brian, Adel Qalieh - Show less +26 more

16 Jul 2018

121 citations

Proceedings Article•

Hierarchical Probabilistic Models for Group Anomaly Detection

[...]

Liang Xiong¹, Barnabás Póczos¹, Jeff Schneider¹, Andrew J. Connolly, Jake Vanderplas² - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, University of Washington²

14 Jun 2011

TL;DR: Generative models for detecting group anomalies, which are larger scale phenomena that only become apparent when groups of points are considered, are proposed.

...read moreread less

Abstract: Statistical anomaly detection typically focuses on finding individual point anomalies. Often the most interesting or unusual things in a data set are not odd individual points, but rather larger scale phenomena that only become apparent when groups of points are considered. In this paper, we propose generative models for detecting such group anomalies. We evaluate our methods on synthetic data as well as astronomical data from the Sloan Digital Sky Survey. The empirical results show that the proposed models are effective in detecting group anomalies.

...read moreread less

95 citations

Journal Article•DOI•

Tests of modified gravity with dwarf galaxies

[...]

Bhuvnesh Jain¹, Jake Vanderplas²•Institutions (2)

University of Pennsylvania¹, University of Washington²

24 Oct 2011-Journal of Cosmology and Astroparticle Physics

TL;DR: In this paper, the authors study observable deviations from modified gravity theories in the disks of late-type dwarf galaxies moving under gravity and find four distinct observable effects in such disk galaxies: 1. Warping of the stellar disk along the direction of the external force.

...read moreread less

Abstract: In modified gravity theories that seek to explain cosmic acceleration, dwarf galaxies in low density environments can be subject to enhanced forces. The class of scalar-tensor theories, which includes f(R) gravity, predict such a force enhancement (massive galaxies like the Milky Way can evade it through a screening mechanism that protects the interior of the galaxy from this ``fifth'' force). We study observable deviations from GR in the disks of late-type dwarf galaxies moving under gravity. The fifth-force acts on the dark matter and HI gas disk, but not on the stellar disk owing to the self-screening of main sequence stars. We find four distinct observable effects in such disk galaxies: 1. A displacement of the stellar disk from the HI disk. 2. Warping of the stellar disk along the direction of the external force. 3. Enhancement of the rotation curve measured from the HI gas compared to that of the stellar disk. 4. Asymmetry in the rotation curve of the stellar disk. We estimate that the spatial effects can be up to 1 kpc and the rotation velocity effects about 10 km/s in infalling dwarf galaxies. Such deviations are measurable: we expect that with a careful analysis of a sample of nearby dwarf galaxies one can improve astrophysical constraints on gravity theories by over three orders of magnitude, and even solar system constraints by one order of magnitude. Thus effective tests of gravity along the lines suggested by Hui, Nicolis, and Stubbs (2009) and Jain (2011) can be carried out with low-redshift galaxies, though care must be exercised in understanding possible complications from astrophysical effects.

...read moreread less

68 citations

Journal Article•DOI•

Astrophysical tests of modified gravity: the morphology and kinematics of dwarf galaxies

[...]

Vinu Vikram¹, Anna Cabré¹, Bhuvnesh Jain¹, Jake Vanderplas²•Institutions (2)

University of Pennsylvania¹, University of Washington²

01 Aug 2013-Journal of Cosmology and Astroparticle Physics

TL;DR: In this paper, the authors carried out four distinct tests using published data on the kinematics and morphology of dwarf galaxies, motivated by the theoretical work of Hui et al. (2009) and Jain & Vanderplas (2011).

...read moreread less

Abstract: This paper is the third in a series on tests of gravity using observations of stars and nearby dwarf galaxies. We carry out four distinct tests using published data on the kinematics and morphology of dwarf galaxies, motivated by the theoretical work of Hui et al. (2009) and Jain & Vanderplas (2011). In a wide class of gravity theories a scalar field couples to matter and provides an attractive fifth force. Due to their different self-gravity, stars and gas may respond differently to the scalar force leading to several observable deviations from standard gravity. HI gas, red giant stars and main sequence stars can be displaced relative to each other, and the stellar disk can display warps or asymmetric rotation curves aligned with external potential gradients. To distinguish the effects of modified gravity from standard astrophysical phenomena, we use a control sample of galaxies that are expected to be screened from the fifth force. In all cases we find no significant deviation from the null hypothesis of general relativity. The limits obtained from dwarf galaxies are not yet competitive with the limits from cepheids obtained in our first paper, but can be improved to probe regions of parameter space that are inaccessible using other tests. We discuss how our methodology can be applied to new radio and optical observations of nearby galaxies.

...read moreread less

59 citations

Journal Article•DOI•

Reducing the dimensionality of data: locally linear embedding of sloan galaxy spectra

[...]

Jake Vanderplas¹, Andrew J. Connolly¹•Institutions (1)

University of Washington¹

30 Sep 2009-The Astronomical Journal

TL;DR: Locally linear embedding (LLE) as mentioned in this paper is a nonlinear dimensionality reduction technique that has been studied in the context of computer perception, and it has been applied to the classification of emission-line spectra.

...read moreread less

Abstract: We introduce locally linear embedding (LLE) to the astronomical community as a new classification technique, using Sloan Digital Sky Survey spectra as an example data set. LLE is a nonlinear dimensionality reduction technique that has been studied in the context of computer perception. We compare the performance of LLE to well-known spectral classification techniques, e.g., principal component analysis and line-ratio diagnostics. We find that LLE combines the strengths of both methods in a single, coherent technique, and leads to improved classification of emission-line spectra at a relatively small computational cost. We also present a data subsampling technique that preserves local information content, and proves effective for creating small, efficient training samples from large, high-dimensional data sets. Software used in this LLE-based classification is made available.

...read moreread less

58 citations

…
1
2
3
4
5
6
7
…
8
9
10
11
12

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.

...read moreread less

Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

...read moreread less

14,872 citations

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

09 Mar 2016-arXiv: Learning

TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

...read moreread less

13,333 citations

Journal Article•DOI•

SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

[...]

Pauli Virtanen¹, Ralf Gommers, Travis E. Oliphant, Matt Haberland², Matt Haberland³, Tyler Reddy⁴, David Cournapeau, Evgeni Burovski⁵, Pearu Peterson, Warren Weckesser⁶, Jonathan Bright, Stefan van der Walt⁶, Matthew Brett⁷, Joshua Wilson, K. Jarrod Millman⁶, Nikolay Mayorov, Andrew Nelson⁸, Eric Jones, Robert Kern, Eric B. Larson⁹, CJ Carey¹⁰, Ilhan Polat, Yu Feng⁶, Eric Moore, Jake Vanderplas⁹, Denis Laxalde, Josef Perktold, Robert Cimrman¹¹, Ian Henriksen¹², Ian Henriksen¹³, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro¹⁴, Fabian Pedregosa¹⁵, Paul van Mulbregt¹⁵, SciPy . Contributors - Show less +33 more•Institutions (15)

University of Jyväskylä¹, California Polytechnic State University², University of California, Los Angeles³, Los Alamos National Laboratory⁴, National Research University – Higher School of Economics⁵, University of California, Berkeley⁶, University of Birmingham⁷, Australian Nuclear Science and Technology Organisation⁸, University of Washington⁹, University of Massachusetts Amherst¹⁰, University of West Bohemia¹¹, University of Texas at Austin¹², Brigham Young University¹³, Universidade Federal de Minas Gerais¹⁴, Google¹⁵

23 Jul 2019-arXiv: Mathematical Software

TL;DR: SciPy as discussed by the authors is an open source scientific computing library for the Python programming language, which includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics.

...read moreread less

Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

...read moreread less

12,774 citations

Journal Article•DOI•

Seven-year wilkinson microwave anisotropy probe (wmap *) observations: cosmological interpretation

[...]

Eiichiro Komatsu¹, Kristine M. Smith², Jo Dunkley³, Charles L. Bennett⁴, B. Gold⁴, Gary Hinshaw⁵, Norman Jarosik², Davin Larson⁴, M. R. Nolta⁶, Lyman A. Page², David N. Spergel², Mark Halpern⁷, Robert S. Hill, A. Kogut⁵, Michele Limon, S. S. Meyer⁸, N. Odegard, Gregory S. Tucker⁹, Janet Weiland, Edward J. Wollack⁵, Edward L. Wright¹⁰ - Show less +17 more•Institutions (10)

University of Texas at Austin¹, Princeton University², University of Oxford³, Johns Hopkins University⁴, Goddard Space Flight Center⁵, University of Toronto⁶, University of British Columbia⁷, University of Chicago⁸, Brown University⁹, University of California, Los Angeles¹⁰

11 Jan 2011-Astrophysical Journal Supplement Series

TL;DR: In this article, a combination of seven-year data from WMAP and improved astrophysical data rigorously tests the standard cosmological model and places new constraints on its basic parameters and extensions.

...read moreread less

Abstract: The combination of seven-year data from WMAP and improved astrophysical data rigorously tests the standard cosmological model and places new constraints on its basic parameters and extensions. By combining the WMAP data with the latest distance measurements from the baryon acoustic oscillations (BAO) in the distribution of galaxies and the Hubble constant (H0) measurement, we determine the parameters of the simplest six-parameter ΛCDM model. The power-law index of the primordial power spectrum is ns = 0.968 ± 0.012 (68% CL) for this data combination, a measurement that excludes the Harrison–Zel’dovich–Peebles spectrum by 99.5% CL. The other parameters, including those beyond the minimal set, are also consistent with, and improved from, the five-year results. We find no convincing deviations from the minimal model. The seven-year temperature power spectrum gives a better determination of the third acoustic peak, which results in a better determination of the redshift of the matter-radiation equality epoch. Notable examples of improved parameters are the total mass of neutrinos, � mν < 0.58 eV (95% CL), and the effective number of neutrino species, Neff = 4.34 +0.86 −0.88 (68% CL), which benefit from better determinations of the third peak and H0. The limit on a constant dark energy equation of state parameter from WMAP+BAO+H0, without high-redshift Type Ia supernovae, is w =− 1.10 ± 0.14 (68% CL). We detect the effect of primordial helium on the temperature power spectrum and provide a new test of big bang nucleosynthesis by measuring Yp = 0.326 ± 0.075 (68% CL). We detect, and show on the map for the first time, the tangential and radial polarization patterns around hot and cold spots of temperature fluctuations, an important test of physical processes at z = 1090 and the dominance of adiabatic scalar fluctuations. The seven-year polarization data have significantly improved: we now detect the temperature–E-mode polarization cross power spectrum at 21σ , compared with 13σ from the five-year data. With the seven-year temperature–B-mode cross power spectrum, the limit on a rotation of the polarization plane due to potential parity-violating effects has improved by 38% to Δα =− 1. 1 ± 1. 4(statistical) ± 1. 5(systematic) (68% CL). We report significant detections of the Sunyaev–Zel’dovich (SZ) effect at the locations of known clusters of galaxies. The measured SZ signal agrees well with the expected signal from the X-ray data on a cluster-by-cluster basis. However, it is a factor of 0.5–0.7 times the predictions from “universal profile” of Arnaud et al., analytical models, and hydrodynamical simulations. We find, for the first time in the SZ effect, a significant difference between the cooling-flow and non-cooling-flow clusters (or relaxed and non-relaxed clusters), which can explain some of the discrepancy. This lower amplitude is consistent with the lower-than-theoretically expected SZ power spectrum recently measured by the South Pole Telescope Collaboration.

...read moreread less

11,309 citations

Proceedings Article•DOI•

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

[...]

Marco Tulio Ribeiro¹, Sameer Singh¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.

...read moreread less

Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

...read moreread less

11,104 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse