scispace - formally typeset
Open AccessJournal Article

More Efficient Estimation for Logistic Regression with Optimal Subsamples

HaiYing Wang
- 01 Aug 2019 - 
- Vol. 20, Iss: 132, pp 1-59
TLDR
In this article, Wang et al. proposed an improved estimation method for logistic regression based on subsamples taken according to the optimal subsampling probabilities developed in Wang et. al. (2018).
Abstract
In this paper, we propose improved estimation method for logistic regression based on subsamples taken according the optimal subsampling probabilities developed in Wang et al. (2018). Both asymptotic results and numerical results show that the new estimator has a higher estimation efficiency. We also develop a new algorithm based on Poisson subsampling, which does not require to approximate the optimal subsampling probabilities all at once. This is computationally advantageous when available random-access memory is not enough to hold the full data. Interestingly, asymptotic distributions also show that Poisson subsampling produces a more efficient estimator if the sampling ratio, the ratio of the subsample size to the full data sample size, does not converge to zero. We also obtain the unconditional asymptotic distribution for the estimator based on Poisson subsampling. Pilot estimators are required to calculate subsampling probabilities and to correct biases in un-weighted estimators; interestingly, even if pilot estimators are inconsistent, the proposed method still produce consistent and asymptotically normal estimators.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Optimal subsampling for quantile regression in big data

TL;DR: In this article, optimal subsampling for quantile regression is investigated and algorithms based on the optimal sampling probabilities are proposed to obtain asymptotic distributions and optimality of the resulting estimators.
Journal ArticleDOI

Optimal subsampling for softmax regression

TL;DR: Wang et al. as mentioned in this paper developed an optimal subsampling method for softmax regression, which is also called multinomial logistic regression and is commonly used to model data with multiple categorical responses.
Journal ArticleDOI

Optimal subsampling for quantile regression in big data

TL;DR: In this article, optimal subsampling for quantile regression is investigated and algorithms based on the optimal sampling probabilities are proposed to obtain asymptotic distributions and optimality of the resulting estimators.
Journal ArticleDOI

Optimal subsampling for large-scale quantile regression

TL;DR: An efficient subsampling method is developed for large-scale quantile regression via Poisson sampling framework, which can solve the memory constraint problem imposed by big data.
Proceedings Article

Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms

TL;DR: An asymptotic analysis is developed to derive the distribution of RandNLA sampling estimators for the least-squares problem and the role of leverage in the sampling process, and the empirical results demonstrate improvements over existing methods.
References
More filters
Journal ArticleDOI

Applied Logistic Regression.

TL;DR: Applied Logistic Regression, Third Edition provides an easily accessible introduction to the logistic regression model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables.
Journal ArticleDOI

Generalized Linear Models

Eric R. Ziegel
- 01 Aug 2002 - 
TL;DR: This is the Ž rst book on generalized linear models written by authors not mostly associated with the biological sciences, and it is thoroughly enjoyable to read.
Book

The C++ Programming Language

TL;DR: Bjarne Stroustrup makes C even more accessible to those new to the language, while adding advanced information and techniques that even expert C programmers will find invaluable.
Book

Probability Theory I

Michel Loève
Related Papers (5)