scispace - formally typeset
Search or ask a question
Author

Shanshan Song

Bio: Shanshan Song is an academic researcher from The Chinese University of Hong Kong. The author has contributed to research in topics: Regression analysis & Estimator. The author has an hindex of 1, co-authored 1 publications receiving 2 citations.

Papers
More filters
Journal ArticleDOI
01 Sep 2021
TL;DR: The Bahadur representation of the ALS estimator is derived, which serves as an important tool to study the relationship between the number of sub-machines K and the sample size and its consistency and asymptotic normality are established under mild conditions.
Abstract: In this paper, we study the large-scale inference for a linear expectile regression model. To mitigate the computational challenges in the classical asymmetric least squares (ALS) estimation under massive data, we propose a communication-efficient divide and conquer algorithm to combine the information from sub-machines through confidence distributions. The resulting pooled estimator has a closed-form expression, and its consistency and asymptotic normality are established under mild conditions. Moreover, we derive the Bahadur representation of the ALS estimator, which serves as an important tool to study the relationship between the number of sub-machines K and the sample size. Numerical studies including both synthetic and real data examples are presented to illustrate the finite-sample performance of our method and support the theoretical results.

2 citations


Cited by
More filters
Posted Content
TL;DR: In this article, the authors use tail expectiles to estimate alternative measures to the Value at Risk (VaR), Expected Shortfall (ES) and Marginal expected shortfall (MES), three instruments of risk protection of utmost importance in actuarial science and statistical finance.
Abstract: We use tail expectiles to estimate alternative measures to the Value at Risk (VaR), Expected Shortfall (ES) and Marginal Expected Shortfall (MES), three instruments of risk protection of utmost importance in actuarial science and statistical finance. The concept of expectiles is a least squares analogue of quantiles. Both expectiles and quantiles were embedded in the more general class of M-quantiles as the minimizers of an asymmetric convex loss function. It has been proved very recently that the only M-quantiles that are coherent risk measures are the expectiles. Moreover, expectiles define the only coherent risk measure that is also elicit able. The elicit ability corresponds to the existence of a natural backtesting methodology. The estimation of expectiles did not, however, receive yet any attention from the perspective of extreme values. The first estimation method that we propose enables the usage of advanced high quantile and tail index estimators. The second method joins together the least asymmetrically weighted squares estimation with the tail restrictions of extreme-value theory. A main tool is to first estimate the large expectile-based VaR, ES and MES when they are covered by the range of the data, and then extrapolate these estimates to the very far tails. We establish the limit distributions of the proposed estimators when they are located in the range of the data or near and even beyond the maximum observed loss. We show through a detailed simulation study the good performance of the procedures, and also present concrete applications to medical insurance data and three large US investment banks.

77 citations

Posted Content
TL;DR: A divide-and-conquer algorithm is proposed to alleviate the computational burden, and shown not to sacrifice any statistical accuracy in comparison with a pooled analysis, and applied to a microarray data example that shows empirical benefits of using more data.
Abstract: Factor modeling is an essential tool for exploring intrinsic dependence structures among high-dimensional random variables. Much progress has been made for estimating the covariance matrix from a high-dimensional factor model. However, the blessing of dimensionality has not yet been fully embraced in the literature: much of the available data is often ignored in constructing covariance matrix estimates. If our goal is to accurately estimate a covariance matrix of a set of targeted variables, shall we employ additional data, which are beyond the variables of interest, in the estimation? In this paper, we provide sufficient conditions for an affirmative answer, and further quantify its gain in terms of Fisher information and convergence rate. In fact, even an oracle-like result (as if all the factors were known) can be achieved when a sufficiently large number of variables is used. The idea of utilizing data as much as possible brings computational challenges. A divide-and-conquer algorithm is thus proposed to alleviate the computational burden, and also shown not to sacrifice any statistical accuracy in comparison with a pooled analysis. Simulation studies further confirm our advocacy for the use of full data, and demonstrate the effectiveness of the above algorithm. Our proposal is applied to a microarray data example that shows empirical benefits of using more data.

13 citations