scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Nonparametric Statistics in 2022"


Journal ArticleDOI
TL;DR: In this paper , the problem of nonparametric estimation of the expectile regression model for strong mixing functional time series data is investigated, and the almost complete consistency and the asymptotic normality of the kernel-type estimator under some mild conditions are established.
Abstract: In this paper, the problem of the nonparametric estimation of the expectile regression model for strong mixing functional time series data is investigated. To be more precise, we establish the almost complete consistency and the asymptotic normality of the kernel-type expectile regression estimator under some mild conditions. The usefulness of our theoretical results in the financial time series analysis is discussed. Further, we provide some practical algorithms to select the smoothing parameter or to construct the confidence intervals using the bootstrap techniques. In addition, a simulation study is carried out to verify the small sample behaviour of the proposed approach. Finally, we give an empirical example using the daily returns of the stock index SP500.

13 citations


Journal ArticleDOI
TL;DR: In this paper , the authors prove oracle inequalities and upper bounds for kernel density estimators on a broad class of metric spaces, including Euclidean spaces, spheres, balls, cubes and Riemannian manifolds.
Abstract: We prove oracle inequalities and upper bounds for kernel density estimators on a very broad class of metric spaces. Precisely we consider the setting of a doubling measure metric space in the presence of a non-negative self-adjoint operator whose heat kernel enjoys Gaussian regularity. Many classical settings like Euclidean spaces, spheres, balls, cubes as well as general Riemannian manifolds, are contained in our framework. Moreover the rate of convergence we achieve is the optimal one in these special cases. Finally we provide the general methodology of constructing the proper kernels when the manifold under study is given and we give precise examples for the case of the sphere.

6 citations


Journal ArticleDOI
TL;DR: In this paper , the authors derived asymptotic expressions for the mean integrated squared error of a class of delta sequence density estimators for circular data and proposed a Fourier series-based direct plug-in approach for bandwidth selection.
Abstract: In this paper, we derive asymptotic expressions for the mean integrated squared error of a class of delta sequence density estimators for circular data. This class includes the class of kernel density estimators usually considered in the literature, as well as a new class that is closer in spirit to the class of Parzen–Rosenblatt estimators for linear data. For these two classes of kernel density estimators, a Fourier series-based direct plug-in approach for bandwidth selection is presented. The proposed bandwidth selector has a relative convergence rate whenever the underlying density is smooth enough and the simulation results testify that it presents a very good finite sample performance against other bandwidth selectors in the literature.

3 citations


Journal ArticleDOI
TL;DR: In this article , a conditional survival function estimator for censored data is studied based on a double smoothing technique: both the covariate and the variable of interest (usually, the time) are smoothed.
Abstract: In this paper, a conditional survival function estimator for censored data is studied. It is based on a double smoothing technique: both the covariate and the variable of interest (usually, the time) are smoothed. Asymptotic expressions for the bias and the variance and the asymptotic normality of the smoothed survival estimator derived from Beran's estimator are found. A simulation study shows the performance of the smoothed Beran's estimator of the conditional survival function and compares it with the smoothed one only in the covariate. The influence of the two smoothing parameters involved in both estimators is also studied.

3 citations


Journal ArticleDOI
TL;DR: In this article , constrained quantile regression is used to test the homoskedasticity of quantile hyperplanes intersections with quantile crossings, and a comparison between the Wald test and quantile crossing is carried out.
Abstract: Quantile crossings do not occur so infrequently as to be declared virtually nonexistent; instead, researchers often have to face the quantile hyperplanes intersections issue, particularly with small and moderate sample sizes. Quantile crossings are particularly disturbing when one considers the estimation of the sparsity function. This, in fact, has a prominent role in determining the asymptotic properties of estimators and in testing the homoskedasticity of residuals. The primary goal of this study is to show that constrained quantile regression can improve conjoint results. We introduce a new method to this end. Furthermore, we carry out a comparison between the Wald test of homoskedasticity, computed by both neglecting and including quantile crossings. Real and simulated data illustrate the finite-sample performance of both versions of the test. Our experiments support the insight that considering monotonicity constraints is relatively rewarding when heteroskedasticity has to be accurately diagnosticated.

2 citations


Journal ArticleDOI
TL;DR: In this article , a goodness-of-fit test for one-parameter count distributions with finite second moment is proposed, where the test statistic is derived from the distance between the probability generating function of the model under the null hypothesis and that of the random variable actually generating data, when the latter belongs to a suitable wide class of alternatives.
Abstract: A goodness-of-fit test for one-parameter count distributions with finite second moment is proposed. The test statistic is derived from the $L_1$-distance of a function of the probability generating function of the model under the null hypothesis and that of the random variable actually generating data, when the latter belongs to a suitable wide class of alternatives. The test statistic has a rather simple form and it is asymptotically normally distributed under the null hypothesis, allowing a straightforward implementation of the test. Moreover, the test is consistent for alternative distributions belonging to the class, but also for all the alternative distributions whose probability of zero is different from that under the null hypothesis. Thus, the use of the test is proposed and investigated also for alternatives not in the class. The finite-sample properties of the test are assessed by means of an extensive simulation study.

2 citations


Journal ArticleDOI
TL;DR: In this paper , principal asymmetric least squares (PALS) is introduced as a novel method for sufficient dimension reduction with heteroscedastic error, which addresses this limitation by synthesizing different expectile levels.
Abstract: Principal asymmetric least squares (PALS) is introduced as a novel method for sufficient dimension reduction with heteroscedastic error. Classical methods such as MAVE [Xia et al. (2002), ‘An Adaptive Estimation of Dimension Reduction Space’ (with discussion), Journal of the Royal Statistical Society Series B, 64, 363–410] and PSVM [Li et al. (2011), ‘Principal Support Vector Machines for Linear and Nonlinear Sufficient Dimension Reduction’, The Annals of Statistics, 39, 3182–3210] may not perform well in the presence of heteroscedasticity, while the new proposal addresses this limitation by synthesising different expectile levels. Through extensive numerical studies, we demonstrate the superior performance of PALS in terms of estimation accuracy over classical methods including MAVE and PSVM. For the asymptotic analysis of PALS, we develop new tools to compute the derivative of an expectation of a non-Lipschitz function.

2 citations


Journal ArticleDOI
TL;DR: In this article , the authors provide kernel estimators of the main characteristics of a continuous-time semi-Markov process, like conditional and unconditional sojourn times in a state, semi-markov kernel, etc.
Abstract: This paper provides kernel estimators of the main characteristics of a continuous-time semi-Markov process, like conditional and unconditional sojourn times in a state, semi-Markov kernel, etc. The main goal of this paper is to establish asymptotic properties of the semi-Markov kernel estimators and of the sojourn time distribution estimators (conditional and unconditional), as well as of the estimators of the associated Radon-Nikodym derivatives, when the sample size becomes large. The approach is illustrated by considering a three state example with detailed calculus and numerical evaluations.

2 citations


Journal ArticleDOI
TL;DR: In this paper , a nonparametric model of copula density function is proposed, which offers the following advantages: (i) it is valid for mixed random vector; (ii) it yields a bonafide density estimate with intepretable parameters; and (iii) it plays a unifying role in our understanding of a large class of statistical methods for mixed.
Abstract: A new nonparametric model of maximum-entropy (MaxEnt) copula density function is proposed, which offers the following advantages: (i) it is valid for mixed random vector. By ‘mixed’, we mean the method works for any combination of discrete or continuous variables in a fully automated manner; (ii) it yields a bonafide density estimate with intepretable parameters. By ‘bonafide’, we mean the estimate guarantees to be a non-negative function, integrates to 1; and (iii) it plays a unifying role in our understanding of a large class of statistical methods for mixed (X,Y). Our approach utilises modern machinery of nonparametric statistics to represent and approximate log-copula density function via LP-Fourier transform. Several real-data examples are also provided to explore the key theoretical and practical implications of the theory.

1 citations


Journal ArticleDOI
TL;DR: In this article , a nonparametric version of the integer-valued GARCH(1,1) model for time series of counts is considered and a least squares estimator for this function is proposed.
Abstract: We consider a nonparametric version of the integer-valued GARCH(1,1) model for time series of counts. The link function in the recursion for the variances is not specified by finite-dimensional parameters. Instead we impose nonparametric smoothness conditions. We propose a least squares estimator for this function and show that it is consistent with a rate that we conjecture to be nearly optimal.

1 citations


Journal ArticleDOI
TL;DR: This paper shows, under some regularity and non-restrictive assumptions on the associated-kernel, that the normalizing random variable converges in mean square to 1 and derives the consistency and the asymptotic normality of the proposed estimator.
Abstract: ABSTRACT Discrete kernel smoothing is now gaining importance in nonparametric statistics. In this paper, we investigate some asymptotic properties of the normalised discrete associated-kernel estimator of a probability mass function and make comparisons. We show, under some regularity and non-restrictive assumptions on the associated-kernel, that the normalising random variable converges in mean square to 1. We then derive the consistency and the asymptotic normality of the proposed estimator. Various families of discrete kernels already exhibited satisfy the conditions, including the refined CoM-Poisson which is underdispersed and of second-order. Finally, the first-order binomial kernel is discussed and, surprisingly, its normalised estimator has a suitable asymptotic behaviour through simulations.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a new risk ratio to describe the relation between the components of the random vector, i.e., the ratio of the conditional hazard rate function of at , given that and the conditional hazards rate function, given that .
Abstract: Inspired by the cross-ratio proposed by Clayton, we study a new risk ratio to describe the relation between the components of the random vector ( ). It is the ratio of the conditional hazard rate function of at , given that and the conditional hazard rate function of at , given that . A nonparametric estimator is proposed and its asymptotic distribution is obtained using Bernstein smoothing for the survival copula of ( ) and its derivatives. The finite sample performance of the estimator is studied via simulations. The practical use of the risk ratio is illustrated in two real datasets, one on food expenditure and net income and one on the relation between maximum heart rate and age, for patients suffering from heart disease versus control patients (no heart disease). Extensions of the proposed risk ratio are given in the discussion section.

Journal ArticleDOI
TL;DR: In this article , the authors proposed nonparametric estimators under stationary and nonstationary processes for time-dependent functional data, where the information contained in the covariances at different lags is used to obtain estimators that consider the time-dependence property.
Abstract: Data can be assumed to be continuous functions defined on an infinite-dimensional space for many phenomena. However, the infinite-dimensional data might be driven by a small number of latent variables. Hence, factor models are relevant for functional data. In this paper, we study functional factor models for time-dependent functional data. We propose nonparametric estimators under stationary and nonstationary processes. We obtain estimators that consider the time-dependence property. Specifically, we use the information contained in the covariances at different lags. We show that the proposed estimators are consistent. Through Monte Carlo simulations, we find that our methodology outperforms estimators based on functional principal components. We also apply our methodology to monthly yield curves. In general, the suitable integration of time-dependent information improves the estimation of the latent factors.

Journal ArticleDOI
TL;DR: In this paper , a two-step maximum likelihood estimator (MLE) and Bayesian Information Criterion (BIC) order selection for ARMA time series with slowly varying trend is examined.
Abstract: Maximum likelihood estimator (MLE) and Bayesian Information Criterion (BIC) order selection are examined for ARMA time series with slowly varying trend to validate the well-known detrending technique of moving average [Section 1.4, Brockwell, P.J., and Davis, R.A. (1991), Time Series: Theory and Methods, New York: Springer-Verlag]. In step one, a moving average equivalent to local linear regression is fitted to the raw data with a data-driven lag number, and subtracted from raw data to produce a sequence of residuals. The residuals are used in step two as substitutes of the latent ARMA series for MLE and BIC procedures. It is shown that with second order smooth trend and correctly chosen lag number, the two-step MLE is oracally efficient, i.e. it is asymptotically as efficient as the would-be MLE based on the unobserved ARMA series. At the same time, the two-step BIC consistently selects the orders as well. Simulation experiments corroborate the theoretical findings.

Journal ArticleDOI
TL;DR: In this article , the authors derived the uniform law of large numbers (ULLN) over a class of functions by domination conditions of random covering numbers and covering integrals, and also derived the asymptotic covariance matrix for biavariant vector of Bernstein estimators.
Abstract: In this paper, we consider the Bernstein polynomial of the empirical distribution function under a triangular sample, which we denote by . For the recentered and normalised statistic , where x is defined on the interval , the stochastic convergence to a Brownian bridge is derived. The main technicality in proving the normality is drawn off into a stochastic equicontinuity condition. To obtain the equicontinuity, we derive the uniform law of large numbers (ULLN) over a class of functions by domination conditions of random covering numbers and covering integrals. In addition, we also derive the asymptotic covariance matrix for biavariant vector of Bernstein estimators. Finally, numerical simulations are presented to verify the validity of our main results.

Journal ArticleDOI
TL;DR: In this article , the authors constructed several new multivariate goodness of fit (GoF) tests based on existing univariate GoF tests, which are distribution-free under the null hypothesis.
Abstract: Using notions of depth functions in the multivariate setting, we have constructed several new multivariate goodness of fit (GoF) tests based on existing univariate GoF tests. Since the exact computation of depth is difficult, depth is estimated based on a large random sample drawn from the null distribution. It has been shown that test statistics based on estimated depth are close to those based on the true depth. Some two-sample tests based on data depth are also discussed for scale differences. These tests are distribution-free under the null hypothesis. Finite sample properties of the proposed tests are studied using several numerical examples. A real-data example is discussed to illustrate the usefulness of the proposed GoF tests.

Journal ArticleDOI
TL;DR: In this paper , a U-statistic-based nonparametric test of the null hypothesis that the treatment effects are identical in different subgroups was proposed to determine whether there is treatment effect heterogeneity across different subpopulations.
Abstract: Many studies include a goal of determining whether there is treatment effect heterogeneity across different subpopulations. In this paper, we propose a U-statistic-based nonparametric test of the null hypothesis that the treatment effects are identical in different subgroups. The proposed test provides more power than the standard parametric test when the underlying distribution assumptions of the latter are violated. We apply the method to data from an economic study of programme effectiveness and find that there is treatment effect heterogeneity in different subpopulations.

Journal ArticleDOI
TL;DR: The interpoint distance outlier test (IDOT) is compared with five competing methods under four distributions, and shows the best performance for outlier detection in terms of the average number of the outliers detected and the probability of the correct identification.
Abstract: Based on the ordered values of the total dissimilarity of each observation from all the others, we present a nonparametric method for detection of high dimensional outliers. We provide algorithms to obtain the distribution of the test statistic based on the percentile bootstrap and offer an outlier visualisation plot as a nonparametric graphical tool for detecting outliers in a data set. We compare the interpoint distance outlier test (IDOT) with five competing methods under four distributions, and using a real data set. IDOT shows the best performance for outlier detection in terms of the average number of the outliers detected and the probability of the correct identification.

Journal ArticleDOI
TL;DR: In this paper , a minimum average variance estimation (MAVE) based on local modal regression is proposed for partial linear single-index models, which can be robust to different error distributions or outliers.
Abstract: In this article, minimum average variance estimation (MAVE) based on local modal regression is proposed for partial linear single-index models, which can be robust to different error distributions or outliers. Asymptotic distributions of the proposed estimators are derived, which have the same convergence rate as the original MAVE based on least squares. A modal EM algorithm is provided to implement our robust estimation. Both simulation studies and a real data example are used to evaluate the finite sample performance of the proposed estimation procedure.

Journal ArticleDOI
TL;DR: In this article , a new test of axial symmetry based on integrated rank scores for directional quantile regression is proposed, which outperforms existing competitors in terms of size, power, robustness, moment conditions or computational feasibility.
Abstract: The article addresses the recently emerging inferential problem of testing axial symmetry up to a shift, which is useful even for testing certain hypotheses of exchangeability, independence, goodness-of-fit or equality of scale. In particular, it introduces a new test of axial symmetry based on integrated rank scores for directional quantile regression. The test outperforms existing competitors in terms of size, power, robustness, moment conditions or computational feasibility. All that is illustrated with a series of simulated examples.

Journal ArticleDOI
TL;DR: In this paper , a general additive multiplicative hazards (AMH) regression model is proposed to estimate the restricted mean treatment effects, where the survival time is subject to both dependent and independent censoring.
Abstract: ABSTRACT The difference in restricted mean survival times between two groups is often of inherent interest in epidemiologic and medical studies. In this paper, we propose a general additive-multiplicative hazards (AMHs) regression model to estimate the restricted mean treatment effects, where the survival time is subject to both dependent and independent censoring. The AMH specifies an additive and multiplicative form on the hazard functions for the survival and censored times associated with covariates, which contains the proportional hazards model and the additive hazards model. By an inverse probability censoring weights scheme, we obtain the estimators of the regression parameters and the restricted mean treatment effects. We establish the large sample properties of the proposed estimators. Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedures and the primary biliary cirrhosis patients data are analyzed for illustration.

Journal ArticleDOI
TL;DR: An introduction to this Special Issue on Data Science for COVID-19 is included in this paper, which contains a general overview about methods and applications of nonparametric inference and other flexible data science methods for the CO VID-19 pandemic.
Abstract: An introduction to this Special Issue on Data Science for COVID-19 is included in this paper. It contains a general overview about methods and applications of nonparametric inference and other flexible data science methods for the COVID-19 pandemic. Specifically, some methods existing before the COVID-19 outbreak are surveyed, followed by an account of survival analysis methods for COVID-related times. Then, several nonparametric tools for the estimation of certain COVID rates are revised, along with the forecasting of most relevant series counts, and some other related problems. Within this setup, the papers published in this special issue are briefly commented in this introductory article.

Journal ArticleDOI
TL;DR: This paper proposes a novel procedure to combine dependent tests based on the notion of data depth that can automatically incorporate the underlying dependency among the tests, and is nonparametric and completely data-driven.
Abstract: Combining multiple tests has many real world applications. However, most existing methods fail to directly take into account the underlying dependency among the tests. In this paper, we propose a novel procedure to combine dependent tests based on the notion of data depth. The proposed method can automatically incorporate the underlying dependency among the tests, and is nonparametric and completely data-driven. To demonstrate its application, we apply the proposed combining method to develop a new two-sample test for data of arbitrary types when the data can be metrizable and their information can be characterised by interpoint distances. Our simulation studies and real data analysis show that the proposed test based on the new combining method performs well across a broad range of settings and compares favourably with existing tests.

Journal ArticleDOI
TL;DR: In this article , a single-index model with covariate measurement errors is proposed, in which both the conditional mean and conditional variance functions of the response given the covariates have a single index structure.
Abstract: In this paper, we propose a novel estimation for heteroscedastic single-index models with covariate measurement errors, in which both the conditional mean and conditional variance functions of the response given the covariates have a single-index structure. When covariates are directly observable, we show that the index parameter vector can be estimated consistently by the estimation obtained by fitting a misspecified linear quantile regression model under some mild regularity conditions. It is well known that naively treating mismeasured covariates as error-free usually leads to inconsistent estimators. To account for measurement errors in covariates, we establish a new estimation procedure based on corrected quantile loss function, and obtain the asymptotic consistency and normality of the resulting estimators. Finally, the finite sample performance of the proposed estimation method is illustrated by simulation studies and an empirical analysis of a real dataset.

Journal ArticleDOI
TL;DR: In this paper , the authors considered the nonparametric estimation of error density in linear regression with right censored data and established a point-wise law of the iterated logarithm for kernel-type error density estimator in censored linear regression.
Abstract: We consider the strong consistency of the nonparametric estimation of error density in linear regression with right censored data. The estimator is defined to be the kernel-smoothed estimator of error density, which makes use of the Kaplan-Meier estimator of the error distribution. We establish a point-wise law of the iterated logarithm for kernel-type error density estimator in censored Linear Regression.

Journal ArticleDOI
TL;DR: In this paper , a nonparametric model using a sequence of Bernstein polynomials is constructed to approximate arbitrary isotropic covariance functions valid in and related approximation properties are investigated using the popular norm and norms.
Abstract: A nonparametric model using a sequence of Bernstein polynomials is constructed to approximate arbitrary isotropic covariance functions valid in and related approximation properties are investigated using the popular norm and norms. A computationally efficient sieve maximum likelihood (sML) estimation is then developed to nonparametrically estimate the unknown isotropic covariance function valid in . Consistency of the proposed sieve ML estimator is established under increasing domain regime. The proposed methodology is compared numerically with couple of existing nonparametric as well as with commonly used parametric methods. Numerical results based on simulated data show that our approach outperforms the parametric methods in reducing bias due to model misspecification and also the nonparametric methods in terms of having significantly lower values of expected and norms. Application to precipitation data is illustrated to showcase a real case study. Additional technical details and numerical illustrations are also made available.

Journal ArticleDOI
TL;DR: In this paper , a pairwise pseudo-likelihood method is proposed to recover some missing information in the conditional method, which is proved to be consistent and asymptotically efficient and normal.
Abstract: Semi-parametric transformation models provide a general and flexible class of models for regression analysis of failure time data and many methods have been developed for their estimation. In particular, they include the proportional hazards and proportional odds models as special cases. In this paper, we discuss the situation where one observes left-truncated and interval-censored data, for which it does not seem to exist an established method. For the problem, in contrast to the commonly used conditional approach that may not be efficient, a pairwise pseudo-likelihood method is proposed to recover some missing information in the conditional method. The proposed estimators are proved to be consistent and asymptotically efficient and normal. A simulation study is conducted to assess the empirical performance of the method and suggests that it works well in practical situations. This method is illustrated by using a set of real data arising from an HIV/AIDS cohort study.

Journal ArticleDOI
TL;DR: In this paper , the authors consider the estimation of the covariance of a stationary Gaussian process on a multi-dimensional grid from observations taken on a general acquisition domain and derive spectral-norm risk rates for multi-taper estimators.
Abstract: We consider the estimation of the covariance of a stationary Gaussian process on a multi-dimensional grid from observations taken on a general acquisition domain. We derive spectral-norm risk rates for multi-taper estimators. When applied to one-dimensional acquisition intervals, these show that Thomson's classical multi-taper has optimal risk rates, as they match known benchmarks. We also extend existing lower risk bounds to multi-dimensional grids and conclude that multi-taper estimators associated with certain two-dimensional acquisition domains also have almost optimal risk rates.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed consistent and efficient robust different time-scales estimators to mitigate the heavy-tail effect of high-frequency financial data, which are based on minimising the Huber loss function with a suitable threshold.
Abstract: We propose consistent and efficient robust different time-scales estimators to mitigate the heavy-tail effect of high-frequency financial data. Our estimators are based on minimising the Huber loss function with a suitable threshold. We show these estimators are guaranteed to be robust to measurement noise of certain types and jumps. With only finite fourth moments of the observation log-price data, we develop the sub-Gaussian concentration of our estimators around the volatility. We conduct the simulation studies to show the finite sample performance of the proposed estimation methods. The simulation studies imply that our methods are also robust to financial data in the presence of jumps. Empirical studies demonstrate the practical relevance and advantages of our estimators.

Journal ArticleDOI
TL;DR: In this article , a varying coefficients single-index regression model with responses missing at random is considered, and rank-based estimators of the index coefficient and the functional coefficients are studied, and their asymptotic properties are established under mild conditions.
Abstract: A varying coefficients single-index regression model with responses missing at random is considered. Rank-based estimators of the index coefficient and the functional coefficients are studied, and their asymptotic properties (consistency and asymptotic normality) are established under mild conditions. To demonstrate the performance of the proposed approach, Monte Carlo simulation experiments are carried out and show that the proposed approach provides robust and more efficient estimators compared to its least-squares counterpart. This is demonstrated under different model error structures, including the standard normal, the t and the contaminated model error distributions. Finally, a real data example is given to illustrate our proposed method.