scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Nonparametric Statistics in 2002"


Journal ArticleDOI
TL;DR: In this article, two improved methods for conditional density estimation were proposed based on locally fitting a log-linear model, and is in the spirit of recent work on locally parametric techniques in estimation.
Abstract: We suggest two improved methods for conditional density estimation. The first is based on locally fitting a log-linear model, and is in the spirit of recent work on locally parametric techniques in...

135 citations


Journal ArticleDOI
TL;DR: It is proved that all smoothers in question are nonlinear filters in a precise sense and characterize their fixed points and a Potts model is adopted for segmentation.
Abstract: We discuss the interplay between local M -smoothers, Bayes smoothers and some nonlinear filters for edge-preserving signal reconstruction. We prove that all smoothers in question are nonlinear filters in a precise sense and characterize their fixed points. Then a Potts model is adopted for segmentation. For 1-d signals, an exact algorithm for the computation of maximum posterior modes is derived and applied to a phantom and to 1-d fMRI-data.

82 citations


Journal ArticleDOI
TL;DR: In this paper, the authors construct canonical monitoring processes which under the hypothesis of no change converge in distribution to independent Brownian bridges, and use these to construct natural goodness-of-fit statistics.
Abstract: Suppose that a sequence of data points follows a distribution of a certain parametric form, but that one or more of the underlying parameters may change over time. This paper addresses various natural questions in such a framework. We construct canonical monitoring processes which under the hypothesis of no change converge in distribution to independent Brownian bridges, and use these to construct natural goodness-of-fit statistics. Weighted versions of these are also studied, and optimal weight functions are derived to give maximum local power against alternatives of interest. We also discuss how our results can be used to pinpoint where and what type of changes have occurred, in the event that initial screening tests indicate that such exist. Our unified large-sample methodology is quite general and applies to all regular parametric models, including regression, Markov chain and time series situations.

76 citations


Journal ArticleDOI
TL;DR: In this article, a semiparametric family of bivariate copulas is studied and the symmetry and dependence properties of these copulas are investigated, and bounds on different measures of association (such as Kendall's Tau, Spearman's Rho) for this family are provided.
Abstract: In this paper, we study a semiparametric family of bivariate copulas. The family is generated by a univariate function, determining the symmetry (radial symmetry, joint symmetry) and dependence property (quadrant dependence, total positivity, \ldots ) of the copulas. We provide bounds on different measures of association (such as Kendall's Tau, Spearman's Rho) for this family and several choices of generating functions for which these bounds can be reached.

72 citations


Journal ArticleDOI
TL;DR: It is shown that, provided discontinuities can be detected and located with sufficient accuracy, detection followed by wavelet smoothing enjoys optimal rates of convergence.
Abstract: The objective of this paper is to contribute to the methodology available for dealing with the detection and the estimation of the location of discontinuities in one-dimensional piecewise smooth regression functions observed in white Gaussian noise over an interval. Our approach is nonparametric in nature because the unknown function is not assumed to have any specific form. Our method relies upon a wavelet analysis of the observed signal and belongs to the class of "indirect" methods, where one detects and locates the change points prior to fitting the curve, and then uses ones favorite function estimation technique on each segment to recover the curve. We show that, provided discontinuities can be detected and located with sufficient accuracy, detection followed by wavelet smoothing enjoys optimal rates of convergence.

60 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose to model rare events via a Poisson distribution, which reveals substantial over-dispersion, indicating that some unexplained discontinuity arises in the data.
Abstract: The modelling of rare events via a Poisson distribution sometimes reveals substantial over-dispersion, indicating that some unexplained discontinuity arises in the data. We suggest modelling this over-dispersion by a Poisson mixture. In a hierarchical Bayesian model, the posterior distributions of the unknown quantities in the mixture (number of components, weights, and Poisson parameters) will be estimated by MCMC algorithms, including reversible jump algorithms which permits varying the dimension of the mixture. We will focus on the difficulty of finding a weakly informative prior for the Poisson parameters: different priors will be detailed and compared. Then, the performances of different moves created for changing dimension will be investigated. The model is extended by the introduction of covariates, with homogeneous or heterogeneous effect. Simulated data sets will be designed for the different comparisons, and the model will finally be illustrated on real data.

54 citations


Journal ArticleDOI
TL;DR: In this paper, the bandwidth is assigned a prior distribution in the neighborhood around the point at which the density is being estimated, and the mean of the posterior distribution is used to select the local bandwidth.
Abstract: In data driven bandwidth selection procedures for density estimation such as least squares cross validation and biased cross validation, the choice of a single global bandwidth is too restrictive. It is however reasonable to assume that the bandwidth has a distribution of its own and that locally, depending on the data, the bandwidth may differ. In this approach, the bandwidth is assigned a prior distribution in the neighborhood around the point at which the density is being estimated. Assuming that the kernel function is a proper probability distribution, a Bayesian approach is employed to come up with a posterior type distribution of the bandwidth given the data. Finally, the mean of the posterior distribution is used to select the local bandwidth.

43 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayesian nonparametric approach to the estimation of a system and its components' survival functions arising from observing the failure of a series system or a competing risk model is presented.
Abstract: This article presents a Bayesian nonparametric approach to the estimation of a system and its components’ survival functions arising from observing the failure of a series system or a competing risk model. A Dirichlet multivariate process is used as a prior for the vector of the components’ random subsurvival function to derive Bayes estimator of the survival function when the cause of failure belongs to a certain risk subset. This is done as follows. First, Peterson’s formula is evaluated using the Bayes estimators of the subsurvival functions corresponding to the risk subset, to obtain a plugged-in nonparametric estimator of the survival function associated with the risk subset. Then, using the product-integration approach, it is proved that this nonparametric estimator is in fact the Bayes estimator of the survival function corresponding to the risk subset under quadratic loss function and the Dirichlet multivariate process. The weak convergence and the strong consistency of the estimator is established. The special case when the system has only two components corresponds to well studied randomly censored model.

39 citations


Journal ArticleDOI
TL;DR: In this paper, several non-traditional bootstrap intervals based on Levene statistics are proposed and compared with traditional ones and adaptive procedures using the best bootstrap interval found in this study are also investigated.
Abstract: The problem of comparing the scales of two populations arises in a variety of areas. Robust confidence intervals and tests for (the logarithm of) the ratio of the two population standard deviations have been studied over the past four decades. Levene statistics and Fligner-Killeen rank statistics are two well-known competitors. Recently, Hall and Padmanabhan (1997) improved the rank-based Fligner-Killeen procedures using bootstrap and adaption. This paper explores potential improvement of Levene type procedures by considering bootstrap and/or adaption methods. Several non-traditional bootstrap intervals based on Levene statistics are proposed and compared with traditional ones. Adaptive procedures using the best bootstrap interval found in this study are also investigated. All the procedures based on Levene statistics are compared with those based on Fligner-Killeen rank statistics.

32 citations


Journal ArticleDOI
Lutz Dümbgen1
TL;DR: In this article, it is shown how to apply several linear rank test statistics simultaneously in order to test monotonicity of f in various regions and to identify its local extrema.
Abstract: Let Y_i = f(x_i) + E_i\ (1\le i\le n) with given covariates x_1\lt x_2\lt \cdots\lt x_n , an unknown regression function f and independent random errors E_i with median zero. It is shown how to apply several linear rank test statistics simultaneously in order to test monotonicity of f in various regions and to identify its local extrema.

30 citations


Journal ArticleDOI
TL;DR: An algorithm which is both fast and robust together with the theoretical properties of the local linear M-smoother is given, which gives a large improvement for some data sets compared to the local constant M- Smoothing and demonstrates elsewhere good performance on various artificial and magnetic resonance data sets.
Abstract: Local linear M-smoothing is proposed as a method for noise reduction in one-dimensional signals. It is more appropriate than conventional local linear smoothing, because it does not introduce blurring of jumps in the signal. It improves local constant M-smoothing, by avoiding boundary effects at edges and jumps. While the idea of local linear M-smoothing is straightforward, numerical issues are challenging, because of the local minima aspect that is crucial to good performance. We give an algorithm which is both fast and robust together with the theoretical properties of the local linear M-smoother. The new M-smoother gives a large improvement for some data sets compared to the local constant M-smoother and demonstrates elsewhere good performance on various artificial and magnetic resonance data sets.

Journal ArticleDOI
TL;DR: Ahmad and Li as mentioned in this paper proposed a new test for symmetry of the error distribution in linear regression models and proved asymptotic normality for the distribution of the correspondences.
Abstract: In a recent paper Ahmad and Li (1997) proposed a new test for symmetry of the error distribution in linear regression models and proved asymptotic normality for the distribution of the correspondin

Journal ArticleDOI
TL;DR: In this article, the authors considered a nonparametric regression model with random design, where the regression function m is given by m(x) = {open E}(Y\mid X = x}.
Abstract: In the nonparametric regression model with random design, where the regression function m is given by m(x) = {\open E}(Y\mid X = x), estimation of the location \theta ( mode ) of a unique maximum of m by the location \hat{\theta} of a maximum of the Nadaraya-Watson kernel estimator \hat{m} for the curve m is considered. Within this setting, we obtain consistency and asymptotic normality results for \hat{\theta} under very mild assumptions on m , the design density g of X and the kernel K . The bandwidths being considered in the present work are data-dependent of the type being generated by plug-in methods. The estimation of the size of the maximum is also considered as well as the estimation of a unique zero of the regression function. Applied to the estimation of the mode of a density, our methods yield some improvements on known results. As a by-product, we obtain some uniform consistency results for the (higher) derivatives of the Nadaraya-Watson estimator with a certain additional uniformity in the ba...

Journal ArticleDOI
TL;DR: In this paper, a constrained discontinuity vector Gaussian process model is proposed for the retrieval of wind vectors over the ocean using satellite-bounded scatterometers, which ensures realistic fronts.
Abstract: A Bayesian procedure for the retrieval of wind vectors over the ocean using satellite borne scatterometers requires realistic prior near-surface wind field models over the oceans. We have implemented carefully chosen vector Gaussian Process models; however in some cases these models are too smooth to reproduce real atmospheric features, such as fronts. At the scale of the scatterometer observations, fronts appear as discontinuities in wind direction. Due to the nature of the retrieval problem a simple discontinuity model is not feasible, and hence we have developed a constrained discontinuity vector Gaussian Process model which ensures realistic fronts. We describe the generative model and show how to compute the data likelihood given the model. We show the results of inference using the model with Markov Chain Monte Carlo methods on both synthetic and real data.

Journal ArticleDOI
TL;DR: In this article, several results are given that allow one to decide whether a class is almost surely discernible, such as continuity, log-concavity, and boundedness by a given constant.
Abstract: Let a class {\cal F} of densities be given. We draw an i.i.d. sample from a density f which may or may not be in {\cal F} . After every n , one must make a guess whether f \in {\cal F} or not. A class is almost surely discernible if there exists such a sequence of classification rules such that for any f , we make finitely many errors almost surely. In this paper several results are given that allow one to decide whether a class is almost surely discernible. For example, continuity and square integrability are not discernible, but unimodality, log-concavity, and boundedness by a given constant are.

Journal ArticleDOI
TL;DR: In this paper, a kernel conditional Kaplan-Meier estimator is used in the missing information principle estimating function for a one-dimensional covariate, where the covariate takes values in a finite set.
Abstract: The fitting of heteroscedastic median regression models to right censored data has been a topic of much research in survival analysis in recent years. McKeague et al. (2001) used the missing information principle to propose an estimator for the regression parameters, and derived the asymptotic properties of their estimator assuming that the covariate takes values in a finite set. In this paper the large sample properties of their estimator are derived when the covariate is continuous. A kernel conditional Kaplan-Meier estimator is used in the missing information principle estimating function. A simulation study involving a one-dimensional covariate is presented.

Journal ArticleDOI
TL;DR: In this article, the pointwise limit distribution results for the isotonic regression estimator at a point of discontinuity are given for independent data, f -mixing data and subordinated Gaussians.
Abstract: Pointwise limit distribution results are given for the isotonic regression estimator at a point of discontinuity. The cases treated are independent data, - and f -mixing data and subordinated Gauss...

Journal ArticleDOI
TL;DR: In this paper, a maxbias curve is used to describe the robustness of an estimator to a given fraction of contamination, which is a powerful tool to describe robustness.
Abstract: A maxbias curve is a powerful tool to describe the robustness of an estimator. It tells us how much an estimator can change due to a given fraction of contamination. In this paper, maxbias curves a...

Journal ArticleDOI
TL;DR: In this article, the authors considered a Cox model for right-censored survival data with a covariate missing completely at random and when a continuous surrogate covariate and a complete validation sample are available.
Abstract: We consider a Cox model for right-censored survival data when a covariate is missing completely at random and when a continuous surrogate covariate and a complete validation sample are available. The likelihood is approximated using a nonparametric estimation of the missing covariate values and we propose new estimators of the regression parameter and of the cumulative hazard function based on the approximated likelihood. We show that the estimators are n 1/2 -consistent and asymptotically Gaussian and give conditions which ensure that the estimator of the regression parameter has the minimal asymptotic variance in the model with missing covariate data.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric dispersion test for two-level fractional factorial designs is proposed, which is the only one of those studied that does not incorrectly detect a spurious dispersion effect.
Abstract: A consistent product/process will have little variability, i.e. dispersion. The widely-used unreplicated two-level fractional factorial designs can play an important role in detecting dispersion effects with a minimum expenditure of resources. In this paper we develop a nonparametric dispersion test for unreplicated two-level fractional factorial designs. The test statistic is defined, critical values are provided, and large sample approximations are given. Through simulations and examples from the literature, the test is compared to general nonparametric dispersion tests and a parametric test based on a normality assumption. These comparisons show the test to be the most robust of those studied and even superior to the normality-based test under normality in some situations. An example is given where this new test is the only one of those studied that does not incorrectly detect a spurious dispersion effect.

Journal ArticleDOI
TL;DR: In this article, the rank tests for uncensored data without ties need to be re-parameterized, which can cause ambiguity about the rank of an observation, and therefore traditional rank tests without ties are needed.
Abstract: Ties, right censoring, left censoring, and interval censoring can all cause ambiguity about the rank of an observation. Therefore traditional rank tests for uncensored data without ties need to be ...

Journal ArticleDOI
TL;DR: In this article, a procedure for testing the goodness-of-fit of the conditional variance function of a Markov model of order 1, under stationarity and ergodicity, is presented.
Abstract: We present a procedure for testing the goodness-of-fit of the conditional variance function of a Markov model of order 1, under stationarity and ergodicity. The autoregressive parameter, the distribution of the noise and the stationary distribution of the observations are assumed to be unknown. Under the null hypothesis H 0 that the conditional variance function belongs to a class of parametric functions, we define an estimator {\tilde{\theta}}_{n} of Š 0 , the assumed true parameter, and we establish its consistency and asymptotic normality. We define a marked empirical process A n (·), for which we state and prove a functional limit theorem under H 0 . The asymptotic behavior of this process is studied under fixed alternatives H 1 . Based on the process A n (·), a chi-squared test is derived. Simulation experiments show that the test is powerful against some heteroscedastic time series models.

Journal ArticleDOI
TL;DR: In this paper, the authors studied the kernel estimation of the Lebesgue measure for a sequence of associated variables with a common density function, and provided sufficient conditions for the consistency and asymptotic normality of the kernel estimator.
Abstract: Let X_n , n\in {\open N} , be a sequence of associated variables with common density function. We study the kernel estimation of this density, based on the given sequence of variables. Sufficient conditions are given for the consistency and asymptotic normality of the kernel estimator. The assumptions made require that the distribution of pairs (X_i, X_j) decompose as the sum of an absolutely continuous measure with another measure concentrated on the diagonal of {\open R}\times {\open R} satisfying a further absolute continuity with respect to the Lebesgue measure on this diagonal. For the convergence in probability we find the usual convergence rate on the bandwidth, whereas for the almost sure convergence we need to require that the bandwidth does not decrease to fast and that the kernel is of bounded variation. This assumption on the kernel is also required for the asymptotic normality, together with a slightly strengthened version of the usual decrease rate on the bandwidth. The assumption of bounded...

Journal ArticleDOI
TL;DR: In this article, the problem of smoothing parameter selection when estimating the direction vector (β0) and the link function in the context of semiparametric, single index Poisson regression is addressed.
Abstract: We address the problem of smoothing parameter (h) selection when estimating the direction vector (β0) and the link function in the context of semiparametric, single index Poisson regression. The single index Poisson model (PSIM) differs from the classical nonparametric setting in two ways: first, the errors are heteroscedastic, and second, the direction parameter is unknown and has to be estimated. We propose two simple, automatic rules for simultaneously estimating β0 and h in a PSIM. The first criterion, called weighted least squares (WLS2), estimates the Kullback-Leibler risk function and has a penalty term to prevent undersmoothing in small samples. The second method, termed double smoothing (DS), is based on the estimation of an L2 approximation of the Kullback-Leibler risk and makes use of a double smoothing idea as in Wand and Gutierrez (1997). Simulations are used to investigate the behavior of various criteria in the PSIM context. Our weighted least squares and double smoothing methods out-perform both a Kullback-Leibler version of cross-validation and the weighted least squares cross-validation criterion proposed by Hardle,Hall and Ichimura (1993).

Journal ArticleDOI
TL;DR: In this paper, statistical models and methods for Discontinuous Phenomena are presented. But they do not address the problem of nonparametric statistical models, as we do.
Abstract: (2002). Statistical Models and Methods for Discontinuous Phenomena. Journal of Nonparametric Statistics: Vol. 14, No. 1-2, pp. 1-5.

Journal ArticleDOI
TL;DR: In this article, a testing procedure for the presence of drift is derived, along with consistent estimators for the variance components, applied to a long-run sequence of tapping data, motivated by a psychological study of human rhythm and motor control.
Abstract: Many statistics techniques rely on the assumption that random variables measured over time have a common mean. In many situations, this assumption is violated, the mean of the observations drifting gradually over time. For a particular modeling situation, motivated by a psychological study of human rhythm and motor control, a testing procedure for the presence of drift is derived, along with consistent estimators for the variance components. These procedures are applied to a long-run sequence of tapping data.

Journal ArticleDOI
TL;DR: In this article, two nonparametric tests for the existence of change-points in a regression function are introduced, based on the jump estimate of the regression function at a known point and the rescaled process of local variation in the neighborhood of the point.
Abstract: Two non-parametric tests for existence of change-points in a regression function are introduced. The first test is based on the jump estimate of the regression function at a known point ‰, whereas the second one, named local test in the text, is based on the rescaled process of local variation in the neighborhood of ‰. This process is proved to be asymptotically gaussian. We derive the asymptotic distributions for the test statistics under the null hypothesis and show that their power under local alternatives tends to unity. Numerical experiments are provided to give evidence for performances of these tests.

Journal ArticleDOI
TL;DR: In this article, a Bayesian method for estimating the power spectra of stationary random processes is proposed, which uses piecewise polynomials with random knot locations to capture peaks and troughs in the true spectrum.
Abstract: A Bayesian method of estimating the power spectra of stationary random processes is proposed. Initially we estimate the true spectra via the log periodogram but due to the inadequacies of the periodogram when the true spectrum has a high dynamic range and/or is rapidly varying we find that improved results can be obtained using multitaper spectrum estimates (Percival and Walden, 1993). We follow the method of Denison, Mallick and Smith (1998) and estimate the spectra using piecewise polynomials with random knot locations. The methodology is shown, using simulated examples, to be successful in giving "smooth" estimates which also capture peaks and troughs in the true spectra.

Journal ArticleDOI
TL;DR: In this article, a new Bayesian method using a priori spatial information modeled by means of a suitable Markov random field is proposed to classify multispectral image data, where the image data for each class are assumed to be i.i.d following a multivariate Gaussian model with unknown mean and unknown diagonal covariance matrix.
Abstract: The problem of classifying multispectral image data is studied here. We propose a new Bayesian method for this. The method uses " a priori " spatial information modeled by means of a suitable Markov random field. The image data for each class are assumed to be i.i.d. following a multivariate Gaussian model with unknown mean and unknown diagonal covariance matrix. When the prior information is not used and the variances of the Gaussian model are equal, the method reduces to the standard K -means algorithm. All the parameters appearing in the posterior model are estimated simultaneously. The prior normalizing constant is approximated on the basis of the expectation of the energy function as obtained by means of Markov Chain Monte Carlo simulations. Some experimental results suggest calculating this expectation from a "standard" function by simple multiplication by the minimum value of the energy. A local solution to the problem of maximizing the posterior distribution is obtained by using the Iterated Condi...

Journal ArticleDOI
TL;DR: In this article, a simple goodness-of-fit test of the hypothesis that a contour belongs to a given parametric family against a nonparametric alternative is proposed, and the behavior of the test under the null hypothesis and under the alternative separated from the null parametric families by a distance of order n−1/2 is analyzed.
Abstract: We consider the problem of testing hypotheses about the contours in binary images observed on the regular grid. We propose a simple goodness-of-fit test of the hypothesis that a contour belongs to a given parametric family against a nonparametric alternative. We analyze the behavior of the test under the null hypothesis, and under the alternative separated from the null parametric family by a distance of order n^{-1/2} ( n is the total number of observations and the distance is defined as the measure of symmetric difference between the sets whose boundaries are the contours of interest). Finally, we prove the lower bound showing that no test can be consistent if the distance between the hypothesis and the alternative is of the order smaller than n^{-1/2} .