scispace - formally typeset
Search or ask a question

Showing papers in "Electronic Journal of Statistics in 2012"


Journal ArticleDOI
TL;DR: Given two probability measures, P and Q defined on a measurable space, S, the integral probability metric (IPM) is defined as γF(P,Q) = sup {∣∣ ∣∦∦ ∣ ∫}.
Abstract: Given two probability measures, $\mathbb{P}$ and $\mathbb{Q}$ defined on a measurable space, $S$, the integral probability metric (IPM) is defined as $$\gamma_{\mathcal{F}}(\mathbb{P},\mathbb{Q})=\sup\left\{\left\vert \int_{S}f\,d\mathbb{P}-\int_{S}f\,d\mathbb{Q}\right\vert\,:\,f\in\mathcal{F}\right\},$$ where $\mathcal{F}$ is a class of real-valued bounded measurable functions on $S$. By appropriately choosing $\mathcal{F}$, various popular distances between $\mathbb{P}$ and $\mathbb{Q}$, including the Kantorovich metric, Fortet-Mourier metric, dual-bounded Lipschitz distance (also called the Dudley metric), total variation distance, and kernel distance, can be obtained. In this paper, we consider the problem of estimating $\gamma_{\mathcal{F}}$ from finite random samples drawn i.i.d. from $\mathbb{P}$ and $\mathbb{Q}$. Although the above mentioned distances cannot be computed in closed form for every $\mathbb{P}$ and $\mathbb{Q}$, we show their empirical estimators to be easily computable, and strongly consistent (except for the total-variation distance). We further analyze their rates of convergence. Based on these results, we discuss the advantages of certain choices of $\mathcal{F}$ (and therefore the corresponding IPMs) over others—in particular, the kernel distance is shown to have three favorable properties compared with the other mentioned distances: it is computationally cheaper, the empirical estimate converges at a faster rate to the population value, and the rate of convergence is independent of the dimension $d$ of the space (for $S=\mathbb{R}^{d}$). We also provide a novel interpretation of IPMs and their empirical estimators by relating them to the problem of binary classification: while the IPM between class-conditional distributions is the negative of the optimal risk associated with a binary classifier, the smoothness of an appropriate binary classifier (e.g., support vector machine, Lipschitz classifier, etc.) is inversely related to the empirical estimator of the IPM between these class-conditional distributions.

279 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that the posterior distribution of a parameter in misspecified LAN parametric models can be approximated by a random normal distribution, and that Bayesian credible sets are not valid confidence sets if the model is missing.
Abstract: We prove that the posterior distribution of a parameter in misspecified LAN parametric models can be approximated by a random normal distribution. We derive from this that Bayesian credible sets are not valid confidence sets if the model is misspecified. We obtain the result under conditions that are comparable to those in the well-specified situation: uniform testability against fixed alternatives and sufficient prior mass in neighbourhoods of the point of convergence. The rate of convergence is considered in detail, with special attention for the existence and construction of suitable test sequences. We also give a lemma to exclude testable model subsets which implies a misspecified version of Schwartz’ consistency theorem, establishing weak convergence of the posterior to a measure degenerate at the point at minimal Kullback-Leibler divergence with respect to the true distribution.

226 citations


Journal ArticleDOI
TL;DR: The identi ability of SBM is proved, while asymptotic properties of maximum-likelihood and variational esti- mators are provided, and the consistency of these estimators is settled, which is, to the best of the authors' knowledge, the rst result of this type for variational estimators with random graphs.
Abstract: The stochastic block model (SBM) is a probabilistic model de- signed to describe heterogeneous directed and undirected graphs. In this paper, we address the asymptotic inference on SBM by use of maximum- likelihood and variational approaches. The identi ability of SBM is proved, while asymptotic properties of maximum-likelihood and variational esti- mators are provided. In particular, the consistency of these estimators is settled, which is, to the best of our knowledge, the rst result of this type for variational estimators with random graphs.

207 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that GLASSO is solving the dual of the graphical lasso penalized likelihood, by block coordinate ascent, where the target of estimation is the covariance matrix, rather than the precision matrix.
Abstract: The graphical lasso [5] is an algorithm for learning the structure in an undirected Gaussian graphical model, using l1 regularization to control the number of zeros in the precision matrix Θ = Σ-1 [2, 11]. The R package GLASSO [5] is popular, fast, and allows one to efficiently build a path of models for different values of the tuning parameter. Convergence of GLASSO can be tricky; the converged precision matrix might not be the inverse of the estimated covariance, and occasionally it fails to converge with warm starts. In this paper we explain this behavior, and propose new algorithms that appear to outperform GLASSO. By studying the "normal equations" we see that, GLASSO is solving the dual of the graphical lasso penalized likelihood, by block coordinate ascent; a result which can also be found in [2]. In this dual, the target of estimation is Σ, the covariance matrix, rather than the precision matrix Θ. We propose similar primal algorithms P-GLASSO and DP-GLASSO, that also operate by block-coordinate descent, where Θ is the optimization target. We study all of these algorithms, and in particular different approaches to solving their coordinate sub-problems. We conclude that DP-GLASSO is superior from several points of view.

204 citations


Journal ArticleDOI
TL;DR: In this article, an exponential family random graph model (ERGM) is proposed for networks whose ties are counts and discussed issues that arise when moving beyond the binary case, and applied to a network dataset whose values are counts of interactions.
Abstract: Exponential-family random graph models (ERGMs) provide a principled and flexible way to model and simulate features common in social networks, such as propensities for homophily, mutuality, and friend-of-a-friend triad closure, through choice of model terms (sufficient statistics). However, those ERGMs modeling the more complex features have, to date, been limited to binary data: presence or absence of ties. Thus, analysis of valued networks, such as those where counts, measurements, or ranks are observed, has necessitated dichotomizing them, losing information and introducing biases. In this work, we generalize ERGMs to valued networks. Focusing on modeling counts, we formulate an ERGM for networks whose ties are counts and discuss issues that arise when moving beyond the binary case. We introduce model terms that generalize and model common social network features for such data and apply these methods to a network dataset whose values are counts of interactions.

167 citations


Journal ArticleDOI
TL;DR: In this paper, the authors advocate the decomposition of goodness of fit into contributions of (groups of) regressor variables according to the Shapley value or, if regressors are exogenously grouped, the Owen value because of the attractive axioms associated with these values.
Abstract: We advocate the decomposition of goodness of fit into contributions of (groups of) regressor variables according to the Shapley value or—if regressors are exogenously grouped—the Owen value because of the attractive axioms associated with these values. A wage regression model with German data illustrates the method.

162 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived rank-based multiple contrast test procedures and simultaneous confidence intervals which take the correlation between the test statistics into account, and showed that the individual test decisions and the simultaneous confidence interval are compatible.
Abstract: We study simultaneous rank procedures for unbalanced designs with independent observations. The hypotheses are formulated in terms of purely nonparametric treatment effects. In this context, we derive rank-based multiple contrast test procedures and simultaneous confidence intervals which take the correlation between the test statistics into account. Hereby, the individual test decisions and the simultaneous confidence intervals are compatible. This means, whenever an individual hypothesis has been rejected by the multiple contrast test, the corresponding simultaneous confidence interval does not include the null, i.e. the hypothetical value of no treatment effect. The procedures allow for testing arbitrary purely nonparametric multiple linear hypotheses (e.g. many-to-one, all-pairs, changepoint, or even average comparisons). We do not assume homogeneous variances of the data; in particular, the distributions can have different shapes even under the null hypothesis. Thus, a solution to the multiple nonparametric Behrens-Fisher problem is presented in this unified framework.

141 citations


Journal ArticleDOI
TL;DR: In this paper, the class of Gaussian copula models for marginal regression analysis of non-normal dependent observa- tions is defined and the class provides a natural extension of traditional linear regression models with normal correlated errors.
Abstract: This paper identifies and develops the class of Gaussian copula models for marginal regression analysis of non-normal dependent observa- tions. The class provides a natural extension of traditional linear regression models with normal correlated errors. Any kind of continuous, discrete and categorical responses is allowed. Dependence is conveniently modelled in terms of multivariate normal errors. Inference is performed through a like- lihood approach. While the likelihood function is available in closed-form for continuous responses, in the non-continuous setting numerical approx- imations are used. Residual analysis and a specification test are suggested for validating the adequacy of the assumed multivariate model. Methodol- ogy is implemented in a R package called gcmr. Illustrations include simu- lations and real data applications regarding time series, cross-design data, longitudinal studies, survival analysis and spatial regression.

137 citations


Journal ArticleDOI
TL;DR: This paper develops the theory and computational details of a novel Markov chain Monte Carlo sampling scheme for Gaussian graphical model determination under G-Wishart prior distributions and generalizes the maximum clique block Gibbs samplers to a class of flexible block Gibbs SAMs and proves its convergence.
Abstract: This paper proposes a new algorithm for Bayesian model determination in Gaussian graphical models under G-Wishart prior distributions. We first review recent development in sampling from G-Wishart distributions for given graphs, with a particular interest in the efficiency of the block Gibbs samplers and other competing methods. We generalize the maximum clique block Gibbs samplers to a class of flexible block Gibbs samplers and prove its convergence. This class of block Gibbs samplers substantially outperforms its competitors along a variety of dimensions. We next develop the theory and computational details of a novel Markov chain Monte Carlo sampling scheme for Gaussian graphical model determination. Our method relies on the partial analytic structure of G-Wishart distributions integrated with the exchange algorithm. Unlike existing methods, the new method requires neither proposal tuning nor evaluation of normalizing constants of G-Wishart distributions.

125 citations


Journal ArticleDOI
TL;DR: In this paper, the authors studied the minimax risks of estimation and testing over classes of sparse vectors in the Gaussian linear regression model with high dimensionality and showed that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting.
Abstract: Consider the standard Gaussian linear regression model $Y=X\theta+\epsilon$, where $Y\in R^n$ is a response vector and $ X\in R^{n*p}$ is a design matrix. Numerous work have been devoted to building efficient estimators of $\theta$ when $p$ is much larger than $n$. In such a situation, a classical approach amounts to assume that $\theta_0$ is approximately sparse. This paper studies the minimax risks of estimation and testing over classes of $k$-sparse vectors $\theta$. These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of $X\theta$), the inverse problem (estimation of $\theta_0$) and linear testing (testing $X\theta=0$). Interestingly, an elbow effect occurs when the number of variables $k\log(p/k)$ becomes large compared to $n$. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting. Moreover, we compute the minimax risks when the variance of the noise is unknown. The knowledge of this variance is shown to play a significant role in the optimal rates of estimation and testing. All these minimax bounds provide a characterization of statistical problems that are so difficult so that no procedure can provide satisfying results.

121 citations


Journal ArticleDOI
TL;DR: In this article, a copy of an article published in the Electronic Journal of Statistics © 2012 Institute of Mathematical Statistics at DOI: 10.1214/12-EJS668.
Abstract: This is a copy of an article published in the Electronic Journal of Statistics © 2012 Institute of Mathematical Statistics at DOI: 10.1214/12-EJS668.

Journal ArticleDOI
TL;DR: In this article, a generalized Hoeffding-Sobol decomposition is used to measure the sensitivity of the output with respect to the input variables, and the estimation of these new indices is discussed.
Abstract: In this paper, we consider a regression model built on dependent variables. This regression modelizes an input output relationship. Under boundedness assumptions on the joint distribution function of the input variables, we show that a generalized Hoeffding-Sobol decomposition is available. This leads to new indices measuring the sensitivity of the output with respect to the input variables. We also study and discuss the estimation of these new indices.

Journal ArticleDOI
TL;DR: It is shown that calibrated asymmetric surrogate losses give rise to excess risk bounds, which control the expected misclassification cost in terms of the excess surrogate risk.
Abstract: Surrogate losses underlie numerous state-of-the-art binary classification algorithms, such as support vector machines and boosting. The impact of a surrogate loss on the statistical performance of an algorithm is well-understood in symmetric classification settings, where the misclassification costs are equal and the loss is a margin loss. In particular, classification-calibrated losses are known to imply desirable properties such as consistency. While numerous efforts have been made to extend surrogate loss-based algorithms to asymmetric settings, to deal with unequal misclassification costs or training data imbalance, considerably less attention has been paid to whether the modified loss is still calibrated in some sense. This article extends the theory of classification-calibrated losses to asymmetric problems. As in the symmetric case, it is shown that calibrated asymmetric surrogate losses give rise to excess risk bounds, which control the expected misclassification cost in terms of the excess surrogate risk. This theory is illustrated on the class of uneven margin losses, and the uneven hinge, squared error, exponential, and sigmoid losses are treated in detail.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of constructing adaptive confidence bands, whose width contracts at an optimal rate over a range of Holder classes, and show that the assumption of self-similarity is both necessary and sufficient for the construction of adaptive bands.
Abstract: Confidence bands are confidence sets for an unknown function f, containing all functions within some sup-norm distance of an estima- tor. In the density estimation, regression, and white noise models, we consider the problem of constructing adaptive confidence bands, whose width contracts at an optimal rate over a range of Holder classes. While adaptive estimators exist, in general adaptive confidence bands do not, and to proceed we must place further conditions on f. We discuss previous approaches to this issue, and show it is necessary to restrict f to fundamentally smaller classes of functions. We then consider the self-similar functions, whose Holder norm is similar at large and small scales. We show that such functions may be considered typical functions of a given Holder class, and that the assumption of self-similarity is both necessary and sufficient for the construction of adaptive bands. Finally, we show that this assumption allows us to resolve the problem of undersmoothing, creating bands which are honest simultaneously for functions of any Holder norm.

Journal ArticleDOI
TL;DR: This paper introduces a general class of statistical multiresolution estimators and develops an algorithmic framework for computing those and employs a combination of the alternating direction method of multipliers with Dykstra's algorithm for computing orthogonal projections onto intersections of convex sets and proves numerical convergence.
Abstract: In this paper we are concerned with fully automatic and locally adaptive estimation of functions in a “signal + noise”-model where the regression function may additionally be blurred by a linear operator, e.g. by a convolution. To this end, we introduce a general class of statistical multiresolution estimators and develop an algorithmic framework for computing those. By this we mean estimators that are defined as solutions of convex optimization problems with l∞-type constraints. We employ a combination of the alternating direction method of multipliers with Dykstra’s algorithm for computing orthogonal projections onto intersections of convex sets and prove numerical convergence. The capability of the proposed method is illustrated by various examples from imaging and signal detection.

Journal ArticleDOI
TL;DR: A procedure is proposed that estimates the structure of a graphical model by minimizing the temporally smoothed L1 penalized regression, which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition.
Abstract: We study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying data collected over a period of time, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to i.i.d. data from an invariant model widely considered in current literature of structural estimation. In particular, we consider the scenario in which the model evolves in a piece-wise constant fashion. We propose a procedure that estimates the structure of a graphical model by minimizing the temporally smoothed L1 penalized regression, which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition. A highly scalable proximal gradient method is proposed to solve the resultant convex optimization problem; and the conditions for sparsistent estimation and the convergence rate of both the partition boundaries and the network structure are established for the first time for such estimators.

Journal ArticleDOI
TL;DR: In this paper, the authors consider a Lasso estimator with a fully data-driven l 1 penalization, which is tuned for the estimation problem at hand, and prove sharp oracle inequalities for this estimator.
Abstract: We consider a general high-dimensional additive hazards model in a non-asymptotic setting, including regression for censored-data. In this context, we consider a Lasso estimator with a fully data-driven l1 penalization, which is tuned for the estimation problem at hand. We prove sharp oracle inequalities for this estimator. Our analysis involves a new “data-driven” Bernstein’s inequality, that is of independent interest, where the predictable variation is replaced by the optional variation.

Journal ArticleDOI
TL;DR: In this paper, an off-line multiple break detection algorithm was proposed for a general class of models, where the observations are supposed to fit a parametric causal process with distinct parameters on multiple periods.
Abstract: This paper is devoted to the off-line multiple breaks detection for a general class of models. The observations are supposed to fit a parametric causal process (such as classical models AR(∞), ARCH(∞) or TARCH(∞)) with distinct parameters on multiple periods. The number and dates of breaks, and the different parameters on each period are estimated using a quasi-likelihood contrast penalized by the number of distinct periods. For a convenient choice of the regularization parameter in the penalty term, the consistency of the estimator is proved when the moment order r of the process satisfies r≥2. If r≥4, the length of each approximative segment tends to infinity at the same rate as the length of the true segment and the parameters estimators on each segment are asymptotically normal. Compared to the existing literature, we added the fact that a dependence is possible over distinct periods. To be robust to this dependence, the chosen regularization parameter in the penalty term is larger than the ones from BIC approach. We detail our results which notably improve the existing ones for the AR(∞), ARCH(∞) and TARCH(∞) models. For the practical applications (when n is not too large) we use a data-driven procedure based on the slope estimation to choose the penalty term. The procedure is implemented using the dynamic programming algorithm. It is an O(n2) complexity algorithm that we apply on AR(1), AR(2), GARCH(1,1) and TARCH(1) processes and on the FTSE index data.

Journal ArticleDOI
TL;DR: An algorithm able to process very large networks and consistent estimators based on it is provided, which proves a bound of the probability of misclassification of at least one node, including when the number of classes grows.
Abstract: The Stochastic Blockmodel [16] is a mixture model for heterogeneous network data. Unlike the usual statistical framework, new nodes give additional information about the previous ones in this model. Thereby the distribution of the degrees concentrates in points conditionally on the node class. We show under a mild assumption that classification, estimation and model selection can actually be achieved with no more than the empirical degree data. We provide an algorithm able to process very large networks and consistent estimators based on it. In particular, we prove a bound of the probability of misclassification of at least one node, including when the number of classes grows.

Journal ArticleDOI
TL;DR: This paper proposes a method for identifying activity types, like walking, standing, and resting, from acceleration data, which decomposes movements into short components called "movelets", and builds a reference for each activity type.
Abstract: Recent technological advances provide researchers with a way of gathering real-time information on an individual’s movement through the use of wearable devices that record acceleration. In this paper, we propose a method for identifying activity types, like walking, standing, and resting, from acceleration data. Our approach decomposes movements into short components called “movelets”, and builds a reference for each activity type. Unknown activities are predicted by matching new movelets to the reference. We apply our method to data collected from a single, three-axis accelerometer and focus on activities of interest in studying physical function in elderly populations. An important technical advantage of our methods is that they allow identification of short activities, such as taking two or three steps and then stopping, as well as low frequency rare(compared with the whole time series) activities, such as sitting on a chair. Based on our results we provide simple and actionable recommendations for the design and implementation of large epidemiological studies that could collect accelerometry data for the purpose of predicting the time series of activities and connecting it to health outcomes.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated posterior contraction rates for priors on multivariate functions that are constructed using tensor-product B-spline expansions and obtained procedures that adapt to the degree of smoothness of the unknown function up to the order of the splines that are used.
Abstract: We investigate posterior contraction rates for priors on multivariate functions that are constructed using tensor-product B-spline expansions. We prove that using a hierarchical prior with an appropriate prior distribution on the partition size and Gaussian prior weights on the B-spline coefficients, procedures can be obtained that adapt to the degree of smoothness of the unknown function up to the order of the splines that are used. We take a unified approach including important nonparametric statistical settings like density estimation, regression, and classification.

Journal ArticleDOI
TL;DR: In this paper, a family of nonparametric multivariate multisample tests based on depth rankings is proposed, which are adapted to the depth function that is most appropriate for the application.
Abstract: In this paper, we construct a family of nonparametric multivariate multisample tests based on depth rankings. These tests are of Kruskal-Wallis type in the sense that the samples are variously ordered. However, unlike the Kruskal-Wallis test, these tests are based upon a depth ranking using a statistical depth function such as the halfspace depth or the Mahalanobis depth, etc. The types of tests we propose are adapted to the depth function that is most appropriate for the application. Under the null hypothesis that all samples come from the same distribution, we show that the test statistic asymptotically has a chi-square distribution. Some comparisons of power are made with the Hotelling T2, and the test of Choi and Marden (1997). Our test is particularly recommended when the data are of unknown distribution type where there is some evidence that the density contours are not elliptical. However, when the data are normally distributed, we often obtain high relative power.

Journal ArticleDOI
TL;DR: In this article, a permutation approach is proposed for matched pairs, whose distributions can have different shapes even under the null hypothesis of no treatment effect, and the limit of the studentized permutation distribution under alternatives, which can be used for the construction of $(1-\alpha)$-confidence intervals.
Abstract: We consider nonparametric ranking methods for matched pairs, whose distributions can have different shapes even under the null hypothesis of no treatment effect. Although the data may not be exchangeable under the null, we investigate a permutation approach as a valid procedure for finite sample sizes. In particular, we derive the limit of the studentized permutation distribution under alternatives, which can be used for the construction of $(1-\alpha)$-confidence intervals. Simulation studies show that the new approach is more accurate than its competitors. The procedures are illustrated using a real data set.

Journal ArticleDOI
TL;DR: A parsimonious multivariate stochastic volatility model that embeds GGM uncertainty in a larger hierarchical framework is developed, shown to be capable of adapting to swings in market volatility, offering improved calibration of predictive distributions.
Abstract: The Gaussian Graphical Model (GGM) is a popular tool for incorporating sparsity into joint multivariate distributions. The G-Wishart distribution, a conjugate prior for precision matrices satisfying general GGM constraints, has now been in existence for over a decade. However, due to the lack of a direct sampler, its use has been limited in hierarchical Bayesian contexts, relegating mixing over the class of GGMs mostly to situations involving standard Gaussian likelihoods. Recent work has developed methods that couple model and parameter moves, first through reversible jump methods and later by direct evaluation of conditional Bayes factors and subsequent resampling. Further, methods for avoiding prior normalizing constant calculations-a serious bottleneck and source of numerical instability-have been proposed. We review and clarify these developments and then propose a new methodology for GGM comparison that blends many recent themes. Theoretical developments and computational timing experiments reveal an algorithm that has limited computational demands and dramatically improves on computing times of existing methods. We conclude by developing a parsimonious multivariate stochastic volatility model that embeds GGM uncertainty in a larger hierarchical framework. The method is shown to be capable of adapting to swings in market volatility, offering improved calibration of predictive distributions.

Journal ArticleDOI
TL;DR: This paper reviews covariance-based Partial Least Squares methods, focusing on common features of their respective algorithms and optimization criteria, and proposes three new PLS-type algorithms for the analysis of one, two or several blocks of variables.
Abstract: In this paper I review covariance-based Partial Least Squares (PLS) methods, focusing on common features of their respective algorithms and optimization criteria. I then show how these algorithms can be ad- justed for use as optimal scaling tools. Three new PLS-type algorithms are proposed for the analysis of one, two or several blocks of variables: the Non- Metric NIPALS, the Non-Metric PLS Regression and the Non-Metric PLS Path Modeling, respectively. These algorithms extend the applicability of PLS methods to data measured on different measurement scales, as well as to variables linked by non-linear relationships.

Journal ArticleDOI
TL;DR: In this paper, the authors prove oracle inequalities for three different types of adap- tation procedures inspired by cross-validation and aggregation, which are then applied to the construction of Lasso estimators and aggre- gation with exponential weights with data-driven regularization and tem- perature parameters.
Abstract: We prove oracle inequalities for three different types of adap- tation procedures inspired by cross-validation and aggregation These pro- cedures are then applied to the construction of Lasso estimators and aggre- gation with exponential weights with data-driven regularization and tem- perature parameters, respectively We also prove oracle inequalities for the cross-validation procedure itself under some convexity assumptions AMS 2000 subject classifications: Primary 62G99

Journal ArticleDOI
TL;DR: In this paper, a nonparametric estimator for conditional copula functions in case of a univariate covariate has been proposed, where the covariate takes on values in more complex spaces.
Abstract: In this paper the interest is to estimate the dependence between two variables conditionally upon a covariate, through copula modelling. In recent literature nonparametric estimators for conditional copula functions in case of a univariate covariate have been proposed. The aim of this paper is to nonparametrically estimate a conditional copula when the covariate takes on values in more complex spaces. We consider multivariate covariates and functional covariates. We establish weak convergence, and bias and variance properties of the proposed nonparametric estimators. We also briefly discuss nonparametric estimation of conditional association measures such as a conditional Kendall’s tau. The case of functional covariates is of particular interest and challenge, both from theoretical as well as practical point of view. For this setting we provide an illustration with a real data example in which the covariates are spectral curves. A simulation study investigating the finite-sample performances of the discussed estimators is provided.

Journal ArticleDOI
TL;DR: In this paper, the intensity of a Gibbs point process model is computed as a function of the model parameters, which is an intractable function for the practical application of such models.
Abstract: The intensity of a Gibbs point process model is usually an intractable function of the model parameters. This is a severe restriction on the practical application of such models. We develop a new approximation for the intensity of a stationary Gibbs point process on $\mathbb{R}^{d}$. For pairwise interaction processes, the approximation can be computed rapidly and is surprisingly accurate. The new approximation is qualitatively similar to the mean field approximation, but is far more accurate, and does not exhibit the same pathologies. It may be regarded as a counterpart of the Percus-Yevick approximation.

Journal ArticleDOI
TL;DR: In this article, the estimation of conditional quantiles when the covariate is functional and when the order of the quantiles converges to one as the sample size increases was investigated, and sufficient conditions on the rate of convergence of their order to one were provided to obtain asymptotically Gaussian distributed estimators.
Abstract: We address the estimation of conditional quantiles when the covariate is functional and when the order of the quantiles converges to one as the sample size increases. In a first time, we investigate to what extent these large conditional quantiles can still be estimated through a functional kernel estimator of the conditional survival function. Sufficient conditions on the rate of convergence of their order to one are provided to obtain asymptotically Gaussian distributed estimators. In a second time, basing on these result, a functional Weissman estimator is derived, permitting to estimate large conditional quantiles of arbitrary large order. These results are illustrated on finite sample situations.

Journal ArticleDOI
TL;DR: In this paper, a variable selection method for VC models in quantile regression using a shrinkage idea is proposed, based on the basis expansion of each varying coefficient and the regularization penalty on the Euclidean norm of the corresponding coefficient vector.
Abstract: Varying coefficient (VC) models are commonly used to study dynamic patterns in many scientific areas. In particular, VC models in quantile regression are known to provide a more complete description of the response distribution than in mean regression. In this paper, we develop a variable selection method for VC models in quantile regression using a shrinkage idea. The proposed method is based on the basis expansion of each varying coefficient and the regularization penalty on the Euclidean norm of the corresponding coefficient vector. We show that our estimator is obtained as an optimal solution to the second order cone programming (SOCP) problem and that the proposed procedure has consistency in variable selection under suitable conditions. Further, we show that the estimated relevant coefficients converge to the true functions at the univariate optimal rate. Finally, the method is illustrated with numerical simulations including the analysis of forced expiratory volume (FEV) data.