scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Methodology in 2009"


Journal ArticleDOI
TL;DR: In this paper, a general procedure is studied to perturb a multivariate density satisfying a weak form of multivariate symmetry, and to generate a whole set of non-symmetric densities.
Abstract: A fairly general procedure is studied to perturbate a multivariate density satisfying a weak form of multivariate symmetry, and to generate a whole set of non-symmetric densities. The approach is general enough to encompass a number of recent proposals in the literature, variously related to the skew normal distribution. The special case of skew elliptical densities is examined in detail, establishing connections with existing similar work. The final part of the paper specializes further to a form of multivariate skew $t$ density. Likelihood inference for this distribution is examined, and it is illustrated with numerical examples.

1,174 citations


Journal ArticleDOI
TL;DR: Azzalini and Dalla Valle as mentioned in this paper have recently discussed the multivariate skew-normal distribution which extends the class of normal distributions by the addition of a shape parameter.
Abstract: Azzalini & Dalla Valle (1996) have recently discussed the multivariate skew-normal distribution which extends the class of normal distributions by the addition of a shape parameter. The first part of the present paper examines further probabilistic properties of the distribution, with special emphasis on aspects of statistical relevance. Inferential and other statistical issues are discussed in the following part, with applications to some multivariate statistics problems, illustrated by numerical examples. Finally, a further extension is described which introduces a skewing factor of an elliptical density.

1,046 citations


Journal ArticleDOI
TL;DR: The analysis of fMRI data is discussed, from the initial acquisition of the raw data to its use in locating brain activity, making inference about brain connectivity and predictions about psychological or disease states.
Abstract: In recent years there has been explosive growth in the number of neuroimaging studies performed using functional Magnetic Resonance Imaging (fMRI). The field that has grown around the acquisition and analysis of fMRI data is intrinsically interdisciplinary in nature and involves contributions from researchers in neuroscience, psychology, physics and statistics, among others. A standard fMRI study gives rise to massive amounts of noisy data with a complicated spatio-temporal correlation structure. Statistics plays a crucial role in understanding the nature of the data and obtaining relevant results that can be used and interpreted by neuroscientists. In this paper we discuss the analysis of fMRI data, from the initial acquisition of the raw data to its use in locating brain activity, making inference about brain connectivity and predictions about psychological or disease states. Along the way, we illustrate interesting and important issues where statistics already plays a crucial role. We also seek to illustrate areas where statistics has perhaps been underutilized and will have an increased role in the future.

607 citations


Journal ArticleDOI
TL;DR: It is shown that the proposed methods also possess the sure screening property with vanishing false selection rate, which justifies the applicability of such a simple method in a wide spectrum.
Abstract: Ultrahigh-dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849-911] propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screening property within the context of the linear model with Gaussian covariates and responses. In this paper, we propose a more general version of the independent learning with ranking the maximum marginal likelihood estimates or the maximum marginal likelihood itself in generalized linear models. We show that the proposed methods, with Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849-911] as a very special case, also possess the sure screening property with vanishing false selection rate. The conditions under which the independence learning possesses a sure screening is surprisingly simple. This justifies the applicability of such a simple method in a wide spectrum. We quantify explicitly the extent to which the dimensionality can be reduced by independence screening, which depends on the interactions of the covariance matrix of covariates and true parameters. Simulation studies are used to illustrate the utility of the proposed approaches. In addition, we establish an exponential inequality for the quasi-maximum likelihood estimator which is useful for high-dimensional statistical learning.

538 citations


Posted Content
TL;DR: This paper investigates a new learning formulation called structured sparsity, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing by allowing arbitrary structures on the feature set, which generalizes the group sparsity idea.
Abstract: This paper investigates a new learning formulation called structured sparsity, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing. By allowing arbitrary structures on the feature set, this concept generalizes the group sparsity idea that has become popular in recent years. A general theory is developed for learning with structured sparsity, based on the notion of coding complexity associated with the structure. It is shown that if the coding complexity of the target signal is small, then one can achieve improved performance by using coding complexity regularization methods, which generalize the standard sparse regularization. Moreover, a structured greedy algorithm is proposed to efficiently solve the structured sparsity problem. It is shown that the greedy algorithm approximately solves the coding complexity optimization problem under appropriate conditions. Experiments are included to demonstrate the advantage of structured sparsity over standard sparsity on some real applications.

457 citations


Posted Content
TL;DR: Brown et al. as mentioned in this paper presented the results of a simulation study into the properties of 12 different estimators of the Hurst parameter, H, or the fractional integra-tion parameter, d, in long memory time series.
Abstract: jennifer.brown@cantebury.ac.nz:We present the results of a simulation study into the properties of 12different estimators of the Hurst parameter, H, or the fractional integra-tion parameter, d, in long memory time series. We compare and contrasttheir performance on simulated Fractional Gaussian Noises and fractionallyintegrated series with lengths between 100 and 10,000 data points and Hvalues between 0.55 and 0.90 or d values between 0.05 and 0.40. We applyall 12 estimators to the Campito Mountain data and estimate the accuracyof their estimates using the Beran goodness of fit test for long memory timeseries.MCS code: 37M10Keywords and phrases: Strong dependence, Global dependence, Longrange dependence, Hurst parameter estimators.

397 citations


Posted Content
TL;DR: In this paper, the authors developed inferentially practical, likelihood-based methods for fitting max-stable processes derived from a composite-likelihood approach, which is sufficiently reliable and versatile to permit the simultaneous modeling of marginal and dependence parameters in the spatial context at a moderate computational cost.
Abstract: The last decade has seen max-stable processes emerge as a common tool for the statistical modeling of spatial extremes. However, their application is complicated due to the unavailability of the multivariate density function, and so likelihood-based methods remain far from providing a complete and flexible framework for inference. In this article we develop inferentially practical, likelihood-based methods for fitting max-stable processes derived from a composite-likelihood approach. The procedure is sufficiently reliable and versatile to permit the simultaneous modeling of marginal and dependence parameters in the spatial context at a moderate computational cost. The utility of this methodology is examined via simulation, and illustrated by the analysis of U.S. precipitation extremes.

339 citations


Journal ArticleDOI
TL;DR: The use and misuse of citation data in the assessment of scientific research is discussed in this article, where citation statistics are inherently more accurate because they substitute simple numbers for complex judgments, and hence overcome the possible subjectivity of peer review.
Abstract: This is a report about the use and misuse of citation data in the assessment of scientific research The idea that research assessment must be done using ``simple and objective'' methods is increasingly prevalent today The ``simple and objective'' methods are broadly interpreted as bibliometrics, that is, citation data and the statistics derived from them There is a belief that citation statistics are inherently more accurate because they substitute simple numbers for complex judgments, and hence overcome the possible subjectivity of peer review But this belief is unfounded

182 citations


Journal ArticleDOI
TL;DR: In this paper, partial identification results for average and quantile effects are given for discrete regressors, under static or dynamic conditions, in fully nonparametric and in semiparametric models, with time effects.
Abstract: Nonseparable panel models are important in a variety of economic settings, including discrete choice. This paper gives identification and estimation results for nonseparable models under time homogeneity conditions that are like "time is randomly assigned" or "time is an instrument." Partial identification results for average and quantile effects are given for discrete regressors, under static or dynamic conditions, in fully nonparametric and in semiparametric models, with time effects. It is shown that the usual, linear, fixed-effects estimator is not a consistent estimator of the identified average effect, and a consistent estimator is given. A simple estimator of identified quantile treatment effects is given, providing a solution to the important problem of estimating quantile treatment effects from panel data. Bounds for overall effects in static and dynamic models are given. The dynamic bounds provide a partial identification solution to the important problem of estimating the effect of state dependence in the presence of unobserved heterogeneity. The impact of $T$, the number of time periods, is shown by deriving shrinkage rates for the identified set as $T$ grows. We also consider semiparametric, discrete-choice models and find that semiparametric panel bounds can be much tighter than nonparametric bounds. Computationally-convenient methods for semiparametric models are presented. We propose a novel inference method that applies in panel data and other settings and show that it produces uniformly valid confidence regions in large samples. We give empirical illustrations.

173 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigate a general framework for combining regularized regression methods with the estimation of Graphical Gaussian models, including various existing methods as well as two new approaches based on ridge regression and adaptive lasso, respectively.
Abstract: Graphical Gaussian models are popular tools for the estimation of (undirected) gene association networks from microarray data. A key issue when the number of variables greatly exceeds the number of samples is the estimation of the matrix of partial correlations. Since the (Moore-Penrose) inverse of the sample covariance matrix leads to poor estimates in this scenario, standard methods are inappropriate and adequate regularization techniques are needed. In this article, we investigate a general framework for combining regularized regression methods with the estimation of Graphical Gaussian models. This framework includes various existing methods as well as two new approaches based on ridge regression and adaptive lasso, respectively. These methods are extensively compared both qualitatively and quantitatively within a simulation study and through an application to six diverse real data sets. In addition, all proposed algorithms are implemented in the R package "parcor", available from the R repository CRAN.

129 citations


Posted Content
TL;DR: This paper investigates the use of the Metropolis--Hastings algorithm to compute a pseudo-posterior distribution based on the composite likelihood and two methodologies for adjusting the algorithm are presented.
Abstract: Composite likelihoods are increasingly used in applications where the full likelihood is analytically unknown or computationally prohibitive. Although the maximum composite likelihood estimator has frequentist properties akin to those of the usual maximum likelihood estimator, Bayesian inference based on composite likelihoods has yet to be explored. In this paper we investigate the use of the Metropolis--Hastings algorithm to compute a pseudo-posterior distribution based on the composite likelihood. Two methodologies for adjusting the algorithm are presented and their performance on approximating the true posterior distribution is investigated using simulated data sets and real data on spatial extremes of rainfall.

Posted Content
TL;DR: In this article, a regression interpretation of the Cholesky factor of the covariance matrix is proposed, which leads to a new class of regularized covariance estimators suitable for high-dimensional problems.
Abstract: In this paper we propose a new regression interpretation of the Cholesky factor of the covariance matrix, as opposed to the well known regression interpretation of the Cholesky factor of the inverse covariance, which leads to a new class of regularized covariance estimators suitable for high-dimensional problems. Regularizing the Cholesky factor of the covariance via this regression interpretation always results in a positive definite estimator. In particular, one can obtain a positive definite banded estimator of the covariance matrix at the same computational cost as the popular banded estimator proposed by Bickel and Levina (2008b), which is not guaranteed to be positive definite. We also establish theoretical connections between banding Cholesky factors of the covariance matrix and its inverse and constrained maximum likelihood estimation under the banding constraint, and compare the numerical performance of several methods in simulations and on a sonar data example.

Journal ArticleDOI
TL;DR: This paper generalizes the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties.
Abstract: We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation--Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual $d$-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-merge" procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional velocity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.

Posted Content
TL;DR: In this paper, the authors presented a new method of trend extraction in the framework of the Singular Spectrum Analysis (SSA) approach, which is easy to use, does not need specification of models of time series and trend, allows to extract trend in the presence of noise and oscillations and has only two parameters (besides basic SSA parameter called window length).
Abstract: • The paper presents a new method of trend extraction in the framework of the Singular Spectrum Analysis (SSA) approach. This method is easy to use, does not need specification of models of time series and trend, allows to extract trend in the presence of noise and oscillations and has only two parameters (besides basic SSA parameter called window length). One parameter manages scale of the extracted trend and another is a method specific threshold value. We propose procedures for the choice of the parameters. The presented method is evaluated on a simulated time series with a polynomial trend and an oscillating component with unknown period and on the seasonally adjusted monthly data of unemployment level in Alaska for the period 1976/01–2006/09.

Posted Content
TL;DR: In this article, the authors develop modeling and inference tools for counterfactual distributions based on regression methods and derive joint functional central limit theorems and bootstrap validity results for regression-based estimators of the status quo and the conditional distribution of the outcome given covariates.
Abstract: Counterfactual distributions are important ingredients for policy analysis and decomposition analysis in empirical economics. In this article we develop modeling and inference tools for counterfactual distributions based on regression methods. The counterfactual scenarios that we consider consist of ceteris paribus changes in either the distribution of covariates related to the outcome of interest or the conditional distribution of the outcome given covariates. For either of these scenarios we derive joint functional central limit theorems and bootstrap validity results for regression-based estimators of the status quo and counterfactual outcome distributions. These results allow us to construct simultaneous confidence sets for function-valued effects of the counterfactual changes, including the effects on the entire distribution and quantile functions of the outcome as well as on related functionals. These confidence sets can be used to test functional hypotheses such as no-effect, positive effect, or stochastic dominance. Our theory applies to general counterfactual changes and covers the main regression methods including classical, quantile, duration, and distribution regressions. We illustrate the results with an empirical application to wage decompositions using data for the United States. As a part of developing the main results, we introduce distribution regression as a comprehensive and flexible tool for modeling and estimating the \textit{entire} conditional distribution. We show that distribution regression encompasses the Cox duration regression and represents a useful alternative to quantile regression. We establish functional central limit theorems and bootstrap validity results for the empirical distribution regression process and various related functionals.

Journal ArticleDOI
TL;DR: In this paper, the authors present methods of sensitivity analysis to adjust interval estimates of treatment effect obtained using multiple linear regression, which adapts to treatment effects that may differ by subgroup, to scenarios involving omission of multiple variables, and to combinations of covariance adjustment with propensity score stratification.
Abstract: Omitted variable bias can affect treatment effect estimates obtained from observational data due to the lack of random assignment to treatment groups. Sensitivity analyses adjust these estimates to quantify the impact of potential omitted variables. This paper presents methods of sensitivity analysis to adjust interval estimates of treatment effect---both the point estimate and standard error---obtained using multiple linear regression. Central to our approach is what we term benchmarking, the use of data to establish reference points for speculation about omitted confounders. The method adapts to treatment effects that may differ by subgroup, to scenarios involving omission of multiple variables, and to combinations of covariance adjustment with propensity score stratification. We illustrate it using data from an influential study of health outcomes of patients admitted to critical care.

Posted Content
TL;DR: An overview of the historical development of statistical network modeling is overviewed and a number of examples that have been studied in the network literature are introduced, and a subsequent discussion focuses on anumber of prominent static and dynamic network models and their interconnections.
Abstract: Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.

Posted Content
TL;DR: This work proposes a new SMC algorithm to compute the expectation of additive functionals recursively and shows how this allows to perform recursive parameter estimation using an SMC implementation of an on-line version of the Expectation-Maximization algorithm which does not suffer from the particle path degeneracy problem.
Abstract: Sequential Monte Carlo (SMC) methods are a widely used set of computational tools for inference in non-linear non-Gaussian state-space models. We propose a new SMC algorithm to compute the expectation of additive functionals recursively. Essentially, it is an on-line or "forward only" implementation of a forward filtering backward smoothing SMC algorithm proposed by Doucet, Godsill and Andrieu (2000). Compared to the standard \emph{path space} SMC estimator whose asymptotic variance increases quadratically with time even under favorable mixing assumptions, the non asymptotic variance of the proposed SMC estimator only increases linearly with time. We show how this allows us to perform recursive parameter estimation using an SMC implementation of an on-line version of the Expectation-Maximization algorithm which does not suffer from the particle path degeneracy problem.

Posted Content
TL;DR: In this article, the authors considered a zero mean discrete time series and defined its discrete Fourier transform at the canonical frequencies, and constructed a Portmanteau type test statistic for testing stationarity of the time series.
Abstract: We consider a zero mean discrete time series, and define its discrete Fourier transform at the canonical frequencies. It is well known that the discrete Fourier transform is asymptotically uncorrelated at the canonical frequencies if and if only the time series is second order stationary. Exploiting this important property, we construct a Portmanteau type test statistic for testing stationarity of the time series. It is shown that under the null of stationarity, the test statistic is approximately a chi square distribution. To examine the power of the test statistic, the asymptotic distribution under the locally stationary alternative is established. It is shown to be a type of noncentral chi-square, where the noncentrality parameter measures the deviation from stationarity. The test is illustrated with simulations, where is it shown to have good power. Some real examples are also included to illustrate the test.

Posted Content
TL;DR: In this article, the authors consider nonparametric regression in the context of functional data, that is, when a random sample of functions is observed on a fine grid and obtain a functional asymptotic normality result allowing to build simultaneous confidence bands (SCB) for various estimation and inference tasks.
Abstract: We consider nonparametric regression in the context of functional data, that is, when a random sample of functions is observed on a fine grid. We obtain a functional asymptotic normality result allowing to build simultaneous confidence bands (SCB) for various estimation and inference tasks. Two applications to a SCB procedure for the regression function and to a goodness-of-fit test for curvilinear regression models are proposed. The first one has improved accuracy upon the other available methods while the second can detect local departures from a parametric shape, as opposed to the usual goodness-of-fit tests which only track global departures. A numerical study of the SCB procedures and an illustration with a speech data set are provided.

Posted Content
TL;DR: In this article, the authors present a test for decentralized sequential hypothesis testing, which is asymptotically optimum in the case of continuous time and continuous path signals, while in discrete time this strong optimality property is preserved under proper conditions.
Abstract: We present a test for the problem of decentralized sequential hypothesis testing, which is asymptotically optimum. By selecting a suitable sampling mechanism at each sensor, communication between sensors and fusion center is asynchronous and limited to 1-bit data. The proposed SPRT-like test turns out to be order-2 asymptotically optimum in the case of continuous time and continuous path signals, while in discrete time this strong asymptotic optimality property is preserved under proper conditions. If these conditions do not hold, then we can show optimality of order-1. Simulations corroborate the excellent performance characteristics of the test of interest.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric P\'{o}lya tree prior is proposed for two-sample hypothesis testing, which leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null hypothesis.
Abstract: In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples $\mathbf{y}^{\scriptscriptstyle(1)}\;$\stackrel{\scriptscriptstyle{iid}}{\s im}$\;F^{\scriptscriptstyle(1)}$ and $\mathbf{y}^{\scriptscriptstyle(2 )}\;$\stackrel{\scriptscriptstyle{iid}}{\sim}$\;F^{\scriptscriptstyle( 2)}$, with $F^{\scriptscriptstyle(1)},F^{\scriptscriptstyle(2)}$ unknown, we wish to evaluate the evidence for the null hypothesis $H_0:F^{\scriptscriptstyle(1)}\equiv F^{\scriptscriptstyle(2)}$ versus the alternative $H_1:F^{\scriptscriptstyle(1)} eq F^{\scriptscriptstyle(2)}$. Our method is based upon a nonparametric P\'{o}lya tree prior centered either subjectively or using an empirical procedure. We show that the P\'{o}lya tree prior leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null $\mathrm{Pr}(H_0|\{\mathbf {y}^{\scriptscriptstyle(1)},\mathbf{y}^{\scriptscriptstyle(2)}\}\mathbf{)}$.

Posted Content
TL;DR: This thesis discusses some extensions of cross-validation to unsupervised learning, specifically focusing on the problem of choosing how many principal components to keep, and introduces the latent factor model, an objective criterion, and shows how CV can be used to estimate the intrinsic dimensionality of a data set.
Abstract: Cross-validation (CV) is a popular method for model-selection. Unfortunately, it is not immediately obvious how to apply CV to unsupervised or exploratory contexts. This thesis discusses some extensions of cross-validation to unsupervised learning, specifically focusing on the problem of choosing how many principal components to keep. We introduce the latent factor model, define an objective criterion, and show how CV can be used to estimate the intrinsic dimensionality of a data set. Through both simulation and theory, we demonstrate that cross-validation is a valuable tool for unsupervised learning.

Posted Content
TL;DR: This work presents an expansion for the mean of Q under the null hypothesis that is valid when the effect and the weight for each study depend on a single parameter, but for which neither normality nor independence of theEffect and weight estimators is needed.
Abstract: Meta-analysis seeks to combine the results of several experiments in order to improve the accuracy of decisions. It is common to use a test for homogeneity to determine if the results of the several experiments are sufficiently similar to warrant their combination into an overall result. Cochran's Q statistic is frequently used for this homogeneity test. It is often assumed that Q follows a chi-square distribution under the null hypothesis of homogeneity, but it has long been known that this asymptotic distribution for Q is not accurate for moderate sample sizes. Here we present formulas for the mean and variance of Q under the null hypothesis which represent O(1/n) corrections to the corresponding chi-square moments in the one parameter case. The formulas are fairly complicated, and so we provide a program (available at this http URL) for making the necessary calculations. We apply the results to the standardized mean difference (Cohen's d-statistic) and consider two approximations: a gamma distribution with estimated shape and scale parameters and the chi-square distribution with fractional degrees of freedom equal to the estimated mean of Q. We recommend the latter distribution as an approximate distribution for Q to use for testing the null hypothesis.

Posted Content
TL;DR: The computational results demonstrate that the proposed algorithm framework and methods are capable of solving problems of size at least a thousand and number of constraints of nearly a half million within a reasonable amount of time, and that the ASPG method generally outperforms the ANS method and glasso.
Abstract: In this paper, we consider estimating sparse inverse covariance of a Gaussian graphical model whose conditional independence is assumed to be partially known. Similarly as in [5], we formulate it as an $l_1$-norm penalized maximum likelihood estimation problem. Further, we propose an algorithm framework, and develop two first-order methods, that is, the adaptive spectral projected gradient (ASPG) method and the adaptive Nesterov's smooth (ANS) method, for solving this estimation problem. Finally, we compare the performance of these two methods on a set of randomly generated instances. Our computational results demonstrate that both methods are able to solve problems of size at least a thousand and number of constraints of nearly a half million within a reasonable amount of time, and the ASPG method generally outperforms the ANS method.

Posted Content
Jianqing Fan1
TL;DR: This paper extends ISIS, without explicit definition of residuals, to a general pseudo-likelihood framework, which includes generalized linear models as a special case, and introduces a new technique to reduce the false discovery rate in the feature screening stage.
Abstract: Variable selection in high-dimensional space characterizes many contemporary prob- lems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking or feature selection using a two- sample t-test in high-dimensional classification. Within the context of the linear model, Fan and Lv (2008) showed that this simple correlation ranking possesses a sure independence screen- ing property under certain conditions and that its revision, called iteratively sure independent screening (ISIS), is needed when the features are marginally unrelated but jointly related to the response variable. In this paper, we extend ISIS, without explicit definition of residuals, to a general pseudo-likelihood framework, which includes generalized linear models as a special case. Even in the least-squares setting, the new method improves ISIS by allowing variable deletion in the iterative process. Our technique allows us to select important features in high-dimensional classification where the popularly used two-sample t-method fails. A new technique is introduced to reduce the false discovery rate in the feature screening stage. Several simulated and two real data examples are presented to illustrate the methodology. Refreshments will be served at 3:30 PM in 0-112 Martin Hall.

Posted Content
TL;DR: This paper demonstrates how with complete sampling, conjugate closed form model selection based on product Dirichlet priors is possible, and proves that suitable homogeneity assumptions characterise the productDirichlet prior on this class of models.
Abstract: The class of chain event graph models is a generalisation of the class of discrete Bayesian networks, retaining most of the structural advantages of the Bayesian network for model interrogation, propagation and learning, while more naturally encoding asymmetric state spaces and the order in which events happen. In this paper we demonstrate how with complete sampling, conjugate closed form model selection based on product Dirichlet priors is possible, and prove that suitable homogeneity assumptions characterise the product Dirichlet prior on this class of models. We demonstrate our techniques using two educational examples.

Journal ArticleDOI
TL;DR: The Overlapping Stochastic Block Model (OSMBM) as mentioned in this paper allows the vertices to belong to multiple clusters, and, to some extent, generalizes the well-known stochastic block model [Nowicki and Snijders (2001].
Abstract: Complex systems in nature and in society are often represented as networks, describing the rich set of interactions between objects of interest. Many deterministic and probabilistic clustering methods have been developed to analyze such structures. Given a network, almost all of them partition the vertices into disjoint clusters, according to their connection profile. However, recent studies have shown that these techniques were too restrictive and that most of the existing networks contained overlapping clusters. To tackle this issue, we present in this paper the Overlapping Stochastic Block Model. Our approach allows the vertices to belong to multiple clusters, and, to some extent, generalizes the well-known Stochastic Block Model [Nowicki and Snijders (2001)]. We show that the model is generically identifiable within classes of equivalence and we propose an approximate inference procedure, based on global and local variational techniques. Using toy data sets as well as the French Political Blogosphere network and the transcriptional network of Saccharomyces cerevisiae, we compare our work with other approaches.

Posted Content
TL;DR: In this paper, a Horvitz-Thompson estimator of the mean trajectory is proposed to obtain uniformly consistent estimators of both the mean function and its variance function under mild regularity conditions.
Abstract: When dealing with very large datasets of functional data, survey sampling approaches are useful in order to obtain estimators of simple functional quantities, without being obliged to store all the data. We propose here a Horvitz--Thompson estimator of the mean trajectory. In the context of a superpopulation framework, we prove under mild regularity conditions that we obtain uniformly consistent estimators of the mean function and of its variance function. With additional assumptions on the sampling design we state a functional Central Limit Theorem and deduce asymptotic confidence bands. Stratified sampling is studied in detail, and we also obtain a functional version of the usual optimal allocation rule considering a mean variance criterion. These techniques are illustrated by means of a test population of N=18902 electricity meters for which we have individual electricity consumption measures every 30 minutes over one week. We show that stratification can substantially improve both the accuracy of the estimators and reduce the width of the global confidence bands compared to simple random sampling without replacement.

Posted Content
TL;DR: In this article, the authors proposed a local change-point analysis to estimate the volatility of financial time series when the stationarity assumption is violated, based on an adaptive pointwise selection of the largest interval of homogeneity with a given right-end point.
Abstract: This paper offers a new method for estimation and forecasting of the volatility of financial time series when the stationarity assumption is violated. Our general local parametric approach particularly applies to general varying-coefficient parametric models, such as GARCH, whose coefficients may arbitrarily vary with time. Global parametric, smooth transition, and change-point models are special cases. The method is based on an adaptive pointwise selection of the largest interval of homogeneity with a given right-end point by a local change-point analysis. We construct locally adaptive estimates that can perform this task and investigate them both from the theoretical point of view and by Monte Carlo simulations. In the particular case of GARCH estimation, the proposed method is applied to stock-index series and is shown to outperform the standard parametric GARCH model.