scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Methodology in 2021"


Journal ArticleDOI
TL;DR: In this article, the authors investigate an alternative relative accuracy measure which avoids this bias: the log of the accuracy ratio: log (prediction / actual) which is particularly relevant if the scatter in the data grows as the value of the variable grows (heteroscedasticity).
Abstract: Surveys show that the mean absolute percentage error (MAPE) is the most widely used measure of forecast accuracy in businesses and organizations. It is however, biased: When used to select among competing prediction methods it systematically selects those whose predictions are too low. This is not widely discussed and so is not generally known among practitioners. We explain why this happens. We investigate an alternative relative accuracy measure which avoids this bias: the log of the accuracy ratio: log (prediction / actual). Relative accuracy is particularly relevant if the scatter in the data grows as the value of the variable grows (heteroscedasticity). We demonstrate using simulations that for heteroscedastic data (modelled by a multiplicative error factor) the proposed metric is far superior to MAPE for model selection. Another use for accuracy measures is in fitting parameters to prediction models. Minimum MAPE models do not predict a simple statistic and so theoretical analysis is limited. We prove that when the proposed metric is used instead, the resulting least squares regression model predicts the geometric mean. This important property allows its theoretical properties to be understood.

62 citations


Journal ArticleDOI
TL;DR: This article argued that alpha adjustment is only appropriate in the case of disjunction testing, in which at least one test result must be significant in order to reject the associated joint null hypothesis.
Abstract: Scientists often adjust their significance threshold (alpha level) during null hypothesis significance testing in order to take into account multiple testing and multiple comparisons. This alpha adjustment has become particularly relevant in the context of the replication crisis in science. The present article considers the conditions in which this alpha adjustment is appropriate and the conditions in which it is inappropriate. A distinction is drawn between three types of multiple testing: disjunction testing, conjunction testing, and individual testing. It is argued that alpha adjustment is only appropriate in the case of disjunction testing, in which at least one test result must be significant in order to reject the associated joint null hypothesis. Alpha adjustment is inappropriate in the case of conjunction testing, in which all relevant results must be significant in order to reject the joint null hypothesis. Alpha adjustment is also inappropriate in the case of individual testing, in which each individual result must be significant in order to reject each associated individual null hypothesis. The conditions under which each of these three types of multiple testing is warranted are examined. It is concluded that researchers should not automatically (mindlessly) assume that alpha adjustment is necessary during multiple testing. Illustrations are provided in relation to joint studywise hypotheses and joint multiway ANOVAwise hypotheses.

60 citations


Journal ArticleDOI
TL;DR: In this article, the authors discuss issues of structural and practical identifiability of partially observed differential equations which are often applied in systems biology and propose using the profile likelihood, which is a powerful approach to detect and resolve practical non-identifiability.
Abstract: We discuss issues of structural and practical identifiability of partially observed differential equations which are often applied in systems biology The development of mathematical methods to investigate structural non-identifiability has a long tradition Computationally efficient methods to detect and cure it have been developed recently Practical non-identifiability on the other hand has not been investigated at the same conceptually clear level We argue that practical identifiability is more challenging than structural identifiability when it comes to modelling experimental data We discuss that the classical approach based on the Fisher information matrix has severe shortcomings As an alternative, we propose using the profile likelihood, which is a powerful approach to detect and resolve practical non-identifiability

57 citations


Posted Content
TL;DR: LinDA as discussed by the authors uses linear regression models on the centered log-ratio trans-formed data and corrects the bias due to compositional effects of false positive control for differentially abundance analysis of microbiome data.
Abstract: Differential abundance analysis is at the core of statistical analysis of microbiome data The compositional nature of microbiome sequencing data makes false positive control challenging Here we show that the compositional effects can be addressed elegantly by a simple, yet highly flexible and scalable approach The proposed method, LinDA, only requires fitting linear regression models on the centered log-ratio trans- formed data, and correcting the bias due to compositional effects We show that LinDA enjoys asymptotic FDR control property and can be extended to mixed-effect models for correlated microbiome data Using simulations and real examples, we demonstrate the effectiveness of LinDA

36 citations


Journal ArticleDOI
TL;DR: The contribution of the BCF model to the field of causal inference through discussions on two topics: 1) the difference between the PS in BCF and the Bayesian PS in a Bayesian updating approach, and 2) an alternative exposition of the role of PS in outcome modeling based methods for the estimation of causal effects.
Abstract: Hahn et al. (2020) offers an extensive study to explicate and evaluate the performance of the BCF model in different settings and provides a detailed discussion about its utility in causal inference. It is a welcomed addition to the causal machine learning literature. I will emphasize the contribution of the BCF model to the field of causal inference through discussions on two topics: 1) the difference between the PS in the BCF model and the Bayesian PS in a Bayesian updating approach, 2) an alternative exposition of the role of the PS in outcome modeling based methods for the estimation of causal effects. I will conclude with comments on avenues for future research involving BCF that will be important and much needed in the era of Big data.

29 citations


Posted Content
TL;DR: This paper presented a framework for addressing external validity bias, including a synthesis of approaches for generalizability and transportability, the assumptions they require, as well as tests for the heterogeneity of treatment effects and differences between study and target populations.
Abstract: When assessing causal effects, determining the target population to which the results are intended to generalize is a critical decision. Randomized and observational studies each have strengths and limitations for estimating causal effects in a target population. Estimates from randomized data may have internal validity but are often not representative of the target population. Observational data may better reflect the target population, and hence be more likely to have external validity, but are subject to potential bias due to unmeasured confounding. While much of the causal inference literature has focused on addressing internal validity bias, both internal and external validity are necessary for unbiased estimates in a target population. This paper presents a framework for addressing external validity bias, including a synthesis of approaches for generalizability and transportability, the assumptions they require, as well as tests for the heterogeneity of treatment effects and differences between study and target populations.

22 citations


Posted Content
TL;DR: This work outlines a formal framework that covers most existing approaches for validating clustering results on validation data, and reviews classical validation techniques such as internal and external validation, stability analysis, and visual validation, and shows how they can be interpreted in terms of this framework.
Abstract: Cluster analysis refers to a wide range of data analytic techniques for class discovery and is popular in many application fields. To judge the quality of a clustering result, different cluster validation procedures have been proposed in the literature. While there is extensive work on classical validation techniques, such as internal and external validation, less attention has been given to validating and replicating a clustering result using a validation dataset. Such a dataset may be part of the original dataset, which is separated before analysis begins, or it could be an independently collected dataset. We present a systematic structured framework for validating clustering results on validation data that includes most existing validation approaches. In particular, we review classical validation techniques such as internal and external validation, stability analysis, hypothesis testing, and visual validation, and show how they can be interpreted in terms of our framework. We precisely define and formalise different types of validation of clustering results on a validation dataset and explain how each type can be implemented in practice. Furthermore, we give examples of how clustering studies from the applied literature that used a validation dataset can be classified into the framework.

21 citations


Posted Content
TL;DR: In this article, the authors provide a workflow to test the strengths and limitations of Bayes factors as a way to quantify evidence in support of scientific hypotheses, and illustrate this workflow using an example from the cognitive sciences.
Abstract: Inferences about hypotheses are ubiquitous in the cognitive sciences. Bayes factors provide one general way to compare different hypotheses by their compatibility with the observed data. Those quantifications can then also be used to choose between hypotheses. While Bayes factors provide an immediate approach to hypothesis testing, they are highly sensitive to details of the data/model assumptions. Moreover it's not clear how straightforwardly this approach can be implemented in practice, and in particular how sensitive it is to the details of the computational implementation. Here, we investigate these questions for Bayes factor analyses in the cognitive sciences. We explain the statistics underlying Bayes factors as a tool for Bayesian inferences and discuss that utility functions are needed for principled decisions on hypotheses. Next, we study how Bayes factors misbehave under different conditions. This includes a study of errors in the estimation of Bayes factors. Importantly, it is unknown whether Bayes factor estimates based on bridge sampling are unbiased for complex analyses. We are the first to use simulation-based calibration as a tool to test the accuracy of Bayes factor estimates. Moreover, we study how stable Bayes factors are against different MCMC draws. We moreover study how Bayes factors depend on variation in the data. We also look at variability of decisions based on Bayes factors and how to optimize decisions using a utility function. We outline a Bayes factor workflow that researchers can use to study whether Bayes factors are robust for their individual analysis, and we illustrate this workflow using an example from the cognitive sciences. We hope that this study will provide a workflow to test the strengths and limitations of Bayes factors as a way to quantify evidence in support of scientific hypotheses. Reproducible code is available from this https URL.

20 citations


Posted Content
TL;DR: The asymptotic analysis reveals the efficiency‐robustness trade‐off by comparing the properties of various estimators using data at different levels with and without covariate adjustment and highlights the critical role of covariates in improving estimation efficiency.
Abstract: Cluster-randomized experiments are widely used due to their logistical convenience and policy relevance. To analyze them properly, we must address the fact that the treatment is assigned at the cluster level instead of the individual level. Standard analytic strategies are regressions based on individual data, cluster averages, and cluster totals, which differ when the cluster sizes vary. These methods are often motivated by models with strong and unverifiable assumptions, and the choice among them can be subjective. Without any outcome modeling assumption, we evaluate these regression estimators and the associated robust standard errors from a design-based perspective where only the treatment assignment itself is random and controlled by the experimenter. We demonstrate that regression based on cluster averages targets a weighted average treatment effect, regression based on individual data is suboptimal in terms of efficiency, and regression based on cluster totals is consistent and more efficient with a large number of clusters. We highlight the critical role of covariates in improving estimation efficiency, and illustrate the efficiency gain via both simulation studies and data analysis. Moreover, we show that the robust standard errors are convenient approximations to the true asymptotic standard errors under the design-based perspective. Our theory holds even when the outcome models are misspecified, so it is model-assisted rather than model-based. We also extend the theory to a wider class of weighted average treatment effects.

19 citations


Posted Content
TL;DR: It is shown that analogous, albeit more complex, transformations exist in the more general linear factor model, providing a new means to identify the effect in that model, and is proved that the resulting average causal effect estimator is root-N consistent and asymptotically normal.
Abstract: We develop a new approach for identifying and estimating average causal effects in panel data under a linear factor model with unmeasured confounders. Compared to other methods tackling factor models such as synthetic controls and matrix completion, our method does not require the number of time periods to grow infinitely. Instead, we draw inspiration from the two-way fixed effect model as a special case of the linear factor model, where a simple difference-in-differences transformation identifies the effect. We show that analogous, albeit more complex, transformations exist in the more general linear factor model, providing a new means to identify the effect in that model. In fact many such transformations exist, called bridge functions, all identifying the same causal effect estimand. This poses a unique challenge for estimation and inference, which we solve by targeting the minimal bridge function using a regularized estimation approach. We prove that our resulting average causal effect estimator is root-N consistent and asymptotically normal, and we provide asymptotically valid confidence intervals. Finally, we provide extensions for the case of a linear factor model with time-varying unmeasured confounders.

16 citations


Posted Content
TL;DR: A framework for Bayesian Likelihood-Free Inference based on Generalized Bayesian Inference using scoring rules (SRs) is proposed and it is proved finite sample posterior consistency and outlier robustness of the authors' posterior for the Kernel and Energy Scores are proved.
Abstract: We propose a framework for Bayesian Likelihood-Free Inference (LFI) based on Generalized Bayesian Inference using scoring rules (SRs). SRs are used to evaluate probabilistic models given an observation; a proper SR is minimised in expectation when the model corresponds to the data generating process for the observations. Using a strictly proper SR, for which the above minimum is unique, ensures posterior consistency of our method. Further, we prove finite sample posterior consistency and outlier robustness of our posterior for the Kernel and Energy Scores. As the likelihood function is intractable for LFI, we employ consistent estimators of SRs using model simulations in a pseudo-marginal MCMC; we show the target of such chain converges to the exact SR posterior by increasing the number of simulations. Furthermore, we note popular LFI techniques such as Bayesian Synthetic Likelihood (BSL) can be seen as special cases of our framework using only proper (but not strictly so) SR. We empirically validate our consistency and outlier robustness results and show how related approaches do not enjoy these properties. Practically, we use the Energy and Kernel Scores, but our general framework sets the stage for extensions with other scoring rules.

Posted Content
TL;DR: In this paper, a unified framework of counterfactual estimation for time-series cross-sectional data is introduced, which estimates the average treatment effect on the treated by directly imputing treated counters.
Abstract: This paper introduces a unified framework of counterfactual estimation for time-series cross-sectional data, which estimates the average treatment effect on the treated by directly imputing treated counterfactuals. Examples include the fixed effects counterfactual estimator, interactive fixed effects counterfactual estimator, and matrix completion estimator. These estimators provide more reliable causal estimates than conventional twoway fixed effects models when treatment effects are heterogeneous or unobserved time-varying confounders exist. Under this framework, we propose a new dynamic treatment effects plot, as well as several diagnostic tests, to help researchers gauge the validity of the identifying assumptions. We illustrate these methods with two political economy examples and develop an open-source package, fect, in both R and Stata to facilitate implementation.

Book ChapterDOI
TL;DR: This introductory review discusses origins, conventions, implementation and result interpretation of three uncertainty and sensitivity analyses methods, suitable to use when working with agent-based models, namely Consistency Analysis, Robustness Analysis and Latin Hypercube Analysis.
Abstract: Multiscale, agent-based mathematical models of biological systems are often associated with model uncertainty and sensitivity to parameter perturbations. Here, three uncertainty and sensitivity analyses methods, that are suitable to use when working with agent-based models, are discussed. These methods are namely Consistency Analysis, Robustness Analysis and Latin Hypercube Analysis. This introductory review discusses origins, conventions, implementation and result interpretation of the aforementioned methods. Information on how to implement the discussed methods in MATLAB is included.

Posted Content
TL;DR: In this paper, the authors propose a conformal inference framework for nonparametric outlier detection, which yields p-values that are marginally valid but mutually dependent for different test points.
Abstract: This paper studies the construction of p-values for nonparametric outlier detection, taking a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually dependent for different test points. We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense. We then introduce a new method to compute p-values that are both valid conditionally on the training data and independent of each other for different test points; this paves the way to stronger type-I error guarantees. Our results depart from classical conformal inference as we leverage concentration inequalities rather than combinatorial arguments to establish our finite-sample guarantees. Furthermore, our techniques also yield a uniform confidence bound for the false positive rate of any outlier detection algorithm, as a function of the threshold applied to its raw statistics. Finally, the relevance of our results is demonstrated by numerical experiments on real and simulated data.

Journal ArticleDOI
TL;DR: In this paper, the authors lay out the causal structure of individualized treatment effect in terms of potential outcomes and describe the required assumptions that underlie a causal interpretation of its prediction.
Abstract: Randomized trials typically estimate average relative treatment effects, but decisions on the benefit of a treatment are possibly better informed by more individualized predictions of the absolute treatment effect. In case of a binary outcome, these predictions of absolute individualized treatment effect require knowledge of the individual's risk without treatment and incorporation of a possibly differential treatment effect (i.e. varying with patient characteristics). In this paper we lay out the causal structure of individualized treatment effect in terms of potential outcomes and describe the required assumptions that underlie a causal interpretation of its prediction. Subsequently, we describe regression models and model estimation techniques that can be used to move from average to more individualized treatment effect predictions. We focus mainly on logistic regression-based methods that are both well-known and naturally provide the required probabilistic estimates. We incorporate key components from both causal inference and prediction research to arrive at individualized treatment effect predictions. While the separate components are well known, their successful amalgamation is very much an ongoing field of research. We cut the problem down to its essentials in the setting of a randomized trial, discuss the importance of a clear definition of the estimand of interest, provide insight into the required assumptions, and give guidance with respect to modeling and estimation options. Simulated data illustrates the potential of different modeling options across scenarios that vary both average treatment effect and treatment effect heterogeneity. Two applied examples illustrate individualized treatment effect prediction in randomized trial data.

Posted Content
TL;DR: This paper showed that the standard confidence intervals for prediction error derived from cross-validation may have coverage far below the desired level because each data point is used for both training and testing, and so the usual estimate of variance is too small.
Abstract: Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit to the training data We prove that this is not the case for the linear model fit by ordinary least squares; rather it estimates the average prediction error of models fit on other unseen training sets drawn from the same population We further show that this phenomenon occurs for most popular estimates of prediction error, including data splitting, bootstrapping, and Mallow's Cp Next, the standard confidence intervals for prediction error derived from cross-validation may have coverage far below the desired level Because each data point is used for both training and testing, there are correlations among the measured accuracies for each fold, and so the usual estimate of variance is too small We introduce a nested cross-validation scheme to estimate this variance more accurately, and show empirically that this modification leads to intervals with approximately correct coverage in many examples where traditional cross-validation intervals fail Lastly, our analysis also shows that when producing confidence intervals for prediction accuracy with simple data splitting, one should not re-fit the model on the combined data, since this invalidates the confidence intervals

Posted Content
TL;DR: In this paper, a randomized greedy search algorithm is proposed to find a point estimate for a random partition based on a loss function and posterior Monte Carlo samples, which is embarrassingly parallel.
Abstract: We propose a randomized greedy search algorithm to find a point estimate for a random partition based on a loss function and posterior Monte Carlo samples. Given the large size and awkward discrete nature of the search space, the minimization of the posterior expected loss is challenging. Our approach is a stochastic search based on a series of greedy optimizations performed in a random order and is embarrassingly parallel. We consider several loss functions, including Binder loss and variation of information. We note that criticisms of Binder loss are the result of using equal penalties of misclassification and we show an efficient means to compute Binder loss with potentially unequal penalties. Furthermore, we extend the original variation of information to allow for unequal penalties and show no increased computational costs. We provide a reference implementation of our algorithm. Using a variety of examples, we show that our method produces clustering estimates that better minimize the expected loss and are obtained faster than existing methods.

Posted Content
TL;DR: In this paper, the authors compare single and multiple changepoint techniques for time series data and introduce a new distance metric specifically designed to compare two multiple-changepoint segmentation methods.
Abstract: This paper describes and compares several prominent single and multiple changepoint techniques for time series data. Due to their importance in inferential matters, changepoint research on correlated data has accelerated recently. Unfortunately, small perturbations in model assumptions can drastically alter changepoint conclusions; for example, heavy positive correlation in a time series can be misattributed to a mean shift should correlation be ignored. This paper considers both single and multiple changepoint techniques. The paper begins by examining cumulative sum (CUSUM) and likelihood ratio tests and their variants for the single changepoint problem; here, various statistics, boundary cropping scenarios, and scaling methods (e.g., scaling to an extreme value or Brownian Bridge limit) are compared. A recently developed test based on summing squared CUSUM statistics over all times is shown to have realistic Type I errors and superior detection power. The paper then turns to the multiple changepoint setting. Here, penalized likelihoods drive the discourse, with AIC, BIC, mBIC, and MDL penalties being considered. Binary and wild binary segmentation techniques are also compared. We introduce a new distance metric specifically designed to compare two multiple changepoint segmentations. Algorithmic and computational concerns are discussed and simulations are provided to support all conclusions. In the end, the multiple changepoint setting admits no clear methodological winner, performance depending on the particular scenario. Nonetheless, some practical guidance will emerge.

Posted Content
TL;DR: In this paper, the authors investigate the joint estimation of extreme marginal expectiles of a random vector with heavy-tailed marginal distributions, in a general extremal dependence model, and use these results to derive corrected confidence regions for extreme expectiles, as well as a test for the equality of tail expectiles.
Abstract: Expectiles induce a law-invariant, coherent and elicitable risk measure that has received substantial attention in actuarial and financial risk management contexts. A number of recent papers have focused on the behaviour and estimation of extreme expectile-based risk measures and their potential for risk assessment was highlighted in financial and actuarial real data applications. Joint inference of several extreme expectiles has however been left untouched; in fact, even the inference about a marginal extreme expectile turns out to be a difficult problem in finite samples, even though an accurate idea of estimation uncertainty is crucial for the construction of confidence intervals in applications to risk management. We investigate the joint estimation of extreme marginal expectiles of a random vector with heavy-tailed marginal distributions, in a general extremal dependence model. We use these results to derive corrected confidence regions for extreme expectiles, as well as a test for the equality of tail expectiles. The methods are showcased in a finite-sample simulation study and on real financial data.

Posted Content
TL;DR: This work proposes contrastive latent variable models designed for count data to create a richer portrait of differential expression in sequencing data and develops a model-based hypothesis testing framework that can test for global and gene subset-specific changes in expression.
Abstract: High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools for understanding cellular state. Often it is of interest to quantify and summarize changes in cell state that occur between experimental or biological conditions. Differential expression is typically assessed using univariate tests to measure gene-wise shifts in expression. However, these methods largely ignore changes in transcriptional correlation. Furthermore, there is a need to identify the low-dimensional structure of the gene expression shift to identify collections of genes that change between conditions. Here, we propose contrastive latent variable models designed for count data to create a richer portrait of differential expression in sequencing data. These models disentangle the sources of transcriptional variation in different conditions, in the context of an explicit model of variation at baseline. Moreover, we develop a model-based hypothesis testing framework that can test for global and gene subset-specific changes in expression. We test our model through extensive simulations and analyses with count-based gene expression data from perturbation and observational sequencing experiments. We find that our methods can effectively summarize and quantify complex transcriptional changes in case-control experimental sequencing data.

Journal ArticleDOI
TL;DR: In this article, a Bayesian model-averaged meta-analysis for standardized mean variance differences was proposed to quantify evidence for both treatment effectiveness and across-study heterogeneity, and four competing models were constructed by orthogonally combining two present-absent assumptions, one for the treatment effect and one for acrossstudy heterogeneity.
Abstract: We outline a Bayesian model-averaged meta-analysis for standardized mean differences in order to quantify evidence for both treatment effectiveness $\delta$ and across-study heterogeneity $\tau$. We construct four competing models by orthogonally combining two present-absent assumptions, one for the treatment effect and one for across-study heterogeneity. To inform the choice of prior distributions for the model parameters, we used 50% of the Cochrane Database of Systematic Reviews to specify rival prior distributions for $\delta$ and $\tau$. The relative predictive performance of the competing models and rival prior distributions was assessed using the remaining 50\% of the Cochrane Database. On average, $\mathcal{H}_1^r$ -- the model that assumes the presence of a treatment effect as well as across-study heterogeneity -- outpredicted the other models, but not by a large margin. Within $\mathcal{H}_1^r$, predictive adequacy was relatively constant across the rival prior distributions. We propose specific empirical prior distributions, both for the field in general and for each of 46 specific medical subdisciplines. An example from oral health demonstrates how the proposed prior distributions can be used to conduct a Bayesian model-averaged meta-analysis in the open-source software R and JASP. The preregistered analysis plan is available at https://osf.io/zs3df/.

Posted Content
TL;DR: It is shown that SVM is a continuous relaxation of the quadratic integer program for computing the largest balanced subset, establishing its direct relation to the cardinality matching method and characterize the bias of causal effect estimation arising from this trade-off.
Abstract: Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature. We demonstrate that SVM can be used to balance covariates and estimate average causal effects under the unconfoundedness assumption. Specifically, we adapt the SVM classifier as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups while simultaneously maximizing effective sample size. We also show that SVM is a continuous relaxation of the quadratic integer program for computing the largest balanced subset, establishing its direct relation to the cardinality matching method. Another important feature of SVM is that the regularization parameter controls the trade-off between covariate balance and effective sample size. As a result, the existing SVM path algorithm can be used to compute the balance-sample size frontier. We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods. Finally, we conduct simulation and empirical studies to evaluate the performance of the proposed methodology and find that SVM is competitive with the state-of-the-art covariate balancing methods.

Posted Content
TL;DR: In this paper, a novel procedure to perform fuzzy clustering of multivariate time series generated from different dependence models is proposed, where each series is associated to all the clusters with specific membership levels.
Abstract: A novel procedure to perform fuzzy clustering of multivariate time series generated from different dependence models is proposed. Different amounts of dissimilarity between the generating models or changes on the dynamic behaviours over time are some arguments justifying a fuzzy approach, where each series is associated to all the clusters with specific membership levels. Our procedure considers quantile-based cross-spectral features and consists of three stages: (i) each element is characterized by a vector of proper estimates of the quantile cross-spectral densities, (ii) principal component analysis is carried out to capture the main differences reducing the effects of the noise, and (iii) the squared Euclidean distance between the first retained principal components is used to perform clustering through the standard fuzzy C-means and fuzzy C-medoids algorithms. The performance of the proposed approach is evaluated in a broad simulation study where several types of generating processes are considered, including linear, nonlinear and dynamic conditional correlation models. Assessment is done in two different ways: by directly measuring the quality of the resulting fuzzy partition and by taking into account the ability of the technique to determine the overlapping nature of series located equidistant from well-defined clusters. The procedure is compared with the few alternatives suggested in the literature, substantially outperforming all of them whatever the underlying process and the evaluation scheme. Two specific applications involving air quality and financial databases illustrate the usefulness of our approach.

Posted Content
TL;DR: In this paper, the Stein discrepancy is used as a loss function for generalised Bayesian inference, which can be used to improve robustness against possible misspecification of the likelihood.
Abstract: Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible misspecification of the likelihood. Here we consider generalised Bayesian inference with a Stein discrepancy as a loss function, motivated by applications in which the likelihood contains an intractable normalisation constant. In this context, the Stein discrepancy circumvents evaluation of the normalisation constant and produces generalised posteriors that are either closed form or accessible using standard Markov chain Monte Carlo. On a theoretical level, we show consistency, asymptotic normality, and bias-robustness of the generalised posterior, highlighting how these properties are impacted by the choice of Stein discrepancy. Then, we provide numerical experiments on a range of intractable distributions, including applications to kernel-based exponential family models and non-Gaussian graphical models.

Posted Content
TL;DR: In this article, the authors consider how, at the point of writing a statistical analysis plan, to choose between three broad approaches: direct adjustment, standardisation and inverse probability of treatment weighting (IPTW), which are in their view the most promising methods.
Abstract: Background: It has long been advised to account for baseline covariates in the analysis of confirmatory randomised trials, with the main statistical justifications being that this increases power and, when a randomisation scheme balanced covariates, permits a valid estimate of experimental error. There are various methods available to account for covariates. Methods: We consider how, at the point of writing a statistical analysis plan, to choose between three broad approaches: direct adjustment, standardisation and inverse-probability-of-treatment weighting (IPTW), which are in our view the most promising methods. Using the GetTested trial, a randomised trial designed to assess the effectiveness of an electonic STI (sexually transmitted infection) testing and results service, we illustrate how a method might be chosen in advance and show some of the anticipated issues in action. Results: The choice of approach is not straightforward, particularly with models for binary outcome measures, where we focus most of our attention. We compare the properties of the three broad approaches in terms of the quantity they target (estimand), how a method performs under model misspecification, convergence issues, handling designed balance, precision of estimators, estimation of standard errors, and finally clarify some issues around handling of missing data. Conclusions: We conclude that no single approach is always best and explain why the choice will depend on the trial context but encourage trialists to consider the three methods more routinely.

Posted Content
TL;DR: This article proposed a marginalization method based on parametric G-computation that can be easily applied where the outcome regression is a generalized linear model or a Cox model, and showed that the marginalized covariate-adjusted estimates provide greater precision and accuracy than the conditional estimates produced by the conventional outcome regression, which are systematically biased because the measure of effect is noncollapsible.
Abstract: Population adjustment methods such as matching-adjusted indirect comparison (MAIC) are increasingly used to compare marginal treatment effects when there are cross-trial differences in effect modifiers and limited patient-level data. MAIC is based on propensity score weighting, which is sensitive to poor covariate overlap and cannot extrapolate beyond the observed covariate space. Current outcome regression-based alternatives can extrapolate but target a conditional treatment effect that is incompatible in the indirect comparison. When adjusting for covariates, one must integrate or average the conditional estimate over the relevant population to recover a compatible marginal treatment effect. We propose a marginalization method based parametric G-computation that can be easily applied where the outcome regression is a generalized linear model or a Cox model. The approach views the covariate adjustment regression as a nuisance model and separates its estimation from the evaluation of the marginal treatment effect of interest. The method can accommodate a Bayesian statistical framework, which naturally integrates the analysis into a probabilistic framework. A simulation study provides proof-of-principle and benchmarks the method's performance against MAIC and the conventional outcome regression. Parametric G-computation achieves more precise and more accurate estimates than MAIC, particularly when covariate overlap is poor, and yields unbiased marginal treatment effect estimates under no failures of assumptions. Furthermore, the marginalized covariate-adjusted estimates provide greater precision and accuracy than the conditional estimates produced by the conventional outcome regression, which are systematically biased because the measure of effect is non-collapsible.

Posted Content
TL;DR: This article provides a tight upper bound for the probability of LASSO sign recovery and presents examples of irrepresentability and identifiability curves for some selected design matrices X, showing that "irrepresentability" is a much stronger condition than "identifiability", especially when the entries in each row of X are strongly correlated.
Abstract: Basis Pursuit (BP), Basis Pursuit DeNoising (BPDN), and LASSO are popular methods for identifying important predictors in the high-dimensional linear regression model Y = Xβ + e. By definition, when e = 0, BP uniquely recovers β when Xβ = Xb and β different than b implies L1 norm of β is smaller than the L1 norm of b (identifiability condition). Furthermore, LASSO can recover the sign of β only under a much stronger irrepresentability condition. Meanwhile, it is known that the model selection properties of LASSO can be improved by hard-thresholding its estimates. This article supports these findings by proving that thresholded LASSO, thresholded BPDN and thresholded BP recover the sign of β in both the noisy and noiseless cases if and only if β is identifiable and large enough. In particular, if X has iid Gaussian entries and the number of predictors grows linearly with the sample size, then these thresholded estimators can recover the sign of β when the signal sparsity is asymptotically below the Donoho-Tanner transition curve. This is in contrast to the regular LASSO, which asymptotically, recovers the sign of β only when the signal sparsity tends to 0. Numerical experiments show that the identifiability condition, unlike the irrepresentability condition, does not seem to be affected by the structure of the correlations in the X matrix.

Journal ArticleDOI
TL;DR: In this article, the authors evaluate the performance of ten prominent bivariate causality indices for time series data, across four simulated model systems that have different coupling schemes and characteristics, and they recommend transfer entropy and nonlinear Granger causality as likely to be particularly robust indices for estimating bivariate causal relationships in real-world applications.
Abstract: Inferring nonlinear and asymmetric causal relationships between multivariate longitudinal data is a challenging task with wide-ranging application areas including clinical medicine, mathematical biology, economics and environmental research. A number of methods for inferring causal relationships within complex dynamic and stochastic systems have been proposed but there is not a unified consistent definition of causality in this context. We evaluate the performance of ten prominent bivariate causality indices for time series data, across four simulated model systems that have different coupling schemes and characteristics. In further experiments, we show that these methods may not always be invariant to real-world relevant transformations (data availability, standardisation and scaling, rounding error, missing data and noisy data). We recommend transfer entropy and nonlinear Granger causality as likely to be particularly robust indices for estimating bivariate causal relationships in real-world applications. Finally, we provide flexible open-access Python code for computation of these methods and for the model simulations.

Posted Content
TL;DR: This work applies methods to estimate sensitivity of the expected number of distinct clusters present in the Iris dataset to the BNP prior specification and derives local sensitivity measures for a truncated variational Bayes (VB) approximation and approximate nonlinear dependence of a VB optimum on prior parameters using a local Taylor series approximation.
Abstract: Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks. Prior specification is, however, relatively difficult for such models, given that their flexibility implies that the consequences of prior choices are often relatively opaque. Moreover, these choices can have a substantial effect on posterior inferences. Thus, considerations of robustness need to go hand in hand with nonparametric modeling. In the current paper, we tackle this challenge by exploiting the fact that variational Bayesian methods, in addition to having computational advantages in fitting complex nonparametric models, also yield sensitivities with respect to parametric and nonparametric aspects of Bayesian models. In particular, we demonstrate how to assess the sensitivity of conclusions to the choice of concentration parameter and stick-breaking distribution for inferences under Dirichlet process mixtures and related mixture models. We provide both theoretical and empirical support for our variational approach to Bayesian sensitivity analysis.

Posted Content
TL;DR: A Bayesian change point detection method is proposed, which is one of the fastest Bayesian methodologies, and it is more robust to misspecification of the error terms than the competing methods.
Abstract: We study the use of spike and slab priors for consistent estimation of the number of change points and their locations. Leveraging recent results in the variable selection literature, we show that an estimator based on spike and slab priors achieves optimal localization rate in the multiple offline change point detection problem. Based on this estimator, we propose a Bayesian change point detection method, which is one of the fastest Bayesian methodologies, and it is more robust to misspecification of the error terms than the competing methods. We demonstrate through empirical work the good performance of our approach vis-a-vis some state-of-the-art benchmarks.