scispace - formally typeset
Search or ask a question

Showing papers on "Likelihood principle published in 2020"


Journal ArticleDOI
TL;DR: In this paper, a new reading of the Robbins' confidence sequences was proposed, which can be justified under various views of inference: they are likelihood-based, can incorporate prior information, and obey the strong likelihood principle.
Abstract: The widely claimed replicability crisis in science may lead to revised standards of significance. The customary frequentist confidence intervals, calibrated through hypothetical repetitions of the experiment that is supposed to have produced the data at hand, rely on a feeble concept of replicability. In particular, contradictory conclusions may be reached when a substantial enlargement of the study is undertaken. To redefine statistical confidence in such a way that inferential conclusions are non-contradictory, with large enough probability, under enlargements of the sample, we give a new reading of a proposal dating back to the 60's, namely Robbins' confidence sequences. Directly bounding the probability of reaching, in the future, conclusions that contradict the current ones, Robbins' confidence sequences ensure a clear-cut form of replicability when inference is performed on accumulating data. Their main frequentist property is easy to understand and to prove. We show that Robbins' confidence sequences may be justified under various views of inference: they are likelihood-based, can incorporate prior information, and obey the strong likelihood principle. They are easy to compute, even when inference is on a parameter of interest, especially using a closed-form approximation from normal asymptotic theory.

13 citations


Posted Content
TL;DR: The e -value and the accompanying FBST, the Full Bayesian Significance Test, constitute the core of a research program that was started at IME-USP, is being developed by over 20 researchers worldwide, and has, so far, been referenced by over 200 publications.
Abstract: This article gives a survey of the e-value, a statistical significance measure a.k.a. the evidence rendered by observational data, X, in support of a statistical hypothesis, H, or, the other way around, the epistemic value of H given X. The $e$-value and the accompanying FBST, the Full Bayesian Significance Test, constitute the core of a research program that was started at IME-USP, is being developed by over 20 researchers worldwide, and has, so far, been referenced by over 200 publications. The e-value and the FBST comply with the best principles of Bayesian inference, including the likelihood principle, complete invariance, asymptotic consistency, etc. Furthermore, they exhibit powerful logic or algebraic properties in situations where one needs to compare or compose distinct hypotheses that can be formulated either in the same or in different statistical models. Moreover, they effortlessly accommodate the case of sharp or precise hypotheses, a situation where alternative methods often require ad hoc and convoluted procedures. Finally, the FBST has outstanding robustness and reliability characteristics, outperforming traditional tests of hypotheses in many practical applications of statistical modeling and operations research.

12 citations


Journal ArticleDOI
TL;DR: A survey of the e-value, a statistical significance measure, can be found in this paper, where the authors give a survey of e-values and their application in statistical modeling and operations research.
Abstract: This article gives a survey of the e-value, a statistical significance measure a.k.a. the evidence rendered by observational data, X, in support of a statistical hypothesis, H, or, the other way around, the epistemic value of H given X. The e-value and the accompanying FBST, the Full Bayesian Significance Test, constitute the core of a research program that was started at IME-USP, is being developed by over 20 researchers worldwide, and has, so far, been referenced by over 200 publications. The e-value and the FBST comply with the best principles of Bayesian inference, including the likelihood principle, complete invariance, asymptotic consistency, etc. Furthermore, they exhibit powerful logic or algebraic properties in situations where one needs to compare or compose distinct hypotheses that can be formulated either in the same or in different statistical models. Moreover, they effortlessly accommodate the case of sharp or precise hypotheses, a situation where alternative methods often require ad hoc and convoluted procedures. Finally, the FBST has outstanding robustness and reliability characteristics, outperforming traditional tests of hypotheses in many practical applications of statistical modeling and operations research.

10 citations


Journal ArticleDOI
TL;DR: A new analytic alternative for item-level missingness, called two-stage maximum likelihood, is studied, showing negligible bias, high efficiency, and good coverage and is recommended whenever it achieves convergence.
Abstract: Psychologists use scales comprised of multiple items to measure underlying constructs. Missing data on such scales often occur at the item level, whereas the model of interest to the researcher is at the composite (scale score) level. Existing analytic approaches cannot easily accommodate item-level missing data when models involve composites. A very common practice in psychology is to average all available items to produce scale scores. This approach, referred to as available-case maximum likelihood (ACML), may produce biased parameter estimates. Another approach researchers use to deal with item-level missing data is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if any item is missing. SL-FIML is inefficient and it may also exhibit bias. Multiple imputation (MI) produces the correct results using a simulation-based approach. We study a new analytic alternative for item-level missingness, called two-stage maximum likelihood (TSML; Savalei & Rhemtulla, Journal of Educational and Behavioral Statistics, 42(4), 405-431. 2017). The original work showed the method outperforming ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL-FIML, MI, and TSML in the context of univariate regression. We demonstrated performance issues encountered by ACML and SL-FIML when estimating regression coefficients, under both MCAR and MAR conditions. Aside from convergence issues with small sample sizes and high missingness, TSML performed similarly to MI in all conditions, showing negligible bias, high efficiency, and good coverage. This fast analytic approach is therefore recommended whenever it achieves convergence. R code and a Shiny app to perform TSML are provided.

6 citations


Journal ArticleDOI
01 Feb 2020
TL;DR: In this paper, the authors address the issue of testing inference of the dispersion parameter in heteroscedastic symmetric nonlinear regression models considering small samples and derive Bartlett corrections to improve the likelihood ratio.
Abstract: In this paper we address the issue of testing inference of the dispersion parameter in heteroscedastic symmetric nonlinear regression models considering small samples. We derive Bartlett corrections to improve the likelihood ratio as well modified profile likelihood ratio tests. Our results extend some of those obtained in Cordeiro (J Stat Comput Simul 74:609–620, 2004) and Ferrari et al. (J Stat Plan Inference 124:423–437, 2004), who consider a symmetric nonlinear regression model and normal linear regression model, respectively. We also present the bootstrap and bootstrap Bartlett corrected likelihood ratio tests. Monte Carlo simulations are carried out to compare the finite sample performances of the three corrected tests and their uncorrected versions. The numerical evidence shows that the corrected modified profile likelihood ratio test, the bootstrap and bootstrap Bartlett corrected likelihood ratio test perform better than the other ones. We also present an empirical application.

4 citations


Posted Content
TL;DR: This note is intended to provide a brief introduction at the advanced undergraduate or beginning graduate student level, citing a few papers giving examples and containing numerous pointers to the vast literature on likelihood.
Abstract: Likelihood functions are ubiquitous in data analyses at the LHC and elsewhere in particle physics. Partly because "probability" and "likelihood" are virtual synonyms in everyday English, but crucially distinct in data analysis, there is great potential for confusion. Furthermore, each of various approaches to statistical inference (likelihoodist, Neyman-Pearson, Bayesian) uses the likelihood function in different ways. This note is intended to provide a brief introduction at the advanced undergraduate or beginning graduate student level, citing a few papers giving examples and containing numerous pointers to the vast literature on likelihood. The Likelihood Principle (routinely violated in particle physics analyses) is mentioned as an unresolved issue in the philosophical foundations of statistics.

4 citations


Posted Content
TL;DR: Based on all of the simulations and the five case studies, the proposed support vector regression using a working likelihood, data-driven insensitive parameter is superior and has lower computational costs.
Abstract: The insensitive parameter in support vector regression determines the set of support vectors that greatly impacts the prediction. A data-driven approach is proposed to determine an approximate value for this insensitive parameter by minimizing a generalized loss function originating from the likelihood principle. This data-driven support vector regression also statistically standardizes samples using the scale of noises. Nonlinear and linear numerical simulations with three types of noises ($\epsilon$-Laplacian distribution, normal distribution, and uniform distribution), and in addition, five real benchmark data sets, are used to test the capacity of the proposed method. Based on all of the simulations and the five case studies, the proposed support vector regression using a working likelihood, data-driven insensitive parameter is superior and has lower computational costs.

4 citations


Posted Content
TL;DR: The method proposed has shown its merit in producing an MLE for a network dataset and model that had defied estimation using all other known methods and is exploited here to search for improved starting values for approximation-based MLE methods.
Abstract: Much of the theory of estimation for exponential family models, which include exponential-family random graph models (ERGMs) as a special case, is well-established and maximum likelihood estimates in particular enjoy many desirable properties. However, in the case of many ERGMs, direct calculation of MLEs is impossible and therefore methods for approximating MLEs and/or alternative estimation methods must be employed. Many MLE approximation methods require alternative estimates as starting points. We discuss one class of such alternatives here. The MLE satisfies the so-called "likelihood principle," unlike the MPLE. This means that different networks may have different MPLEs even if they have the same sufficient statistics. We exploit this fact here to search for improved starting values for approximation-based MLE methods. The method we propose has shown its merit in producing an MLE for a network dataset and model that had defied estimation using all other known methods.

3 citations



Journal ArticleDOI
TL;DR: It is argued that Pinna and Conti's claims that simplicity and likelihood are equivalent are based on incorrect assumptions, whereas their second claim is simply untrue.
Abstract: Pinna and Conti (Brain Sci., 2019, 9, 149, doi:10.3390/brainsci9060149) presented phenomena concerning the salience and role of contrast polarity in human visual perception, particularly in amodal completion. These phenomena are indeed illustrative thereof, but here, the focus is on their claims (1) that neither simplicity nor likelihood approaches can account for these phenomena; and (2) that simplicity and likelihood are equivalent. I argue that their first claim is based on incorrect assumptions, whereas their second claim is simply untrue.

3 citations


Journal ArticleDOI
TL;DR: In this paper, the univariate likelihood function based on a random vector is used to provide the uniqueness in reconstructing the vector distribution in multivariate normal (MN) frameworks, which links to a reverse of Cochran's theorem that concerns the distribution of quadratic forms in normal variables.

Posted Content
TL;DR: In this paper, a penalized maximum likelihood approach was proposed to address the problem of likelihood degeneracy in the von Mises-Fisher distribution, whereby a penalty function was incorporated.
Abstract: The von Mises-Fisher distribution is one of the most widely used probability distributions to describe directional data. Finite mixtures of von Mises-Fisher distributions have found numerous applications. However, the likelihood function for the finite mixture of von Mises-Fisher distributions is unbounded and consequently the maximum likelihood estimation is not well defined. To address the problem of likelihood degeneracy, we consider a penalized maximum likelihood approach whereby a penalty function is incorporated. We prove strong consistency of the resulting estimator. An Expectation-Maximization algorithm for the penalized likelihood function is developed and simulation studies are performed to examine its performance.

Posted Content
TL;DR: In this article, the authors argue that the likelihood principle and weak law of likelihood generalize naturally to settings in which experimenters are justified only in making comparative, non-numerical judgments of the form "$A$ given $B$ is more likely than $C$ given D$".
Abstract: We argue that the likelihood principle (LP) and weak law of likelihood (LL) generalize naturally to settings in which experimenters are justified only in making comparative, non-numerical judgments of the form "$A$ given $B$ is more likely than $C$ given $D$." To do so, we first \emph{formulate} qualitative analogs of those theses. Then, using a framework for qualitative conditional probability, just as the characterizes when all Bayesians (regardless of prior) agree that two pieces of evidence are equivalent, so a qualitative/non-numerical version of LP provides sufficient conditions for agreement among experimenters' whose degrees of belief satisfy only very weak "coherence" constraints. We prove a similar result for LL. We conclude by discussing the relevance of results to stopping rules.

Journal ArticleDOI
TL;DR: In this article, a new reading of the Robbins' confidence sequences was proposed, which can be justified under various views of inference: they are likelihood-based, can incorporate prior information, and obey the strong likelihood principle.
Abstract: The widely claimed replicability crisis in science may lead to revised standards of significance. The customary frequentist confidence intervals, calibrated through hypothetical repetitions of the experiment that is supposed to have produced the data at hand, rely on a feeble concept of replicability. In particular, contradictory conclusions may be reached when a substantial enlargement of the study is undertaken. To redefine statistical confidence in such a way that inferential conclusions are non-contradictory, with large enough probability, under enlargements of the sample, we give a new reading of a proposal dating back to the 60's, namely Robbins' confidence sequences. Directly bounding the probability of reaching, in the future, conclusions that contradict the current ones, Robbins' confidence sequences ensure a clear-cut form of replicability when inference is performed on accumulating data. Their main frequentist property is easy to understand and to prove. We show that Robbins' confidence sequences may be justified under various views of inference: they are likelihood-based, can incorporate prior information, and obey the strong likelihood principle. They are easy to compute, even when inference is on a parameter of interest, especially using a closed-form approximation from normal asymptotic theory.