scispace - formally typeset
Search or ask a question

Showing papers in "Annals of the Institute of Statistical Mathematics in 1998"


Journal ArticleDOI
TL;DR: Several space-time statistical models are constructed based on both classical empirical studies of clustering and some more speculative hypotheses, and the goodness-of-fit of the models, as measured by AIC values, is discussed for two high quality data sets, in different tectonic regions as mentioned in this paper.
Abstract: Several space-time statistical models are constructed based on both classical empirical studies of clustering and some more speculative hypotheses. Then we discuss the discrimination between models incorporating contrasting assumptions concerning the form of the space-time clusters. We also examine further practical extensions of the model to situations where the background seismicity is spatially non-homogeneous, and the clusters are non-isotropic. The goodness-of-fit of the models, as measured by AIC values, is discussed for two high quality data sets, in different tectonic regions. AIC also allows the details of the clustering structure in space to be clarified. A simulation algorithm for the models is provided, and used to confirm the numerical accuracy of the likelihood calculations. The simulated data sets show the similar spatial distributions to the real ones, but differ from them in some features of space-time clustering. These differences may provide useful indicators of directions for further study.

1,060 citations


Journal ArticleDOI
TL;DR: In this article, the condition number of Σ-1 S, i.e., the ratio of its extreme eigenvalues, was shown to be 1 + Op((q/n)1/2) as q → ∞ and q/n → 0.
Abstract: Let y1, y2,..., yn∈ Rq be independent, identically distributed random vectors with nonsingular covariance matrix Σ, and let S = S(y1,..., yn) be an estimator for Σ. A quantity of particular interest is the condition number of Σ-1 S. If the yi are Gaussian and S is the sample covariance matrix, the condition number of Σ-1 S, i.e. the ratio of its extreme eigenvalues, equals 1 + Op((q/n)1/2) as q →∞ and q/n → 0. The present paper shows that the same result can be achieved with two estimators based on Tyler's (1987, Ann. Statist., 15, 234-251) M-functional of scatter, assuming only elliptical symmetry of ℒ(yi) or less. The main tool is a linear expansion for this M-functional which holds uniformly in the dimension q. As a by-product we obtain continuous Frechet-differentiability with respect to weak convergence.

111 citations


Journal ArticleDOI
TL;DR: Considering the sampling error of AIC, a set of good models is constructed rather than choosing a single model, called a confidence set of models, which includes the minimum ε{AIC} model at an error rate smaller than the specified significance level.
Abstract: Akaike's information criterion (AIC) is widely used to estimate the best model from a given candidate set of parameterized probabilistic models. In this paper, considering the sampling error of AIC, a set of good models is constructed rather than choosing a single model. This set is called a confidence set of models, which includes the minimum e{AIC} model at an error rate smaller than the specified significance level. The result is given as P-value for each model, from which the confidence set is immediately obtained. A variant of Gupta's subset selection procedure is devised, in which a standardized difference of AIC is calculated for every pair of models. The critical constants are computed by the Monte-Carlo method, where the asymptotic normal approximation of AIC is used. The proposed method neither requires the full model nor assumes a hierarchical structure of models, and it has higher power than similar existing methods.

68 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduce new exponential families, that come from the concept of weighted distribution, that include and generalize the Poisson distribution, and study the statistical properties of the families and provide a useful interpretation of the parameters.
Abstract: The main goal of this paper is to introduce new exponential families, that come from the concept of weighted distribution, that include and generalize the Poisson distribution. In these families there are distributions with index of dispersion greater than, equal to or smaller than one. This property makes them suitable to fit discrete data in overdispersion or underdispersion situations. We study the statistical properties of the families and we provide a useful interpretation of the parameters. Two classical examples are considered in order to compare the fits with some other distributions. To obtain the fits with the new family, the study of the profile log-likelihood is required.

65 citations


Journal ArticleDOI
TL;DR: In this article, a diagnostic model and several new diagnostic statistics are proposed for testing for varying dispersion in exponential family nonlinear models and the results of simulation studies are presented, which show that the adjusted tests keep their sizes better and are more powerful than the ordinary tests.
Abstract: A diagnostic model and several new diagnostic statistics are proposed for testing for varying dispersion in exponential family nonlinear models. A score statistic and an adjusted score statistic based on Cox and Reid (1987, J. Roy. Statist. Soc. Ser. B, 55, 467-471) are derived in normal, inverse Gaussian, and gamma nonlinear models. An adjusted likelihood ratio statistic is also given for normal and inverse Gaussian nonlinear models. The results of simulation studies are presented, which show that the adjusted tests keep their sizes better and are more powerful than the ordinary tests.

34 citations


Journal ArticleDOI
TL;DR: In this paper, the maximum size of random spheres in a volume part is to be predicted from the sectional circular distribution of spheres cut by a plane, where the size of the spheres is assumed to follow the three-parameter generalized gamma distribution.
Abstract: This is a continuing paper of the authors (1998, Ann. Inst. Statist. Math., 50, 361–377). In the Wicksell corpuscle problem, the maximum size of random spheres in a volume part is to be predicted from the sectional circular distribution of spheres cut by a plane. The size of the spheres is assumed to follow the three-parameter generalized gamma distribution. Prediction methods based on the moment estimation are proposed and their performances are evaluated by simulation. For a practically probable case, one of these prediction methods is as good as a method previously proposed by the authors where the two shape parameters are assumed to be known.

32 citations


Journal ArticleDOI
TL;DR: The local structure of monotone and regular divergences, which include f-divergences as a particular case, is characterized by giving their Taylor expansion up to fourth order by using the invariant properties of Amari's α-connections.
Abstract: In this paper we characterize the local structure of monotone and regular divergences, which include f-divergences as a particular case, by giving their Taylor expansion up to fourth order. We extend a previous result obtained by Cencov, using the invariant properties of Amari's α-connections.

29 citations


Journal ArticleDOI
TL;DR: In this article, the authors give an approach for testing statistical hypotheses, using a general class of dissimilarity measures among k ≥ 2 distributions, and the test statistics are obtained by the replacement, in the expression of the disimilarity measure, of the unknown parameters by their maximum likelihood estimators.
Abstract: Various problems in statistics have been treated by the decision rule, based on the concept of distance between distributions. The aim of this paper is to give an approach for testing statistical hypotheses, using a general class of dissimilarity measures among k ≥ 2 distributions. The test statistics are obtained by the replacement, in the expression of the dissimilarity measure, of the unknown parameters by their maximum likelihood estimators. The asymptotic distributions of the resulting test statistics are investigated and the results are applied to multinomial and multivariate normal populations.

29 citations


Journal ArticleDOI
TL;DR: In this article, a least squares version of the empirical likelihood is proposed to overcome the computational difficulty of conventional empirical likelihood, where additional constraints are imposed to reflect additional and sought-after features of statistical analysis.
Abstract: In conventional empirical likelihood, there is exactly one structural constraint for every parameter. In some circumstances, additional constraints are imposed to reflect additional and sought-after features of statistical analysis. Such an augmented scheme uses the implicit power of empirical likelihood to produce very natural adaptive statistical methods, free of arbitrary tuning parameter choices, and does have good asymptotic properties. The price to be paid for such good properties is in extra computational difficulty. To overcome the computational difficulty, we propose a ‘least-squares’ version of the empirical likelihood. The method is illustrated by application to the case of combined empirical likelihood for the mean and the median in one sample location inference.

29 citations


Journal ArticleDOI
TL;DR: In this paper, the probability distributions of runs of length "1" of length k (k ≥ m) and of length 1 (k < m) in the sequence of a {0, 1}-valued m-th order Markov chain are studied.
Abstract: Let X-m+1, X-m+2,..., X0, X1, X2,..., Xn be a time-homogeneous {0, 1}-valued m-th order Markov chain. The probability distributions of numbers of runs of "1" of length k (k ≥ m) and of "1" of length k (k < m) in the sequence of a {0, 1}-valued m-th order Markov chain are studied. There are some ways of counting numbers of runs with length k. This paper studies the distributions based on four ways of counting numbers of runs, i.e., the number of non-overlapping runs of length k, the number of runs with length greater than or equal to k, the number of overlapping runs of length k and the number of runs of length exactly k.

27 citations


Journal ArticleDOI
TL;DR: In this article, the problem of testing normal mean vector when the observations are missing from subsets of components is considered and three simple exact tests are proposed as alternatives to the traditional likelihood ratio test.
Abstract: The problem of testing normal mean vector when the observations are missing from subsets of components is considered. For a data matrix with a monotone pattern, three simple exact tests are proposed as alternatives to the traditional likelihood ratio test. Numerical power comparisons between the proposed tests and the likelihood ratio test suggest that one of the proposed tests is indeed comparable to the likelihood ratio test and the other two tests perform better than the likelihood ratio test over a part of the parameter space. The results are extended to a nonmonotone pattern and illustrated using an example.

Journal ArticleDOI
TL;DR: In this article, the authors considered a linear process where the innovations Z's are i.i.d. satisfying a standard tail regularity and balance condition, vis., P(Z > z) ∼ rz-αL1(z), p(Z 0 and L1) is a slowly varying function.
Abstract: Consider a linear process $$X_t = \sum olimits_{i = 0}^\infty {c_i Z_{t - 1} } $$ where the innovations Z's are i.i.d. satisfying a standard tail regularity and balance condition, vis., P(Z > z) ∼ rz-αL1(z), P(Z 0 and L1 is a slowly varying function. It turns out that in this setup, P(X > x) ∼ px-αL(x), P(X < -x) ∼ qx-αL(x), as x →∞, where α is the same as above, p is a convex combination of r and s, p + q = 1, p, q ≥ 0 and L = $$\left\| {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{c} } \right\|_\alpha ^\alpha L_1 $$ where $$\left\| {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{c} } \right\|_\alpha = \left( {\sum {\left| {c_i } \right|^\alpha } } \right)^{1/\alpha } $$ . The quantities α and β = 2p - 1 can be regarded as tail parameters of the marginal distribution of Xt. We estimate α and β based on a finite realization X1,.., Xn of the time series. Consistency and asymptotic normality of the estimators are established. As a further application, we estimate a tail probability under the marginal distribution of the Xt. A small simulation study is included to indicate the finite sample behavior of the estimators.

Journal ArticleDOI
TL;DR: In this paper, it was shown that when samples are drawn without replacement from a finite population, the relative precision of the ranked set sampling estimator of the population mean, relative to the simple random sample estimator with the same number of units quantified, is bounded below 1.
Abstract: Let X1, X2,..., Xm, Y1, Y2,..., Yn be a simple random sample without replacement from a finite population and let X(1) ≤ X(2) ≤...≤ X(m) and Y(1) ≤ Y(2) ≤...≤ Y(n) be the order statistics of X1, X2,..., Xm and Y1, Y2,..., Yn, respectively. It is shown that the joint distribution of X(i) and X(j) is positively likelihood ratio dependent and Y(j) is negatively regression dependent on X(i). Using these results, it is shown that when samples are drawn without replacement from a finite population, the relative precision of the ranked set sampling estimator of the population mean, relative to the simple random sample estimator with the same number of units quantified, is bounded below by 1.

Journal ArticleDOI
TL;DR: In this article, the authors apply the Kalman filter to the analysis of multi-unit variance components models where each unit's response profile follows a state space model and use the signal extraction approach to smooth individual profiles.
Abstract: We apply the Kalman Filter to the analysis of multi-unit variance components models where each unit's response profile follows a state space model. We use mixed model results to obtain estimates of unit-specific random effects, state disturbance terms and residual noise terms. We use the signal extraction approach to smooth individual profiles. We show how to utilize the Kalman Filter to efficiently compute the restricted loglikelihood of the model. For the important special case where each unit's response profile follows a continuous structural time series model with known transition matrix we derive an EM algorithm for the restricted maximum likelihood (REML) estimation of the variance components. We present details for the case where individual profiles are modeled as local polynomial trends or polynomial smoothing splines.

Journal ArticleDOI
TL;DR: In this paper, a new formulation and proof of Skovgaard's theorem for the intensity of minimum contrast estimators are presented, under conditions which are typically straightforward to check in practice.
Abstract: A number of authors have been concerned with constructing large deviation approximations to densities and probabilities associated with minimum contrast estimators (equivalently, M-estimators) using a tilting approach due to Field. These developments are an interesting and important extension of saddlepoint-type methodology. However, in the case of a multivariate parameter, the theoretical picture has remained incomplete in certain respects, as explained below. In this paper we present results which provide rigorous justification of the tilting argument, using conditions which it is feasible to check. These results include a new formulation and proof of Skovgaard's theorem for the intensity of minimum contrast estimators, but under conditions which are typically straightforward to check in practice. Our most detailed application is to multivariate location-scatter models.

Journal ArticleDOI
TL;DR: In this article, a computationally simple algorithm for estimating the intensity function of a Poisson process with exponential quadratic and cyclic of fixed frequency trends was proposed, which can successfully be used to estimate any Poisson intensity function provided that it has a parametric form.
Abstract: Under the presence of only one realization, we consider a computationally simple algorithm for estimating the intensity function of a Poisson process with exponential quadratic and cyclic of fixed frequency trends. We argue that the algorithm can successfully be used to estimate any Poisson intensity function provided that it has a parametric form.

Journal ArticleDOI
TL;DR: In this article, a Bayesian analysis for the Block and Basu (ACBVE) bivariate exponential distribution is proposed, and the authors also consider the use of Gibbs sampling to develop Bayesian inference for accelerated life tests.
Abstract: Metropolis algorithms along with Gibbs steps are proposed to perform a Bayesian analysis for the Block and Basu (ACBVE) bivariate exponential distribution. We also consider the use of Gibbs sampling to develop Bayesian inference for accelerated life tests assuming a power rule model and the ACBVE distribution. The methodology developed in this paper is exemplified with two examples.

Journal ArticleDOI
TL;DR: In this article, the authors derived the generalized probability generating functions of the distributions of the waiting times until the r-th occurrence among the events of a pattern of length k in the higher order Markov chain.
Abstract: Let X\(_1\), X\(_2\), ... be a sequence of independent and identically distributed random variables, which take values in a countable set S = {0, 1, 2, ...}. By a pattern we mean a finite sequence of elements in S. For every i = 0, 1, 2, ..., we denote by P\(_i\) = "a\(_{i,1}\)a\(_{i,2}\)... a\(_{i,k_i }\)" the pattern of some length k\(_i\), and E\(_i\) denotes the event that the pattern P\(_i\) occurs in the sequence X\(_1\), X\(_2\), .... In this paper, we have derived the generalized probability generating functions of the distributions of the waiting times until the r-th occurrence among the events \(\{ E_i \} _{i = 0}^\infty\). We also have derived the probability generating functions of the distributions of the number of occurrences of sub-patterns of length l(l < k) until the fiurrence of the pattern of length k in the higher order Markov chain.

Journal ArticleDOI
TL;DR: In this paper, the behavior of a sequence of independent identically distributed random variables with respect to a random threshold is investigated, and three statistics connected with exceeding the threshold are introduced, their exact and asymptotic distributions are derived.
Abstract: Behaviour of a sequence of independent identically distributed random variables with respect to a random threshold is investigated. Three statistics connected with exceeding the threshold are introduced, their exact and asymptotic distributions are derived. Also distribution-free properties, leading to some common and some new discrete distributions, are considered. Identification of equidistribution of observations and the threshold are discussed. In this context relations between the exponential and gamma distributions are studied and a new derivation of the celebrated Laplace expansion for the standard normal distribution function is given.

Journal ArticleDOI
TL;DR: In this article, it was shown that for the curved and regular exponential family situations arising when κ is known, and unknown respectively, the MLE of the mean direction μ is the best equivariant estimator.
Abstract: The circular normal distribution, CN(μ, κ), plays a role for angular data comparable to that of a normal distribution for linear data. We establish that for the curved and for the regular exponential family situations arising when κ is known, and unknown respectively, the MLE \(\widehat\mu\) of the mean direction μ is the best equivariant estimator. These results are generalized for the MLE \(\widehat{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{\mu } }\) of the mean direction vector \(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{\mu } = (\mu _1 , \ldots ,\mu _p )'\)in the simultaneous estimation problem with independent CN(μ\(_i\), ϰ), i = 1,..., p, populations. We further observe that \(\widehat{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{\mu } }\) is admissible both when κ is known or unknown. Thus unlike the normal theory, Stein effect does not hold for the circular normal case. This result is generalized for the simultaneous estimation problem with directional data in q-dimensional hyperspheres following independent Langevin distributions, L(\(L(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{\mu } _i ,\kappa ),i = 1, \ldots ,p\).

Journal ArticleDOI
TL;DR: In this article, the recursive estimation of the regression function m(x) = E(Y/X = x) and its derivatives is studied under dependence conditions, and the examined method of nonparametric estimation is a recursive version of the estimator based on locally weighted polynomial fitting.
Abstract: The recursive estimation of the regression function m(x) = E(Y/X = x) and its derivatives is studied under dependence conditions The examined method of nonparametric estimation is a recursive version of the estimator based on locally weighted polynomial fitting, that in recent articles has proved to be an attractive technique and has advantages over other popular estimation techniques For strongly mixing processes, expressions for the bias and variance of these estimators are given and asymptotic normality is established Finally, a simulation study illustrates the proposed estimation method

Journal ArticleDOI
TL;DR: In this article, the problem of estimating the scale matrix and their eigenvalues in a Wishart distribution and in a multivariate F distribution is considered and a new class of estimators which shrink the eigen values towards their arithmetic mean is proposed.
Abstract: In this paper, the problem of estimating the scale matrix and their eigenvalues in a Wishart distribution and in a multivariate F distribution (which arise naturally from a two-sample setting) are considered A new class of estimators which shrink the eigenvalues towards their arithmetic mean are proposed It is shown that the new estimator which dominates the usual unbiased estimator under the squared error loss function A simulation study was carried out to study the performance of these estimators

Journal ArticleDOI
TL;DR: In this article, the first order local influence approach is adopted to assess the local influence of observations to canonical correlation coefficients, canonical vectors and several relevant test statistics in canonical correlation analysis.
Abstract: The first order local influence approach is adopted in this paper to assess the local influence of observations to canonical correlation coefficients, canonical vectors and several relevant test statistics in canonical correlation analysis. This approach can detect different aspects of influence due to different perturbation schemes. In this paper, we consider two different kinds, namely, the additive perturbation scheme and the case-weights perturbation scheme. It is found that, under the additive perturbation scheme, the influence analysis of any canonical correlation coefficient can be simplified to just observing two predicted residuals. To do the influence analysis for canonical vectors, a scale invariant norm is proposed. Furthermore, by choosing proper perturbation scales on different variables, we can compare the different influential effects of perturbations on different variables under the additive perturbation scheme. An example is presented to illustrate the effectiveness of the first order local influence approach.

Journal ArticleDOI
TL;DR: In this article, exact weak and strong Bahadur-kiefer representations of the least absolute deviation estimator for the linear regression model were obtained under minimal conditions, and the exact behavior of these representations was obtained under some minimal conditions.
Abstract: We consider exact weak and strong Bahadur-Kiefer representations of the least absolute deviation estimator for the linear regression model. The precise behavior of these representations is obtained under minimal conditions.

Journal ArticleDOI
TL;DR: In this article, an exponential family of distributions which generalises the exponential distribution for censored failure time data is analyzed, analogous to the way in which the class of generalised linear models generalizes the normal distribution.
Abstract: We analyse an exponential family of distributions which generalises the exponential distribution for censored failure time data, analogous to the way in which the class of generalised linear models generalises the normal distribution. The parameter of the distribution depends on a linear combination of covariates via a possibly nonlinear link function, and we allow another level of heterogeneity: the data may contain "immune" individuals who are not subject to failure. Thus the data is modelled by a mixture of a distribution from the exponential family and a "mass at infinity" representing individuals who never fail. Our results include large sample distributions for parameter estimators and for hypothesis test statistics obtained by maximising the likelihood of a sample. The asymptotic distribution of the likelihood ratio test statistic for the hypothesis that there are no immunes present in the population is shown to be "non-standard"; it is a 50-50 mixture of a chi-squared distribution on 1 degree of freedom and a point mass at 0. Our analysis clearly shows how "negligibility" of individual covariate values and "sufficient followup" conditions are required for the asymptotic properties.

Journal ArticleDOI
TL;DR: In this paper, a test for the hypothesis that f is a linear combination of given linearly independent regression functions g1,..,gd is proposed, based on an estimator of the minimal L2-distance between f and the subspace spanned by the regression functions.
Abstract: Let (X,Y) denote a random vector with decomposition Y = f(X) + e where f(x) = E[Y ¦ X = x] is the regression of Y on X. In this paper we propose a test for the hypothesis that f is a linear combination of given linearly independent regression functions g1,..,gd. The test is based on an estimator of the minimal L2-distance between f and the subspace spanned by the regression functions. More precisely, the method is based on the estimation of certain integrals of the regression function and therefore does not require an explicit estimation of the regression. For this reason the test proposed in this paper does not depend on the subjective choice of a smoothing parameter. Differences between the problem of regression diagnostics in the nonrandom and random design case are also discussed.

Journal ArticleDOI
TL;DR: In this paper, the authors obtained asymptotic representations of several variance estimators of U-statistics and studied their effects for studentizations via Edgeworth expansions up to the order op(n-1).
Abstract: In this paper we obtain asymptotic representations of several variance estimators of U-statistics and study their effects for studentizations via Edgeworth expansions. Jackknife, unbiased and Sen's variance estimators are investigated up to the order op(n-1). Substituting these estimators to studentized U-statistics, the Edgeworth expansions with remainder term o(n-1) are established and inverting the expansions, the effects on confidence intervals are discussed theoretically. We also show that Hinkley's corrected jackknife variance estimator is asymptotically equivalent to the unbiased variance estimator up to the order op(n-1).

Journal ArticleDOI
TL;DR: By examining a set of data on evolving program failures, the effect of evolving program model is amply proved and a non-Gaussion state space model to apply in software reliability is proposed.
Abstract: In this paper, we propose a non-Gaussion state space model to apply in software reliability. This model assumes an exponential distribution for the failure time in every test-debugging stage, conditionally on the state parameter — the number of faults in the program. It is a generalized JM model which can be applied to the imperfect debugging situation as well as in evolving programs. By examining a set of data on evolving program failures, the effect of evolving program model is amply proved.

Journal ArticleDOI
TL;DR: In this article, the location probability model with homoscedastic across location conditional dispersion matrices is adopted for the problem of classification between two populations dealing with both continuous and binary variables is handled by splitting the problem into different locations.
Abstract: Classification between two populations dealing with both continuous and binary variables is handled by splitting the problem into different locations. Given the location specified by the values of the binary variables, discrimination is performed using the continuous variables. The location probability model with homoscedastic across location conditional dispersion matrices is adopted for this problem. In this paper, we consider presence of continuous covariates with heterogeneous location conditional dispersion matrices. The continuous covariates have equal location specific mean in both populations. Conditional homoscedasticity fails when strong interaction between the continuous and binary variables is present. A plug-in covariance adjusted rule is constructed and its asymptotic distribution is derived. An asymptotic expansion for the overall error rate is given. The result is extended to include binary covariates.

Journal ArticleDOI
TL;DR: In this paper, the joint distributions of the numbers of trials, failures and successes of the first consecutive k successes are obtained in the sequence X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, X
Abstract: Let X-m+1, X-m+2,.., X0, X1, X2,.., be a time-homogeneous {0, 1}-valued m-th order Markov chain. Joint distributions of the numbers of trials, failures and successes, of the numbers of trials and success-runs of length l (m ≤ l ≤ k) and of the numbers of trials and success-runs of length l (l ≤ m ≤ k) until the first consecutive k successes are obtained in the sequence X1, X2,.., There are some ways of counting numbers of runs of length l. This paper studies the joint distributions based on four ways of counting numbers of runs, i.e., the number of non-overlapping runs of length l, the number of runs of length greater than or equal to l, the number of overlapping runs of length l and the number of runs of length exactly l. Marginal distributions of them can be obtained immediately, and surprisingly their distributions are very simple.