scispace - formally typeset
Search or ask a question

Showing papers in "Annals of Mathematical Statistics in 1962"


Journal ArticleDOI
TL;DR: In this paper, the problem of the estimation of a probability density function and of determining the mode of the probability function is discussed. Only estimates which are consistent and asymptotically normal are constructed.
Abstract: : Given a sequence of independent identically distributed random variables with a common probability density function, the problem of the estimation of a probability density function and of determining the mode of a probability function are discussed. Only estimates which are consistent and asymptotically normal are constructed. (Author)

10,114 citations


Journal ArticleDOI
TL;DR: For a long time I have thought I was a statistician, interested in inferences from the particular to the general as mentioned in this paper. But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt.
Abstract: For a long time I have thought I was a statistician, interested in inferences from the particular to the general. But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt. And when I have pondered about why such techniques as the spectrum analysis of time series have proved so useful, it has become clear that their “dealing with fluctuations” aspects are, in many circumstances, of lesser importance than the aspects that would already have been required to deal effectively with the simpler case of very extensive data, where fluctuations would no longer be a problem. All in all, I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.

1,569 citations


Journal ArticleDOI
TL;DR: In this paper, a generalization of the gamma distribution is proposed, which is based on Liouville's extension to Dirichlet's integral formula, and the moment generating function is shown, and cumulative probabilities are related directly to the incomplete gamma function.
Abstract: This paper concerns a generalization of the gamma distribution, the specific form being suggested by Liouville's extension to Dirichlet's integral formula [3]. In this form it also may be regarded as a special case of a function introduced by L. Amoroso [1] and R. d'Addario [2] in analyzing the distribution of economic income. (Also listed in [4] and [5].) In essence, the generalization (1) herein is accomplished by supplying a positive parameter, $p$, as an exponent in the exponential factor of the gamma distribution. The moment generating function is shown, and cumulative probabilities are related directly to the incomplete gamma function (tabulated in [6]). Distributions are given for various functions of independent "generalized gamma variates" thus defined, special attention being given to the sum of such variates. Convolution results occur in alternating series form, with coefficients whose evaluation may be tedious and lengthy. An upper bound is provided for the modulus of each term, and simplified computation methods are developed for some special cases. A corollary is derived showing that the researches of Robbins in [7] apply to a larger class of problems than was treated in [7]. Extensions of his methods lead to iterative formulae for the coefficients in series obtained for an even larger class of problems.

1,232 citations



Journal ArticleDOI
TL;DR: In this paper, the authors consider a system with a finite number of states and choose an action from a finite set of possible actions to move the system to a new state with the probability of a particular new state given by a function.
Abstract: We consider a system with a finite number $S$ of states $s$, labeled by the integers $1, 2, \cdots, S$. Periodically, say once a day, we observe the current state of the system, and then choose an action $a$ from a finite set $A$ of possible actions. As a joint result of the current state $s$ and the chosen action $a$, two things happen: (1) we receive an immediate income $i(s, a)$ and (2) the system moves to a new state $s'$ with the probability of a particular new state $s'$ given by a function $q = q(s' \mid s, a)$. Finally there is specified a discount factor $\beta, 0 \leqq \beta < 1$, so that the value of unit income $n$ days in the future is $\beta^n$. Our problem is to choose a policy which maximizes our total expected income. This problem, which is an interesting special case of the general dynamic programming problem, has been solved by Howard in his excellent book [3]. The case $\beta = 1$, also studied by Howard, is substantially more difficult. We shall obtain in this case results slightly beyond those of Howard, though still not complete. Our method, which treats $\beta = 1$ as a limiting case of $\beta < 1$, seems rather simpler than Howard's.

730 citations


Journal ArticleDOI
TL;DR: The Cramer-von Mises criterion for testing whether a sample is drawn from a specified continuous distribution was introduced in this paper. But it is not known whether the criterion can be applied to the case of two samples.
Abstract: The Cramer-von Mises $\omega^2$ criterion for testing that a sample, $x_1, \cdots, x_N$, has been drawn from a specified continuous distribution $F(x)$ is \begin{equation*}\tag{1}\omega^2 = \int^\infty_{-\infty} \lbrack F_N(x) - F(x)\rbrack^2 dF(x),\end{equation*} where $F_N(x)$ is the empirical distribution function of the sample; that is, $F_N(x) = k/N$ if exactly $k$ observations are less than or equal to $x(k = 0, 1, \cdots, N)$. If there is a second sample, $y_1, \cdots, y_M$, a test of the hypothesis that the two samples come from the same (unspecified) continuous distribution can be based on the analogue of $N\omega^2$, namely \begin{equation*}\tag{2} T = \lbrack NM/(N + M)\rbrack \int^\infty_{-\infty} \lbrack F_N(x) - G_M(x)\rbrack^2 dH_{N+M}(x),\end{equation*} where $G_M(x)$ is the empirical distribution function of the second sample and $H_{N+M}(x)$ is the empirical distribution function of the two samples together [that is, $(N + M)H_{N+M}(x) = NF_N(x) + MG_M(x)\rbrack$. The limiting distribution of $N\omega^2$ as $N \rightarrow \infty$ has been tabulated [2], and it has been shown ([3], [4a], and [7]) that $T$ has the same limiting distribution as $N \rightarrow \infty, M \rightarrow \infty$, and $N/M \rightarrow \lambda$, where $\lambda$ is any finite positive constant. In this note we consider the distribution of $T$ for small values of $N$ and $M$ and present tables to permit use of the criterion at some conventional significance levels for small values of $N$ and $M$. The limiting distribution seems a surprisingly good approximation to the exact distribution for moderate sample sizes (corresponding to the same feature for $N\omega^2$ [6]). The accuracy of approximation is better than in the case of the two-sample Kolmogorov-Smirnov statistic studied by Hodges [4].

517 citations



Book ChapterDOI
TL;DR: It is now generally agreed that in testing for shift in the two-sample problem, certain tests based on ranks have considerable advantage over the classical t-test as mentioned in this paper, and it is also recognized that these optimum properties are somewhat illusory and that, under realistic assumptions about extreme observations or gross errors, the T-test in practice may well be less efficient than such rank tests as the Wilcoxon or normal scores test.
Abstract: It is now coming to be generally agreed that in testing for shift in the two-sample problem, certain tests based on ranks have considerable advantage over the classical t-test. From the beginning, rank tests were recognized to have one important advantage: their significance levels are exact under the sole assumption that the samples are randomly drawn (or that the assignment of treatments to subjects is performed at random), whereas the t-test in effect is exact only when we are dealing with random samples from normal distributions. On the other hand, it was felt that this advantage had to be balanced against the various optimum properties possessed by the t-test under the assumption of normality. It is now being recognized that these optimum properties are somewhat illusory and that, under realistic assumptions about extreme observations or gross errors, the t-test in practice may well be less efficient than such rank tests as the Wilcoxon or normal scores test [6], [7].

433 citations



Journal ArticleDOI
TL;DR: In this article, the relation between weak convergence of a sequence of measures and uniform convergence over certain classes of continuity sets (or uniform convergence of the integrals of the continuous functions) is studied.
Abstract: In this paper the relation between weak convergence of a sequence of measures and uniform convergence over certain classes of continuity sets (or uniform convergence of the integrals over certain classes of continuous functions) is studied. These results are applied to obtain laws of large numbers for random functions and generalizations of the Glivenko-Cantelli lemma.

348 citations


Journal ArticleDOI
TL;DR: In this article, the problem of negative estimates of variance components for all random effects models whose expected mean square column may be thought of as forming a mathematical tree in a certain sense is addressed.
Abstract: The usefulness of variance component techniques is frequently limited by the occurrence of negative estimates of essentially positive parameters. This paper uses a restricted maximum likelihood principle to remove this objectionable characteristic for certain experimental models. Section 2 discusses certain necessary results from the theory of non-linear programming. Section 3 derives specific formulae for estimating the variance components of the random one-way and two-way classification models. The problem of determining the precision of instruments in the two instrument case is dealt with in section 4, and a surprising though not unreasonable answer is obtained. The remaining sections provide an algorithm for solving the problem of negative estimates of variance components for all random effects models whose expected mean square column may be thought of as forming a mathematical tree in a certain sense. The algorithm is as follows: Consider the minimum mean square in the entire array; if this mean square is the root of the tree then equate it to its expectation. If the minimum mean square is not the root then pool it with its predecessor. In either case the problem is reduced to an identical one having one less variable, and hence in a finite number of steps the process will yield estimates of the variance components. These estimates are non-negative and have a maximum likelihood property.

Journal ArticleDOI
TL;DR: In this paper, the sequential sampling rule is studied for various uncertainty functions and experiments, and the problem of optimally choosing the experiments to be performed sequentially from a class of available experiments is considered when the goal is either to minimize the expected uncertainty after a fixed number of experiments or to minimise the expected number of tests needed to reduce the uncertainty to a fixed level.
Abstract: Consider a situation in which it is desired to gain knowledge about the true value of some parameter (or about the true state of the world) by means of experimentation. Let $\Omega$ denote the set of all possible values of the parameter $\theta$, and suppose that the experimenter's knowledge about the true value of $\theta$ can be expressed, at each stage of experimentation, in terms of a probability distribution $\xi$ over $\Omega$. Each distribution $\xi$ indicates a certain amount of uncertainty on the part of the experimenter about the true value of $\theta$, and it is assumed that for each $\xi$ this uncertainty can be characterized by a non-negative number. The information in an experiment is then defined as the expected difference between the uncertainty of the prior distribution over $\Omega$ and the uncertainty of the posterior distribution. In any particular situation, the selection of an appropriate uncertainty function would typically be based on the use to which the experimenter's knowledge about $\theta$ is to be put. If, for example, the actions available to the experimenter and the losses associated with these actions can be specified as in a statistical decision problem, then presumably the uncertainty function would be determined from the loss function. In Section 2 some properties of uncertainty and information functions, and their relation to statistical decision problems and loss functions, are considered. In Section 3 the sequential sampling rule whereby experiments are performed until the uncertainty is reduced to a preassigned level is studied for various uncertainty functions and experiments. This rule has been previously studied by Lindley, [8], [9], in special cases where the uncertainty function is the Shannon entropy function. In Sections 4 and 5 the problem of optimally choosing the experiments to be performed sequentially from a class of available experiments is considered when the goal is either to minimize the expected uncertainty after a fixed number of experiments or to minimize the expected number of experiments needed to reduce the uncertainty to a fixed level. Particular problems of this nature have been treated by Bradt and Karlin [6]. The recent work of Chernoff [7] and Albert [1] on the sequential design of experiments is also of interest in relation to these problems.

Journal ArticleDOI
TL;DR: In this paper, linear procedures for classifying an observation as coming from one of two multivariate normal distributions are studied in the case that the two distributions differ both in mean vectors and covariance matrices.
Abstract: Linear procedures for classifying an observation as coming from one of two multivariate normal distributions are studied in the case that the two distributions differ both in mean vectors and covariance matrices We find the class of admissible linear procedures, which is the minimal complete class of linear procedures It is shown how to construct the linear procedure which minimizes one probability of misclassification given the other and how to obtain the minimax linear procedure; Bayes linear procedures are also discussed

Journal ArticleDOI
TL;DR: In this paper, the authors presented a mathematical theory associated with this procedure and derived asymptotic expressions for the variance of the estimate of the population total, together with variance estimates for moderate values of N. The reduction in variance, as compared to sampling with p.p.s. with replacement, is clearly demonstrated.
Abstract: Given a population of $N$ units, it is required to draw a random sample of $n$ distinct units in such a way that the probability for the $i$th unit to be in the sample is proportional to its "size" $x_i$ (sampling with p.p.s. without replacement). From a number of alternatives of achieving this, one well known procedure is here selected: The $N$ units in the population are listed in a random order and their $x_i$ are cumulated and a systematic selection of $n$ elements from a "random start" is then made on the cumulation. The mathematical theory associated with this procedure, not available in the literature to date, is here provided: With the help of an asymptotic theory, compact expressions for the variance of the estimate of the population total are derived together with variance estimates. These formulas are applicable for moderate values of $N$. The reduction in variance, as compared to sampling with p.p.s. with replacement, is clearly demonstrated.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the asymptotic efficiency of rank-order tests is at least twice as large as the efficiency of corresponding parametric tests of Neyman's type.
Abstract: = 0 against the alternative j3 > 0. We suppose that the square root of the probability density f(x) of the residuals Yi possesses a quadratically integrable derivative and define a class of rank order tests, which are asymptotically most powerful for given f. The main result is exposed in the following succession: theorem, corollaries and examples, comments, preliminaries and proof. The proof is based on results by Hajek [6] and LeCam [8], [9]. Section 6 deals with asymptotic efficiency of rank-order tests, which is shown, on the basis of Mikulski's results [10], to be presumably never less than the asymptotic efficiency of corresponding parametric tests of Neyman's type [11]. This would extend the wellknown result obtained by Chernoff and Savage [2] for the Student t-test. Furthermore, it is shown that the efficiency may be negative, i.e., asymptotic power may be less than the asymptotic size. In Section 7 we consider parallel rank-order tests of symmetry for judging paired comparisons. Section 8 is devoted to rankorder tests for densities such that (f(x))' does not possess a quadratically integrable derivative. In Section 9, we construct a test which is asymptotically most powerful simultaneously for all densities f(x) such that (f(x)) 2 possesses a quadratically integrable derivative. 1. The main theorem. Consider a sequence of random vectors (XV v*... , XVN,), 1 0, and ,B is the parameter under test. We test the hypothesis ,B = 0 against the alternative , > 0. Assume that the density f(x) = F'(x) exists and that (f(x) )i possesses a quadratically integrable derivative. As

Journal ArticleDOI
TL;DR: In this paper, the distribution of homogeneous and non-homogeneous quadratic functions of normal variables is presented in geometrical terms for a fixed ellipsoid of arbitrary size, location and orientation under an underlying multivariate normal distribution.
Abstract: Abstract : The distribution of homogeneous and non-homogeneous quadratic functions of normal variables is presented. In geometrical terms the probability content is evaluated for a fixed ellipsoid of arbitrary size, location and orientation under an underlying multivariate normal distribution.

Journal ArticleDOI
TL;DR: In this paper, the authors suggest two families of bivariate Pareto distributions with the property that both marginal distributions are of univariate form, and discuss estimation of the parameters in the bivariate distributions.
Abstract: It is well known that the family of Pareto distributions with densities \begin{equation*}\begin{align*}f(x; a, p) = pa^p/x^{p+1},\qquad x > a > 0, \\ = (1.1) \\ 0,\qquad x \leqq a, p > 0,\end{align*}\end{equation*} provides reasonably good fits to many empirical distributions, e.g., to distributions of income and of property values. In most of these cases, ancillary information is present, which could be utilized if an appropriate multivariate Pareto distribution were available. The objects of this note are (i) to suggest two families of bivariate Pareto distributions with the property that both marginal distributions are of univariate Pareto form; (ii) to extend these to multivariate forms; and (iii) to discuss estimation of the parameters in the bivariate distributions.

Journal ArticleDOI
TL;DR: In this article, two sample rank tests for scale alternatives are considered and compared from the point of view of limiting Pitman efficiency for normal and nonnormal alternatives, and a rank test is proposed for particular alternatives which is most powerful for rectangular densities.
Abstract: This paper is concerned with two sample rank tests for scale alternatives. The two samples are assumed to have continuous distribution functions with the difference in respective location parameters (medians) known. Various rank tests are considered and compared from the point of view of limiting Pitman efficiency for normal and nonnormal alternatives. Among the tests considered is a test with efficiency one relative to the $F$-test for normal alternatives. Tables are given to facilitate its use. Small sample power and efficiency for normal alternatives are computed for equal sample sizes of 5. The small sample efficiency values differ appreciably from the limiting value; this deficiency of power appears to derive from the use of ranks per se rather than from the use of a rank test that is not optimal among rank tests. Lastly, a rank test is proposed for particular alternatives which is most powerful for rectangular densities. It is a simple test which is seen to have surprisingly good power for normal alternatives.

Journal ArticleDOI
TL;DR: In this paper, it is shown that for any set of ordered variables, normal or otherwise, the hypothesis that the covariance matrix is such that all the above partial correlations vanish will be denoted by $D_s (s = 0, 1, \cdots, p - 1)$, so that, for the multivariate normal distribution, $D-s$ denotes the hypothesis of $s$th ante-dependence.
Abstract: For a set of variables in a given order, $s$th ante-dependence will be said to obtain if each one of the variables, given at least $s$ immediate antecedent variables in the order, is independent of all further preceding variables. If the number of variables is $p$, ante-dependence is of some order between 0 and $p - 1.$ 0th ante-dependence and $(p - 1)$st ante-dependence are equivalent to complete independence and to completely arbitrary patterns of dependence, respectively, and are defined irrespective of the ordering of the variables. 1st to $(p - 2)$nd ante-dependence are defined in terms of a specific order only. If $X_1, X_2, \cdots, X_p$ are multivariate normal, $s$th ante-dependence is equivalent to each $X_i$, given $X_{i - 1}, X_{i - 2}, \cdots, X_{i - s}, \cdots, X_{i - s - z}$, being uncorrelated with $X_{i - s - z - 1}, X_{i - s - z - 2}, \cdots, X_2, X_1$ for any non-negative $z$. In other words, the partial correlation of $X_i$ and $X_{i - s - z - k}$, given all the variables $X_{i - 1}, X_{i - 2}, \cdots, X_{i - s - z}$, is zero for all $i, k$ and $z$. The hypothesis that the covariance matrix is such that all the above partial correlations vanish will be denoted by $D_s (s = 0, 1, \cdots, p - 1)$, so that, for the multivariate normal distribution, $D_s$ denotes the hypothesis of $s$th ante-dependence. It is shown that for any set of ordered variables, normal or otherwise, $D_s$ is equivalent to the following correspondence between the regression equations of $X_i$ on all other variables, and on $X_{i - s}, X_{i - s + 1}, \cdots, X_{i = 1}, X_{i + 1}, \cdots, X_{i + s}$ only: the multiple correlations are equal, and the regression coefficients of $X_{i - s}, X_{i - s + 1}, \cdots, X_{i + s}$ are equal in both equations, all other coefficients in the former equation being zero. It is also equivalent to the $(p - s)(p - s - 1)/2$ elements in the upper right (and also lower left) corner of the inverse covariance matrix being zero. Indeed, any null hypothesis on a set of elements of the inverse covariance matrix may be formulated, and tested, as a hypothesis $D_s$ if the variables can be so ordered as to put the zero elements in the upper right and lower left corners of the inverse. Maximum likelihood estimates are derived under $D_s$ for the normal case. Likelihood ratio tests of any one $D_s$ against any other follow immediately and may be expressed in terms of the sample partial correlations. Exact distributions are not investigated, but for large samples $\chi^2$ approximations are available. Thus a sequence of tests of $D_{p - 2}$ under $D_{p - 1}, D_{p - 3}$ under $D_{p - 2}, \cdots, D_0$ under $D_1$, is obtained which, in effect, forms a breakdown of the large sample test of independence, $D_0$, under the general alternative, $D_{p - 1}$. The assumptions of ante-dependence are clearly analogous to those of Markov processes and autoregressive schemes for time series; the motivation for the study and application of these models is also similar. The present model is more general in that it relaxes the usual autoregression assumptions of equal variances and, more crucial, of equal correlations between all pairs of equidistant variables (distance being meant in terms of the order of the set or time series). This greater generality requires analysis of a sample of observations for the study of ante-dependence, whereas for autoregressive schemes there are methods of analysis based on a single observation of the time series. The ante-dependence models can be generalized to the case of several variables at each stage of the ordering. This would be analogous to the study of multiple time-series. $s$-ante-dependent sets of variables may be generated by $s$ successive summations of independent variables. This may be relevant for some applications of such models. Ante-dependence models might be applicable to observations ordered in time or otherwise. Observations on growth of organisms up to each of several ages could be analyzed in such a manner. Where growth is recorded on several dimensions, e.g., height and weight, the analysis would proceed in terms of the multidimensional generalization of the model. Other possible fields of application include batteries of psychological tests increasing in complexity, and data on the successive location of travelling objects. A study of some such applications is now under way.

Journal ArticleDOI
TL;DR: In this article, the Bayes sequential design is obtained for an optimization problem involving the choice of experiments given are experiments $A, B, densities $p_1, p_2, a positive integer $N$ and a number $\xi \varepsilon \lbrack 0, 1\rbrack$ a sequence of observations is to be made such that at each stage either $A$ or $B$ is observed, the loss being 1 if the experiment with density$p_2$ is chosen, 0 otherwise 0 unless the prior probability
Abstract: The Bayes sequential design is obtained for an optimization problem involving the choice of experiments Given are experiments $A, B$, densities $p_1, p_2$, a positive integer $N$ and a number $\xi \varepsilon \lbrack 0, 1\rbrack$ A sequence of $N$ observations is to be made such that at each stage either $A$ or $B$ is observed, the loss being 1 if the experiment with density $p_2$ is chosen, 0 otherwise $\xi$ is the prior probability that $A$ has density $p_1$ If the mean of $p_1$ is bigger than the mean of $p_2$ one obtains a more common version of the "two-armed bandit" (see eg [1]) The principal result of this paper is a proof of optimality for the procedure which at each stage chooses the experiment with higher posterior probability of being correct Some attention is also given to the problem of computing risk functions

Journal ArticleDOI
TL;DR: In this paper, the large-sample limiting shapes of the Bayes sequential testing regions of composite hypotheses are found explicityly, and the result obtained is related to the Sequential Probability Ratio Test in the same way taht the likelihood ratio statistic for composite hypotheses for simple hypotheses.
Abstract: The large-sample limiting shapes of the Bayes sequential testing regions of composite hypotheses are found explicityly. The result obtained is related to the Sequential Probability Ratio Test in the same way taht the likelihood ratio statistic for composite hypotheses is related to the Ne7man-Pearson test for simple hypotheses

Journal ArticleDOI
TL;DR: In this article, the problem of determining the appropriate degree of a polynomial in the index, say time, to represent the regression of the observable variable is formulated in terms used in the theory of testing hypotheses and the optimal procedure is to test in sequence whether coefficients are 0, starting with the highest (specified) degree.
Abstract: On the basis of a sample of observations, an investigator wants to determine the appropriate degree of a polynomial in the index, say time, to represent the regression of the observable variable. This multiple decision problem is formulated in terms used in the theory of testing hypotheses. Given the degree of polynomial regression, the probability of deciding a higher degree is specified and does not depend on what the actual polynomial is (expect its degree). Within the class of procedures satisfying these conditions and symmetry (or two-sidedness) conditions, the probabilities of correct decisions are maximized. The optimal procedure is to test in sequence whether coefficients are 0, starting with the highest (specified) degree. The procedure holds for other linear regression functions when the independent variates are ordered. The problem and its solution can be generalized to the multivariate case and to other cases with a certain structure of sufficient statistics.

Journal ArticleDOI
TL;DR: In this article, a density on the space of sample functions over a continuous time Markov process with stationary transition probabilities is constructed, which depends upon the identity matrix of the process.
Abstract: Let $\{Z(t), t > 0\}$ be a separable, continuous time Markov Process with stationary transition probabilities $P_{ij}(t), i, j = 1, 2, \cdots, M$. Under suitable regularity conditions, the matrix of transition probabilities, $P(t)$, can be expressed in the form $P(t) = \exp tQ$, where $Q$ is an $M \times M$ matrix and is called the "infinitesimal generator" for the process. In this paper, a density on the space of sample functions over $[0, t)$ is constructed. This density depends upon $Q$. If $Q$ is unknown, the maximum likelihood estimate $\hat{Q}(k, t) = \|\hat{q}_{ij}(k, t)\|$, based upon $k$ independent realizations of the process over $\lbrack 0, t)$ can be derived. If each state has positive probability of being occupied during $\lbrack 0, t)$ and if the number of independent observations, $k$, grows larger ($t$ held fixed), then $\hat{q}_{ij}$ is strongly consistent and the joint distribution of the set $\{k^{\frac{1}{2}}(\hat{q}_{ij} - q_{ij})\}_{i eq j}$ (suitably normalized), is asymptotically normal with zero mean and covariance equal to the identity matrix. If $k$ is held fixed (at one, say) and if $t$ grows large, then $\hat{q}_{ij}$ is again strongly consistent and the joint distribution of the set $\{t^{\frac{1}{2}}(\hat{q}_{ij} - q_{ij})\}_{i eq j}$ (suitably normalized), is asymptotically normal with zero mean and covariance equal to the identity matrix, provided that the process $\{Z(t), t > 0\}$ is positively regular. The asymptotic variances of the $\hat{q}_{ij}$ are computed in both cases.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the multivariate normal c.d. can be expressed as a single integral having a product of univariate normal C.d.'s in the integrand.
Abstract: As has been noted by several authors, when a multivariate normal distribution with correlation matrix $\{\rho_{ij}\}$ has a correlation structure of the form $\rho_{ij} = \alpha_i \alpha_j (i eq j)$, where $-1 \leqq \alpha_i \leqq + 1$, its c.d.f. can be expressed as a single integral having a product of univariate normal c.d.f.'s in the integrand. The advantage of such a single integral representation is that it is easy to evaluate numerically. In this paper it is noted that the $n$-variate normal c.d.f. with correlation matrix $\{\rho_{ij}\}$ can always be written as a single integral in two ways, with an $n$-variate normal c.d.f. in the integrand and the integration extending over a doubly-infinite range, and with an $(n - 1)$-variate normal c.d.f. in the integrand and the integration extending over a singly-infinite range. We shall show that, for certain correlation structures, the multivariate normal c.d.f. in the integrand factorizes into a product of lower-order normal c.d.f.'s. The results may be useful in instances where these lower-order integrals are tabulated or can be evaluated. One important special case is $\rho_{ij} = \alpha_i \alpha_j$, previously mentioned. Another is $\rho_{ij} = \gamma_i/\gamma_j (i < j)$, where $|\gamma_i| < |\gamma_j|$ for $i < j$. Some applications of these two special cases are given.

Journal ArticleDOI
TL;DR: In this article, the spectral density of a discrete stationary stochastic process is estimated for the case when the observations consist of repeated groups of a equally spaced observations followed by 13 missed observations, (a >,3).
Abstract: 0. Summary. Estimating the spectral density of a discrete stationary stochastic process is studied for the case when the observations consist of repeated groups of a equally spaced observations followed by 13 missed observations, (a > ,3). The asymptotic variance of the estimate is derived for normally distributed variables. It is found that this variance depends not only on the value of the spectral density being estimated, but also on the spectral density at the harmonic frequencies brought in by the periodic method of sampling. Curves are presented for ,B = 1 showing the increase in the standard deviation and effective decrease in sample size as a function of a. 1. Introduction. When observing a stationary stochastic process at equally spaced intervals of time, it is sometimes necessary to occasionally miss observations for calibration or other purposes. The difficulty of estimating the spectral density in this case is not greatly increased, but in order to determine what is lost by this method of sampling, it is necessary to determine the increase in variance. Given a sample of size N, X1, X2, ***, XN, from a real stationary stochastic process of mean zero and continuous spectral density, 00

Journal ArticleDOI
TL;DR: In this paper, a treatment of the M/G/1 queue with interruptions of Poisson incidence occasioned either by server breakdown or the arrival of customers with higher priority is given.
Abstract: A treatment is given of the M/G/1 queue with interruptions of Poisson incidence occasioned either by server breakdown or the arrival of customers with higher priority. Interruption times and priority service times have arbitrary distribution. After pre-emptive interruption, ordinary service is either repeated or resumed. The time dependent behavior of the system is discussed in a complete state space and the joint density in all system variables of this space is constructed systematically from the densities associated with a set of simpler first-passage problems. Equilibrium distributions are available as limiting forms and server busy period distributions obtained.



Journal ArticleDOI
TL;DR: In this paper, it was shown that the dependence measures are not equivalent to Shannon's mutual information in the sense of preserving order in strictly positive probability spaces (which are necessarily generated by random variables).
Abstract: Renyi [19] gives a set of seven postulates which a measure of dependence for a pair of random variables should satisfy. Of the dependence measures considered by Renyi, only Gebelein's [5] maximal correlation, $S_P$, satisfies all seven postulates. Kramer [10] in considering the uncertainty principle in Fourier analysis [11] generalizes the Gebelein maximal correlation to the case of arbitrary pairs of $\sigma$-algebras; and asks whether this generalization is equivalent to Shannon's mutual information, $C_P$, [4, 9, 21] for pairs of $\sigma$-algebras--equivalent in the sense of preserving order. The object of this note is to compare $S_P$ and the two normalizations $C'_P$ and $C"_P$, of $C_P$, as dependence measures for strictly positive probability spaces (which are necessarily generated by random variables). It is found that for such spaces with the proper finiteness restrictions (a) (Thm 5.1) $0 \leqq S_P, C'_P, C"_P \leqq 1$; (b) (Thm 5.2) $S_P = 0 \text{iff} C'_P = 0 \text{iff} C"_P = 0$ iff the random variables are independent; (c) (Thm 5.4) $S_P = 1$ if the two generated algebras have a nontrivial intersection (the conditions are equivalent for finite algebras); $C'_P = 1$ iff one of the random variables is a function of the other; and $C"_P = 1$ iff the random variables are functions of each other; and, consequently, (d) (Thm 5.5) there exist probability spaces for which the dependence measures are not equivalent. The paper is divided into six sections. Section 1 contains the introduction and summary. Section 2 introduces the terminology, notation and preliminaries. Section 3 treats $S_P$ and the Renyi postulates. In Section 4, the basic Shannon-Feinstein-Khinchin mutual information is extended to strictly positive measure spaces, not necessarily finite. The comparison of the dependence measures and postulate modifications are given in Section 5. Finally, in Section 6 some extensions and open problems are mentioned.