scispace - formally typeset
Search or ask a question

Showing papers in "International Statistical Review in 1973"


Journal Article•DOI•

4,052 citations


Journal Article•DOI•
TL;DR: Isotonic regression under order restrictions has been used to test the equality of ordered means for goodness of fit as discussed by the authors, in the normal case and in the special case of a sigma-lattice.
Abstract: : ;Contents: Isotonic regression; Estimation under order restrictions; Testing the equality of ordered means--likelihood ratio tests in the normal case; Testing the equality of ordered means--extensions and generalizations; Estimation of distributions; Isotonic tests for goodness of fit; Conditional expectation given a sigma-lattice

741 citations



Journal Article•DOI•
TL;DR: In this paper, the authors used a deck of cards to measure the proportion of persons having a certain characteristic A (belong to group A) belonging to groups A. The randomizing instrument to be used here is a deck-of-cards and on each card there is one of the following two statements, which occur with relative frequency P and 1-P respectively.
Abstract: In sample surveys it is often difficult to get reliable information from the respondents. This is very often the case when the investigation deals with phenomena that are illegal or looked upon as morally condemnable by society. As, for example, illegal abortion and withholding incomes at self-assessment. Refusal to answer and consciously false information commonly causes large systematic errors when estimating parameters of interest. In order to diminish such errors when estimating the proportion of persons having a certain characteristic A (belong to group A) Warner [3] suggested an interview technique which provides true information with known probability but which does not reveal to which group the respondent belongs. Different techniques to get the information have been suggested by earlier authors. The randomizing instrument to be used here is a deck of cards. On each card there is one of the following two statements, which occur with relative frequency P and 1-P respectively: 1. I belong to group A 2. I belong to group A.

95 citations




Journal Article•DOI•
TL;DR: In this article, the authors examine the extent of digit preference and avoidance in the age statistics of recent African censuses, paying particular attention to identifying some of the social and economic correlates of age heaping.
Abstract: Demographers have long known that the age data compiled by national population censuses are often subject to a number of limitations, particularly in countries that can be characterized as economically underdeveloped. In those instances where age statistics are inadequate, it can generally be attributed to one or both of two basic sources: (1) failure to report ages (especially for very young children who are often not thought of as full members of society); and (2) misstatements of ages that are reported. When the former occurs, and the proportion of unknown ages is significantly large, little can be done unless the demographer has access to other data that would provide a basis for a realistic distribution of unknowns among the population with ages known. In most cases, however, the problem of unknown ages is secondary; and it is errors in the ages reported that constitute the most serious source of bias in age data. Misstatements of age arise from a number of different sources, among the most common of which are ignorance of correct age; a tendency to understate at some ages while exaggerating at others particularly at the more advanced ages; either a conscious or subconscious preference for certain socially significant ages (such as 21 or 65 in the United States), and a corresponding avoidance of other ages (such as the "unlucky" 13); a general tendency to overselect ages ending in certain digits, such as 0, 2, 5 and 8; and finally, deliberate falsification of age reports for a variety of social, economic, political or purely individual motives. As a result of these tendencies, population age distributions often display a regular pattern of irregularity possessing such characteristics as: (a) a deficiency of infants and very young children; (b) overstatement of seniority among the very oldest ages; (c) overstatement of certain socially significant ages; and (d) overstatement at ages ending in certain preferred digits with a corresponding understatement of ages ending in other digits.' This latter is the most common source of error in age reporting, and it leads to a pattern of successive "heaping" throughout an age distribution.2 The aim of the present paper is to examine the extent of digit preference and avoidance in the age statistics of recent African censuses, paying particular attention to identifying some of the social and economic correlates of age heaping.

33 citations


Journal Article•DOI•
TL;DR: In this article, two bivariate probability models were studied: one for the joint distribution of the number of accidents and number of fatal accidents; and another for joint distributions of the accidents and the fatalities.
Abstract: which may lead to one or more fatalities. It is clear that the number of fatal accidents Y and the number of fatalities Z are both positively correlated with the number of causing accidents X. What is unclear, however, is how sometimes a major accident involving large property damage is not fatal while a minor accident may be fatal. This reflects the uncontrollable nature of those factors which cause traffic fatalities. This suggests that a probability model might be used to express the relationship between the number of accidents and the number of fatal accidents and the relationship between the number of accidents and the number of fatalities. In this investigation, two bivariate probability models will be studied: one for the joint distribution of the number of accidents and the number of fatal accidents; and one for the joint distribution of the number of accidents and the number of fatalities. In both models, the number of accidents X at a certain location during a given time interval is assumed to

29 citations



Journal Article•DOI•
TL;DR: In this paper, the authors present a survey of the problem of comparing maximum likelihood estimation with other estimation procedures and the related problems that arise by reason of the existence of inconsistent maximum likelihood estimates, and of other estimators which are superefficient.
Abstract: In Part 2 of this survey the discussion is mainly confined to the problem of comparing maximum likelihood estimation with other estimation procedures and the related problems that arise by reason of the existence of inconsistent maximum likelihood estimates, and of other estimators which are superefficient, with regard to which special reference is made to the work of C. R. Rao (1961a and 1962a). Initially, however, there is a further discussion of Cramer's conditions as they relate to consistency and efficiency, and there is also a section on the general multiparameter situation. All references are contained within the bibliography at the end of Part 1 (Int. Statist. Rev., Vol. 40, No. 3, 1972, pp. 329-354), but a number of important definitions are given separately in the Appendix at the end of this paper.

25 citations



Journal Article•DOI•
TL;DR: In this article, the authors proposed a controlled simple random sampling (CSR) method, which reduces the risk of obtaining a non-preferred sample to the minimum possible extent and yet provides an estimate which is at least as efficient as in the case of single random sampling.
Abstract: mechanism of stratification, it may still be necessary to further control the selection of the sample within each stratum. The techniques of cluster sampling and subsampling can be used within a stratum to minimize the selection of non-preferred samples. However, they may result in loss of precision of the estimated characteristic under study. Attempts were therefore made to propose new methods of controlled selection. Goodman and Kish were the first to suggest a technique of controlled selection. The idea of controlled selection is doubtless attractive but needs to be further developed before it can be used to obtain estimates of known precision. The authors [2] have developed a technique of controlled sampling with equal probabilities and without replacement which reduces the risk of obtaining a non-preferred sample to the minimum possible extent and yet provides an estimate which is at least as efficient as in the case of simple random sampling. To further enhance the applicability of this technique, the authors [3] have proposed some simplified procedures of controlled selection. At times, even these may appear to be time consuming. Avadhani [4] has suggested a new approach to obviate this difficulty altogether. The results based on this approach which ultimately lead to the method of controlled simple random sampling, are consolidated in this paper in an integrated fashion and utilised to present a suitable sampling mechanism for controlled selection. 2. Notation and Definitions

Journal Article•DOI•
TL;DR: A historical account is given of the development of the concepts of entropy, communication theory and their relation to statistical information and an attempt is made to remove misunderstandings.
Abstract: Summary A historical account is given of the development of the concepts of entropy, communication theory and their relation to statistical information. There are many misunderstandings about the nature of information as often defined in statistical theory and an attempt is made to remove them. 1. By the middle of the nineteenth century heat was accepted as a mode of molecular motion. The law which asserts the fact, the so-called First Law of Thermodynamics, has the important implication that in a conservative system wherein the energy remains constant, there is a loss of usable energy as temperatures within the system approach equality. The


Journal Article•DOI•
TL;DR: This paper deals with the problem of estimating the probability density function (p.d.f.) of a random variable (r.v.) based on random number of observations and investigates the asymptotic properties of a class of estimates j.f (x) of the p.f. (x), which is defined by Parzen (1962).
Abstract: In this paper we deal with the problem of estimating the probability density function (p.d.f.) of a random variable (r.v.) based on random number of observations. Parzen (1962) has investigated the asymptotic properties of a class of estimates j. (x) of the p.d.f. f(x) of a r.v. X based on n independent observations X1, ..., Xn. The estimate jf (x) is defined by

Journal Article•DOI•
TL;DR: In this article, an interactive computer-based system for assisting investigators on a step-by-step basis in the use of a particular analytic tool Bayesian analysis using the two parameter normal model.
Abstract: Few scientific investigators can maintain expertise in statistical methods. They may have had substantial training in, understanding of, and competence in statistical methods; but they are unlikely to exercise these skills often enough to maintain them at a high level of proficiency. These investigators can use and are typically receptive to guidance in their statistical work provided they remain in control of their own analyses. Investigators with lesser statistical skills will benefit from even more directive guidance through the maze of detail required of good statistical practice. For all investigators, the tedium of computation or, alternatively, the maintenance of esoteric computer expertise is a regrettable hindrance to their function of extracting meaning from data. The system described here is an interactive computer-based system for assisting investigators on a step-bystep basis in the use of a particular analytic tool Bayesian analysis using the two parameter normal model. The example is meant to be suggestive of the kinds of computer-assisted data analysis programs that can be developed for use by scientific investigators. Programs such as these can also be used in the classroom and laboratory for teaching purposes, but beyond this, they can be used by the practising scientist in his day-to-day work.

Journal Article•DOI•
TL;DR: In this paper, the major developments of sampling inspection procedures for continuous production processes, from an economic point of view, are reviewed, and a discussion on some practical points for consideration in the selection of an inspection procedure.
Abstract: This paper studies the major developments of the economic design of inspection procedures for continuous production. Such procedures are mainly proposed for the aim of trouble shooting, and can be classified under two headings: itemwise inspection and grouped inspection. In the first class, more papers have been written on the surveillance of the proportion of defectives than on the surveillance of a " variable" quality characteristic, and it is seen that a majority of the papers has been influenced by Girshick and Rubin's (1952) work. Some of the more important procedures are presented. In the second class, the papers reviewed deal exclusively with the surveillance of a "variable" quality characteristic, using control charts. It is seen that, among these papers, Duncan's contributions (1956, 1971) are more important and practical. Comments are made on the advantages and limitations of each procedure. In the last section there is a discussion on some practical points for consideration in the selection of an inspection procedure. Finally, the paper points out some desirable research areas for future study. In this paper we review some major developments of sampling inspection procedures for continuous production processes, from an economic point of view. Products are being made continuously in a conveyor belt and an inspection station is located somewhere in the line, possibly at a point after a production stage that has a good chance of causing a variation in the quality. Is it often not desirable to group the product artificially into batches in order to apply the batch-by-batch inspection plans because batching under such circumstances may interfere with the production efficiency, require extra storage space, lead to rejecting articles not yet produced, and the like. It is more practical to inspect items (itemwise sampling) or to inspect groups of several items (grouped sampling) at a certain rate and to diagnose the condition of the production in progress. The form of a continuous sampling inspection procedure is more complex than the batch-by-batch inspection scheme, and each such procedure has a restricted area of application.

Journal Article•DOI•
TL;DR: In this paper, a reliability programming approach is built into the intertemporal quadratic system known as the HMMS model, which is well known for its linear decision rule method for production scheduling.
Abstract: Summary Using a chance-constrained interpretation of the sales constraint, a reliability programming approach is built into the intertemporal quadratic system known as the HMMS model, which is well known for its linear decision rule method for production scheduling. Our empirical results tend to confirm the tendency for the optimal decision variables to converge very fast to their steady-state values.


Journal Article•DOI•
TL;DR: The paper summarizes the four objectives of survey evaluation programme: to measure the accuracy of survey results in order to guide users; to analyse sources of error with a view to subsequent improvement; to evaluate alternative methods of survey design; and to institute a continuing control procedure.
Abstract: Summary The paper summarizes the four objectives of survey evaluation programme: to measure the accuracy of survey results in order to guide users; to analyse sources of error with a view to subsequent improvement; to evaluate alternative methods of survey design; to institute, in the case of a regular, periodic survey, a continuing control procedure. Much of the paper is devoted to the summarization of concrete Canadian experiences with respect to the evaluation of different specific sources of error: sampling errors, coverage errors, response errors, non-response errors, processing errors.

Journal Article•DOI•
TL;DR: This paper specifies a small model, estimates this model using preliminary national accounting data for each of 21 countries, and then re-estimate the same model for the same time period using revised data, to present some results bearing on measurement errors of a kind.
Abstract: The question of what procedure to choose for estimating a given simultaneous-equation model is a basic one in applied econometrics. Considerations of computing cost, small-sample properties of estimators, and the sensitivity of different estimators to specification error are all relevant. Also relevant is the sensitivity of estimators to errors of observation or measurement. Do measurement errors in the data tend to distort some estimators more than others? How important in practice is the problem of measurement error in relation to the more widely discussed problem of choosing among alternative estimators ? The purpose of this paper is to present some results bearing on both of these questions. There are two ways of approaching the questions. One is to carry out Monte Carlo experiments with artificial data.' The other and this is the one followed here is to seek and make use of real-world data for which there is some information about actual errors of measurement.2 Both approaches have advantages and disadvantages. The particular advantage of the second approach is obvious: it produces results based on actual errors rather than artificial ones for which distributions must be assumed. On the other hand, a major disadvantage is the scarcity of data for which anything is known about errors. The presence of measurement error in economic statistics is widely recognized, of course, but specific information is seldom available. One case for which some information is available is the case of preliminary national accounting data. The United Nations Statistical Office compiles and publishes, in standardized form, annual national accounts estimates for a large number of countries. Each year the estimates for previous years may be modified by the national statistical agencies reporting to the U.N. as new or improved information becomes available. Undoubtedly, even the most thoroughly revised figures are still less than perfect but presumably they are better than the preliminary ones. In this sense, the differences between preliminary and revised data may be viewed as measurement errors of a kind. What we have done in this study is to specify a small model, estimate this model using preliminary national accounting data for each of 21 countries, and then re-estimate the same model for the same time period using revised data. The characteristics of the data revisions themselves and their effects on parameter estimates obtained by two-stage least squares have been reported in detail in Denton and Oksanen [4]. In the present study, we have estimated the model by ordinary least squares (OLS) and three-stage least squares (3SLS), as well as two-stage least squares (2SLS). The focus here is on (a) the relative sensitivity to

Journal Article•DOI•
TL;DR: A survey of the concerned efforts in recent years to develop the theory of stochastic integral equations using as the principal tools the methods of probability theory, functional analysis and topology can be found in this article.
Abstract: Our aim in this presentation is to introduce the theoretical and applied scientists to the area of Stochastic Integral Equations. We hope to convey the manner in which such equations arise, the mathematical difficulties encountered, their usefulness to life sciences and engineering, and a survey of the concerned efforts in recent years to develop the theory of stochastic integral equations using as the principal tools the methods of probability theory, functional analysis and topology. Due to the nondeterministic nature of phenomena in the general areas of the biological, engineering, oceanographic and physical sciences, the mathematical descriptions of such phenomena frequently result in random or stochastic equations. These equations arise in various ways, and in order to understand better the importance of developing the theory of such equations and its application, it is of interest to consider how they arise. Usually the mathematical models or equations used to describe physical phenomena contain certain parameters or coefficients which have specific physical interpretations, but whose values are unknown. As examples, we have the diffusion coefficient in the theory of diffusion, the volume-scattering coefficient in underwater acoustics, the coefficient of viscosity in fluid mechanics, the propagation coefficient in the theory of wave propagation, and the modulus of elasticity in the theory of elasticity, among others. The mathematical equations are solved using as the value of the parameter or coefficient the mean value of a set of observations experimentally obtained. However, if the experiment is performed repeatedly, then the mean values found will vary, and if the variation is large, the mean value actually used may be quite unsatisfactory. Thus in practice the physical constant is not really a constant, but a random variable whose behaviour is governed by some probability distribution. It is thus advantageous to view these equations as being random rather than deterministic, to search for a random solution, and to study its statistical properties. There are many other ways in which random or stochastic equations arise. Stochastic differential equations appear in the study of diffusion processes and Brownian motion (I. I. Gikhmann and A. V. Skorokhod [24]). The classical Ito random integral equation (K. Ito [27]) may be found in many texts, for example in Doob [19], which is a Stieltjes integral with respect to the Brownian motion process. Integral equations with random kernels arise in random eigenvalue problems (A. T. Bharucha-Reid [11]). Stochastic integral equations


Journal Article•DOI•
TL;DR: The need to allow for the dynamic nature of adjustment patterns in economic behavior has been a primary influence on the development of distributed lag models where the results of a change in an independent variable are spread over a number of time periods.
Abstract: The need to allow for the dynamic nature of adjustment patterns in economic behaviour has been a primary influence on the development of distributed lag models where the results of a change in an independent variable are spread over a number of time periods. While there are a variety of estimators recommended for these models on the basis of their asymptotic characteristics, it has proved difficult to derive the small sample properties of these techniques analytically. Accordingly, over the past decade there have been a number of Monte Carlo studies examining their respective small sample performance patterns. The literature is fairly diverse and the experiments distinguished by a variety of design attributes. Consequently it is difficult "to take stock" of what has been established by this research. The purpose of this paper is to provide a brief tabular review of most of the Monte Carlo studies in this area.2 Section II discusses the models and estimators used in each of the ten studies summarized in Table I. Section III notes the salient results of these papers and indicates areas which deserve additional attention.

Journal Article•DOI•
TL;DR: In this paper, the Cauchy-Schwarz inequality is used to derive the relation between the statistics for individual functions and the sums of squares for linear sets of functions.
Abstract: 1. Introduction and Summary An elementary proof of some basic relationships in analysis of variance is presented in this paper. This approach is also shown to yield fundamental results in testing and simultaneous inference. The common textbook solution of linear least-squares methods derives the requisite normal equations either by partial differentiation or by appeals to vector spaces and their orthogonal bases (Scheff6, 1959, section 1.3). In either case it is necessary first to formulate a linear model for the expectations. The presentation of simultaneous confidence intervals usually involves a geometric argument which yields the relation between the statistics for individual functions and the sums of squares for linear sets of functions (Scheff6, 1953 - see also Scheff6, 1959, Appendix III). It is shown here how both these results can be derived simultaneously by an application of the Cauchy-Schwarz inequality - the Theorem of section 2. This approach hinges on the equality of the maximum of the "squares" for single functions and the minimum "sum of squares" for a set of such functions. The equalities and inequalities used in simultaneous inference follow naturally and do not require separate treatment. The present derivation is simpler mathematically than the common ones. Moreover, it does not require postulation of a model on variables and expectations but stresses the algebraic structure of the statistics. This may be useful in yielding data-analytic insights. For these reasons, the present approach is likely to be didactically preferable to the common one, at least at the introductory level. (For more advanced students the geometric approach no doubt

Journal Article•DOI•
TL;DR: In this article, Hohenbalken and Tintner assume a simple version of the "Keynesian" consumption function, involving gross national product as an explanatory variable, although they supplement the latter with lagged Gross National Product and with a trend variable.
Abstract: Most tests of recent consumption function hypotheses have been confined to a single country or, in some cases, to a very small number of countries.' Two studies which do involve data drawn from a large number of countries are those of Hohenbalken and Tintner 2 and Yang.3 We first consider their work, and then discuss the theoretical models employed in our analysis.4 The Hohenbalken-Tintner study was, in significant measure, aimed at the estimation of policy models and it did not purport to test complex consumption hypotheses. Hohenbalken and Tintner assume a simple version of the "Keynesian" consumption function, involving gross national product as an explanatory variable, although they supplement the latter with lagged gross national product and with a trend variable. The variables are in constant prices per capita. In general, the additional explanatory variables were found to be statistically nonsignificant. Yang was concerned mainly with testing basic "Keynesian" consumption theory with data from both developing countries and industrially advanced ones. Yang's analysis encompassed eighteen countries, and used United Nations data for the period 1950-1959. The variables were defined per capita in constant prices. Two forms were tested, with disposable income as an explanatory variable supplemented with a change-in-income variable. The latter was seldom significant, and assumed a negative sign in the majority of cases. (Its sign was positive in several instances including Austria, Belgium, Sweden, the United Kingdom and the United States.) Further analysis suggested that there was a direct relationship between the stability of growth in disposable income and overall goodness of fit.

Journal Article•DOI•
TL;DR: A survey of the statistical methods used in environmental studies and some unsolved statistical problems is presented in this article, where the authors present a two-man sampler of the available literature.
Abstract: "Washingtonians do not need statisticians to tell them what air pollution does to their property. Anyone who has spent a summer in the District knows how difficult it is to keep houses, cars, clothes, and window sills unsoiled." This quotation from Esposito [28] suggests that environmental problems are so obvious that the added help of statisticians in attempts to point out these problems is minimal. Some problems are not quite so obvious, and in some of them we think statisticians can be of great help. This paper surveys some of the statistical methods used in environmental studies and indicates some unsolved statistical problems. Some of the problems are pressing since regulatory bodies are called upon to set standards that more and more incorporate statistical and probabilitistic aspects (see section 2 on air pollution and noise standards) and these must be based on the soundest data evaluated in the best way, i.e. statistics. The views expressed in this paper are those of two statisticians interested in the environment and its problems. The paper should be considered a two-man sampler. Two books are available that are in a sense more complete. One is a report of the Environmental Pollution Panel [26] of the President's Science Advisory Committee chaired by J. W. Tukey. The second, the report of a Task Force on Research Planning in Environmental Health Science [101], chaired by N. Nelson lists some research needs; there is a long chapter discussing epidemiological and biometrical problems and research needs in the environmental health sciences. Both books are useful in pointing out statistical aspects of pollution and the environment. It has not been possible to fully evaluate all the literature the authors have limited expertise and at times a paper will be cited without critical comments. Since this is a survey paper we cannot go into many details some would say we do not go into sufficient detail. However, a major goal of the paper is to inform statisticians of the existence of a substantial literature. Finally, this paper is written in the U.S.A. with some resultant bias in the selection of the topics and problems. However, both authors participated in the IASPS Symposium on Statistical Aspects of Pollution Problems and came away with the feeling that the technical problems are similar the world over. One of the major components in the pollution web is overpopulation. We will not deal with this problem directly; we feel that demography is a well developed branch of applied science in which demographers have already made substantial contributions to understanding the problems from a population point of view. 41/3--B