scispace - formally typeset
Search or ask a question
Author

Pranab Kumar Sen

Bio: Pranab Kumar Sen is an academic researcher from University of North Carolina at Chapel Hill. The author has contributed to research in topics: Estimator & Nonparametric statistics. The author has an hindex of 51, co-authored 570 publications receiving 19997 citations. Previous affiliations of Pranab Kumar Sen include Indian Statistical Institute & Academia Sinica.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a Kolmogorov-Smirnov type test based on the aligned observations is considered and its properties are studied for testing the hypothesis that two (symmetric) distributions may differ only in locations.

4 citations

Journal ArticleDOI
TL;DR: In this paper, a critical appraisal of this basic role of jackknifing in shrinkage estimation is made and the usual versions of jack-knifed shrinkage estimates are considered and their performance characteristics are studied.
Abstract: In multi-parameter ( multivariate ) estimation, the Stein rule provides minimax and admissible estimators , compromising generally on their unbiasedness. On the other hand, the primary aim of jack-knifing is to reduce the bias of an estimator ( without necessarily compromising on its efficacy ), and, at the same time, jackknifing provides an estimator of the sampling variance of the estimator as well. In shrinkage estimation ( where minimization of a suitably defined risk function is the basic goal ), one may wonder how far the bias-reduction objective of jackknifing incorporates the dual objective of minimaxity ( or admissibility ) and estimating the risk of the estimator ? A critical appraisal of this basic role of jackknifing in shrinkage estimation is made here. Restricted, semi-restricted and the usual versions of jackknifed shrinkage estimates are considered and their performance characteristics are studied . It is shown that for Pitman-type ( local ) alternatives, usually, jackkntfing fails to prov...

4 citations

Book ChapterDOI
TL;DR: In this paper, the authors present the major issues concerning the use of the Hamming distance and its generalizations in bioinformatics and present a large class of quasi U-statistics for which desirable asymptotic properties are attainable under mild regularity conditions.
Abstract: We present the major issues concerning the use of the Hamming distance and its generalizations in bioinformatics. Hamming distance type measures have enjoyed a perennial usage in the fields of biodiversity, genetics, ecology, among many others. Bioinformatics data bring new challenges to statistical procedures. Dependence among variates is presented in stochastic as well as functional forms. The classical asymptotic paradigm of very large data sets formed by a small number of variates is usually inappropriate in bioinformatics data, in which usually one expects a few observations each of very high dimension. Thus, parametric modeling may be unsuitable in many situations in bioinformatics and its shortcomings may appear hidden as spurious statistical artifacts. The Hamming distance, in its classical or modified versions, is a very powerful tool for the statistical analysis of such data. Its functional straightforwardness results in fast calculations even for very large data sets, and interpretations are easily obtained. Moreover, we can successfully employ Hamming distance based test statistics for studies concerning population heterogeneity. These statistics belong to a large class called quasi U-statistics for which desirable asymptotic properties are attainable under mild regularity conditions. The use of generalized Hamming distance is exemplified by a real DNA data set.

4 citations

Journal ArticleDOI
TL;DR: In this article, Rao's (modified) scores statistics in conjunction with an alignment procedure are incorporated in the formulation of suitable randomisation tests which avoid the complications of likelihood based parametrics and exploit the Chatterjee-Sen (multivariate) permutation principle for such shape distributions.
Abstract: Based on independent samples, tests for homogeneity of distributions for compositional and directional data models are considered. Rao's (modified) scores statistics in conjunction with an alignment procedure are incorporated in the formulation of suitable randomisation tests which avoid the complications of likelihood based parametrics and exploit the Chatterjee-Sen (multivariate) permutation principle for such shape distributions. Our proposed tests amend readily to other directional data models. Finite as well as large sample size properties of the proposed tests are presented.

4 citations

Book ChapterDOI
01 Jan 2002
TL;DR: This paper addresses some basic statistical issues prevailing in the context of health related quality of life assessment problems.
Abstract: In the broad range WHO interpretation of Quality of Life, albeit made from an individualistic perspective, there are numerous qualitative factors along with some relatively more quantitative ones which are useful in the context of health related quality of life assessment problems. Though item analysis is commonly used in practice for (quantitative) risk assessment, for drawing valid conclusions, statistical reasoning is essential. Quality of life, survival time and quality adjusted life are important (population-based) measures that need to be appraised in light of statistical and health-related undercurrents. This paper addresses some basic statistical issues prevailing in this context.

4 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A nonparametric approach to the analysis of areas under correlated ROC curves is presented, by using the theory on generalized U-statistics to generate an estimated covariance matrix.
Abstract: Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed variable that lies on a continuous or graded scale, an assessment of the overall value of the test can be made through the use of a receiver operating characteristic (ROC) curve. The curve is constructed by varying the cutpoint used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates. When two or more empirical curves are constructed based on tests performed on the same individuals, statistical analysis on differences between curves must take into account the correlated nature of the data. This paper presents a nonparametric approach to the analysis of areas under correlated ROC curves, by using the theory on generalized U-statistics to generate an estimated covariance matrix.

16,496 citations

Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Abstract: Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparable procedures which control the traditional familywise error rate. We prove that this same procedure also controls the false discovery rate when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses. This condition for positive dependency is general enough to cover many problems of practical interest, including the comparisons of many treatments with a single control, multivariate normal test statistics with positive correlation matrix and multivariate $t$. Furthermore, the test statistics may be discrete, and the tested hypotheses composite without posing special difficulties. For all other forms of dependency, a simple conservative modification of the procedure controls the false discovery rate. Thus the range of problems for which a procedure with proven FDR control can be offered is greatly increased.

9,335 citations

Journal ArticleDOI
TL;DR: In this article, a simple and robust estimator of regression coefficient β based on Kendall's rank correlation tau is studied, where the point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti.
Abstract: The least squares estimator of a regression coefficient β is vulnerable to gross errors and the associated confidence interval is, in addition, sensitive to non-normality of the parent distribution. In this paper, a simple and robust (point as well as interval) estimator of β based on Kendall's [6] rank correlation tau is studied. The point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti , and is unbiased. The confidence interval is also determined by two order statistics of this set of slopes. Various properties of these estimators are studied and compared with those of the least squares and some other nonparametric estimators.

8,409 citations