scispace - formally typeset
Search or ask a question
Author

Pranab Kumar Sen

Bio: Pranab Kumar Sen is an academic researcher from University of North Carolina at Chapel Hill. The author has contributed to research in topics: Estimator & Nonparametric statistics. The author has an hindex of 51, co-authored 570 publications receiving 19997 citations. Previous affiliations of Pranab Kumar Sen include Indian Statistical Institute & Academia Sinica.


Papers
More filters
Journal ArticleDOI
01 Nov 2007-Stroke
TL;DR: In stroke/transient ischemic attack patients, leukocyte count is independently associated with the progression of aortic atheroma over 12 months (>0.70 mm), which is associated with cardiovascular risk.
Abstract: Background and Purpose— Leukocyte count is an independent predictor of stroke. We investigated the association between leukocyte count and progression of aortic atheroma over 12 months in stroke/transient ischemic attack (TIA) patients. Methods— Consecutive ischemic stroke and transient ischemic attack patients underwent 12-month sequential transesophageal echocardiography and were assessed for total and differential leukocyte counts on admission. Paired aortic plaque images were assessed for several parameters, including changes in grade, intimal-medial thickness (IMT), and cross-sectional area. Multivariate linear and logistic regressions were used to calculate the effect of leukocyte count on the change in aortic atheromas over 12 months. Results— Of the 115 participants (mean±SD age, 64.6±11.9 years; 53.1% men; 73.4% white, 24.2% black, and 2.3% Asian), 45 (35%) showed clinically significant progression of aortic atheromas (maximal change in IMT >0.70 mm over 12 months). The mean admission leukocyte c...

16 citations

Journal ArticleDOI
TL;DR: In this paper, a general model incorporating staggering entry and random withdrawal and pertaining to a simple regression problem (including the two-sample location problem as a special case) is conceived, and a scheme allowing progressive censoring (continuous monitoring of experimentation from the beginning) is developed along with the proposal for and study of some nonparametric testing procedures, the proposed tests rest on the construction of certain two-dimensional time-parameter stochastic processes from a triangular array of progressively censored linear rank statistics and their weak convergence to appropriate Gaussian functions.
Abstract: In the context of (multi-center) clinical trials and life testins problems, a general model incorporating both the staggering entry and random withdrawal and pertaining to a simple regression problem (including the two-sample location problem as a special case) is conceived, and, within this framework, a scheme allowing progressive censoring (continuous monitoring of experimentation from the beginning) is developed along with the proposal for and study of some nonparametric testing procedures, The proposed tests rest on the construction of certain two-dimensional time-parameter stochastic processes from a triangular array of progressively censored linear rank statistics and their weak convergence to appropriate Gaussian functions. Asymptotic properties of these procedures are studied. A computer program pertaining to the numerical computations and practical administrations of these testing procedures is also provided at the end.

16 citations

Journal ArticleDOI
TL;DR: All publications including cluster‐randomized trials used for maternal and child health research in developing countries during the last 10 years are summarized and evaluated.
Abstract: Summary Objective To summarize and evaluate all publications including cluster-randomized trials used for maternal and child health research in developing countries during the last 10 years. Methods All cluster-randomized trials published between 1998 and 2008 were reviewed, and those that met our criteria for inclusion were evaluated further. The criteria for inclusion were that the trial should have been conducted in maternal and child health care in a developing country and that the conclusions should have been made on an individual level. Methods of accounting for clustering in design and analysis were evaluated in the eligible trials. Results Thirty-five eligible trials were identified. The majority of them were conducted in Asia, used community as randomization unit, and had less than 10 000 participants. To minimize confounding, 23 of the 35 trials had stratified, blocked, or paired the clusters before they were randomized, while 17 had adjusted for confounding in the analysis. Ten of the 35 trials did not account for clustering in sample size calculations, and seven did not account for the cluster-randomized design in the analysis. The number of cluster-randomized trials increased over time, and the trials generally improved in quality. Conclusions Shortcomings exist in the sample-size calculations and in the analysis of cluster-randomized trials conducted during maternal and child health research in developing countries. Even though there has been improvement over time, further progress in the way that researchers utilize and analyse cluster-randomized trials in this field is needed. Evaluation d’essais randomises en grappes sur la sante maternelle et infantile dans les pays en developpement Objectif: Resumer et analyser toutes les publications incluant des essais randomises en grappes utilises dans la recherche sur la sante maternelle et infantile dans les pays en developpement au cours des 10 dernieres annees. Methodes: Tous les essais randomises en grappes publies entre 1998 et 2008 ont ete examines et ceux qui repondaient a nos criteres d’inclusion ont ete retenus pour une evaluation plus approfondie. Les criteres d’inclusion exigeaient que l’essai ait ete conduit sur la sante maternelle et infantile dans un pays en developpement et que les conclusions aient ete faites a l’echelle de l’individu. Les methodes tenant compte des grappes dans la conception et dans l’analyse ont eteevaluees dans les essais eligibles. Resultats: 35 essais eligibles ont ete identifies. La majorite d’entre eux ont ete menes en Asie, utilisant la communaute comme unite de randomisation et portaient sur moins de 10000 participants. Afin de minimiser les variables confusionnelles, 23 des 35 essais avaient stratifie, bloque ou apparie les groupes avant la randomisation, tandis que 17 avaient ajuste l’analyse en fonction des variables confusionnelles. Dix des 35 essais n’ont pas tenu compte du regroupement dans le calcul de la taille de l’echantillon et 7 n’ont pas tenu compte dans l’analyse du concept de randomisation par grappe. Le nombre d’essais randomises par grappes a augmente au cours du temps et la qualite des essais s’est amelioree de maniere generale. Conclusions: Des lacunes existent dans les calculs de la taille de l’echantillon et dans l’analyse des essais randomises en grappes menes dans la recherche sur la sante maternelle et infantile dans les pays du tiers monde. Quand bien meme il y a eu des ameliorations au cours du temps, des progres supplementaires sont necessaires dans la facon dont les chercheurs utilisent et d’analysent les essais randomises en grappes dans ce domaine. Evaluacion de ensayos aleatorizados en racimo en investigacion materno-infantil en paises en vias de desarrollo Objetivo: Resumir y evaluar todas las publicaciones de ensayos aleatorizados en racimo utilizados para la investigacion materno infantil en paises en vias de desarrollo durante los ultimos 10 anos. Metodos: Se revisaron todos los ensayos aleatorizados en racimo publicados entre 1998 y 2008, y se evaluaron todos aquellos que cumplieron con los criterios de inclusion. Dichos criterios eran que el ensayo fuese sobre salud materno infantil en paises en vias de desarrollo y que las conclusiones se hubiesen realizado a nivel individual. Los metodos justificando el racimo en el diseno y en el analisis fueron evaluados en los ensayos que reunian los requisitos. Resultados: Se identificaron 35 ensayos que cumplian criterios. La mayoria de ellos habian sido realizados en Asia, utilizaban la comunidad como unidad de aleatorizacion, y tenian menos de 10,000 participantes. Para minimizar factores de confusion, 23 de los 35 ensayos tenian racimos estratificados, bloqueados o pareados antes de ser aleatorizados, mientras que 17 habian ajustado para factores de confusion en el analisis. Diez de los 35 ensayos no tuvieron en cuenta el racimo en los calculos de tamano muestral, y 7 no tuvieron en cuenta el diseno de racimo- aleatorizado en el analisis. El numero de ensayos aleatorizados en racimo aumento a lo largo del tiempo, y en general los ensayos mejoraron en calidad. Conclusiones: Existen deficiencias en los calculos de tamano muestral y en el analisis de ensayo aleatorizados en racimo conducidos como parte de la investigacion en salud materno infantil en paises en vias de desarrollo. Aunque ha habido mejoras a lo largo del tiempo, se requieren mayores progresos en la forma en la que los investigadores utilizan y analizan los ensayos de campo aleatorizados en racimo.

15 citations

Journal ArticleDOI
TL;DR: In this article, the Chen-Stein theorem is used to show that family-wise error rate can be controlled for cluster-dependent microRNAs under weak assumptions, and the theory is illustrated with an analysis of real data, a microRNA expression data set on Finnish (aggressive and non-aggressive) prostate cancer patients and their controls.
Abstract: New statistical procedures are introduced to analyse typical microRNA expression data sets. For each separate microRNA expression, the null hypothesis to be tested is that there is no difference between the distributions of the expression in different groups. The test statistics are then constructed having certain type of alternatives in mind. To avoid strong (parametric) distributional assumptions, the alternatives are formulated using probabilities of different orders of pairs or triples of observations coming from different groups, and the test statistics are then constructed using corresponding several-sample U-statistics, natural estimates of these probabilities. Classical several-sample rank test statistics, such as the Kruskal–Wallis and Jonckheere–Terpstra tests, are special cases in our approach. Also, as the number of variables (microRNAs) is huge, we confront a serious simultaneous testing problem. Different approaches to control the family-wise error rate or the false discovery rate are shortly discussed, and it is shown how the Chen–Stein theorem can be used to show that family-wise error rate can be controlled for cluster-dependent microRNAs under weak assumptions. The theory is illustrated with an analysis of real data, a microRNA expression data set on Finnish (aggressive and non-aggressive) prostate cancer patients and their controls.

15 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A nonparametric approach to the analysis of areas under correlated ROC curves is presented, by using the theory on generalized U-statistics to generate an estimated covariance matrix.
Abstract: Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed variable that lies on a continuous or graded scale, an assessment of the overall value of the test can be made through the use of a receiver operating characteristic (ROC) curve. The curve is constructed by varying the cutpoint used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates. When two or more empirical curves are constructed based on tests performed on the same individuals, statistical analysis on differences between curves must take into account the correlated nature of the data. This paper presents a nonparametric approach to the analysis of areas under correlated ROC curves, by using the theory on generalized U-statistics to generate an estimated covariance matrix.

16,496 citations

Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Abstract: Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparable procedures which control the traditional familywise error rate. We prove that this same procedure also controls the false discovery rate when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses. This condition for positive dependency is general enough to cover many problems of practical interest, including the comparisons of many treatments with a single control, multivariate normal test statistics with positive correlation matrix and multivariate $t$. Furthermore, the test statistics may be discrete, and the tested hypotheses composite without posing special difficulties. For all other forms of dependency, a simple conservative modification of the procedure controls the false discovery rate. Thus the range of problems for which a procedure with proven FDR control can be offered is greatly increased.

9,335 citations

Journal ArticleDOI
TL;DR: In this article, a simple and robust estimator of regression coefficient β based on Kendall's rank correlation tau is studied, where the point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti.
Abstract: The least squares estimator of a regression coefficient β is vulnerable to gross errors and the associated confidence interval is, in addition, sensitive to non-normality of the parent distribution. In this paper, a simple and robust (point as well as interval) estimator of β based on Kendall's [6] rank correlation tau is studied. The point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti , and is unbiased. The confidence interval is also determined by two order statistics of this set of slopes. Various properties of these estimators are studied and compared with those of the least squares and some other nonparametric estimators.

8,409 citations