scispace - formally typeset
Search or ask a question
Author

Pranab Kumar Sen

Bio: Pranab Kumar Sen is an academic researcher from University of North Carolina at Chapel Hill. The author has contributed to research in topics: Estimator & Nonparametric statistics. The author has an hindex of 51, co-authored 570 publications receiving 19997 citations. Previous affiliations of Pranab Kumar Sen include Indian Statistical Institute & Academia Sinica.


Papers
More filters
Journal ArticleDOI
TL;DR: For a broad class of one-sample rank-order statistics, weak convergence to Brownian motion processes is studied in this article, and a simple proof of the asymptotic normality of these statistics for random sample sizes is also presented.
Abstract: Analogous to the Donsker theorem on partial cumulative sums of independent random variables, for a broad class of one-sample rank order statistics, weak convergence to Brownian motion processes is studied here. A simple proof of the asymptotic normality of these statistics for random sample sizes is also presented. Some asymptotic results on renewal theory for one-sample rank order statistics are derived.

7 citations

Journal ArticleDOI
TL;DR: In this paper, the limitations of conventional statistical analysis in health related environmental risk assessment are appraised; some Bayesian perspectives are also examined, and limitations of standard dose-response regression analysis, as well as limitations of usual dosimetric studies are discussed.
Abstract: In the bioinformatics and environmetrics evolution, virology, molecular biology and toxicology, along with biostochastics, play the key role; the complex task is invaded by some statistical methodologic challenges. We need to identify the multitude of (often disguised) toxicants and virus working in synergism and slow progression, assess their absorption mode, as well as, in vivo biological activity and reaction, and then to formulate suitable stochastic models for statistical analysis. Lack of experimental control, difficulties with standard dose-response regression analysis, as well as limitations of usual dosimetric studies create impasses. Basically, chemical or viral structure and in vivo biological activity relationship information needs to be incorporated adequately to depict the causal cum stochastic relationship between environmental exposures (of toxins and virus) and specific health hazards. Limitations of conventional statistical analysis in health related environmental risk assessment are appraised; some Bayesian perspectives are also examined.

7 citations

Journal Article
TL;DR: Because of the latent nature of a large class of toxic substances, the extreme variability of human metabolism as well as their exposure to toxic material, yet unknown nature of many carcinogenic activities, and immense difficulties in the assessment of effective toxicity levels, there is a genuine need to have statistical appraisal at each phase.
Abstract: Toxicity abounds in nature, environment, and in our modern life-style. Toxicology relates to the study of the intake process of such toxins by human being, their mode of propagation, biological reactions, molecular level of penetration, genotoxicity and afte rmaths. Because of the latent nature of a large class of toxic substances, the extreme variability of human metabolism as well as their exposure to toxic material, yet unknown nature of many carcinogenic activities, and immense difficulties in the assessment of effective toxicity levels (especially in the environment), there is a genuine need to have statistical appraisal at each phase. These statistical perspectives are highlighted here along with some outline of recent statistical approaches in this much needed assessment task. FROM time immemorial, toxins have been recognized by the human being, and used both beneficially and destru ctively. Like morphine and other pain relievers, such to xins have often been used in medicine and allied fields to combat some diseases or disorders, and more often in punitive or destructive modes; the use of (slow as well as instantaneous) poisons to eliminate an undesired person has been in practice from the dawn of human civiliz ation. Unknowingly, sometimes we might be in touch with some toxic plants, fruits or material, and such co ntacts might have widespread and often disastrous effects. Even in the animal world, toxins are recognized by various species who might have acquired such knowledge through their ancestors. From the ancient Vedic time, toxins in various herbs and plants have been most thoroughly studied by Indian herbal physicians (known as the Vaidyas) and their acquired knowledge (Ayurveda) is still a yardstick for the (western as well as oriental) medical system; the basic difference between the two lies in the occidental use of toxic chemical compounds (inorganic toxins) instead of the organic toxins in herb and plants. The entire found ation of the homeopathic medicinal system is based on the impact of toxins on human body as well as mind, and is a classical example, how a form of a toxic element can be used to nullify some other toxicity in our body. Although there may be some controversy over the doctrine of homeopathic medicine system, there is no doubt about the neutralizing capacities of toxins of different kinds. Toxins are characterized by their toxicity, in some form or the other, in relation to their reactions on living orga nisms including man. In a traditional sense, toxicity relates to the poisonous effects which are associated with specific chemical compounds (such as potassium cyanide), various plants and fungi, and with our environment and ecosy stem. A good deal of knowledge has been acquired on toxicity of various chemicals, fungi and other co mpounds. Such toxicities are fairly quick in progre ssion and fairly deterministic in the dose-response relation. As such, viewed from statistical perspectives, such quick -action, fairly deterministic toxicity are not of much inte rest. Rather, the more complex types that often pr ogress slowly and invisibly, and in that process invite other activities, such as genotoxicity and carcinogeneity, have a far greater need for statistical appraisal, and we shall mainly confine ourselves to this aspect of toxicology. This co mplex of toxicity includes a wide variety of subdisc iplines: Plant toxicology and animal toxicology refer to toxins and their toxicity that are prevalent in plants and animals respectively; taken together, they constitute the organic toxicity. Toxicity is also prevalent to varying degrees in drugs and pharmaceutical products. Toxicity may arise in the use of pest control and storage facilities of agricultural products; it can also arise in the process of cooking food, its preservation and service. Industrial fact ories and plants are notorious sources of toxicit y; indu strial waste also often lead to significant amount of toxicity. More signif icantly, environmental toxicity, resulting from air pollution or toxicants (such as industrial and automobile exhausts, environmental smoking, thinning of ozone layer, smog containing airborne particulate matters, acid rain and others), and groundwater and subsoil contamination (due to arsenic minerals, industrial waste dumping, landfill practice, ecological disasters, nuclear waste disposal, and other sources of associated t oxicants), might be labelled as a significant ingredient of toxicity (though such toxicity effects are generally slow in progression and hard to is olate in detection). In the foothills of the Himalayas and the Gangetic valleys, particularly in the lower basins, arsenic contamination of groundwater and soil from subterranean source is a growing threat to public health. Besides, o rganic arsenic occurs in plants, fish, crab, human body and other organisms, though this may be generally less toxic in effect compared to arsenic minerals. Arsenic contamination of groundwater may be caused by geological and anthropometric processes. This contamination is due to a chemical process wherein arseneous acids from buried

7 citations

Journal ArticleDOI
TL;DR: This work generalizes the mapping problem to a genuine nonparametric setup and provides a robust estimation procedure for the situation where the underlying phenotype distributions are completely unspecified, in quantitative-trait linkage studies using experimental crosses.

7 citations

01 Jan 2013
TL;DR: The theoretical underpinnings of robust procedures developed recently, where the proposed estimator is robust to outliers/influential observations as well as to heteroscedasticity are provided.
Abstract: Robust statistical methods, such as M-estimators, are needed for nonlinear regression models because of the presence of outliers/influential observations and heteroscedasticity. Outliers and influential observations are commonly observed in many applications, especially in toxicology and agricultural experiments. For example, dose response studies, which are routinely conducted in toxicology and agriculture, sometimes result in potential outliers, especially in the high dose groups. This is because response to high doses often varies among experimental units (e.g., animals). Consequently, this may result in outliers (i.e., very low values) in that group. Unlike the linear models, in nonlinear models the outliers not only impact the point estimates of the model parameters but can also severely impact the estimate of the information matrix. Note that, the information matrix in a nonlinear model is a function of the model parameters. This is not the case in linear models. In addition to outliers, heteroscedasticity is a major concern when dealing with nonlinear models. Ignoring heteroscedasticity may lead to inaccurate coverage probabilities and Type I error rates. Robustness to outliers/influential observations and to heteroscedasticity is even more important when dealing with thousands of nonlinear regression models in quantitative high throughput screening assays. Recently, these issues have been studied very extensively in the literature (references are provided in this paper), where the proposed estimator is robust to outliers/influential observations as well as to heteroscedasticity. The focus of this paper is to provide the theoretical underpinnings of robust procedures developed recently.

7 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A nonparametric approach to the analysis of areas under correlated ROC curves is presented, by using the theory on generalized U-statistics to generate an estimated covariance matrix.
Abstract: Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed variable that lies on a continuous or graded scale, an assessment of the overall value of the test can be made through the use of a receiver operating characteristic (ROC) curve. The curve is constructed by varying the cutpoint used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates. When two or more empirical curves are constructed based on tests performed on the same individuals, statistical analysis on differences between curves must take into account the correlated nature of the data. This paper presents a nonparametric approach to the analysis of areas under correlated ROC curves, by using the theory on generalized U-statistics to generate an estimated covariance matrix.

16,496 citations

Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Abstract: Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparable procedures which control the traditional familywise error rate. We prove that this same procedure also controls the false discovery rate when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses. This condition for positive dependency is general enough to cover many problems of practical interest, including the comparisons of many treatments with a single control, multivariate normal test statistics with positive correlation matrix and multivariate $t$. Furthermore, the test statistics may be discrete, and the tested hypotheses composite without posing special difficulties. For all other forms of dependency, a simple conservative modification of the procedure controls the false discovery rate. Thus the range of problems for which a procedure with proven FDR control can be offered is greatly increased.

9,335 citations

Journal ArticleDOI
TL;DR: In this article, a simple and robust estimator of regression coefficient β based on Kendall's rank correlation tau is studied, where the point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti.
Abstract: The least squares estimator of a regression coefficient β is vulnerable to gross errors and the associated confidence interval is, in addition, sensitive to non-normality of the parent distribution. In this paper, a simple and robust (point as well as interval) estimator of β based on Kendall's [6] rank correlation tau is studied. The point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti , and is unbiased. The confidence interval is also determined by two order statistics of this set of slopes. Various properties of these estimators are studied and compared with those of the least squares and some other nonparametric estimators.

8,409 citations