Home
/
Authors
/
Nitin R. Patel

Author

Nitin R. Patel

Other affiliations: Cytel, Indian Institute of Management Ahmedabad, Harvard University

Bio: Nitin R. Patel is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Contingency table & Exact test. The author has an hindex of 31, co-authored 55 publications receiving 4573 citations. Previous affiliations of Nitin R. Patel include Cytel & Indian Institute of Management Ahmedabad.

Topics: Contingency table, Exact test, Exact statistics, Monte Carlo method, Portfolio ...read more

Papers published on a yearly basis

2016
2015
2014
2013
2010
2009
2008
2007
2006
2002
2001
2000
1998
1995
1994
1992
1990
1989
1988
1987
1986
1985
1984
1983
1982
1980
1979
1977
1974

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Network Algorithm for Performing Fisher's Exact Test in r × c Contingency Tables

[...]

Cyrus R. Mehta¹, Nitin R. Patel²•Institutions (2)

Harvard University¹, Indian Institute of Management Ahmedabad²

01 Jun 1983-Journal of the American Statistical Association

TL;DR: In this paper, the problem of finding all paths through a directed acyclic network that equal or exceed a fixed length is transformed into one of identifying all paths in a directed ACYCLIC network.

...read moreread less

Abstract: An exact test of significance of the hypothesis that the row and column effects are independent in an r × c contingency table can be executed in principle by generalizing Fisher's exact treatment of the 2 × 2 contingency table. Each table in a conditional reference set of r × c tables with fixed marginal sums is assigned a generalized hypergeometric probability. The significance level is then computed by summing the probabilities of all tables that are no larger (on the probability scale) than the observed table. However, the computational effort required to generate all r × c contingency tables with fixed marginal sums severely limits the use of Fisher's exact test. A novel technique that considerably extends the bounds of computational feasibility of the exact test is proposed here. The problem is transformed into one of identifying all paths through a directed acyclic network that equal or exceed a fixed length. Some interesting new optimization theorems are developed in the process. The numer...

...read moreread less

960 citations

Journal Article•DOI•

Exact logistic regression: Theory and examples

[...]

Cyrus R. Mehta¹, Nitin R. Patel¹•Institutions (1)

Harvard University¹

15 Oct 1995-Statistics in Medicine

TL;DR: This work provides an alternative to the maximum likelihood method for making inferences about the parameters of the logistic regression model based on appropriate permutational distributions of sufficient statistics.

...read moreread less

Abstract: We provide an alternative to the maximum likelihood method for making inferences about the parameters of the logistic regression model. The method is based appropriate permutational distributions of sufficient statistics. It is useful for analysing small or unbalanced binary data with covariates. It also applies to small-sample clustered binary data. We illustrate the method by analysing several biomedical data sets.

...read moreread less

469 citations

Journal Article•DOI•

Computing an Exact Confidence Interval for the Common Odds Ratio in Several 2×2 Contingency Tables

[...]

Cyrus R. Mehta¹, Nitin R. Patel², Robert Gray¹•Institutions (2)

Harvard University¹, Indian Institute of Management Ahmedabad²

01 Dec 1985-Journal of the American Statistical Association

TL;DR: A quadratic time network algorithm is provided for computing an exact confidence interval for the common odds ratio in several 2×2 independent contingency tables, shown to be a considerable improvement on an existing algorithm developed by Thomas (1975), which relies on exhaustive enumeration.

...read moreread less

Abstract: A quadratic time network algorithm is provided for computing an exact confidence interval for the common odds ratio in several 2×2 independent contingency tables. The algorithm is shown to be a considerable improvement on an existing algorithm developed by Thomas (1975), which relies on exhaustive enumeration. Problems that would formerly have consumed several CPU hours can now be solved in a few CPU seconds. The algorithm can easily handle sparse data sets where asymptotic results are suspect. The network approach, on which the algorithm is based, is also a powerful tool for exact statistical inference in other settings.

...read moreread less

387 citations

Journal Article•DOI•

Exact significance testing to establish treatment equivalence with ordered categorical data.

[...]

Cyrus R. Mehta, Nitin R. Patel, Anastasios A. Tsiatis

01 Sep 1984-Biometrics

TL;DR: An efficient numerical algorithm for computing the exact significance level and a simple method for obtaining the asymptotic significance level are provided for establishing the therapeutic equivalence of two treatments that are being compared on the basis of ordered categorical data.

...read moreread less

Abstract: This communication concerns the problem of establishing the therapeutic equivalence of two treatments that are being compared on the basis of ordered categorical data. The problem is formulated as a significance test in which the null hypothesis specifies a treatment difference. An efficient numerical algorithm for computing the exact significance level is provided, along with a simple method for obtaining the asymptotic significance level. Both methods are applied to a clinical trial of a new agent versus an active control. Guidelines for when to use the exact procedure and when to rely on asymptotic theory are provided.

...read moreread less

335 citations

Journal Article•DOI•

Computing Distributions for Exact Logistic Regression

[...]

Karim F. Hirji¹, Cyrus R. Mehta², Nitin R. Patel³•Institutions (3)

University of California, Los Angeles¹, Harvard University², Indian Institute of Management Ahmedabad³

01 Dec 1987-Journal of the American Statistical Association

TL;DR: In this paper, an efficient recursive algorithm was proposed to generate the joint and conditional distributions of the sufficient statistics for logistic regression with binary response variables, and the algorithm was shown to be computationally feasible except in a few special situations.

...read moreread less

Abstract: Logistic regression is a commonly used technique for the analysis of retrospective and prospective epidemiological and clinical studies with binary response variables. Usually this analysis is performed using large sample approximations. When the sample size is small or the data structure sparse, the accuracy of the asymptotic approximations is in question. On other occasions, singularity of the covariance matrix of parameter estimates precludes asymptotic analysis. Under these circumstances, use of exact inferential procedures would seem to be a prudent alternative. Cox (1970) showed that exact inference on the parameters of a logistic model with binary response requires consideration of the distribution of sufficient statistics for these parameters. To date, however, resorting to the exact method has not been computationally feasible except in a few special situations. This article presents an efficient recursive algorithm that generates the joint and conditional distributions of the sufficient...

...read moreread less

289 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Second-generation PLINK: rising to the challenge of larger and richer datasets

[...]

Christopher C. Chang, Carson C. Chow¹, Laurent C. A. M. Tellier², Shashaank Vattikuti¹, Shaun Purcell³, James J. Lee⁴ - Show less +2 more•Institutions (4)

National Institutes of Health¹, University of Copenhagen², Icahn School of Medicine at Mount Sinai³, University of Minnesota⁴

25 Feb 2015-GigaScience

TL;DR: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility, and for the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

...read moreread less

Abstract: Background: PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1’s primary data format. Findings: To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O √ n -time/constant-space Hardy-Weinberg equilibrium and Fisher’s exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0). Conclusions: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

...read moreread less

7,038 citations

Journal Article•DOI•

A simulation study of the number of events per variable in logistic regression analysis.

[...]

Peter Peduzzi¹, John Concato², John Concato¹, Elizabeth Kemper², Elizabeth Kemper¹, Theodore R. Holford², Alvan R. Feinstein¹, Alvan R. Feinstein² - Show less +4 more•Institutions (2)

Veterans Health Administration¹, Yale University²

01 Dec 1996-Journal of Clinical Epidemiology

TL;DR: Findings indicate that low EPV can lead to major problems, and the regression coefficients were biased in both positive and negative directions, and paradoxical associations (significance in the wrong direction) were increased.

...read moreread less

6,490 citations

Journal Article•DOI•

Second-generation PLINK: rising to the challenge of larger and richer datasets

[...]

Christopher C. Chang, Carson C. Chow¹, Laurent C. A. M. Tellier², Shashaank Vattikuti¹, Shaun Purcell³, James J. Lee⁴ - Show less +2 more•Institutions (4)

National Institutes of Health¹, University of Copenhagen², Icahn School of Medicine at Mount Sinai³, University of Minnesota⁴

17 Oct 2014-arXiv: Genomics

TL;DR: PLINK as discussed by the authors is a C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics, which has been widely used in the literature.

...read moreread less

Abstract: PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for even faster and more scalable implementations of key functions. In addition, GWAS and population-genetic data now frequently contain probabilistic calls, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O(sqrt(n))-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. This will be followed by PLINK 2.0, which will introduce (a) a new data format capable of efficiently representing probabilities, phase, and multiallelic variants, and (b) extensions of many functions to account for the new types of information. The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

...read moreread less

3,513 citations

Journal Article•DOI•

Microsatellite instability in cancer of the proximal colon

[...]

Stephen N. Thibodeau¹, Gary D. Bren¹, Daniel J. Schaid¹•Institutions (1)

Mayo Clinic¹

07 May 1993-Science

TL;DR: Colorectal tumor DNA was examined for somatic instability at (CA)n repeats on human chromosomes 5q, 15q, 17p, and 18q, and this instability was significantly correlated with the tumor's location in the proximal colon and with increased patient survival and loss of heterozygosity.

...read moreread less

Abstract: Colorectal tumor DNA was examined for somatic instability at (CA)n repeats on human chromosomes 5q, 15q, 17p, and 18q. Differences between tumor and normal DNA were detected in 25 of the 90 (28 percent) tumors examined. This instability appeared as either a substantial change in repeat length (often heterogeneous in nature) or a minor change (typically two base pairs). Microsatellite instability was significantly correlated with the tumor's location in the proximal colon (P = 0.003), with increased patient survival (P = 0.02), and, inversely, with loss of heterozygosity for chromosomes 5q, 17p, and 18q. These data suggest that some colorectal cancers may arise through a mechanism that does not necessarily involve loss of heterozygosity.

...read moreread less

3,093 citations