scispace - formally typeset
Search or ask a question
Author

Gail Gong

Other affiliations: Carnegie Mellon University
Bio: Gail Gong is an academic researcher from Stanford University. The author has contributed to research in topics: Population & Penetrance. The author has an hindex of 14, co-authored 23 publications receiving 4344 citations. Previous affiliations of Gail Gong include Carnegie Mellon University.

Papers
More filters
Journal ArticleDOI
TL;DR: This paper reviewed the nonparametric estimation of statistical error, mainly the bias and standard error of an estimator, or the error rate of a prediction rule, at a relaxed mathematical level, omitting most proofs, regularity conditions and technical details.
Abstract: This is an invited expository article for The American Statistician. It reviews the nonparametric estimation of statistical error, mainly the bias and standard error of an estimator, or the error rate of a prediction rule. The presentation is written at a relaxed mathematical level, omitting most proofs, regularity conditions, and technical details.

3,146 citations

Journal Article
01 Jan 2007-JAMA
TL;DR: In this paper, the authors estimated the prevalence of BRCA1 mutation carriers in a population-based, multiethnic series of female breast cancer patients younger than 65 years at diagnosis who were enrolled at the Northern California site of the Breast Cancer Family Registry during the period 1996-2005.
Abstract: Context Information on the prevalence of pathogenic BRCA1 mutation carriers in racial/ethnic minority populations is limited. Objective To estimate BRCA1 carrier prevalence in Hispanic, African American, and Asian American female breast cancer patients compared with non-Hispanic white patients with and without Ashkenazi Jewish ancestry. Design, Setting, and Participants We estimated race/ethnicity-specific prevalence of BRCA1 in a population-based, multiethnic series of female breast cancer patients younger than 65 years at diagnosis who were enrolled at the Northern California site of the Breast Cancer Family Registry during the period 1996-2005. Race/ethnicity and religious ancestry were based on self-report. Weighted estimates of prevalence and 95% confidence intervals (Cls) were based on Horvitz-Thompson estimating equations. Main Outcome Measure Estimates of BRCA1 prevalence. Results Estimates of BRCA1 prevalence were 3.5% (95% Cl, 2.1%-5.8%) in Hispanic patients (n=393), 1.3% (95% Cl, 0.6%-2.6%) in African American patients (n=341), and 0.5% (95% Cl, 0.1%-2.0%) in Asian American patients (n=444), compared with 8.3% (95% Cl, 3.1%-20.1%) in Ashkenazi Jewish patients (n=41) and 2.2% (95% Cl, 0.7%-6.9%) in other non-Hispanic white patients (n=508). Prevalence was particularly high in young (<35 years) African American patients (5/30 patients [16.7%]; 95% Cl, 7.1%-34.3%). 185delAG was the most common mutation in Hispanics, found in 5 of 21 carriers (24%). Conclusions Among African American, Asian American, and Hispanic patients in the Northern California Breast Cancer Family Registry, the prevalence of BRCA1 mutation carriers was highest in Hispanics and lowest in Asian Americans. The higher carrier prevalence in Hispanics may reflect the presence of unrecognized Jewish ancestry in this population.

286 citations

Journal ArticleDOI
26 Dec 2007-JAMA
TL;DR: Among African American, Asian American, and Hispanic patients in the Northern California Breast Cancer Family Registry, the prevalence of BRCA1 mutation carriers was highest in Hispanics and lowest in Asian Americans.
Abstract: ContextInformation on the prevalence of pathogenic BRCA1 mutation carriers in racial/ethnic minority populations is limited.ObjectiveTo estimate BRCA1 carrier prevalence in Hispanic, African American, and Asian American female breast cancer patients compared with non-Hispanic white patients with and without Ashkenazi Jewish ancestry.Design, Setting, and ParticipantsWe estimated race/ethnicity-specific prevalence of BRCA1 in a population-based, multiethnic series of female breast cancer patients younger than 65 years at diagnosis who were enrolled at the Northern California site of the Breast Cancer Family Registry during the period 1996-2005. Race/ethnicity and religious ancestry were based on self-report. Weighted estimates of prevalence and 95% confidence intervals (CIs) were based on Horvitz-Thompson estimating equations.Main Outcome MeasureEstimates of BRCA1 prevalence.ResultsEstimates of BRCA1 prevalence were 3.5% (95% CI, 2.1%-5.8%) in Hispanic patients (n = 393), 1.3% (95% CI, 0.6%-2.6%) in African American patients (n = 341), and 0.5% (95% CI, 0.1%-2.0%) in Asian American patients (n = 444), compared with 8.3% (95% CI, 3.1%-20.1%) in Ashkenazi Jewish patients (n = 41) and 2.2% (95% CI, 0.7%-6.9%) in other non-Hispanic white patients (n = 508). Prevalence was particularly high in young (<35 years) African American patients (5/30 patients [16.7%]; 95% CI, 7.1%-34.3%). 185delAG was the most common mutation in Hispanics, found in 5 of 21 carriers (24%).ConclusionsAmong African American, Asian American, and Hispanic patients in the Northern California Breast Cancer Family Registry, the prevalence of BRCA1 mutation carriers was highest in Hispanics and lowest in Asian Americans. The higher carrier prevalence in Hispanics may reflect the presence of unrecognized Jewish ancestry in this population.

261 citations

Journal ArticleDOI
TL;DR: Three estimates of the excess error are considered: cross-validation, the jackknife, and the bootstrap, for a moderately complicated prediction rule, involving a variable-selection procedure based on forward logistic regression.
Abstract: Given a prediction rule based on a set of patients, what is the probability of incorrectly predicting the outcome of a new patient? Call this probability the true error. An optimistic estimate is the apparent error, or the proportion of incorrect predictions on the original set of patients, and it is the goal of this article to study estimates of the excess error, or the difference between the true and apparent errors. I consider three estimates of the excess error: cross-validation, the jackknife, and the bootstrap. Using simulations and real data, the three estimates for a specific prediction rule are compared. When the prediction rule is allowed to be complicated, overfitting becomes a real danger, and excess error estimation becomes important. The prediction rule chosen here is moderately complicated, involving a variable-selection procedure based on forward logistic regression.

245 citations

Journal Article
TL;DR: Prevalence estimates of mutation carrier prevalence in U.S. non-Hispanic Whites, specific for Ashkenazi status, using data from two population-based series of patients with invasive cancers of the breast or ovary, and data on breast and ovarian cancer risks in Ashkenazim and non-Ashkenazi carriers may be useful in guiding resource allocation for genetic testing and genetic counseling and in planning preventive interventions.
Abstract: Data from several countries indicate that 1% to 2% of Ashkenazi Jews carry a pathogenic ancestral mutation of the tumor suppressor gene BRCA1. However, the prevalence of BRCA1 mutations among non-Ashkenazi Whites is uncertain. We estimated mutation carrier prevalence in U.S. non-Hispanic Whites, specific for Ashkenazi status, using data from two population-based series of San Francisco Bay Area patients with invasive cancers of the breast or ovary, and data on breast and ovarian cancer risks in Ashkenazi and non-Ashkenazi carriers. Assuming that 90% of the BRCA1 mutations were detected, we estimate a carrier prevalence of 0.24% (95% confidence interval, 0.15-0.39%) in non-Ashkenazi Whites, and 1.2% (95% confidence interval, 0.5-2.6%) in Ashkenazim. When combined with U.S. White census counts, these prevalence estimates suggest that approximately 550,513 U.S. Whites (506,206 non-Ashkenazim and 44,307 Ashkenazim) carry germ line BRCA1 mutations. These estimates may be useful in guiding resource allocation for genetic testing and genetic counseling and in planning preventive interventions.

118 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The recently‐developed statistical method known as the “bootstrap” can be used to place confidence intervals on phylogenies and shows significant evidence for a group if it is defined by three or more characters.
Abstract: The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.

40,349 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide guidance for substantive researchers on the use of structural equation modeling in practice for theory testing and development, and present a comprehensive, two-step modeling approach that employs a series of nested models and sequential chi-square difference tests.
Abstract: In this article, we provide guidance for substantive researchers on the use of structural equation modeling in practice for theory testing and development. We present a comprehensive, two-step modeling approach that employs a series of nested models and sequential chi-square difference tests. We discuss the comparative advantages of this approach over a one-step approach. Considerations in specification, assessment of fit, and respecification of measurement models using confirmatory factor analysis are reviewed. As background to the two-step approach, the distinction between exploratory and confirmatory analysis, the distinction between complementary approaches for theory testing versus predictive application, and some developments in estimation methods also are discussed.

34,720 citations

Journal ArticleDOI
TL;DR: The Unified Theory of Acceptance and Use of Technology (UTAUT) as mentioned in this paper is a unified model that integrates elements across the eight models, and empirically validate the unified model.
Abstract: Information technology (IT) acceptance research has yielded many competing models, each with different sets of acceptance determinants. In this paper, we (1) review user acceptance literature and discuss eight prominent models, (2) empirically compare the eight models and their extensions, (3) formulate a unified model that integrates elements across the eight models, and (4) empirically validate the unified model. The eight models reviewed are the theory of reasoned action, the technology acceptance model, the motivational model, the theory of planned behavior, a model combining the technology acceptance model and the theory of planned behavior, the model of PC utilization, the innovation diffusion theory, and the social cognitive theory. Using data from four organizations over a six-month period with three points of measurement, the eight models explained between 17 percent and 53 percent of the variance in user intentions to use information technology. Next, a unified model, called the Unified Theory of Acceptance and Use of Technology (UTAUT), was formulated, with four core determinants of intention and usage, and up to four moderators of key relationships. UTAUT was then tested using the original data and found to outperform the eight individual models (adjusted R2 of 69 percent). UTAUT was then confirmed with data from two new organizations with similar results (adjusted R2 of 70 percent). UTAUT thus provides a useful tool for managers needing to assess the likelihood of success for new technology introductions and helps them understand the drivers of acceptance in order to proactively design interventions (including training, marketing, etc.) targeted at populations of users that may be less inclined to adopt and use new systems. The paper also makes several recommendations for future research including developing a deeper understanding of the dynamic influences studied here, refining measurement of the core constructs used in UTAUT, and understanding the organizational outcomes associated with new technology use.

27,798 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations

Journal ArticleDOI
TL;DR: PLS-regression (PLSR) as mentioned in this paper is the PLS approach in its simplest, and in chemistry and technology, most used form (two-block predictive PLS) is a method for relating two data matrices, X and Y, by a linear multivariate model.

7,861 citations