scispace - formally typeset
Search or ask a question
Author

Kung Yee Liang

Bio: Kung Yee Liang is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Nuisance parameter & Estimator. The author has an hindex of 29, co-authored 50 publications receiving 20733 citations. Previous affiliations of Kung Yee Liang include National Yang-Ming University & National Health Research Institutes.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Abstract: SUMMARY This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating equations are derived without specifying the joint distribution of a subject's observations yet they reduce to the score equations for multivariate Gaussian outcomes. Asymptotic theory is presented for the general class of estimators. Specific cases in which we assume independence, m-dependence and exchangeable correlation structures from each subject are discussed. Efficiency of the proposed estimators in two simple situations is considered. The approach is closely related to quasi-likelih ood. Some key ironh: Estimating equation; Generalized linear model; Longitudinal data; Quasi-likelihood; Repeated measures.

17,111 citations

Journal ArticleDOI
TL;DR: In this article, the authors derived the asymptotic distribution of maximum likelihood estimators and likelihood ratio statistics, which is the same as the distribution of the projection of the Gaussian random variable.
Abstract: Large sample properties of the likelihood function when the true parameter value may be on the boundary of the parameter space are described. Specifically, the asymptotic distribution of maximum likelihood estimators and likelihood ratio statistics are derived. These results generalize the work of Moran (1971), Chant (1974), and Chernoff (1954). Some of Chant's results are shown to be incorrect. The approach used in deriving these results follows from comments made by Moran and Chant. The problem is shown to be asymptotically equivalent to the problem of estimating the restricted mean of a multivariate Gaussian distribution from a sample of size 1. In this representation the Gaussian random variable corresponds to the limit of the normalized score statistic and the estimate of the mean corresponds to the limit of the normalized maximum likelihood estimator. Thus the limiting distribution of the maximum likelihood estimator is the same as the distribution of the projection of the Gaussian random v...

2,564 citations

Journal ArticleDOI
TL;DR: In this paper, the authors considered extensions of logistic regression to the case where the binary outcome variable is observed repeatedly for each subject and proposed two working models that lead to consistent estimates of the regression parameters and of their variances under mild assumptions about the time dependence within each subject's data.
Abstract: SUMMARY This paper considers extensions of logistic regression to the case where the binary outcome variable is observed repeatedly for each subject. We propose two working models that lead to consistent estimates of the regression parameters and of their variances under mild assumptions about the time dependence within each subject's data. The efficiency of the proposed estimators is examined. An analysis of stress in mothers with infants is presented to illustrate the proposed method.

137 citations

Journal ArticleDOI
TL;DR: In this paper, a class of conditional logistic regression models for clustered binary data is considered, including the polychotomous logistic model of Rosner (1984) as a special case.
Abstract: SUMMARY A class of conditional logistic regression models for clustered binary data is considered. This includes the polychotomous logistic model of Rosner (1984) as a special case. Properties such as the joint distribution and pairwise odds ratio are investigated. A class of easily computed estimating functions is introduced which is shown to have high efficiency compared to the computationally intensive maximum likelihood approach. An example on chronic obstructive pulmonary disease among sibs is presented for illustration.

131 citations

Journal ArticleDOI
TL;DR: Results presented here demonstrate that case-control designs can be used to detect gene-environment interaction when there is both a common exposure and a highly polymorphic marker of susceptibility.
Abstract: As genetic markers become more available, case-control studies will be increasingly important in defining the role of genetic factors in disease causality. The authors estimate the minimum sample size needed to assure adequate statistical power to detect gene-environment interaction. One assumption is made: the prevalence of exposure is independent of marker genotypes among controls. Given the assumption, six parameters (three odds ratios, the prevalence of exposure, the proportion of those with the susceptible genotype, and the ratio of controls to cases) dictate the expected cell sizes in a 2 x 2 x 2 table contrasting genetic susceptibility, exposure, and disease. The three odds ratios reflect the association between disease and 1) exposure among non-susceptibles; 2) susceptible genotypes among nonexposed individuals; and 3) the gene-environment interaction itself, respectively. Given these parameters, the number of cases and controls needed to assure any particular Type I and Type II error rates can be estimated. Results presented here demonstrate that case-control designs can be used to detect gene-environment interaction when there is both a common exposure and a highly polymorphic marker of susceptibility.

124 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Abstract: SUMMARY This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating equations are derived without specifying the joint distribution of a subject's observations yet they reduce to the score equations for multivariate Gaussian outcomes. Asymptotic theory is presented for the general class of estimators. Specific cases in which we assume independence, m-dependence and exchangeable correlation structures from each subject are discussed. Efficiency of the proposed estimators in two simple situations is considered. The approach is closely related to quasi-likelih ood. Some key ironh: Estimating equation; Generalized linear model; Longitudinal data; Quasi-likelihood; Repeated measures.

17,111 citations

Journal ArticleDOI
TL;DR: PAML, currently in version 4, is a package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML), which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses.
Abstract: PAML, currently in version 4, is a package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML). The programs may be used to compare and test phylogenetic trees, but their main strengths lie in the rich repertoire of evolutionary models implemented, which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses. Uses of the programs include estimation of synonymous and nonsynonymous rates (d(N) and d(S)) between two protein-coding DNA sequences, inference of positive Darwinian selection through phylogenetic comparison of protein-coding genes, reconstruction of ancestral genes and proteins for molecular restoration studies of extinct life forms, combined analysis of heterogeneous data sets from multiple gene loci, and estimation of species divergence times incorporating uncertainties in fossil calibrations. This note discusses some of the major applications of the package, which includes example data sets to demonstrate their use. The package is written in ANSI C, and runs under Windows, Mac OSX, and UNIX systems. It is available at -- (http://abacus.gene.ucl.ac.uk/software/paml.html).

10,773 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations

Journal ArticleDOI
TL;DR: In this article, the authors examine the different methods used in the literature and explain when the different approaches yield the same (and correct) standard errors and when they diverge, and give researchers guidance for their use.
Abstract: In both corporate finance and asset pricing empirical work, researchers are often confronted with panel data. In these data sets, the residuals may be correlated across firms and across time, and OLS standard errors can be biased. Historically, the two literatures have used different solutions to this problem. Corporate finance has relied on clustered standard errors, while asset pricing has used the Fama-MacBeth procedure to estimate standard errors. This paper examines the different methods used in the literature and explains when the different methods yield the same (and correct) standard errors and when they diverge. The intent is to provide intuition as to why the different approaches sometimes give different answers and give researchers guidance for their use.

7,647 citations

Book
01 Jan 2006
TL;DR: In this article, the authors present a detailed, worked-through example drawn from psychology, management, and sociology studies illustrate the procedures, pitfalls, and extensions of CFA methodology.
Abstract: "With its emphasis on practical and conceptual aspects, rather than mathematics or formulas, this accessible book has established itself as the go-to resource on confirmatory factor analysis (CFA). Detailed, worked-through examples drawn from psychology, management, and sociology studies illustrate the procedures, pitfalls, and extensions of CFA methodology. The text shows how to formulate, program, and interpret CFA models using popular latent variable software packages (LISREL, Mplus, EQS, SAS/CALIS); understand the similarities and differences between CFA and exploratory factor analysis (EFA); and report results from a CFA study. It is filled with useful advice and tables that outline the procedures. The companion website offers data and program syntax files for most of the research examples, as well as links to CFA-related resources. New to This Edition *Updated throughout to incorporate important developments in latent variable modeling. *Chapter on Bayesian CFA and multilevel measurement models. *Addresses new topics (with examples): exploratory structural equation modeling, bifactor analysis, measurement invariance evaluation with categorical indicators, and a new method for scaling latent variables. *Utilizes the latest versions of major latent variable software packages"--

7,620 citations