scispace - formally typeset
Search or ask a question
Author

Jun Yan

Bio: Jun Yan is an academic researcher from University of Connecticut. The author has contributed to research in topics: Estimator & Parametric statistics. The author has an hindex of 31, co-authored 150 publications receiving 5165 citations. Previous affiliations of Jun Yan include University of Wisconsin-Madison & University of Auckland.


Papers
More filters
Journal ArticleDOI
TL;DR: The core features of the R package geepack are described, which implements the generalized estimating equations (GEE) approach for fitting marginal generalized linear models to clustered data, through an example of clustered binary data.
Abstract: This paper describes the core features of the R package geepack, which implements the generalized estimating equations (GEE) approach for fitting marginal generalized linear models to clustered data. Clustered data arise in many applications such as longitudinal data and repeated measures. The GEE approach focuses on models for the mean of the correlated observations within clusters without fully specifying the joint distribution of the observations. It has been widely used in statistical practice. This paper illustrates the application of the GEE approach with geepack through an example of clustered binary data.

1,785 citations

Journal ArticleDOI
TL;DR: The R package copula provides a carefully designed and easily extensible platform for multivariate modeling with copulas in R, with methods for density/distribution evaluation, random number generation, and graphical display.
Abstract: Copulas have become a popular tool in multivariate modeling successfully applied in many fields. A good open-source implementation of copulas is much needed for more practitioners to enjoy the joy of copulas. This article presents the design, features, and some implementation details of the R package copula. The package provides a carefully designed and easily extensible platform for multivariate modeling with copulas in R. S4 classes for most frequently used elliptical copulas and Archimedean copulas are implemented, with methods for density/distribution evaluation, random number generation, and graphical display. Fitting copula-based models with maximum likelihood method is provided as template examples. With the classes and methods in the package, the package can be easily extended by user-defined copulas and margins to solve problems.

456 citations

Journal ArticleDOI
TL;DR: This paper investigates generalized estimating equations for association parameters, which are frequently of interest in family studies, with emphasis on covariance estimation, and finds that the formula for the approximate jackknife variance estimator in Ziegler et al. is deficient, resulting in systematic deviations from the fully iterated jackknifevariance estimator.
Abstract: This paper investigates generalized estimating equations for association parameters, which are frequently of interest in family studies, with emphasis on covariance estimation. Separate link functions are used to connect the mean, the scale, and the correlation to linear predictors involving possibly different sets of covariates, and separate estimating equations are proposed for the three sets of parameters. Simulations show that the robust 'sandwich' variance estimator and the jackknife variance estimator for the correlation parameters are generally close to the empirical variance for the sample size of 50 clusters. The results contradict Ziegler et al. and Kastner and Ziegler, where the 'sandwich' estimator obtained from the software MAREG was shown to be unsuitable for practical usage. The problem appears to arise because the MAREG variance estimator does not account for variability in estimation of the scale parameters, but may be valid with fixed scale. We also find that the formula for the approximate jackknife variance estimator in Ziegler et al. is deficient, resulting in systematic deviations from the fully iterated jackknife variance estimator. A general jackknife formula is provided and performs well in numerical studies. Data from a study on the genetics of alcoholism is used to illustrate the importance of reliable variance estimation in biomedical applications.

415 citations

Journal ArticleDOI
TL;DR: The copula-based modeling of multivariate distributions with continuous margins is presented as a succession of rank-based tests: a multivariate test of randomness followed by a test of mutual independence and a series of goodness-of-fit tests.
Abstract: The copula-based modeling of multivariate distributions with continuous margins is presented as a succession of rank-based tests: a multivariate test of randomness followed by a test of mutual independence and a series of goodness-of-fit tests. All the tests under consideration are based on the empirical copula, which is a nonparametric rank-based estimator of the true unknown copula. The principles of the tests are recalled and their implementation in the copula R package is briefly described. Their use in the construction of a copula model from data is thoroughly illustrated on real insurance and financial data.

392 citations

Journal ArticleDOI
TL;DR: Self-reported empathy for patients, a possibly critical factor in high-quality patient-centered care, wanes as students advance in clinical training, particularly among those entering technology-oriented specialties.
Abstract: Background: Empathy is important in the physician–patient relationship. Prior studies suggest that medical student empathy declines with clinical training.Aims: We examined the trend of empathy longitudinally; determined differences in empathy according to gender and medical specialty preferences; and determined empathy and career preference differences among students admitted through different medical school admission pathways.Method: The data for this study were collected using a longitudinal cohort design and included 2652 observations nested within 1162 individuals. Participants were medical students at a university-based medical school surveyed yearly from 2007 through 2010. Empathy was measured by the Jefferson Scale of Physician Empathy-Student Version (JSPE-S), a validated, 20-item self-administered questionnaire. Predictors of JSPE-S scores included gender, age, anticipated financial debt upon graduation and future career interest.Results: Empathy scores of students in preclinical years were high...

220 citations


Cited by
More filters
Book
01 Jan 2009

8,216 citations

Journal ArticleDOI

6,278 citations

01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading modern applied statistics with s. As you may know, people have search hundreds times for their favorite readings like this modern applied statistics with s, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. modern applied statistics with s is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the modern applied statistics with s is universally compatible with any devices to read.

5,249 citations

Journal ArticleDOI
TL;DR: In this article, a new implementation of hurdle and zero-inflated regression models in the functions hurdle() and zeroinfl() from the package pscl is introduced, which reuses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models.
Abstract: The classical Poisson, geometric and negative binomial regression models for count data belong to the family of generalized linear models and are available at the core of the statistics toolbox in the R system for statistical computing. After reviewing the conceptual and computational features of these methods, a new implementation of hurdle and zero-inflated regression models in the functions hurdle() and zeroinfl() from the package pscl is introduced. It re-uses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models. Both hurdle and zero-inflated model, are able to incorporate over-dispersion and excess zeros-two problems that typically occur in count data sets in economics and the social sciences-better than their classical counterparts. Using cross-section data on the demand for medical care, it is illustrated how the classical as well as the zero-augmented models can be fitted, inspected and tested in practice.

1,971 citations

Journal ArticleDOI
TL;DR: The core features of the R package geepack are described, which implements the generalized estimating equations (GEE) approach for fitting marginal generalized linear models to clustered data, through an example of clustered binary data.
Abstract: This paper describes the core features of the R package geepack, which implements the generalized estimating equations (GEE) approach for fitting marginal generalized linear models to clustered data. Clustered data arise in many applications such as longitudinal data and repeated measures. The GEE approach focuses on models for the mean of the correlated observations within clusters without fully specifying the joint distribution of the observations. It has been widely used in statistical practice. This paper illustrates the application of the GEE approach with geepack through an example of clustered binary data.

1,785 citations