scispace - formally typeset
Search or ask a question
Author

Daniel Gianola

Bio: Daniel Gianola is an academic researcher from University of California, Santa Barbara. The author has contributed to research in topics: Population & Best linear unbiased prediction. The author has an hindex of 75, co-authored 435 publications receiving 22214 citations. Previous affiliations of Daniel Gianola include Institut national de la recherche agronomique & University of Illinois at Urbana–Champaign.


Papers
More filters
Book
01 Dec 2010
TL;DR: In this paper, a review of probability and distribution theory is presented, with a focus on the use of random variables for likelihood inference. But, the authors do not consider the problem of estimating the probability of a single variable in a single model.
Abstract: Preface I Review of Probability and Distribution Theory 1 Probability and Random Variables 1.1 Introduction 1.2 Univariate Discrete Distributions 1.2.1 The Bernoulli and Binomial Distributions 1.2.2 The Poisson Distribution 1.2.3 Binomial Distribution: Normal Approximation 1.3 Univariate Continuous Distributions 1.3.1 The Uniform, Beta, Gamma, Normal, and Student-t Distributions 1.4 Multivariate Probability Distributions 1.4.1 The Multinomial Distribution 1.4.2 The Dirichlet Distribution 1.4.3 The d-Dimensional Uniform Distribution 1.4.4 The Multivariate Normal Distribution 1.4.5 The Chi-square Distribution 1.4.6 The Wishart and Inverse Wishart Distributions 1.4.7 The Multivariate-t Distribution 1.5 Distributions with Constrained Sample Space 1.6 Iterated Expectations 2 Functions of Random Variables 2.1 Introduction 2.2 Functions of a Single Random Variable 2.2.1 Discrete Random Variables 2.2.2 Continuous Random Variables 2.2.3 Approximating the Mean and Variance 2.2.4 Delta Method 2.3 Functions of Several Random Variables 2.3.1 Linear Transformations 2.3.2 Approximating the Mean and Covariance Matrix II Methods of Inference 3 An Introduction to Likelihood Inference 3.1 Introduction 3.2 The Likelihood Function 3.3 The Maximum Likelihood Estimator 3.4 Likelihood Inference in a Gaussian Model 3.5 Fisher's Information Measure 3.5.1 Single Parameter Case 3.5.2 Alternative Representation of Information 3.5.3 Mean and Variance of the Score Function 3.5.4 Multiparameter Case 3.5.5 Cramer-Rao Lower Bound 3.6 Sufficiency 3.7 Asymptotic Properties: Single Parameter Models 3.7.1 Probability of the Data Given the Parameter 3.7.2 Consistency 3.7.3 Asymptotic Normality and Effciency 3.8 Asymptotic Properties: Multiparameter Models 3.9 Functional Invariance 3.9.1 Illustration of FunctionalInvariance 3.9.2 Invariance in a Single Parameter Model 3.9.3 Invariance in a Multiparameter Model 4 Further Topics in Likelihood Inference 4.1 Introduction 4.2 Computation of Maximum Likelihood Estimates 4.3 Evaluation of Hypotheses 4.3.1 Likelihood Ratio Tests 4.3.2 Con.dence Regions 4.3.3 Wald's Test 4.3.4 Score Test 4.4 Nuisance Parameters 4.4.1 Loss of Efficiency Due to Nuisance Parameters 4.4.2 Marginal Likelihoods 4.4.3 Profile Likelihoods 4.5 Analysis of a Multinomial Distribution 4.5.1 Amount of Information per Observation 4.6 Analysis of Linear Logistic Models 4.6.1 The Logistic Distribution 4.6.2 Likelihood Function under Bernoulli Sampling 4.6.3 Mixed Effects Linear Logistic Model 5 An Introduction to Bayesian Inference 5.1 Introduction 5.2 Bayes Theorem: Discrete Case 5.3 Bayes Theorem: Continuous Case 5.4 Posterior Distributions 5.5 Bayesian Updating 5.6 Features of Posterior Distributions 5.6.1 Posterior Probabilities 5.6.2 Posterior Quantiles 5.6.3 Posterior Modes 5.6.4 Posterior Mean Vector and Covariance Matrix 6 Bayesian Analysis of Linear Models 6.1 Introduction 6.2 The Linear Regression Model 6.2.1 Inference under Uniform Improper Priors 6.2.2 Inference under Conjugate Priors 6.2.3 Orthogonal Parameterization of the Model 6.3 The Mixed Linear Model 6.3.1 Bayesian View of the Mixed Effects Model 6.3.2 Joint and Conditional Posterior Distributions 6.3.3 Marginal Distribution of Variance Components 6.3.4 Marginal Distribution of Location Parameters 7 The Prior Distribution and Bayesian Analysis 7.1 Introduction 7.2 An Illustration of the Effect of Priors on Inferences 7.3 A Rapid Tour of Bayesian Asymptotics 7.3.1 Discrete Parameter 7.3.2 Continuous Parameter 7.4 Statistical Information and Entropy 7.4.1 Information 7.4.2 Entropy of a Discrete

690 citations

Journal ArticleDOI
01 Oct 2010-Genetics
TL;DR: Evaluated parametric and semiparametric models for GS using wheat and maize data in which different traits were measured in several environmental conditions indicate that models including marker information had higher predictive ability than pedigree-based models.
Abstract: The availability of dense molecular markers has made possible the use of genomic selection (GS) for plant breeding. However, the evaluation of models for GS in real plant populations is very limited. This article evaluates the performance of parametric and semiparametric models for GS using wheat (Triticum aestivum L.) and maize (Zea mays) data in which different traits were measured in several environmental conditions. The findings, based on extensive cross-validations, indicate that models including marker information had higher predictive ability than pedigree-based models. In the wheat data set, and relative to a pedigree model, gains in predictive ability due to inclusion of markers ranged from 7.7 to 35.7%. Correlation between observed and predictive values in the maize data set achieved values up to 0.79. Estimates of marker effects were different across environmental conditions, indicating that genotype × environment interaction is an important component of genetic variability. These results indicate that GS in plant breeding can be an effective strategy for selecting among lines whose phenotypes have yet to be observed.

676 citations

Journal ArticleDOI
TL;DR: A method of evaluation of ordered categorical responses is presented, where the probability of response in a given category follows a normal integral with an argument dependent on fixed thresholds and random variables sampled from a conceptual distribution with known first and second moments.
Abstract: A method of evaluation of ordered categorical responses is presented. The probability of response in a given category follows a normal integral with an argument dependent on fixed thresholds and random variables sampled from a conceptual distribution with known first and second moments, a priori. The prior distribution and the likelihood function are combined to yield the posterior density from which inferences are made. The mode of the posterior distribution is taken as an estimator of location. Finding this mode entails solving a non-linear system ; estimation equations are presented. Relationships of the procedure to \"generalized linear models\" and \"normal scores are discussed. A numerical example involving sire evaluation for calving ease is used to illustrate the method.

605 citations

Journal ArticleDOI
01 May 2009-Genetics
TL;DR: This article adapts the Bayesian least absolute shrinkage and selection operator (LASSO) to arrive at a regression model where markers, pedigrees, and covariates other than markers are considered jointly, and results indicate that inclusion of markers in the regression further improved the predictive ability of models.
Abstract: The availability of genomewide dense markers brings opportunities and challenges to breeding programs. An important question concerns the ways in which dense markers and pedigrees, together with phenotypic records, should be used to arrive at predictions of genetic values for complex traits. If a large number of markers are included in a regression model, marker-specific shrinkage of regression coefficients may be needed. For this reason, the Bayesian least absolute shrinkage and selection operator (LASSO) (BL) appears to be an interesting approach for fitting marker effects in a regression model. This article adapts the BL to arrive at a regression model where markers, pedigrees, and covariates other than markers are considered jointly. Connections between BL and other marker-based regression models are discussed, and the sensitivity of BL with respect to the choice of prior distributions assigned to key parameters is evaluated using simulation. The proposed model was fitted to two data sets from wheat and mouse populations, and evaluated using cross-validation methods. Results indicate that inclusion of markers in the regression further improved the predictive ability of models. An R program that implements the proposed model is freely available.

550 citations

Journal ArticleDOI
18 Dec 2009-Science
TL;DR: An experimental investigation of stress-driven grain boundary migration manifested as grain growth in nanocrystalline aluminum thin films indicates that shear stresses drive grain boundaries to move in a manner consistent with recent molecular dynamics simulations and theoretical predictions of coupled grain boundaries migration.
Abstract: In crystalline materials, plastic deformation occurs by the motion of dislocations, and the regions between individual crystallites, called grain boundaries, act as obstacles to dislocation motion. Grain boundaries are widely envisaged to be mechanically static structures, but this report outlines an experimental investigation of stress-driven grain boundary migration manifested as grain growth in nanocrystalline aluminum thin films. Specimens fabricated with specially designed stress and strain concentrators are used to uncover the relative importance of these parameters on grain growth. In contrast to traditional descriptions of grain boundaries as stationary obstacles to dislocation-based plasticity, the results of this study indicate that shear stresses drive grain boundaries to move in a manner consistent with recent molecular dynamics simulations and theoretical predictions of coupled grain boundary migration.

520 citations


Cited by
More filters
01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations