Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments
read more
Citations
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
limma powers differential expression analyses for RNA-sequencing and microarray studies
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
Differential expression analysis for sequence count data.
References
Significance analysis of microarrays applied to the ionizing radiation response
limma: Linear Models for Microarray Data
Summaries of Affymetrix GeneChip probe level data
Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection
Variance stabilization applied to microarray data calibration and to the quantification of differential expression.
Related Papers (5)
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Bioconductor: open software development for computational biology and bioinformatics
Frequently Asked Questions (17)
Q2. What is the simplest way to solve f(y) = x?
To avoid overflow or underflow in floating point arithmetic, the authors can set y = 1/ √ x when x > 107 and y = 1/x when x < 10−6 instead of performing the iteration.
Q3. What is the third step to reformulate the posterior odds statistic?
The third step is to reformulate the posterior odds statistic in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations.
Q4. What is the main goal of the Swirl experiment?
The main goal of the Swirl experiment is to identify genes with altered expression in the Swirl mutant compared to wild-type zebrafish.
Q5. What is the advantage of the moderated t statistic?
The moderated t has the advantage over the B-statistic that Bgj depends on hyperparameters v0j and pj for all j as well as d0 and s 2 0 whereas t̃gj depends only on d0 and s20.
Q6. What is the unscaled variance for the contrasts of interest?
The unscaled variance for the contrasts of interest is estimated to be v0 = 3.4 meaning that the typical fold change for differentially expressed genes is estimated to be about 1.3.
Q7. What is the advantage of the moderated t inferential approach?
The moderated t inferential approach extends to accommodate tests involving two or more contrasts through the use of moderated F -statistics.
Q8. What is the estimate of v0?
0. Restricting to those values of r for which (r− 0.5)/(2G) < p ensures also that ptarget < 1 so that the estimator of v0 is defined.
Q9. What is the cumulative distribution function of tg?
The cumulative distribution function of t̃g isF (t̃g; vg, v0, d0 + dg) = pF t̃g {vg vg + v0}1/2 ; d0 + dg + (1− p)F (t̃g; d0 + dg)where F (·; k) is the cumulative distribution function of the t-distribution on k degrees of freedom.
Q10. What is the unscaled variance for the contrast?
The estimated unscaled variance for the contrast is v0 = 22.7, meaning that the standard deviation of the log-ratio for a typical gene is (0.0509)1/2(22.7)1/2 = 1.07, i.e., genes which are differentially expressed typically change by about two-fold.
Q11. Why is the posterior variance s2g offset?
This is because the posterior variance s̃2g offsets the small sample variances heavily in a relative sense while larger sample variances are moderated to a lesser relative degree.
Q12. is the covariance matrix assumed to be dependent on g?
If so, the covariance matrix is assumedSmyth: Empirical Bayes Methods for Differential ExpressionPublished by The Berkeley Electronic Press, 2004to be evaluated at α̂g and is the dependence is assumed to be such that it can be ignored to a first order approximation.
Q13. What is the purpose of this paper?
The purpose of this paper is to develop the hierarchical model of Lönnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples.
Q14. How many different facilities have been used to test the Limma software?
The Limma software has been tested on a wide range of microarray data sets from many different facilities and has been used routinely at the author’s institution since the middle of 2002.
Q15. What is the coefficient of contrast for design (a)?
The regression coefficient here estimates the contrast B − A on the log-scale, just as for design (a), but with two arrays there is one degree of freedom for error.
Q16. What is the current paper designed to do?
The single linear model approach assumes all equal variances across genes whereas the current paper is designed to accommodate different variances.
Q17. What is the limit for v1/2 0j s0?
In the software packageLimma which implements the methods in this paper, the user is allowed to place limits on the possible values for v1/2 0j s0.