scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Optimal factorial designs for cDNA microarray experiments

TL;DR: In this paper, the authors consider cDNA microarray experiments when the cell populations have a factorial structure, and investigate the problem of their optimal designing under a baseline parametrization where the objects of interest differ from those under the more common orthogonal parameter.
Abstract: We consider cDNA microarray experiments when the cell populations have a factorial structure, and investigate the problem of their optimal designing under a baseline parametrization where the objects of interest differ from those under the more common orthogonal parametrization. First, analytical results are given for the $2\times 2$ factorial. Since practical applications often involve a more complex factorial structure, we next explore general factorials and obtain a collection of optimal designs in the saturated, that is, most economic, case. This, in turn, is seen to yield an approach for finding optimal or efficient designs in the practically more important nearly saturated cases. Thereafter, the findings are extended to the more intricate situation where the underlying model incorporates dye-coloring effects, and the role of dye-swapping is critically examined.
Citations
More filters
Journal ArticleDOI
TL;DR: This work discusses the practice of problem solving, testing hypotheses about statistical parameters, calculating and interpreting confidence limits, tolerance limits and prediction limits, and setting up and interpreting control charts.
Abstract: THE best adjective to describe this work is \"sweep11 ing.\" The range of subject matter is so broad that it can almost be described as containing everything except fuzzy set theory. Included are explicit discussions of the basics of probability (relegated to an appendix); the practice of problem solving; testing hypotheses about statistical parameters; calculating and interpreting confidence limits; tolerance limits and prediction limits; setting up and interpreting control charts; design of experiments; analysis of variance; line and surface fitting; and maximum likelihood procedures. If you can think of something that is not in this list, then it probably means I have overlooked it.

309 citations

Journal ArticleDOI
TL;DR: An overview of the different classes of hybridization designs is provided, discussing their advantages and limitations, and the current trends in the use of different hybridization design types in contemporary research are illustrated.

42 citations

Journal ArticleDOI
TL;DR: In this paper, two-level fractional factorial designs are considered under a baseline parameterization and the criterion of minimum aberration is formulated in this context and optimal designs under this criterion are investigated.
Abstract: Two-level fractional factorial designs are considered under a baseline parameterization. The criterion of minimum aberration is formulated in this context and optimal designs under this criterion are investigated. The underlying theory and the concept of isomorphism turn out to be significantly different from their counterparts under orthogonal parameterization, and this is reflected in the optimal designs obtained. Copyright 2012, Oxford University Press.

25 citations

Journal ArticleDOI
TL;DR: In this article, a general theory for weighted optimality, allowing precise design selection according to expressed relative interest in different functions in the estimation space, is developed, and the results are applied to solve the $A$-optimal design problem for baseline factorial effects in unblocked experiments.
Abstract: The standard approach to finding optimal experimental designs employs conventional measures of design efficacy, such as the $A$, $E$, and $D$-criterion, that assume equal interest in all estimable functions of model parameters. This paper develops a general theory for weighted optimality, allowing precise design selection according to expressed relative interest in different functions in the estimation space. The approach employs a very general class of matrix-specified weighting schemes that produce easily interpretable weighted optimality criteria. In particular, for any set of estimable functions, and any selected corresponding weights, analogs of standard optimality criteria are found that guide design selection according to the weighted variances of estimators of those particular functions. The results are applied to solve the $A$-optimal design problem for baseline factorial effects in unblocked experiments.

20 citations

Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of optimal biased chemical and spring balance weights for the case N ≡ 0 (mod 4), where N is the run size, and showed that for all p ≥ 0, there is a Φ p -optimal estimator for this problem.
Abstract: Optimal biased chemical and spring balance weighing designs are considered. Optimal designs in either setting can be obtained from those in the other via a simple transformation. Optimal approximate designs for unbiased chemical balance, biased chemical balance, and biased spring balance are closely related and can easily be obtained from one another. These designs correspond to universally optimal exact designs for the case N ≡ 0 (mod 4), where N is the run size. While Cheng's (1980) result on the type 1 optimality of certain unbiased chemical balance weighing designs for the case N ≡ 1 (mod 4) can be extended to the biased setting, such an extension does not hold for N ≡ 2 (mod 4). We obtain exact Φ p -optimal designs in the latter case for all p ≥ 0. The results obtained in this article can also be applied to optimal main-effect plans when one is interested in the main effects but not the general mean. Under the usual orthogonal parameterization, the model matrices of main-effect plans have 1, −1 entri...

18 citations

References
More filters
Journal ArticleDOI
TL;DR: Fundamental issues of how to design an experiment to ensure that the resulting data are amenable to statistical analysis are discussed.
Abstract: Microarray technology is now widely available and is being applied to address increasingly complex scientific questions Consequently, there is a greater demand for statistical assessment of the conclusions drawn from microarray experiments This review discusses fundamental issues of how to design an experiment to ensure that the resulting data are amenable to statistical analysis The discussion focuses on two-color spotted cDNA microarrays, but many of the same issues apply to single-color gene-expression assays as well

1,122 citations


"Optimal factorial designs for cDNA ..." refers background or methods in this paper

  • ...We refer to Kerr and Churchill (2001b), Yang and Speed (2002) and Churchill (2002) for very informative further discussion on the design issues....

    [...]

  • ..., a design where every treatment combination appears an even number of times) allows a dye-color assignment that ensures orthogonality to η [Kerr and Churchill (2001a)]....

    [...]

  • ...While Kerr and Churchill (2001a, 2001b) and Churchill (2002) concentrated on varietal designs, Yang and Speed (2002) discussed factorial designs in some detail....

    [...]

  • ...…work on varietal designs for microarrays includes those due to Dobbin and Simon (2002), Kerr (2003), Rosa, Steibel and Tempelman (2005), Wit, Nobile and Khanin (2005) and Altman and Hua (2006), although some of these authors, as also Churchill (2002), briefly touched upon factorial designs as well....

    [...]

  • ...Then, for the purpose of estimating the θs, the means of the r log intensity ratios arising from the slides play the role of the individual ratios considered so far, but an attempt to estimate σ2 on the basis of the within slide variation can be vitiated by unknown correlation among the ratios arising from the same slide [Yang and Speed (2002) and Churchill (2002)]....

    [...]

Journal ArticleDOI
TL;DR: This paper focuses on microarray experiments, which are used to quantify and compare gene expression on a large scale and can be costly in terms of equipment, consumables and time.
Abstract: Microarray experiments are used to quantify and compare gene expression on a large scale. As with all large-scale experiments, they can be costly in terms of equipment, consumables and time. Therefore, careful design is particularly important if the resulting experiment is to be maximally informative, given the effort and the resources. What then are the issues that need to be addressed when planning microarray experiments? Which features of an experiment have the most impact on the accuracy and precision of the resulting measurements? How do we balance the different components of experimental design to reach a decision? For example, should we replicate, and if so, how?

824 citations


"Optimal factorial designs for cDNA ..." refers background or methods or result in this paper

  • ...We consider the baseline parametrization [cf. Yang and Speed (2002); GS] according to which the main effects of F1 and F2 are given respectively by θ10 = τ10 − τ00 and θ01 = τ01 − τ00,(1) while the interaction effect F1F2 is given by θ11 = τ11 − τ10 − τ01 + τ00....

    [...]

  • ...We refer to Kerr and Churchill (2001b), Yang and Speed (2002) and Churchill (2002) for very informative further discussion on the design issues....

    [...]

  • ...While Kerr and Churchill (2001a, 2001b) and Churchill (2002) concentrated on varietal designs, Yang and Speed (2002) discussed factorial designs in some detail....

    [...]

  • ...Turning to factorial designs for microarrays under the baseline parametrization, which is the main thrust of this paper, two key references are Yang and Speed (2002) and Glonek and Solomon (2004), (hereafter abbreviated GS)....

    [...]

  • ...The following results confirm this and, hence, vindicate the proposal of Yang and Speed (2002) about dye-swapping....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors examined experimental design issues arising with gene expression microarray technology and provided a general set of recommendations for design with microarrays, illustrated in detail for one kind of experimental objective, where they also gave the results of a computer search for good designs.
Abstract: We examine experimental design issues arising with gene expression microarray technology. Microarray experiments have multiple sources of variation, and experimental plans should ensure that eects of interest are not confounded with ancillary eects. A commonly-used design is shown to violate this principle and to be generally inecient. We explore the connection between microarray designs and classical block design and use a family of ANOVA models as a guide to choosing a design. We combine principles of good design and A-optimality to give a general set of recommendations for design with microarrays. These recommendations are illustrated in detail for one kind of experimental objective, where we also give the results of a computer search for good designs.

701 citations

Journal ArticleDOI
TL;DR: This work relates certain features of microarrays to other kinds of experimental data and argues that classical statistical techniques are appropriate and useful and advocate greater attention to experimental design issues and a more prominent role for the ideas of statistical inference in microarray studies.
Abstract: Gene expression microarrays are an innovative technology with enormous promise to help geneticists explore and understand the genome. Although the potential of this technology has been clearly demonstrated, many important and interesting statistical questions persist. We relate certain features of microarrays to other kinds of experimental data and argue that classical statistical techniques are appropriate and useful. We advocate greater attention to experimental design issues and a more prominent role for the ideas of statistical inference in microarray studies.

614 citations


"Optimal factorial designs for cDNA ..." refers background or methods or result in this paper

  • ...We refer to Kerr and Churchill (2001b), Yang and Speed (2002) and Churchill (2002) for very informative further discussion on the design issues....

    [...]

  • ...While Kerr and Churchill (2001a, 2001b) and Churchill (2002) concentrated on varietal designs, Yang and Speed (2002) discussed factorial designs in some detail....

    [...]

  • ...In a pioneering paper Kerr and Churchill (2001a) discussed the design issues in microarrays and investigated optimal varietal designs that estimate the pairwise contrasts of treatment effects for fixed genes with minimum average variance....

    [...]

  • ...For this reduced model, it is known that any even design (i.e., a design where every treatment combination appears an even number of times) allows a dye-color assignment that ensures orthogonality to η [Kerr and Churchill (2001a)]....

    [...]

  • ...The above experimental setup is structurally similar to classical paired comparison experiments; see Kerr and Churchill (2001a)....

    [...]

Journal ArticleDOI
TL;DR: Inference for most genes is not adversely affected by pooling, and it is recommended that pooling be done when fewer than three arrays are used in each condition, and for larger designs, pooling does not significantly improve inferences if few subjects are pooled.
Abstract: Over 15% of the data sets catalogued in the Gene Expression Omnibus Database involve RNA samples that have been pooled before hybridization. Pooling affects data quality and inference, but the exact effects are not yet known because pooling has not been systematically studied in the context of microarray experiments. Here we report on the results of an experiment designed to evaluate the utility of pooling and the impact on identifying differentially expressed genes. We find that inference for most genes is not adversely affected by pooling, and we recommend that pooling be done when fewer than three arrays are used in each condition. For larger designs, pooling does not significantly improve inferences if few subjects are pooled. The realized benefits in this case do not outweigh the price paid for loss of individual specific information. Pooling is beneficial when many subjects are pooled, provided that independent samples contribute to multiple pools.

486 citations


"Optimal factorial designs for cDNA ..." refers background or result in this paper

  • ...This reinforces the findings in Kerr (2003) in a simpler setting and suggests that, in addition to making the log intensity ratios from different slides uncorrelated, use of only biological replicates can be advantageous from the perspective of design efficiency as well; see also Kendziorski et al. (2005) and the references therein for insightful practical results in a similar context. The point just noted makes sense if the cost of biological replication is negligible compared to the cost of the assay per slide, as has been tacitly supposed in this paper. While Bueno Filho, Gilmour and Rosa (2006) mention that the number of slides is typically the most important limiting factor in microarray experiments, a more detailed discussion in this regard is available in Kerr (2003), who also dwelt on the situation where this is not the case. If the cost of biological replication is a real issue, then the design problem becomes much more complex. Instead of fixing the number of slides, as done here, one should then proceed in the spirit of Kerr (2003) to formulate the problem in terms of a cost function that incorporates the cost of the assays (slides), as well as the cost of biological replication....

    [...]

  • ...This reinforces the findings in Kerr (2003) in a simpler setting and suggests that, in addition to making the log intensity ratios from different slides uncorrelated, use of only biological replicates can be advantageous from the perspective of design efficiency as well; see also Kendziorski et al. (2005) and the references therein for insightful practical results in a similar context....

    [...]

  • ...This reinforces the findings in Kerr (2003) in a simpler setting and suggests that, in addition to making the log intensity ratios from different slides uncorrelated, use of only biological replicates can be advantageous from the perspective of design efficiency as well; see also Kendziorski et al. (2005) and the references therein for insightful practical results in a similar context. The point just noted makes sense if the cost of biological replication is negligible compared to the cost of the assay per slide, as has been tacitly supposed in this paper. While Bueno Filho, Gilmour and Rosa (2006) mention that the number of slides is typically the most important limiting factor in microarray experiments, a more detailed discussion in this regard is available in Kerr (2003), who also dwelt on the situation where this is not the case....

    [...]

  • ...Kerr (2003) and Altman and Hua (2006), among others] with common value say, γ2, then the log intensity ratios arising from different slides are homoscedastic with common variance σ2 = 2γ2 + δ2....

    [...]

  • ...This reinforces the findings in Kerr (2003) in a simpler setting and suggests that, in addition to making the log intensity ratios from different slides uncorrelated, use of only biological replicates can be advantageous from the perspective of design efficiency as well; see also Kendziorski et al....

    [...]