scispace - formally typeset
Search or ask a question

Showing papers by "Keith A. Baggerly published in 2001"


Journal ArticleDOI
TL;DR: This work derives models for the intensity of a replicated spot, when replication is performed within and between arrays, and predicts how the variance of a log ratio changes with the total intensity of the signal at the spot, independent of the identity of the gene.
Abstract: A major goal of microarray experiments is to determine which genes are differentially expressed between samples. Differential expression has been assessed by taking ratios of expression levels of different samples at a spot on the array and flagging spots (genes) where the magnitude of the fold difference exceeds some threshold. More recent work has attempted to incorporate the fact that the variability of these ratios is not constant. Most methods are variants of Student's t-test. These variants standardize the ratios by dividing by an estimate of the standard deviation of that ratio; spots with large standardized values are flagged. Estimating these standard deviations requires replication of the measurements, either within a slide or between slides, or the use of a model describing what the standard deviation should be. Starting from considerations of the kinetics driving microarray hybridization, we derive models for the intensity of a replicated spot, when replication is performed within and between ...

122 citations


Journal ArticleDOI
TL;DR: The essential elements of data acquisition, data processing and data analysis are reviewed, and issues related to the quality, validation and storage of data are discussed.

109 citations


Journal ArticleDOI
TL;DR: Testing the relationship between signal and expression for the two types of microarrays most commonly encounter uncovered two sources of nonlinearity: signal quenching associated with excessive dye concentrations and a nonlinear transformation of the raw data introduced by the scanner.
Abstract: A key assumption in the analysis of microarray data is that the quantified signal intensities are linearly related to the expression levels of the corresponding genes. To test this assumption, we experimentally examined the relationship between signal and expression for the two types of microarrays we most commonly encounter: radioactively labeled cDNAs on nylon membranes and fluorescently labeled cDNAs on glass slides. We uncovered two sources of nonlinearity. The first, which led to discrepancies in analysis affecting the fluorescent signals, was signal quenching associated with excessive dye concentrations. The second, affecting the radioactive signals, was a nonlinear transformation of the raw data introduced by the scanner. Correction for this transformation was made by some, but not all, image-quantification software packages. The second type of nonlinearity is more troublesome, because it could not have been predicted a priori. Both types of nonlinearities were detected by simple dilution series, which we recommend as a quality-control step.

87 citations


Journal ArticleDOI
TL;DR: The derivation of a theoretical basis provides a more detailed interpretation of its behavior and renders the probability binning method more flexible, effectively using adaptive binning to locate structure in high-dimensional data.
Abstract: Background A key problem in immunohistochemistry is assessing when two sample histograms are significantly different. One test that is commonly used for this purpose in the univariate case is the chi-squared test. Comparing multivariate distributions is qualitatively harder, as the “curse of dimensionality” means that the number of bins can grow exponentially. For the chi-squared test to be useful, data-dependent binning methods must be employed. An example of how this can be done is provided by the “probability binning” method of Roederer et al. (1, 2, 3). Methods We derive the theoretical distribution of the probability binning statistic, giving it a more rigorous foundation. We show that the null distribution is a scaled chi-square, and show how it can be related to the standard chi-squared statistic. Results A small simulation shows how the theoretical results can be used to (a) modify the probability binning statistic to make it more sensitive and (b) suggest variant statistics which, while still exploiting the data-dependent strengths of the probability binning procedure, may be easier to work with. Conclusions The probability binning procedure effectively uses adaptive binning to locate structure in high-dimensional data. The derivation of a theoretical basis provides a more detailed interpretation of its behavior and renders the probability binning method more flexible.

18 citations


Journal ArticleDOI
TL;DR: Identifying and quantifying sources of variation in high-density cDNA microarray data using 33 P-labeled probes using33 P- labeled probes shows good agreement between stationary and moving targets.
Abstract: Identifying and quantifying sources of variation in high-density cDNA microarray data using 33 P-labeled probes

2 citations