Journal•ISSN: 1465-4644

Biostatistics

Oxford University Press

About: Biostatistics is an academic journal published by Oxford University Press. The journal publishes majorly in the area(s): Covariate & Estimator. It has an ISSN identifier of 1465-4644. Over the lifetime, 1269 publications have been published receiving 84407 citations. The journal is also known as: biometry & biometrics.

...read moreread less

Topics: Covariate, Estimator, Population, Regression analysis, Computer science ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Exploration, normalization, and summaries of high density oligonucleotide array probe level data

[...]

Rafael A. Irizarry¹, Bridget G. Hobbs¹, Francois Collin¹, Yasmin Beazer-Barclay¹, Kristen J. Antonellis¹, Uwe Scherf¹, Terence P. Speed¹ - Show less +3 more•Institutions (1)

Johns Hopkins University¹

01 Apr 2003-Biostatistics

TL;DR: There is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities, and the exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values.

...read moreread less

Abstract: SUMMARY In this paper we report exploratory analyses of high-density oligonucleotide array data from the Affymetrix GeneChip R � system with the objective of improving upon currently used measures of gene expression. Our analyses make use of three data sets: a small experimental study consisting of five MGU74A mouse GeneChip R � arrays, part of the data from an extensive spike-in study conducted by Gene Logic and Wyeth’s Genetics Institute involving 95 HG-U95A human GeneChip R � arrays; and part of a dilution study conducted by Gene Logic involving 75 HG-U95A GeneChip R � arrays. We display some familiar features of the perfect match and mismatch probe ( PM and MM )v alues of these data, and examine the variance–mean relationship with probe-level data from probes believed to be defective, and so delivering noise only. We explain why we need to normalize the arrays to one another using probe level intensities. We then examine the behavior of the PM and MM using spike-in data and assess three commonly used summary measures: Affymetrix’s (i) average difference (AvDiff) and (ii) MAS 5.0 signal, and (iii) the Li and Wong multiplicative model-based expression index (MBEI). The exploratory data analyses of the probe level data motivate a new summary measure that is a robust multiarray average (RMA) of background-adjusted, normalized, and log-transformed PM values. We evaluate the four expression summary measures using the dilution study data, assessing their behavior in terms of bias, variance and (for MBEI and RMA) model fit. Finally, we evaluate the algorithms in terms of their ability to detect known levels of differential expression using the spike-in data. We conclude that there is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities. ∗ To whom correspondence should be addressed

...read moreread less

10,711 citations

Journal Article•DOI•

Adjusting batch effects in microarray expression data using empirical Bayes methods

[...]

W. Evan Johnson¹, Cheng Li¹, Ariel Rabinovic¹•Institutions (1)

Harvard University¹

01 Jan 2007-Biostatistics

TL;DR: This paper proposed parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples.

...read moreread less

Abstract: SUMMARY Non-biological experimental variation or “batch effects” are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (>25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.

...read moreread less

6,319 citations

Journal Article•DOI•

Sparse inverse covariance estimation with the graphical lasso

[...]

Jerome H. Friedman¹, Trevor Hastie¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jul 2008-Biostatistics

TL;DR: Using a coordinate descent procedure for the lasso, a simple algorithm is developed that solves a 1000-node problem in at most a minute and is 30-4000 times faster than competing methods.

...read moreread less

Abstract: We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm--the graphical lasso--that is remarkably fast: It solves a 1000-node problem ( approximately 500,000 parameters) in at most a minute and is 30-4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen and Buhlmann (2006). We illustrate the method on some cell-signaling data from proteomics.

...read moreread less

5,577 citations

Journal Article•DOI•

Circular binary segmentation for the analysis of array-based DNA copy number data.

[...]

Adam B. Olshen¹, Ennapadam Venkatraman¹, Robert Lucito², Michael Wigler²•Institutions (2)

Memorial Sloan Kettering Cancer Center¹, Cold Spring Harbor Laboratory²

01 Oct 2004-Biostatistics

TL;DR: A modification ofbinary segmentation is developed, which is called circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number in DNA sequence copy number.

...read moreread less

Abstract: DNA sequence copy number is the number of copies of DNA at a region of a genome. Cancer progression often involves alterations in DNA copy number. Newly developed microarray technologies enable simultaneous measurement of copy number at thousands of sites in a genome. We have developed a modification of binary segmentation, which we call circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number. The method is evaluated by simulation and is demonstrated on cell line data with known copy number alterations and on a breast cancer cell line data set.

...read moreread less

2,269 citations

Journal Article•DOI•

A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis

[...]

Daniela Witten¹, Robert Tibshirani¹, Trevor Hastie¹•Institutions (1)

Stanford University¹

01 Jul 2009-Biostatistics

TL;DR: A penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix, and establishes connections between the SCoTLASS method for sparse principal component analysis and the method of Zou and others (2006).

...read moreread less

Abstract: SUMMARY We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as ˆ X = � K=1 dkukv T , where dk, uk, and

...read moreread less

1,540 citations

Collapse

Performance

Metrics

1,288

Papers

84,426

Citations

No. of papers from the Journal in previous years
Year	Papers
2023	17
2022	67
2021	98
2020	87
2019	58
2018	45