scispace - formally typeset
Search or ask a question
Author

John F. Bithell

Bio: John F. Bithell is an academic researcher from University of Oxford. The author has contributed to research in topics: Population & Scheduling (production processes). The author has an hindex of 13, co-authored 31 publications receiving 2064 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This article reviews the common algorithms for resampling and methods for constructing bootstrap confidence intervals, together with some less well known ones, highlighting their strengths and weaknesses.
Abstract: Since the early 1980s, a bewildering array of methods for constructing bootstrap confidence intervals have been proposed. In this article, we address the following questions. First, when should bootstrap confidence intervals be used. Secondly, which method should be chosen, and thirdly, how should it be implemented. In order to do this, we review the common algorithms for resampling and methods for constructing bootstrap confidence intervals, together with some less well known ones, highlighting their strengths and weaknesses. We then present a simulation study, a flow chart for choosing an appropriate method and a survival analysis example.

1,416 citations

Journal ArticleDOI
TL;DR: A relative risk function over a geographical region is defined and it is shown that it can be estimated effectively using kernel density estimation separately for the spatial distribution of disease cases and for a sample of controls.
Abstract: A relative risk function over a geographical region is defined and it is shown that it can be estimated effectively using kernel density estimation separately for the spatial distribution of disease cases and for a sample of controls. This procedure is demonstrated using data on childhood leukaemia in the vicinity of the Sellafield nuclear reprocessing plant in Cumbria, U.K. Various modifications to the method are proposed, including the use of an adaptive kernel. The final plot demonstrates a sharp peak at Sellafield and a reasonably smooth surface over the rest of the region, despite the small number of cases in the series.

290 citations

Journal ArticleDOI
TL;DR: The present re-analysis exploits the case-control matching of the study while incorporating the effects of important risk determinants, notably year of birth, trimester of exposure and number of films exposed, to obtain time-invariant estimates of the extra risk per mGy.
Abstract: The association between obstetric X-raying and childhood cancer was first identified by the Oxford Survey of Childhood Cancers in 1956. The present re-analysis exploits the case-control matching of the study while incorporating the effects of important risk determinants, notably year of birth, trimester of exposure and number of films exposed. The decline in risk over time is closely mirrored by the estimated decline in dose per film and, by constraining these two relationships to be parallel, time-invariant estimates of the extra risk per mGy are obtained. For example, it is now estimated that irradiating 10(6) foetuses with 1 mGy of X-rays would, in the absence of other causes of death, yield 175 extra cases of cancer and leukaemia in the first 15 years of life.

83 citations

Journal ArticleDOI
TL;DR: The underlying principles of depicting disease incidence on geographical maps are considered and it is argued that the relative risk function provides a fundamental model useful for assessing different methods as a whole.
Abstract: This paper considers the underlying principles of depicting disease incidence on geographical maps and uses them to attempt a comparative classification of methods. After a discussion of the possibilities for incorporating time, we consider projection methods, some of which have been used to portray information in a manner supposed to be independent of population density. We then distinguish between non-parametric and model-based methods, including models for areal data using Bayesian ideas. Data in point form are also discussed and it is argued that the relative risk function provides a fundamental model useful for assessing different methods as a whole, some of which are known to be flawed and many of which are untested as regards their statistical properties.

82 citations

Journal ArticleDOI
TL;DR: The extension to higher dimensions permits the investigation of joint effects of several factors, while the problem of controlling for confounding variables can be handled by fitting multiplicative risk models.
Abstract: The risk associated with different levels of a quantitative factor X is often measured relative to the level corresponding to X = 0. There are situations, however, where there is no natural zero for X, for example where the risk factor is the age of an individual. In this case it is more natural to measure risk relative to an overall average for the study population. To use the whole population in this way also raises the possibility of regarding X as truly continuous, rather than as a grouped variable. This gives rise to the concept of a relative risk function. Methods for estimating such functions are discussed, concentrating for the most part on the discrete case. The extension to higher dimensions permits the investigation of joint effects of several factors, while the problem of controlling for confounding variables can be handled by fitting multiplicative risk models. Relating the latter to the log-linear model permits the estimation of adjusted relative risk functions. The method is illustrated using data on childhood cancer. The continuous case can in principle be handled in a similar way using density estimation techniques.

52 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: pROC as mentioned in this paper is a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.
Abstract: Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface. With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC. pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.

8,052 citations

Book
18 Aug 2014

2,305 citations

Journal ArticleDOI
TL;DR: In this paper, a pre-specifi ed meta-analysis of individual patient data from 6756 patients in nine randomised trials comparing alteplase with placebo or open control was conducted.

1,773 citations

Journal ArticleDOI
TL;DR: A multiple instance learning-based deep learning system that uses only the reported diagnoses as labels for training, thereby avoiding expensive and time-consuming pixel-wise manual annotations, and has the ability to train accurate classification models at unprecedented scale.
Abstract: The development of decision support systems for pathology and their deployment in clinical practice have been hindered by the need for large manually annotated datasets. To overcome this problem, we present a multiple instance learning-based deep learning system that uses only the reported diagnoses as labels for training, thereby avoiding expensive and time-consuming pixel-wise manual annotations. We evaluated this framework at scale on a dataset of 44,732 whole slide images from 15,187 patients without any form of data curation. Tests on prostate cancer, basal cell carcinoma and breast cancer metastases to axillary lymph nodes resulted in areas under the curve above 0.98 for all cancer types. Its clinical application would allow pathologists to exclude 65–75% of slides while retaining 100% sensitivity. Our results show that this system has the ability to train accurate classification models at unprecedented scale, laying the foundation for the deployment of computational decision support systems in clinical practice.

1,310 citations