scispace - formally typeset
Search or ask a question
Author

Douglas B. Kell

Bio: Douglas B. Kell is an academic researcher from University of Liverpool. The author has contributed to research in topics: Dielectric & Systems biology. The author has an hindex of 111, co-authored 634 publications receiving 50335 citations. Previous affiliations of Douglas B. Kell include Max Planck Society & University of Wales.


Papers
More filters
Journal ArticleDOI
TL;DR: The present data-driven approach for the selection of reference genes by using the easy-to-calculate and robust GC is recommended, and its utility to find tissue- and cell line-optimised housekeeping genes without any prior bias is illustrated.
Abstract: We recently introduced the Gini coefficient (GC) for assessing the expression variation of a particular gene in a dataset, as a means of selecting improved reference genes over the cohort (‘housekeeping genes’) typically used for normalisation in expression profiling studies. Those genes (transcripts) that we determined to be useable as reference genes differed greatly from previous suggestions based on hypothesis-driven approaches. A limitation of this initial study is that a single (albeit large) dataset was employed for both tissues and cell lines. We here extend this analysis to encompass seven other large datasets. Although their absolute values differ a little, the Gini values and median expression levels of the various genes are well correlated with each other between the various cell line datasets, implying that our original choice of the more ubiquitously expressed low-Gini-coefficient genes was indeed sound. In tissues, the Gini values and median expression levels of genes showed a greater variation, with the GC of genes changing with the number and types of tissues in the data sets. In all data sets, regardless of whether this was derived from tissues or cell lines, we also show that the GC is a robust measure of gene expression stability. Using the GC as a measure of expression stability we illustrate its utility to find tissue- and cell line-optimised housekeeping genes without any prior bias, that again include only a small number of previously reported housekeeping genes. We also independently confirmed this experimentally using RT-qPCR with 40 candidate GC genes in a panel of 10 cell lines. These were termed the Gini Genes. In many cases, the variation in the expression levels of classical reference genes is really quite huge (e.g. 44 fold for GAPDH in one data set), suggesting that the cure (of using them as normalising genes) may in some cases be worse than the disease (of not doing so). We recommend the present data-driven approach for the selection of reference genes by using the easy-to-calculate and robust GC.

27 citations

Journal ArticleDOI
TL;DR: A simple approach to the screening of metabolic information that will be valuable in generating metabolomic data is demonstrated and this technique was subsequently used to generate metabolic footprints from cell-free supernatants and enabled to discriminate haploid yeast single-gene deletants (mutants).
Abstract: The importance of metabolomic data in functional genomic investigations is increasingly becoming evident, as is its utility in novel biomarker discovery. We demonstrate a simple approach to the screening of metabolic information that we believe will be valuable in generating metabolomic data. Laser desorption ionisation mass spectrometry on porous silicon was effective in detecting 22 of 30 metabolites in a mixture in the negative-ion mode and 19 of 30 metabolites in the positive-ion mode, without the employment of any prior analyte separation steps. Overall, 26 of the 30 metabolites could be covered between the positive and negative-ion modes. Although the response for the metabolites at a given concentration differed, it was possible to generate direct quantitative information for a given analyte in the mixture. This technique was subsequently used to generate metabolic footprints from cell-free supernatants and, when combined with chemometric analysis, enabled us to discriminate haploid yeast single-gene deletants (mutants). In particular, the metabolic footprint of a deletion mutant in a gene encoding a transcriptional activator (Gln3p) showed increased levels of peaks, including one corresponding to glutamate, compared to the other mutants and the wild-type strain tested, enabling its discrimination based on metabolic information.

27 citations

Journal ArticleDOI
01 Apr 2005-Science
TL;DR: In this paper, the experimental data showed no correlation between NF-kappa B (ReLA) expression level and oscillation dynamics, and a small change to the computational model used by Barken et al. to generate their theoretical data reduced the apparent discrepancies.
Abstract: Our experimental data shows no correlation between NF-kappa B (ReLA) expression level and oscillation dynamics. We show that a small change to the computational model used by Barken et al. to generate their theoretical data reduces the apparent discrepancies. Cell system differences and possible compensatory changes to normal signaling in their genetically engineered knockout cells may explain differences between the two studies.

27 citations

Journal ArticleDOI
TL;DR: In this article, artificial neural networks (ANNs) were used to predict both yeast and wheatgerm content from unseen mixture data, and multivariate statistical methods such as partial least squares (PLS) and principal component regression (PCR) could also be used successfully to deconvolute such dielectric spectra.

27 citations

Journal ArticleDOI
TL;DR: Artificial neural networks of the present type, with fully interconnected feedforward architectures and trained according to the backpropagation algorithm, scaled poorly as the problem size was increased.
Abstract: Here we develop the use of artificial neural networks for solving the inverse metabolic problem, in other words, given a set of steady-state metabolite levels and fluxes in a pathway of known structure to obtain the parameters of the system, in this case the enzymatic limiting rate and Michaelis constants. This requires two main procedures: first the development of a computer program with which one can model metabolism in the forward direction (i.e. given the internal and parameters to determine the steady-state fluxes and metabolite concentrations), and second, given arrays of associated parameters and variables thereby obtained, to exploit artificial neural networks to form a model capable of obtaining the parameters from the variables. We studied 2-step pathways exhibiting first-order kinetics, 2-step pathways exhibiting reversible Michaelis-Menten kinetics and then 3-step pathways (again exhibiting reversible Michaelis-Menten kinetics), modelled using the program Gepasi. Whilst it was fairly easy for the networks to learn most of the parameters in the 2-step pathway, it was found helpful for the Michaelis-Menten case to vary the concentration of the starting pathway substrate for each set of internal parameters, and to train separate networks for each parameter. Some parameters were much easier to learn than others, reverse Km and Vmax values normally being the most difficult. For the 3-step pathway learning sometimes required as much as 3 days, and occasionally convergence was not obtained. Overall, neural networks of the present type, with fully interconnected feedforward architectures and trained according to the backpropagation algorithm, scaled poorly as the problem size was increased.

27 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: A simple and highly efficient method to disrupt chromosomal genes in Escherichia coli in which PCR primers provide the homology to the targeted gene(s), which should be widely useful, especially in genome analysis of E. coli and other bacteria.
Abstract: We have developed a simple and highly efficient method to disrupt chromosomal genes in Escherichia coli in which PCR primers provide the homology to the targeted gene(s). In this procedure, recombination requires the phage lambda Red recombinase, which is synthesized under the control of an inducible promoter on an easily curable, low copy number plasmid. To demonstrate the utility of this approach, we generated PCR products by using primers with 36- to 50-nt extensions that are homologous to regions adjacent to the gene to be inactivated and template plasmids carrying antibiotic resistance genes that are flanked by FRT (FLP recognition target) sites. By using the respective PCR products, we made 13 different disruptions of chromosomal genes. Mutants of the arcB, cyaA, lacZYA, ompR-envZ, phnR, pstB, pstCA, pstS, pstSCAB-phoU, recA, and torSTRCAD genes or operons were isolated as antibiotic-resistant colonies after the introduction into bacteria carrying a Red expression plasmid of synthetic (PCR-generated) DNA. The resistance genes were then eliminated by using a helper plasmid encoding the FLP recombinase which is also easily curable. This procedure should be widely useful, especially in genome analysis of E. coli and other bacteria because the procedure can be done in wild-type cells.

14,389 citations

Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: A practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics, which makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries.
Abstract: The cBioPortal for Cancer Genomics (http://cbioportal.org) provides a Web resource for exploring, visualizing, and analyzing multidimensional cancer genomics data. The portal reduces molecular profiling data from cancer tissues and cell lines into readily understandable genetic, epigenetic, gene expression, and proteomic events. The query interface combined with customized data storage enables researchers to interactively explore genetic alterations across samples, genes, and pathways and, when available in the underlying data, to link these to clinical outcomes. The portal provides graphical summaries of gene-level data from multiple platforms, network visualization and analysis, survival analysis, patient-centric queries, and software programmatic access. The intuitive Web interface of the portal makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries. Here, we provide a practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics.

10,947 citations