scispace - formally typeset
Search or ask a question
Author

Frederick Warner

Other affiliations: Veterans Health Administration
Bio: Frederick Warner is an academic researcher from Yale University. The author has contributed to research in topics: Retrospective cohort study & Population. The author has an hindex of 11, co-authored 24 publications receiving 1987 citations. Previous affiliations of Frederick Warner include Veterans Health Administration.

Papers
More filters
Journal ArticleDOI
TL;DR: The process of iterating or diffusing the Markov matrix is seen as a generalization of some aspects of the Newtonian paradigm, in which local infinitesimal transitions of a system lead to global macroscopic descriptions by integration.
Abstract: We provide a framework for structural multiscale geometric organization of graphs and subsets of R(n). We use diffusion semigroups to generate multiscale geometries in order to organize and represent complex structures. We show that appropriately selected eigenfunctions or scaling functions of Markov matrices, which describe local transitions, lead to macroscopic descriptions at different scales. The process of iterating or diffusing the Markov matrix is seen as a generalization of some aspects of the Newtonian paradigm, in which local infinitesimal transitions of a system lead to global macroscopic descriptions by integration. We provide a unified view of ideas from data analysis, machine learning, and numerical analysis.

1,654 citations

Journal ArticleDOI
TL;DR: This article deals with the construction of fast-order N algorithms for data representation and for homogenization of heterogeneous structures.
Abstract: In the companion article, a framework for structural multiscale geometric organization of subsets of and of graphs was introduced. Here, diffusion semigroups are used to generate multiscale analyses in order to organize and represent complex structures. We emphasize the multiscale nature of these problems and build scaling functions of Markov matrices (describing local transitions) that lead to macroscopic descriptions at different scales. The process of iterating or diffusing the Markov matrix is seen as a generalization of some aspects of the Newtonian paradigm, in which local infinitesimal transitions of a system lead to global macroscopic descriptions by integration. This article deals with the construction of fast-order N algorithms for data representation and for homogenization of heterogeneous structures.

237 citations

Patent
23 Jun 2005
TL;DR: In this article, the authors present a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates, wherein the number is linear in proportion to N.
Abstract: The present invention is directed to a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates. The present method and system stores a number of diffusion coordinates, wherein the number is linear in proportion to N.

126 citations

Journal ArticleDOI
25 Jan 2022-Med
TL;DR: In this paper , the authors analyzed nasal swab PCR tests for samples collected between December 12 and 16, 2021, in Connecticut when the proportion of Delta and Omicron variants was relatively equal, and fitted an exponential curve to the estimated infections to determine the doubling times for each variant.

63 citations

Patent
19 Sep 2005
TL;DR: In this article, a plurality of patches of pixels from within the hyper-spectral images are extracted as being patches around pixels of the elements to be characterized or distinguished, and statistics of spectra for each patch of pixels are computed.
Abstract: A hyper-spectral analysis method for characterizing or distinguishing diverse elements within hyper-spectral images. A plurality of patches of pixels from within the hyper-spectral images are extracted as being patches around pixels of the elements to be characterized or distinguished. The statistics of spectra for each patch of pixels are computed. A first classifier is computed from frequency-wise standard deviation of the spectra in each patch and a set of second classifiers are computed from principal components of the spectral in each patch. A combined classifier is computed based on the output of the first classifier and at least one of the second classifiers. The elements are characterized or distinguished based on the output of at least one of the classifiers, preferably the combined classifier.

44 citations


Cited by
More filters
01 Jan 2020
TL;DR: Prolonged viral shedding provides the rationale for a strategy of isolation of infected patients and optimal antiviral interventions in the future.
Abstract: Summary Background Since December, 2019, Wuhan, China, has experienced an outbreak of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Epidemiological and clinical characteristics of patients with COVID-19 have been reported but risk factors for mortality and a detailed clinical course of illness, including viral shedding, have not been well described. Methods In this retrospective, multicentre cohort study, we included all adult inpatients (≥18 years old) with laboratory-confirmed COVID-19 from Jinyintan Hospital and Wuhan Pulmonary Hospital (Wuhan, China) who had been discharged or had died by Jan 31, 2020. Demographic, clinical, treatment, and laboratory data, including serial samples for viral RNA detection, were extracted from electronic medical records and compared between survivors and non-survivors. We used univariable and multivariable logistic regression methods to explore the risk factors associated with in-hospital death. Findings 191 patients (135 from Jinyintan Hospital and 56 from Wuhan Pulmonary Hospital) were included in this study, of whom 137 were discharged and 54 died in hospital. 91 (48%) patients had a comorbidity, with hypertension being the most common (58 [30%] patients), followed by diabetes (36 [19%] patients) and coronary heart disease (15 [8%] patients). Multivariable regression showed increasing odds of in-hospital death associated with older age (odds ratio 1·10, 95% CI 1·03–1·17, per year increase; p=0·0043), higher Sequential Organ Failure Assessment (SOFA) score (5·65, 2·61–12·23; p Interpretation The potential risk factors of older age, high SOFA score, and d-dimer greater than 1 μg/mL could help clinicians to identify patients with poor prognosis at an early stage. Prolonged viral shedding provides the rationale for a strategy of isolation of infected patients and optimal antiviral interventions in the future. Funding Chinese Academy of Medical Sciences Innovation Fund for Medical Sciences; National Science Grant for Distinguished Young Scholars; National Key Research and Development Program of China; The Beijing Science and Technology Project; and Major Projects of National Science and Technology on New Drug Creation and Development.

4,408 citations

Journal Article
TL;DR: A semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner is proposed and properties of reproducing kernel Hilbert spaces are used to prove new Representer theorems that provide theoretical basis for the algorithms.
Abstract: We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning algorithms and standard methods including support vector machines and regularized least squares can be obtained as special cases. We use properties of reproducing kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.

3,919 citations

BookDOI
31 Mar 2010
TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Abstract: In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research. Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction. Adaptive Computation and Machine Learning series

3,773 citations

Journal ArticleDOI
TL;DR: The field of signal processing on graphs merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process high-dimensional data on graphs as discussed by the authors, which are the analogs to the classical frequency domain and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs.
Abstract: In applications such as social, energy, transportation, sensor, and neuronal networks, high-dimensional data naturally reside on the vertices of weighted graphs. The emerging field of signal processing on graphs merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs. In this tutorial overview, we outline the main challenges of the area, discuss different ways to define graph spectral domains, which are the analogs to the classical frequency domain, and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs. We then review methods to generalize fundamental operations such as filtering, translation, modulation, dilation, and downsampling to the graph setting and survey the localized, multiscale transforms that have been proposed to efficiently extract information from high-dimensional data on graphs. We conclude with a brief discussion of open issues and possible extensions.

3,475 citations

Journal ArticleDOI
TL;DR: This work presents Scanpy, a scalable toolkit for analyzing single-cell gene expression data that includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks, and AnnData, a generic class for handling annotated data matrices.
Abstract: Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells ( https://github.com/theislab/Scanpy ). Along with Scanpy, we present AnnData, a generic class for handling annotated data matrices ( https://github.com/theislab/anndata ).

3,343 citations