Density Estimation for Statistics and Data Analysis

doi:10.2307/2348849

Journal Article•DOI•

Density Estimation for Statistics and Data Analysis

01 Oct 1987-The Statistician (John Wiley & Sons, Ltd)-Vol. 36, Iss: 4, pp 420-421

About: This article is published in The Statistician.The article was published on 1987-10-01. It has received 5674 citations till now. The article focuses on the topics: Density estimation.

...read moreread less

Citations

PDF

Open Access

More filters

Book•

Neural networks for pattern recognition

[...]

Christopher M. Bishop¹•Institutions (1)

Aston University¹

01 Jan 1995

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

19,056 citations

Journal Article•DOI•

A language modeling approach to information retrieval

[...]

Jay Ponte¹, W. Bruce Croft¹•Institutions (1)

University of Massachusetts Amherst¹

01 Aug 1998

TL;DR: It will be shown that probabilistic methods can be used to predict topic changes in the context of the task of new event detection and provide further proof of concept for the use of language models for retrieval tasks.

...read moreread less

Abstract: In today's world, there is no shortage of information. However, for a specific information need, only a small subset of all of the available information will be useful. The field of information retrieval (IR) is the study of methods to provide users with that small subset of information relevant to their needs and to do so in a timely fashion. Information sources can take many forms, but this thesis will focus on text based information systems and investigate problems germane to the retrieval of written natural language documents. Central to these problems is the notion of "topic." In other words, what are documents about? However, topics depend on the semantics of documents and retrieval systems are not endowed with knowledge of the semantics of natural language. The approach taken in this thesis will be to make use of probabilistic language models to investigate text based information retrieval and related problems. One such problem is the prediction of topic shifts in text, the topic segmentation problem. It will be shown that probabilistic methods can be used to predict topic changes in the context of the task of new event detection. Two complementary sets of features are studied individually and then combined into a single language model. The language modeling approach allows this problem to be approached in a principled way without complex semantic modeling. Next, the problem of document retrieval in response to a user query will be investigated. Models of document indexing and document retrieval have been extensively studied over the past three decades. The integration of these two classes of models has been the goal of several researchers but it is a very difficult problem. Much of the reason for this is that the indexing component requires inferences as to the semantics of documents. Instead, an approach to retrieval based on probabilistic language modeling will be presented. Models are estimated for each document individually. The approach to modeling is non-parametric and integrates the entire retrieval process into a single model. One advantage of this approach is that collection statistics, which are used heuristically for the assignment of concept probabilities in other probabilistic models, are used directly in the estimation of language model probabilities in this approach. The language modeling approach has been implemented and tested empirically and performs very well on standard test collections and query sets. In order to improve retrieval effectiveness, IR systems use additional techniques such as relevance feedback, unsupervised query expansion and structured queries. These and other techniques are discussed in terms of the language modeling approach and empirical results are given for several of the techniques developed. These results provide further proof of concept for the use of language models for retrieval tasks.

...read moreread less

2,736 citations

Cites methods from "Density Estimation for Statistics a..."

...Rather than making parametric assumptions, as is done in the 2-Poisson model it is assumed that terms follow a mixture of two Poisson distributions, as Silverman said, \the data will be allowed to speak for themselves [16]....
[...]

Journal Article•DOI•

The RIN: an RNA integrity number for assigning integrity values to RNA measurements

[...]

Andreas Schroeder¹, Odilo Mueller¹, Susanne Dr. Stocker¹, Susanne Dr. Stocker², Ruediger Salowsky¹, Michael Leiber¹, Marcus Gassmann¹, Samar Lightfoot¹, Wolfram Menzel³, Martin Granzow⁴, Thomas Ragg⁴ - Show less +7 more•Institutions (4)

Agilent Technologies¹, Hoffmann-La Roche², Karlsruhe Institute of Technology³, Weingarten Realty Investors⁴

31 Jan 2006-BMC Molecular Biology

TL;DR: The results show the importance of taking characteristics of several regions of the recorded electropherogram into account in order to get a robust and reliable prediction of RNA integrity, especially if compared to traditional methods.

...read moreread less

Abstract: The integrity of RNA molecules is of paramount importance for experiments that try to reflect the snapshot of gene expression at the moment of RNA extraction. Until recently, there has been no reliable standard for estimating the integrity of RNA samples and the ratio of 28S:18S ribosomal RNA, the common measure for this purpose, has been shown to be inconsistent. The advent of microcapillary electrophoretic RNA separation provides the basis for an automated high-throughput approach, in order to estimate the integrity of RNA samples in an unambiguous way. A method is introduced that automatically selects features from signal measurements and constructs regression models based on a Bayesian learning technique. Feature spaces of different dimensionality are compared in the Bayesian framework, which allows selecting a final feature combination corresponding to models with high posterior probability. This approach is applied to a large collection of electrophoretic RNA measurements recorded with an Agilent 2100 bioanalyzer to extract an algorithm that describes RNA integrity. The resulting algorithm is a user-independent, automated and reliable procedure for standardization of RNA quality control that allows the calculation of an RNA integrity number (RIN). Our results show the importance of taking characteristics of several regions of the recorded electropherogram into account in order to get a robust and reliable prediction of RNA integrity, especially if compared to traditional methods.

...read moreread less

2,406 citations

Book•

Bayesian Forecasting and Dynamic Models

[...]

Mike West¹, Jeff Harrison•Institutions (1)

Texas Tech University¹

01 Nov 1989

TL;DR: In this article, the authors propose a model called the Dynamic Regression Model (DRM) which is an extension of the First-Order Polynomial Model (FOPM) and the Dynamic Linear Model (DLM).

...read moreread less

Abstract: to the DLM: The First-Order Polynomial Model.- to the DLM: The Dynamic Regression Model.- The Dynamic Linear Model.- Univariate Time Series DLM Theory.- Model Specification and Design.- Polynomial Trend Models.- Seasonal Models.- Regression, Autoregression, and Related Models.- Illustrations and Extensions of Standard DLMs.- Intervention and Monitoring.- Multi-Process Models.- Non-Linear Dynamic Models: Analytic and Numerical Approximations.- Exponential Family Dynamic Models.- Simulation-Based Methods in Dynamic Models.- Multivariate Modelling and Forecasting.- Distribution Theory and Linear Algebra.

...read moreread less

2,129 citations

Cites background or methods from "Density Estimation for Statistics a..."

...Conventional density estimation techniques (Silverman 1986) choose the window width h as a slowly decreasing function of n, so that the kernel components are naturally more concentrated about the locations θj for larger sample sizes....
[...]
...Useful background on posterior simulation appears in Bernardo and Smith (1994, Section 5.5), and Gelman, Carlin, Stern and Rubin (1995), Chapters 10 and 11....
[...]
...through Woodward and Goldsmith (1964), and of British Nylon Spinners, later to become part of ICI, through Ewan and Kemp (1960)....
[...]

Proceedings Article•DOI•

Model compression

[...]

Cristian Buciluǎ¹, Rich Caruana¹, Alexandru Niculescu-Mizil¹•Institutions (1)

Cornell University¹

20 Aug 2006

TL;DR: This work presents a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance.

...read moreread less

Abstract: Often the best performing supervised learning models are ensembles of hundreds or thousands of base-level classifiers. Unfortunately, the space required to store this many classifiers, and the time required to execute them at run-time, prohibits their use in applications where test sets are large (e.g. Google), where storage space is at a premium (e.g. PDAs), and where computational power is limited (e.g. hea-ring aids). We present a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance.

...read moreread less

2,091 citations

Collapse

Density Estimation for Statistics and Data Analysis

Citations

Cites methods from "Density Estimation for Statistics a..."

Cites background or methods from "Density Estimation for Statistics a..."

Related Papers (5)