scispace - formally typeset
Search or ask a question
Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.


Papers
More filters
Proceedings ArticleDOI
20 Jul 2008
TL;DR: This paper presents a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model, and shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback.
Abstract: Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. A higher relevance density will result in greater retrieval accuracy, ultimately approaching true relevance feedback. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.

174 citations

Journal ArticleDOI
TL;DR: In this article, a new statistical method for regional climate simulations is introduced, which is constrained only by the parameters of a linear regression line for a characteristic climatological variable, and is evaluated by means of a cross validation experiment for the Elbe river basin.
Abstract: A new statistical method for regional climate simulations is introduced. Its simulations are constrained only by the parameters of a linear regression line for a characteristic climatological variable. Simulated series are generated by resampling from segments of observation series such that the resulting series comply with the prescribed regression parameters and possess realistic annual cycles and persistence. The resampling guarantees that the simulated series are physically consistent both with respect to the combinations of different meteorological variables and to their spatial distribution at each time step. The resampling approach is evaluated by means of a cross validation experiment for the Elbe river basin: Its simulations are compared both to an observed climatology and to data simulated by a dynamical RCM. This cross validation shows that the approach is able to reproduce the observed climatology with respect to statistics such as long-term means, persistence features (e.g., dry spells) and extreme events. The agreement of its simulations with the observational data is much closer than for the RCM data.

173 citations

Proceedings ArticleDOI
27 Aug 2003
TL;DR: In this paper, a particle system approximation to the probability hypothesis density (PHD) is presented, where the particle weights are updated upon receiving an observation, taking into account the sensor likelihood function, and then propagated forward in time by sampling from a Markov transition density.
Abstract: We report here on the implementation of a particle systems approximation to the probability hypothesis density (PHD). The PHD of the multitarget posterior density has the property that, given any volume of state space, the integral of the PHD over that volume yields the expected number of targets present in the volume. As in the single target setting, upon receipt of an observation, the particle weights are updated, taking into account the sensor likelihood function, and then propagated forward in time by sampling from a Markov transition density. We also incorporate resampling and regularization into our implementation, introducing the new concept of cluster resampling.

173 citations

Journal ArticleDOI
TL;DR: A particle-based nonlinear filtering scheme, related to recent work on chainless Monte Carlo, designed to focus particle paths sharply so that fewer particles are required.
Abstract: We present a particle-based nonlinear filtering scheme, related to recent work on chainless Monte Carlo, designed to focus particle paths sharply so that fewer particles are required. The main features of the scheme are a representation of each new probability density function by means of a set of functions of Gaussian variables (a distinct function for each particle and step) and a resampling based on normalization factors and Jacobians. The construction is demonstrated on a standard, ill-conditioned test problem.

173 citations

Journal ArticleDOI
TL;DR: In this article, the effects of spatial autocorrelation on hyperparameter tuning and performance estimation by comparing several widely used machine-learning algorithms such as boosted regression trees (BRT), k-nearest neighbor (KNN), random forest (RF) and support vector machine (SVM) with traditional parametric algorithms, such as logistic regression (GLM) and semi-parametric ones like generalized additive models (GAM) in terms of predictive performance.

173 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
89% related
Inference
36.8K papers, 1.3M citations
87% related
Sampling (statistics)
65.3K papers, 1.2M citations
86% related
Regression analysis
31K papers, 1.7M citations
86% related
Markov chain
51.9K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023377
2022759
2021275
2020279