scispace - formally typeset
Search or ask a question

Showing papers on "Pooling published in 2012"


Book ChapterDOI
07 Oct 2012
TL;DR: This paper introduces multiplicative second-order analogues of average and max-pooling that together with appropriate non-linearities lead to state-of-the-art performance on free-form region recognition, without any type of feature coding.
Abstract: Feature extraction, coding and pooling, are important components on many contemporary object recognition paradigms. In this paper we explore novel pooling techniques that encode the second-order statistics of local descriptors inside a region. To achieve this effect, we introduce multiplicative second-order analogues of average and max-pooling that together with appropriate non-linearities lead to state-of-the-art performance on free-form region recognition, without any type of feature coding. Instead of coding, we found that enriching local descriptors with additional image information leads to large performance gains, especially in conjunction with the proposed pooling methodology. We show that second-order pooling over free-form regions produces results superior to those of the winning systems in the Pascal VOC 2011 semantic segmentation challenge, with models that are 20,000 times faster.

547 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: This paper shows that learning more adaptive receptive fields increases performance even with a significantly smaller codebook size at the coding layer, and adopts the idea of over-completeness to learn the optimal pooling parameters.
Abstract: In this paper we examine the effect of receptive field designs on classification accuracy in the commonly adopted pipeline of image classification. While existing algorithms usually use manually defined spatial regions for pooling, we show that learning more adaptive receptive fields increases performance even with a significantly smaller codebook size at the coding layer. To learn the optimal pooling parameters, we adopt the idea of over-completeness by starting with a large number of receptive field candidates, and train a classifier with structured sparsity to only use a sparse subset of all the features. An efficient algorithm based on incremental feature selection and retraining is proposed for fast learning. With this method, we achieve the best published performance on the CIFAR-10 dataset, using a much lower dimensional feature space than previous methods.

284 citations


Book ChapterDOI
07 Oct 2012
TL;DR: A framework that learns object detectors using only image-level class labels, or so-called weak labels is proposed, comparable in accuracy with state-of-the-art weakly supervised detection methods and significantly outperforms SPM-based pooling in image classification.
Abstract: Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, following the intuition that knowing the location of the object of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the objects, and (2) using the location information to pool foreground and background features separately to form the image-level representation. Step (1) is particularly challenging in a typical classification setting where precise object location annotations are not available during training. To address this challenge, we propose a framework that learns object detectors using only image-level class labels, or so-called weak labels. We validate our approach on the challenging PASCAL07 dataset. Our learned detectors are comparable in accuracy with state-of-the-art weakly supervised detection methods. More importantly, the resulting OCP approach significantly outperforms SPM-based pooling in image classification.

227 citations


Proceedings Article
18 Apr 2012
TL;DR: This work augmented the traditional ConvNet architecture by learning multi-stage features and by using Lp pooling and establishes a new state-of-the-art of 95.10% accuracy on the SVHN dataset (48% error improvement).
Abstract: We classify digits of real-world house numbers using convolutional neural networks (ConvNets). Con-vNets are hierarchical feature learning neural networks whose structure is biologically inspired. Unlike many popular vision approaches that are hand-designed, ConvNets can automatically learn a unique set of features optimized for a given task. We augmented the traditional ConvNet architecture by learning multi-stage features and by using Lp pooling and establish a new state-of-the-art of 95.10% accuracy on the SVHN dataset (48% error improvement). Furthermore, we analyze the benefits of different pooling methods and multi-stage features in ConvNets. The source code and a tutorial are available at eblearn.sf.net.

177 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: This work develops representations for poselet-based pose normalization using both explicit warping and implicit pooling as mechanisms and defines a pose normalized similarity or kernel function that is suitable for nearest-neighbor or kernel-based learning methods.
Abstract: The ability to normalize pose based on super-category landmarks can significantly improve models of individual categories when training data are limited. Previous methods have considered the use of volumetric or morphable models for faces and for certain classes of articulated objects. We consider methods which impose fewer representational assumptions on categories of interest, and exploit contemporary detection schemes which consider the ensemble of responses of detectors trained for specific posekeypoint configurations. We develop representations for poselet-based pose normalization using both explicit warping and implicit pooling as mechanisms. Our method defines a pose normalized similarity or kernel function that is suitable for nearest-neighbor or kernel-based learning methods.

159 citations


Book ChapterDOI
05 Nov 2012
TL;DR: The results show the new encoding methods can significantly improve the recognition accuracy compared with classical VQ and among them, Fisher kernel encoding and sparse encoding have the best performance.
Abstract: Bag of visual words (BoVW) models have been widely and successfully used in video based action recognition. One key step in constructing BoVW representation is to encode feature with a codebook. Recently, a number of new encoding methods have been developed to improve the performance of BoVW based object recognition and scene classification, such as soft assignment encoding [1], sparse encoding [2], locality-constrained linear encoding [3] and Fisher kernel encoding [4]. However, their effects for action recognition are still unknown. The main objective of this paper is to evaluate and compare these new encoding methods in the context of video based action recognition. We also analyze and evaluate the combination of encoding methods with different pooling and normalization strategies. We carry out experiments on KTH dataset [5] and HMDB51 dataset [6]. The results show the new encoding methods can significantly improve the recognition accuracy compared with classical VQ. Among them, Fisher kernel encoding and sparse encoding have the best performance. By properly choosing pooling and normalization methods, we achieve the state-of-the-art performance on HMDB51.

147 citations


Journal ArticleDOI
01 Apr 2012
TL;DR: The two-stage process and the relevant work in the existing visual quality metrics are first introduced followed by an in-depth analysis of SVD for visual quality assessment, which shows the proposed method outperforms the eight existing relevant schemes.
Abstract: We study the use of machine learning for visual quality evaluation with comprehensive singular value decomposition (SVD)-based visual features. In this paper, the two-stage process and the relevant work in the existing visual quality metrics are first introduced followed by an in-depth analysis of SVD for visual quality assessment. Singular values and vectors form the selected features for visual quality assessment. Machine learning is then used for the feature pooling process and demonstrated to be effective. This is to address the limitations of the existing pooling techniques, like simple summation, averaging, Minkowski summation, etc., which tend to be ad hoc. We advocate machine learning for feature pooling because it is more systematic and data driven. The experiments show that the proposed method outperforms the eight existing relevant schemes. Extensive analysis and cross validation are performed with ten publicly available databases (eight for images with a total of 4042 test images and two for video with a total of 228 videos). We use all publicly accessible software and databases in this study, as well as making our own software public, to facilitate comparison in future research.

142 citations


Proceedings ArticleDOI
09 Jul 2012
TL;DR: In this paper, the validity of Dempster-Shafer theory to solve practical problems is challenged by using an emblematic example to show that DS rule produces counter-intuitive result.
Abstract: We challenge the validity of Dempster-Shafer Theory by using an emblematic example to show that DS rule produces counter-intuitive result. Further analysis reveals that the result comes from a understanding of evidence pooling which goes against the common expectation of this process. Although DS theory has attracted some interest of the scientific community working in information fusion and artificial intelligence, its validity to solve practical problems is problematic, because it is not applicable to evidences combination in general, but only to a certain type situations which still need to be clearly identified.

111 citations


Journal ArticleDOI
TL;DR: EBMA improves prediction by pooling information from multiple forecast models to generate ensemble predictions similar to a weighted average of component forecasts, and increases the accuracy of out-of-sample forecasts relative to component models in three applications.
Abstract: We present ensemble Bayesian model averaging (EBMA) and illustrate its ability to aid scholars in the social sciences to make more accurate forecasts of future events. In essence, EBMA improves prediction by pooling information from multiple forecast models to generate ensemble predictions similar to a weighted average of component forecasts. The weight assigned to each forecast is calibrated via its performance in some validation period. The aim is not to choose some “best” model, but rather to incorporate the insights and knowledge implicit in various forecasting efforts via statistical postprocessing. After presenting the method, we show that EBMA increases the accuracy of out-of-sample forecasts relative to component models in three applied examples: predicting the occurrence of insurgencies around the Pacific Rim, forecasting vote shares in U.S. presidential elections, and predicting the votes of U.S. Supreme Court Justices.

106 citations


Journal ArticleDOI
02 May 2012
TL;DR: A multi-server model that captures a performance trade-off between centralized and distributed processing is proposed and analyzed, demonstrating a surprising phase transition in the steady-state delay scaling.
Abstract: We propose and analyze a multi-server model that captures a performance trade-off between centralized and distributed processing. In our model, a fraction p of an available resource is deployed in a centralized manner (e.g., to serve a most-loaded station) while the remaining fraction 1 − p is allocated to local servers that can only serve requests addressed specifically to their respective stations. Using a fluid model approach, we demonstrate a surprising phase transition in the steady-state delay scaling, as p changes: in the limit of a large number of stations, and when any amount of centralization is available (p > 0), the average queue length in steady state scales as log11−p11−λ when the traffic intensity λ goes to 1. This is exponentially smaller than the usual M/M/1-queue delay scaling of 11−λ, obtained when all resources are fully allocated to local stations (p = 0). This indicates a strong qualitative impact of even a small degree of resource pooling. We prove convergence to a fluid limit, and ...

94 citations


Patent
10 May 2012
TL;DR: In this paper, the authors present an energy pooling station that combines renewable energy, utility energy and back-up power services in the form of Green and Black Energy with energy storage to create a multi-income stream.
Abstract: The present invention provides for multiple energy pooling stations to combine renewable energy, utility energy and back-up power services in the form of Green and Black Energy with energy storage to create a multi-income stream. An energy pooling station is an advanced part of an evolving "energy network" in which multiple energy pooling stations are communicating with each other to share energy credit, bank energy, and distribute the energy to customers.

Journal ArticleDOI
TL;DR: It is found that, although a simple visuomotor behavior such as short-latency ocular following responses takes advantage of the full distribution of motion signals, perceptual speed discrimination is impaired for stimuli with large bandwidths.
Abstract: Moving objects generate motion information at different scales, but it is not known how the brain pools all of this information to reconstruct object speed and whether pooling depends on the purpose for which the information will be used. Here the authors find task-dependent differences in pooling that can be explained by an adaptive gain control mechanism.

Journal ArticleDOI
TL;DR: In this paper, the forecasting performance of leading indicators for industrial production in Germany was analyzed both before and during the financial crisis and the stability of forecasting models during the most recent financial crisis was investigated.

Book ChapterDOI
07 Oct 2012
TL;DR: A new visual representation, namely scene aligned pooling, for the task of event recognition in complex videos is proposed and can consistently improve various kinds of visual features such as different low-level color and texture features, or middle-level histogram of local descriptors such as SIFT, or space-time interest points, and high level semantic model features.
Abstract: Real-world videos often contain dynamic backgrounds and evolving people activities, especially for those web videos generated by users in unconstrained scenarios. This paper proposes a new visual representation, namely scene aligned pooling, for the task of event recognition in complex videos. Based on the observation that a video clip is often composed with shots of different scenes, the key idea of scene aligned pooling is to decompose any video features into concurrent scene components, and to construct classification models adaptive to different scenes. The experiments on two large scale real-world datasets including the TRECVID Multimedia Event Detection 2011 and the Human Motion Recognition Databases (HMDB) show that our new visual representation can consistently improve various kinds of visual features such as different low-level color and texture features, or middle-level histogram of local descriptors such as SIFT, or space-time interest points, and high level semantic model features, by a significant margin. For example, we improve the-state-of-the-art accuracy on HMDB dataset by 20% in terms of accuracy.

Book ChapterDOI
05 Nov 2012
TL;DR: This work introduces spatially local coding, an alternative way to include spatial information in the image model that performs better than all previous single-feature methods when tested on the Caltech 101 and 256 object recognition datasets.
Abstract: The spatial pyramid and its variants have been among the most popular and successful models for object recognition. In these models, local visual features are coded across elements of a visual vocabulary, and then these codes are pooled into histograms at several spatial granularities. We introduce spatially local coding, an alternative way to include spatial information in the image model. Instead of only coding visual appearance and leaving the spatial coherence to be represented by the pooling stage, we include location as part of the coding step. This is a more flexible spatial representation as compared to the fixed grids used in the spatial pyramid models and we can use a simple, whole-image region during the pooling stage. We demonstrate that combining features with multiple levels of spatial locality performs better than using just a single level. Our model performs better than all previous single-feature methods when tested on the Caltech 101 and 256 object recognition datasets.

Journal ArticleDOI
TL;DR: In this paper, an interval elimination procedure for bound contraction is proposed, which uses an MILP lower bound constructed using partitioning of certain variables, similar to the one used by other approaches.
Abstract: One of the biggest challenges in solving optimization engineering problems is rooted in the nonlinearities and nonconvexities, which arise from bilinear terms corresponding to component material balances and/or concave functions used to estimate capital cost of equipments. The procedure proposed uses an MILP lower bound constructed using partitioning of certain variables, similar to the one used by other approaches. The core of the method is to bound contract a set of variables that are not necessarily the ones being partitioned. The procedure for bound contraction consists of a novel interval elimination procedure that has several variants. Once bound contraction is exhausted the method increases the number of intervals or resorts to a branch and bound strategy where bound contraction takes place at each node. The procedure is illustrated with examples of water management and pooling problems. © 2011 American Institute of Chemical Engineers AIChE J, 58: 2320–2335, 2012

Journal ArticleDOI
TL;DR: This work studies the ex-ante efficient allocation of a set of quality-heterogeneous objects to a number of heterogeneous risk-neutral agents, which combines both pooling and screening of values.

Journal ArticleDOI
TL;DR: A model for inventing new signals is introduced in the context of sender–receiver games with reinforcement learning and helps agents avoid pooling and partial pooling equilibria.
Abstract: A model for inventing new signals is introduced in the context of sender–receiver games with reinforcement learning. If the invention parameter is set to zero, it reduces to basic Roth–Erev learning applied to acts rather than strategies, as in Argiento et al. (Stoch. Process. Appl. 119:373–390, 2009). If every act is uniformly reinforced in every state it reduces to the Chinese Restaurant Process—also known as the Hoppe–Polya urn—applied to each act. The dynamics can move players from one signaling game to another during the learning process. Invention helps agents avoid pooling and partial pooling equilibria.

Journal ArticleDOI
TL;DR: Idaho Chlamydia trachomatis-Neisseria gonorrhoeae specimens from July 2009 were pooled by stratified specimen pooling, an approach that removes high-risk specimens from the pooling population and pools low- risk specimens to maximize pooling efficiency.
Abstract: Idaho Chlamydia trachomatis-Neisseria gonorrhoeae specimens from July 2009 were pooled by stratified specimen pooling, an approach that removes high-risk specimens from the pooling population and pools low-risk specimens to maximize pooling efficiency. This approach reduced pool positivity rates by 8%, repeated tests by 9%, and saved 47.4% in direct costs.

01 Jan 2012
TL;DR: In this paper, the authors consider a stock point for expensive, low-usage items that is operated by multiple decision makers, each faces a Poisson demand process, and the joint stock point is controlled by a continuous-review base stock policy with full backordering.
Abstract: We consider a stock point for expensive, low-usage items that is operated by multiple decision makers. Each faces a Poisson demand process, and the joint stock point is controlled by a continuous-review base stock policy with full backordering. We consider penalty costs for backorders and holding costs for stock on hand. For this model, we derive structural properties of the resulting cost function. We use these to prove not only that it is cost e�ective to share one stock point with all parties involved, but also that collaboration (inventory pooling) can be supported by a stable cost allocation, i.e., the core of the associated cooperative game is non-empty. These results hold under optimized and under exogenously given base stock levels. For the former case, we further identify a stable cost allocation that would be easy to implement in practice and that induces players to reveal their private information truthfully.

Journal ArticleDOI
TL;DR: In this paper, the authors consider several independent decision makers who stock expensive, low-demand spare parts for their high-tech machines, and examine the stability of such pooling arrangements, and address the issue of fairly distributing the collective holding and downtime costs over the participants.
Abstract: We consider several independent decision makers who stock expensive, low-demand spare parts for their high-tech machines. They can collaborate by full pooling of their inventories via free transshipments. We examine the stability of such pooling arrangements, and we address the issue of fairly distributing the collective holding and downtime costs over the participants, by applying concepts from cooperative game theory. We consider two settings: one where each party maintains a predetermined stocking level and one where base stock levels are optimized. For the setting with fixed stocking levels, we unravel the possibly conflicting effects of implementing a full pooling arrangement and study these effects separately to establish intuitive conditions for existence of a stable cost allocation. For the setting with optimized stocking levels, we provide a simple proportional rule that accomplishes a population monotonic allocation scheme if downtime costs are symmetric among participants. Although our whole analysis is motivated by spare parts applications, all results are also applicable to other pooled resource systems of which the steady-state behavior is equivalent to that of an Erlang loss system. © 2012 Wiley Periodicals, Inc. Naval Research Logistics, 2012

Journal ArticleDOI
TL;DR: This paper investigates image features based on two-dimensional mel-cepstrum for the purpose of IQA and proposes a new metric by formulating IQA as a pattern recognition problem, which helps to overcome the limitations of the existing pooling methods.

Journal ArticleDOI
17 Feb 2012-PLOS ONE
TL;DR: It is concluded that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.
Abstract: Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.

Journal ArticleDOI
TL;DR: Bayesian learning is introduced in a variety of environments ranging from simple two-period to continuous-time models with stochastic production, and new mixed-strategy equilibria involving multiple pooling are possible.

Journal ArticleDOI
TL;DR: In this paper, the stochastic pooling problem is decomposed into a sequence of relaxed master problems and primal bounding problems, and the solution of the relaxed master problem yields a sequence with non-increasing lower bounds on the optimal objective value.
Abstract: The stochastic pooling problem is a type of stochastic mixed-integer bilinear program arising in the integrated design and operation of various important industrial networks, such as gasoline blending, natural gas production and transportation, water treatment, etc. This paper presents a rigorous decomposition method for the stochastic pooling problem, which guarantees finding an \({\epsilon}\) -optimal solution with a finite number of iterations. By convexification of the bilinear terms, the stochastic pooling problem is relaxed into a lower bounding problem that is a potentially large-scale mixed-integer linear program (MILP). Solution of this lower bounding problem is then decomposed into a sequence of relaxed master problems, which are MILPs with much smaller sizes, and primal bounding problems, which are linear programs. The solutions of the relaxed master problems yield a sequence of nondecreasing lower bounds on the optimal objective value, and they also generate a sequence of integer realizations defining the primal problems which yield a sequence of nonincreasing upper bounds on the optimal objective value. The decomposition algorithm terminates finitely when the lower and upper bounds coincide (or are close enough), or infeasibility of the problem is indicated. Case studies involving two example problems and two industrial problems demonstrate the dramatic computational advantage of the proposed decomposition method over both a state-of-the-art branch-and-reduce global optimization method and explicit enumeration of integer realizations, particularly for large-scale problems.

Journal ArticleDOI
13 Apr 2012
TL;DR: DataSHIELD as mentioned in this paper is a tool to coordinate analyses of data that cannot be pooled by simply using summary statistics from each study, and it is also an efficient approach to carry out a study level meta-analysis when this is appropriate and when the analysis can be pre-planned.
Abstract: Very large sample sizes are required for estimating effects which are known to be small, and for addressing intricate or complex statistical questions. This is often only achievable by pooling data from multiple studies, especially in genetic epidemiology where associations between individual genetic variants and phenotypes of interest are generally weak. However, the physical pooling of experimental data across a consortium is frequently prohibited by the ethico-legal constraints that govern agreements and consents for individual studies. Study level meta-analyses are frequently used so that data from multiple studies need not be pooled to conduct an analysis, though the resulting analysis is necessarily restricted by the available summary statistics. The idea of maintaining data security is also of importance in other areas and approaches to carrying out ‘secure analyses’ that do not require sharing of data from different sources have been proposed in the technometrics literature. Crucially, the algorithms for fitting certain statistical models can be manipulated so that an individual level meta-analysis can essentially be performed without the need for pooling individual-level data by combining particular summary statistics obtained individually from each study. DataSHIELD (Data Aggregation Through Anonymous Summary-statistics from Harmonised Individual levEL Databases) is a tool to coordinate analyses of data that cannot be pooled. In this paper, we focus on explaining why a DataSHIELD approach yields identical results to an individual level meta-analysis in the case of a generalised linear model, by simply using summary statistics from each study. It is also an efficient approach to carrying out a study level meta-analysis when this is appropriate and when the analysis can be pre-planned. We briefly comment on the IT requirements, together with the ethical and legal challenges which must be addressed.

Journal ArticleDOI
TL;DR: In this article, the authors introduce new nonparametric predictors for homogeneous pooled data in the context of group testing for rare abnormalities and show that they achieve optimal rates of convergence.
Abstract: We introduce new nonparametric predictors for homogeneous pooled data in the context of group testing for rare abnormalities and show that they achieve optimal rates of convergence. In particular, when the level of pooling is moderate, then despite the cost savings, the method enjoys the same convergence rate as in the case of no pooling. In the setting of "over-pooling" the convergence rate differs from that of an optimal estimator by no more than a logarithmic factor. Our approach improves on the random-pooling nonparametric predictor, which is currently the only nonparametric method available, unless there is no pooling, in which case the two approaches are identical.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: This paper introduces a Weakly supervised Sparse Coding (WSC) to exploit the Classemes-based attribute labeling to refine the descriptor coding procedure and proposes an adaptive feature pooling scheme over “superpixels” rather than over fixed spatial pyramids, named Geometric Consistency Pooling (GCP).
Abstract: Most recently the Bag-of-Features (BoF) representation has been well advocated for image search and classification, with two decent phases named sparse coding and max pooling to compensate quantization loss as well as inject spatial layouts. But still, much information has been discarded by quantizing local descriptors with two-dimensional layouts into a one-dimensional BoF histogram. In this paper, we revisit this popular “sparse coding + max pooling” paradigm by “looking around” the local descriptor context towards an optimal BoF. First, we introduce a Weakly supervised Sparse Coding (WSC) to exploit the Classemes-based attribute labeling to refine the descriptor coding procedure. It is achieved by learning an attribute-to-word co-occurrence prior to impose a label inconsistency distortion over the l 1 based coding regularizer, such that the descriptor codes can maximally preserve the image semantic similarity. Second, we propose an adaptive feature pooling scheme over “superpixels” rather than over fixed spatial pyramids, named Geometric Consistency Pooling (GCP). As an effect, local descriptors enjoying good geometric consistency are pooled together to ensure a more precise spatial layouts embedding in BoF. Both of our phases are unsupervised, which differ from the existing works in supervised dictionary learning, sparse coding and feature pooling. Therefore, our approach enables potential applications like scalable visual search. We evaluate in both image classification and search benchmarks and report good improvements over the state-of-the-arts.

Journal ArticleDOI
TL;DR: A novel pooling mechanism which accounts for competition among contestants is developed, a stacking paradigm integrating conditional logit regression and log-likelihood-ratio-based forecast selection, and the proposed stacking ensemble provides statistically and economically accurate forecasts.

Journal ArticleDOI
TL;DR: In this paper, the authors introduce new nonparametric predictors for homogeneous pooled data in the context of group testing for rare abnormalities and show that they achieve optimal rates of convergence.
Abstract: We introduce new nonparametric predictors for homogeneous pooled data in the context of group testing for rare abnormalities and show that they achieve optimal rates of convergence. In particular, when the level of pooling is moderate, then despite the cost savings, the method enjoys the same convergence rate as in the case of no pooling. In the setting of “over-pooling” the convergence rate differs from that of an optimal estimator by no more than a logarithmic factor. Our approach improves on the random-pooling nonparametric predictor, which is currently the only nonparametric method available, unless there is no pooling, in which case the two approaches are identical.