Topic
Test data
About: Test data is a research topic. Over the lifetime, 22460 publications have been published within this topic receiving 260060 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: A classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains and shows how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class.
Abstract: Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. Often, however, we have plentiful labeled training data from a source domain but wish to learn a classifier which performs well on a target domain with a different distribution and little or no labeled training data. In this work we investigate two questions. First, under what conditions can a classifier trained from source data be expected to perform well on target data? Second, given a small amount of labeled target data, how should we combine it during training with the large amount of labeled source data to achieve the lowest target error at test time?
We address the first question by bounding a classifier's target error in terms of its source error and the divergence between the two domains. We give a classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains. Under the assumption that there exists some hypothesis that performs well in both domains, we show that this quantity together with the empirical source error characterize the target error of a source-trained classifier.
We answer the second question by bounding the target error of a model which minimizes a convex combination of the empirical source and target errors. Previous theoretical work has considered minimizing just the source error, just the target error, or weighting instances from the two domains equally. We show how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class. The resulting bound generalizes the previously studied cases and is always at least as tight as a bound which considers minimizing only the target error or an equal weighting of source and target errors.
2,921 citations
•
01 Dec 1984
TL;DR: A survey of the technology of modal testing, a new method for describing the vibration properties of a structure by constructing mathematical models based on test data rather than using conventional theoretical analysis.
Abstract: A survey of the technology of modal testing, a new method for describing the vibration properties of a structure by constructing mathematical models based on test data rather than using conventional theoretical analysis. Shows how to build a detailed mathematical model of a test structure and analyze and modify the structure to improve its dynamics. Covers techniques for measuring the mode, shapes, and frequencies of practical structures from turbine blades to suspension bridges.
2,525 citations
••
TL;DR: A new approach is introduced in conjunction with the singular value decomposition technique to derive the basic formulation of minimum order realization which is an extended version of the Ho-Kalman algorithm.
Abstract: A method, called the Eigensystem Realization Algorithm (ERA), is developed for modal parameter identification and model reduction of dynamic systems from test data. A new approach is introduced in conjunction with the singular value decomposition technique to derive the basic formulation of minimum order realization which is an extended version of the Ho-Kalman algorithm. The basic formulation is then transformed into modal space for modal parameter identification. Two accuracy indicators are developed to quantitatively identify the system modes and noise modes. For illustration of the algorithm, examples are shown using simulation data and experimental data for a rectangular grid structure.
2,366 citations
••
TL;DR: ShengBTE is a software package for computing the lattice thermal conductivity of crystalline bulk materials and nanowires with diffusive boundary conditions based on a full iterative solution to the Boltzmann transport equation.
1,834 citations
••
22 Jul 2006TL;DR: This work introduces structural correspondence learning to automatically induce correspondences among features from different domains in order to adapt existing models from a resource-rich source domain to aresource-poor target domain.
Abstract: Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For many NLP tasks, however, we are confronted with new domains in which labeled data is scarce or non-existent. In such cases, we seek to adapt existing models from a resource-rich source domain to a resource-poor target domain. We introduce structural correspondence learning to automatically induce correspondences among features from different domains. We test our technique on part of speech tagging and show performance gains for varying amounts of source and target training data, as well as improvements in target domain parsing accuracy using our improved tagger.
1,672 citations