Showing papers on "Test data published in 2006"

PDF

Open Access

Proceedings Article•DOI•

Domain Adaptation with Structural Correspondence Learning

[...]

John Blitzer¹, Ryan McDonald¹, Fernando Pereira¹•Institutions (1)

22 Jul 2006

TL;DR: This work introduces structural correspondence learning to automatically induce correspondences among features from different domains in order to adapt existing models from a resource-rich source domain to aresource-poor target domain.

...read moreread less

Abstract: Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For many NLP tasks, however, we are confronted with new domains in which labeled data is scarce or non-existent. In such cases, we seek to adapt existing models from a resource-rich source domain to a resource-poor target domain. We introduce structural correspondence learning to automatically induce correspondences among features from different domains. We test our technique on part of speech tagging and show performance gains for varying amounts of source and target training data, as well as improvements in target domain parsing accuracy using our improved tagger.

...read moreread less

1,672 citations

Proceedings Article•

Correcting Sample Selection Bias by Unlabeled Data

[...]

Jiayuan Huang¹, Arthur Gretton¹, Karsten M. Borgwardt¹, Bernhard Schölkopf¹, Alexander J. Smola¹ - Show less +1 more•Institutions (1)

Max Planck Society¹

04 Dec 2006

TL;DR: A nonparametric method which directly produces resampling weights without distribution estimation is presented, which works by matching distributions between training and testing sets in feature space.

...read moreread less

Abstract: We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces resampling weights without distribution estimation. Our method works by matching distributions between training and testing sets in feature space. Experimental results demonstrate that our method works well in practice.

...read moreread less

1,235 citations

Journal Article•DOI•

Domain adaptation for statistical classifiers

[...]

Hal Daumé¹, Daniel Marcu¹•Institutions (1)

University of Southern California¹

01 May 2006-Journal of Artificial Intelligence Research

TL;DR: This work introduces a statistical formulation of this problem in terms of a simple mixture model and presents an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts and leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

...read moreread less

Abstract: The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution that is related, but not identical, to the "out-of-domain" distribution of the training data. We consider the common case in which labeled out-of-domain data is plentiful, but labeled in-domain data is scarce. We introduce a statistical formulation of this problem in terms of a simple mixture model and present an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts. We present efficient inference algorithms for this special case based on the technique of conditional expectation maximization. Our experimental results show that our approach leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

...read moreread less

894 citations

Patent•

Systems and methods for message threat management

[...]

Paul Judge¹•Institutions (1)

McAfee¹

12 Jul 2006

TL;DR: In this article, the authors present a system and methods for detecting unsolicited and threatening communications and communicating threat information related thereto, where the received threat information is reduced into a canonical form and features are extracted from the reduced threat information in conjunction with configuration data such as goals.

...read moreread less

Abstract: The present invention is directed to systems and methods for detecting unsolicited and threatening communications and communicating threat information related thereto. Threat information is received from one or more sources; such sources can include external security databases and threat information data from one or more application and/or network layer security systems. The received threat information is reduced into a canonical form. Features are extracted from the reduced threat information; these features in conjunction with configuration data such as goals are used to produce rules. In some embodiments, these rules are tested against one or more sets of test data and compared against the same or different goals; if one or more tests fail, the rules are refined until the tests succeed within an acceptable margin of error. The rules are then propagated to one or more application layer security systems.

...read moreread less

486 citations

Book Chapter•DOI•

Missing data in kernel PCA

[...]

Guido Sanguinetti¹, Neil D. Lawrence¹•Institutions (1)

University of Sheffield¹

18 Sep 2006

TL;DR: The probabilistic interpretation of linear PCA is exploited together with recent results on latent variable models in Gaussian Processes in order to introduce an objective function for KPCA, and this new approach can be extended to reconstruct corrupted test data using fixed kernel feature extractors.

...read moreread less

Abstract: Kernel Principal Component Analysis (KPCA) is a widely used technique for visualisation and feature extraction. Despite its success and flexibility, the lack of a probabilistic interpretation means that some problems, such as handling missing or corrupted data, are very hard to deal with. In this paper we exploit the probabilistic interpretation of linear PCA together with recent results on latent variable models in Gaussian Processes in order to introduce an objective function for KPCA. This in turn allows a principled approach to the missing data problem. Furthermore, this new approach can be extended to reconstruct corrupted test data using fixed kernel feature extractors. The experimental results show strong improvements over widely used heuristics.

...read moreread less

236 citations

Journal Article•DOI•

Methods to improve the neural network performance in suspended sediment estimation

[...]

H. Kerem Cigizoglu¹, Ozgur Kisi²•Institutions (2)

Istanbul Technical University¹, Erciyes University²

20 Feb 2006-Journal of Hydrology

TL;DR: The range-dependent neural network (RDNN) was found to be superior to conventional ANN applications, where only a single network is trained considering the entire training data set, and both low and high-observed sediment values were closely approximated by the RDNN.

...read moreread less

197 citations

Journal Article•DOI•

Adaptation of maximum entropy capitalizer: Little data can help a lot

[...]

Ciprian Chelba¹, Alex Acero¹•Institutions (1)

Microsoft¹

01 Oct 2006-Computer Speech & Language

TL;DR: A novel technique for maximum “a posteriori” (MAP) adaptation of maximum entropy (MaxEnt) and maximum entropy Markov models (MEMM) is presented and automatic capitalization error rate of 1.4% is achieved on BN data.

...read moreread less

162 citations

Journal Article•DOI•

Assessing the performance of statistical validation tools for megavariate metabolomics data

[...]

Carina M. Rubingh, Sabina Bijlsma, Eduard P.P.A. Derks, Ivana Bobeldijk, Elwin Verheij, Sunil Kochhar¹, Age K. Smilde - Show less +3 more•Institutions (1)

Nestlé¹

11 Jul 2006-Metabolomics

TL;DR: Partial least squares discriminant analyses models were built for several LC-MS lipidomic training data sets of various numbers of lean and obese subjects and their modelling performance and their predictability were compared using a 10-fold cross-validation, a permutation test, and test data sets.

...read moreread less

Abstract: Statistical model validation tools such as cross-validation, jack-knifing model parameters and permutation tests are meant to obtain an objective assessment of the performance and stability of a statistical model. However, little is known about the performance of these tools for megavariate data sets, having, for instance, a number of variables larger than 10 times the number of subjects. The performance is assessed for megavariate metabolomics data, but the conclusions also carry over to proteomics, transcriptomics and many other research areas. Partial least squares discriminant analyses models were built for several LC-MS lipidomic training data sets of various numbers of lean and obese subjects. The training data sets were compared on their modelling performance and their predictability using a 10-fold cross-validation, a permutation test, and test data sets. A wide range of cross-validation error rates was found (from 7.5% to 16.3% for the largest trainings set and from 0% to 60% for the smallest training set) and the error rate increased when the number of subjects decreased. The test error rates varied from 5% to 50%. The smaller the number of subjects compared to the number of variables, the less the outcome of validation tools such as cross-validation, jack-knifing model parameters and permutation tests can be trusted. The result depends crucially on the specific sample of subjects that is used for modelling. The validation tools cannot be used as warning mechanism for problems due to sample size or to representativity of the sampling.

...read moreread less

150 citations

Journal Article•DOI•

Support vector machines‐based modelling of seismic liquefaction potential

[...]

Mahesh Pal¹•Institutions (1)

National Institute of Technology, Kurukshetra¹

25 Aug 2006-International Journal for Numerical and Analytical Methods in Geomechanics

TL;DR: In this paper, support vector machines (SVM) were used to assess the liquefaction potential from actual standard penetration test (SPT) and cone penetration test(CPT) field data.

...read moreread less

Abstract: This paper investigates the potential of support vector machines (SVM)-based classification approach to assess the liquefaction potential from actual standard penetration test (SPT) and cone penetration test (CPT) field data. SVMs are based on statistical learning theory and found to work well in comparison to neural networks in several other applications. Both CPT and SPT field data sets is used with SVMs for predicting the occurrence and non-occurrence of liquefaction based on different input parameter combination. With SPT and CPT test data sets, highest accuracy of 96 and 97%, respectively, was achieved with SVMs. This suggests that SVMs can effectively be used to model the complex relationship between different soil parameter and the liquefaction potential. Several other combinations of input variable were used to assess the influence of different input parameters on liquefaction potential. Proposed approach suggest that neither normalized cone resistance value with CPT data nor the calculation of standardized SPT value is required with SPT data. Further, SVMs required few user-defined parameters and provide better performance in comparison to neural network approach. Copyright © 2006 John Wiley & Sons, Ltd.

...read moreread less

145 citations

Journal Article•DOI•

Advances in speech transcription at IBM under the DARPA EARS program

[...]

Stanley F. Chen¹, Brian Kingsbury¹, Lidia Mangu¹, Daniel Povey¹, George Saon¹, Hagen Soltau¹, Geoffrey Zweig¹ - Show less +3 more•Institutions (1)

IBM¹

01 Sep 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This paper describes the technical and system building advances made in IBM's speech recognition technology over the course of the Defense Advanced Research Projects Agency (DARPA) Effective Affordable Reusable Speech-to-Text (EARS) program and presents results on English conversational telephony test data from the 2003 and 2004 NIST evaluations.

...read moreread less

Abstract: This paper describes the technical and system building advances made in IBM's speech recognition technology over the course of the Defense Advanced Research Projects Agency (DARPA) Effective Affordable Reusable Speech-to-Text (EARS) program. At a technical level, these advances include the development of a new form of feature-based minimum phone error training (fMPE), the use of large-scale discriminatively trained full-covariance Gaussian models, the use of septaphone acoustic context in static decoding graphs, and improvements in basic decoding algorithms. At a system building level, the advances include a system architecture based on cross-adaptation and the incorporation of 2100 h of training data in every system component. We present results on English conversational telephony test data from the 2003 and 2004 NIST evaluations. The combination of technical advances and an order of magnitude more training data in 2004 reduced the error rate on the 2003 test set by approximately 21% relative-from 20.4% to 16.1%-over the most accurate system in the 2003 evaluation and produced the most accurate results on the 2004 test sets in every speed category

...read moreread less

143 citations

Proceedings Article•DOI•

Metamodel-based Test Generation for Model Transformations: an Algorithm and a Tool

[...]

Erwan Brottier¹, Franck Fleurey, Jim Steel, Benoit Baudry, Y. Le Traon¹ - Show less +1 more•Institutions (1)

Orange S.A.¹

07 Nov 2006

TL;DR: This paper presents an algorithm to automatically build test models from a metamodel, and focuses on generating input test data (called test models) for model transformations.

...read moreread less

Abstract: In a Model-Driven Development context (MDE), model transformations allow memorizing and reusing design know-how, and thus automate parts of the design and refinement steps of a software development process. A model transformation program is a specific program, in the sense it manipulates models as main parameters. Each model must be an instance of a "metamodel", a metamodel being the specification of a set of models. Programming a model transformation is a difficult and error-prone task, since the manipulated data are clearly complex. In this paper, we focus on generating input test data (called test models) for model transformations. We present an algorithm to automatically build test models from a metamodel.

...read moreread less

Journal Article•DOI•

Methodological issues in building, training, and testing artificial neural networks in ecological applications

[...]

Stacy L. Özesmi¹, Can Ozan Tan², Uygar Özesmi¹•Institutions (2)

Erciyes University¹, Middle East Technical University²

15 May 2006-Ecological Modelling

TL;DR: Techniques such as sensitivity analyses, input variable relevances, neural interpretation diagrams, randomization tests, and partial derivatives should be used to make MLP models more transparent, and further the ecological understanding, an important goal of the modelling process.

...read moreread less

Proceedings Article•DOI•

Simple and realistic data generation

[...]

Kenneth Houkjær¹, Kristian Torp¹, Rico Wind¹•Institutions (1)

Aalborg University¹

01 Sep 2006

TL;DR: This paper presents a generic, DBMS independent, and highly extensible relational data generation tool that can efficiently generate realistic test data for OLTP, OLAP, and data streaming applications.

...read moreread less

Abstract: This paper presents a generic, DBMS independent, and highly extensible relational data generation tool. The tool can efficiently generate realistic test data for OLTP, OLAP, and data streaming applications. The tool uses a graph model to direct the data generation. This model makes it very simple to generate data even for large database schemas with complex inter- and intra table relationships. The model also makes it possible to generate data with very accurate characteristics.

...read moreread less

High-frequency vertical wheel–rail contact forces – validation of a prediction model by field testing

[...]

Jens C. O. Nielsen¹•Institutions (1)

Chalmers University of Technology¹

01 Jan 2006

TL;DR: The computer program DIFF as discussed by the authors has been used to simulate vertical vehicle-track interaction at high frequencies, from about 20 Hz to at least 2000 Hz, with good agreement between calculated and measured vertical contact forces, both with respect to magnitude and frequency content.

...read moreread less

Abstract: The computer program DIFF, which is being developed at CHARMEC since the late 1980s, is used to simulate vertical vehicle–track interaction at high frequencies, from about 20 Hz to at least 2000 Hz. Measured results from two field test campaigns are used to validate the vehicle–track interaction model. The first test case involves impact loads from a wheel flat, while the other case studies the influence of a corrugated rail on dynamic vertical wheel–rail contact forces. Four vehicle models and two visco-elastic track models are compared. The track models are calibrated versus test data from laboratory and field tests. Input data on rail and wheel roughness are taken from field measurements. Good agreement between calculated and measured vertical contact forces is observed, both with respect to magnitude and frequency content, for most frequencies below 2000 Hz. The best agreement is obtained when using a vehicle model that accounts for both wheelsets in a bogie, instead of using a single wheelset model.

...read moreread less

Journal Article•DOI•

Some issues in the classification of DAIS hyperspectral data

[...]

Mahesh Pal¹, Paul M. Mather²•Institutions (2)

National Institute of Technology, Kurukshetra¹, University of Nottingham²

20 Jul 2006-International Journal of Remote Sensing

TL;DR: There was no evidence to support the view that classification accuracy inevitably declines as the data dimensionality increases, and it is suggested here that greater attention should be given to the collection of training and test data that represent the range of land surface variability at the spatial scale of the image.

...read moreread less

Abstract: Classification accuracy depends on a number of factors, of which the nature of the training samples, the number of bands used, the number of classes to be identified relative to the spatial resolution of the image and the properties of the classifier are the most important. This paper evaluates the effects of these factors on classification accuracy using a test area in La Mancha, Spain. High spectral and spatial resolution DAIS data were used to compare the performance of four classification procedures (maximum likelihood, neural network, support vector machines and decision tree). There was no evidence to support the view that classification accuracy inevitably declines as the data dimensionality increases. The support vector machine classifier performed well with all test data sets. The use of the orthogonal MNF transform resulted in a decline in classification accuracy. However, the decision‐tree approach to feature selection worked well. Small increases in classifier accuracy may be obtained using mo...

...read moreread less

Journal Article•DOI•

Manufacturing lead time estimation using data mining

[...]

Atakan Öztürk¹, Sinan Kayaligil¹, Nur Evin Özdemirel¹•Institutions (1)

Middle East Technical University¹

01 Sep 2006-European Journal of Operational Research

TL;DR: Empirical results indicate that the data mining approach coupled with the attribute selection scheme outperforms these methods.

...read moreread less

Patent•

Hardware implementation of network testing and performance monitoring in a network device

[...]

Nir Arad, Tsahi Daniel, Maxim Mondaeev

22 Mar 2006

TL;DR: In this article, the authors offload the generation and monitoring of test packets from a Central Processing Unit (CPU) to a dedicated network integrated circuit, such as a router, bridge or switch chip associated with the CPU.

...read moreread less

Abstract: An embodiment of the present invention offloads the generation and monitoring of test packets from a Central processing Unit (CPU) to a dedicated network integrated circuit, such as a router, bridge or switch chip associated with the CPU. The CPU may download test routines and test data to the network IC, which then generates the test packets, identifies and handles received test packets, collects test statistics, and performs other test functions all without loading the CPU. The CPU may be notified when certain events occur, such as when throughput or jitter thresholds for the network are exceeded.

...read moreread less

Journal Article•DOI•

Estimation of Pavement Performance Deterioration Using Bayesian Approach

[...]

Feng Hong¹, Jorge A Prozzi¹•Institutions (1)

University of Texas at Austin¹

01 Jun 2006-Journal of Infrastructure Systems

TL;DR: The Bayesian approach presented in this paper provides an effective and flexible alternative for model estimation and updating, which can be applied to both the road test data sites and other data sources of interest.

...read moreread less

Abstract: This paper investigates an incremental pavement performance model based on experimental data from the American Association of State Highway Officials road test. Structural properties, environmental effects, and traffic loading, the three main factors dominating the characteristic of pavement performance, are incorporated into the model. Due to the limited number of variables that can be controlled and observed, unobserved heterogeneity is almost inevitable. Most of the existing models did not fully account for the heterogeneity issue. In this paper, the Bayesian approach is adopted for its ability to address the issue of interest. The Bayesian approach aims to obtain probabilistic parameter distributions through a combination of existing knowledge (prior) and information from the data collected. The Markov chain Monte Carlo simulation is applied to estimate parameter distributions. Due to significant variability in the parameters, the need exists to address heterogeneity in modeling pavement performance. Furthermore, it is shown that not all the parameters are normally distributed. It is therefore suggested that the performance model developed in this research provides a more realistic forecast than most previous models. In addition, pavement deterioration forecast based on the Gibbs output is performed at different confidence levels with varying inspection frequencies, which can enhance the decision-making process in pavement management. In general, the Bayesian approach presented in this paper provides an effective and flexible alternative for model estimation and updating, which can be applied to both the road test data sites and other data sources of interest.

...read moreread less

Proceedings Article•DOI•

Evolutionary unit testing of object-oriented software using strongly-typed genetic programming

[...]

Stefan Wappler¹, Joachim Wegener¹•Institutions (1)

Daimler AG¹

08 Jul 2006

TL;DR: The approach presented in this paper relies on a tree-based representation of method call sequences by which sequence feasibility is preserved throughout the entire search process, and uses an extended distance-based fitness function to deal with runtime exceptions.

...read moreread less

Abstract: Evolutionary algorithms have successfully been applied to software testing. Not only approaches that search for numeric test data for procedural test objects have been investigated, but also techniques for automatically generating test programs that represent object-oriented unit test cases. Compared to numeric test data, test programs optimized for object-oriented unit testing are more complex. Method call sequences that realize interesting test scenarios must be evolved. An arbitrary method call sequence is not necessarily feasible due to call dependences which exist among the methods that potentially appear in a method call sequence. The approach presented in this paper relies on a tree-based representation of method call sequences by which sequence feasibility is preserved throughout the entire search process. In contrast to other approaches in this area, neither repair of individuals nor penalty mechanisms are required. Strongly-typed genetic programming is employed to generate method call trees. In order to deal with runtime exceptions, we use an extended distance-based fitness function. We performed experiments with four test objects. The initial results are promising: high code coverages were achieved completely automatically for all of the test objects.

...read moreread less

Patent•

Systems and methods for fluid quality sensing, data sharing and data visualization

[...]

Malcolm R. Kahn, Uwe Michalak, Dimitris S. Papageorgiou

16 Nov 2006

TL;DR: In this article, a service provider receives fluid test data generated from multiple different entities and permits authorized users affiliated with the different entities, as well as others, to visualize information associated with that data to via the Internet using graphical computer interfaces at respective computers.

...read moreread less

Abstract: A service provider receives fluid test data generated from multiple different entities and permits authorized users affiliated with the different entities, as well as others, to visualize information associated with that data to via the Internet using graphical computer interfaces at respective computers. The fluid test data can be gathered using portable sensor units equipped with GPS and wireless communication to transmit the fluid test data and geographical information to the service provider.

...read moreread less

Proceedings Article•DOI•

A Graph-Search Based Approach to BPEL4WS Test Generation

[...]

Yuan Yuan¹, Zhongjie Li¹, Wei Sun¹•Institutions (1)

IBM¹

29 Oct 2006

TL;DR: This paper proposes a graph-search based approach to BPEL test case generation, which effectively deals with BPEL concurrency semantics.

...read moreread less

Abstract: Business Process Execution Language for Web Services (BPEL4WS) is a kind of concurrent programming languages with several special features that raise special challenges for verification and testing. This paper proposes a graph-search based approach to BPEL test case generation, which effectively deals with BPEL concurrency semantics. This approach defines an extension of CFG (Control Flow Graph) - BPEL Flow Graph (BFG) - to represent a BPEL program in a graphical model. Then concurrent test paths can be generated by traversing the BFG model, and test data for each path can be generated using a constraint solving method. Finally test paths and data are combined into complete test cases.

...read moreread less

Journal Article•DOI•

XPAND: an efficient test stimulus compression technique

[...]

Subhasish Mitra¹, K.S. Kim¹•Institutions (1)

Intel¹

01 Feb 2006-IEEE Transactions on Computers

TL;DR: Experimental results on industrial designs demonstrate that this new XPAND technique achieves exponential reduction in test data volume and test time compared to traditional scan and significantly outperforms existing test compression tools.

...read moreread less

Abstract: Combinational circuits implemented with exclusive-or gates are used for on-chip generation of deterministic test patterns from compressed seeds. Unlike major test compression techniques, this technique doesn't require test pattern generation with don't cares. Experimental results on industrial designs demonstrate that this new XPAND technique achieves exponential reduction in test data volume and test time compared to traditional scan and significantly outperforms existing test compression tools. The XPAND technique is currently being used by several industrial designs.

...read moreread less

Proceedings Article•DOI•

BIST for network-on-chip interconnect infrastructures

[...]

C. Grecu¹, Partha Pratim Pande², Andre Ivanov¹, Resve A. Saleh¹•Institutions (2)

University of British Columbia¹, Washington State University²

30 Apr 2006

TL;DR: A novel built-in self-test methodology for testing the inter-switch links of network-on-chip (NoC) based chips using a high-level fault model that accounts for crosstalk effects due to inter-wire coupling.

...read moreread less

Abstract: In this paper, we present a novel built-in self-test methodology for testing the inter-switch links of network-on-chip (NoC) based chips. This methodology uses a high-level fault model that accounts for crosstalk effects due to inter-wire coupling. The novelty of our approach lies in the progressive reuse of the NoC infrastructure to transport test data to its own components under test in a bootstrap manner, and in extensively exploiting the inherent parallelism of the data transport mechanism to reduce the test time and implicitly the test cost.

...read moreread less

Journal Article•DOI•

Modeling of 3-D Vertical Interconnect Using Support Vector Machine Regression

[...]

Lei Xia, Jicheng Meng¹, Ruimin Xu, Bo Yan, Guo Yunchuan - Show less +1 more•Institutions (1)

University of Electronic Science and Technology of China¹

04 Dec 2006-IEEE Microwave and Wireless Components Letters

TL;DR: Experimental results suggest that the developed model performs with a good predictive ability in analyzing the electrical performance in the three-dimensional high density microwave packaging structure.

...read moreread less

Abstract: In this letter, the support vector machine (SVM) regression approach is introduced to model the three-dimensional (3-D) high density microwave packaging structure. The SVM is based on the structural risk minimization principle, which leads to a good generalization ability. With a 3-D vertical interconnect used as an example, the SVM regression model is electromagnetically developed with a set of training data and testing data, which is produced by the electromagnetic simulation. Experimental results suggest that the developed model performs with a good predictive ability in analyzing the electrical performance

...read moreread less

Journal Article•DOI•

Search‐based software test data generation for string data using program‐specific search operators

[...]

Mohammad Alshraideh¹, Leonardo Bottaci¹•Institutions (1)

University of Hull¹

01 Sep 2006-Software Testing, Verification & Reliability

TL;DR: This paper presents a novel approach to automatic software test data generation, where the test data is intended to cover program branches which depend on string predicates such as string equality, string ordering and regular expression matching.

...read moreread less

Abstract: This paper presents a novel approach to automatic software test data generation, where the test data is intended to cover program branches which depend on string predicates such as string equality, string ordering and regular expression matching. A search-based approach is assumed and some potential search operators and corresponding evaluation functions are assembled. Their performance is assessed empirically by using them to generate test data for a number of test programs. A novel approach of using search operators based on programming language string operators and parameterized by string literals from the program under test is introduced. These operators are also assessed empirically in generating test data for the test programs and are shown to provide a significant increase in performance. Copyright © 2006 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Exploiting Class Hierarchies for Knowledge Transfer in Hyperspectral Data

[...]

Suju Rajan, Joydeep Ghosh, Melba M. Crawford¹•Institutions (1)

Purdue University¹

30 Oct 2006-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A knowledge transfer framework that leverages the information extracted from the existing labeled data to classify spatially separate and multitemporal test data is proposed and shows that in the absence of any labeled data in the new area, the approach is better than a direct application of the original classifier on the new data.

...read moreread less

Abstract: Obtaining ground truth for classification of remotely sensed data is time consuming and expensive, resulting in poorly represented signatures over large areas. In addition, the spectral signatures of a given class vary with location and/or time. Therefore, successful adaptation of a classifier designed from the available labeled data to classify new hyperspectral images acquired over other geographic locations or subsequent times is difficult, if minimal additional labeled data are available. In this paper, the binary hierarchical classifier is used to propose a knowledge transfer framework that leverages the information extracted from the existing labeled data to classify spatially separate and multitemporal test data. Experimental results show that in the absence of any labeled data in the new area, the approach is better than a direct application of the original classifier on the new data. Moreover, when small amounts of the labeled data are available from the new area, the framework offers further improvements through semisupervised learning mechanisms and compares favorably with previously proposed methods

...read moreread less

Journal Article•DOI•

Inversion of tracer test data using tomographic constraints

[...]

Niklas Linde¹, Stefan Finsterle², Susan S. Hubbard²•Institutions (2)

Uppsala University¹, Lawrence Berkeley National Laboratory²

01 Apr 2006-Water Resources Research

TL;DR: In this paper, the authors developed a methodology for inverting tracer test data using zonation information obtained from two-dimensional radar tomograms to improve the hydraulic conductivity fields obtained from conventional inversion of tracer tests.

...read moreread less

Abstract: [1] We have developed a methodology for inverting tracer test data using zonation information obtained from two-dimensional radar tomograms to improve the (typically overly smooth) hydraulic conductivity fields obtained from conventional inversion of tracer test data. The method simultaneously yields two-dimensional estimates of hydraulic conductivity as well as petrophysical relationships that relate hydraulic conductivity to radar velocity; these relationships can be assumed to be stationary throughout the area of investigation or to vary as a function of zonation. Using a synthetic three-dimensional hydraulic conductivity field, we apply the developed inversion methodology and explore the impact of the strength and stationarity of the petrophysical relationship as well as the impact of errors that are often associated with radar data acquisition (such as unknown borehole deviation). We find that adding radar tomographic data to tracer test data improves hydrogeological site characterization, even in the presence of minor radar data errors. The results are contingent on the assumption that a relationship between radar velocity and hydraulic conductivity exists. Therefore the applicability of the proposed method may be limited to field sites where this condition is partially or fully satisfied.

...read moreread less

Proceedings Article•DOI•

Propeller Performance Measurement for Low Reynolds Number UAV Applications

[...]

Monal P. Merchant¹, L. Scott Miller¹•Institutions (1)

Wichita State University¹

09 Jan 2006

TL;DR: An Integrated Propulsion Test System (IPTS) has been designed, developed and validated at Wichita State University, and a reliable database of performance data has been created.

...read moreread less

Abstract: The recent boom in Unmanned Aerial Vehicles (UAV) and Micro Air Vehicles (MAV) aircraft development creates a strong demand for accurate small-diameter propeller performance data. Small-diameter propellers as defined in this paper (diameter 6 inches to 22 inches), operate at low Reynolds numbers (typically between 30,000 and 300,000), rendering performance scaling from larger counterparts inaccurate. An Integrated Propulsion Test System (IPTS) has been designed, developed and validated at Wichita State University (WSU). A large number of propellers have been tested and a reliable database of performance data has been created. This paper discusses the salient features of this measurement system and propeller test data for a few propellers.

...read moreread less

Patent•

Software testing automation framework

[...]

Laura Ioana Apostoloiu¹, Xin Chen¹, Raymond Scott Harvey¹, Young Wook Lee¹, Kyle D. Robeson¹ - Show less +1 more•Institutions (1)

IBM¹

28 Feb 2006

TL;DR: In this article, a reusable software testing framework is presented for automated application test data processing system, which can include a test task generator and a scenario generator coupled to one another and to the framework.

...read moreread less

Abstract: Embodiments of the present invention address deficiencies of the art in respect to software test automation and provide a method, system and apparatus for a reusable software testing framework. In one embodiment of the invention, an automated application test data processing system can include a reusable test automation framework. The system further can include a test task generator and a scenario generator coupled to one another and to the framework. In this regard, the test task generator can be configured to generate uniform logic for performing testing tasks, while the scenario generator can be configured to arrange testing tasks for a complete test scenario. Finally, a collaborative testing environment can be provided through which multiple users can interact with the scenario generator and test task generator to produce test cases of different test scenarios.

...read moreread less

Journal Article•DOI•

Automatic test data generation using genetic algorithm and program dependence graphs

[...]

James Miller¹, Marek Reformat¹, Howard Zhang¹•Institutions (1)

University of Alberta¹

01 Jul 2006-Information & Software Technology

TL;DR: A new approach utilizing program dependence analysis techniques and genetic algorithms (GAs) to generate test data is presented to show its effectiveness and efficiency based upon established criterion.

...read moreread less

Abstract: The complexity of software systems has been increasing dramatically in the past decade, and software testing as a labor-intensive component is becoming more and more expensive. Testing costs often account for up to 50% of the total expense of software development; hence any techniques leading to the automatic generation of test data will have great potential to considerably reduce costs. Existing approaches of automatic test data generation have achieved some success by using evolutionary computation algorithms, but they are unable to deal with Boolean variables or enumerated types and they need to be improved in many other aspects. This paper presents a new approach utilizing program dependence analysis techniques and genetic algorithms (GAs) to generate test data. A set of experiments using the new approach is reported to show its effectiveness and efficiency based upon established criterion.

...read moreread less

Collapse