scispace - formally typeset
Search or ask a question

Showing papers in "Technical reports in 2008"


Posted Content
TL;DR: In this paper, the problem of constructing optimal designs for model discrimination between competing regression models is considered, and various new properties of optimal designs with respect to the popular T-optimality criterion are derived, which allow an explicit determination of T-optimal designs.
Abstract: We consider the problem of constructing optimal designs for model discrimination between competing regression models. Various new properties of optimal designs with respect to the popular T-optimality criterion are derived, which in many circumstances allow an explicit determination of T-optimal designs. It is also demonstrated, that in nested linear models the number of support points of T-optimal designs is usually too small to estimate all parameters in the extended model. In many cases T-optimal designs are usually not unique, and we give a characterization of all T-optimal designs. Finally, T-optimal designs are compared with optimal discriminating designs with respect to alternative criteria by means of a small simulation study.

59 citations


Posted Content
TL;DR: This paper discusses a general technique for a large class of convex functionals to compute the minimizers iteratively, which is closely related to majorization-minimization algorithms and includes the iteratively reweighted least squares algorithm as a special case.
Abstract: The computation of robust regression estimates often relies on minimization of a convex functional on a convex set. In this paper we discuss a general technique for a large class of convex functionals to compute the minimizers iteratively which is closely related to majorization-minimization algorithms. Our approach is based on a quadratic approximation of the functional to be minimized and includes the iteratively reweighted least squares algorithm as a special case. We prove convergence on convex function spaces for general coercive and convex functionals F and derive geometric convergence in certain unconstrained settings. The algorithm is applied to TV penalized quantile regression and is compared with a step size corrected Newton-Raphson algorithm. It is found that typically in the first steps the iteratively reweighted least squares algorithm performs significantly better, whereas the Newton type method outpaces the former only after many iterations. Finally, in the setting of bivariate regression with unimodality constraints we illustrate how this algorithm allows to utilize highly efficient algorithms for special quadratic programs in more complex settings.

53 citations


Posted Content
TL;DR: In this article, a general bootstrap procedure is proposed to approximate the null distribution of nonparametric frequency domain tests about the spectral density matrix of a multivariate time series under a set of easy to verify conditions.
Abstract: We propose a general bootstrap procedure to approximate the null distribution of nonparametric frequency domain tests about the spectral density matrix of a multivariate time series. Under a set of easy to verify conditions, we establish asymptotic validity of the proposed bootstrap procedure. We apply a version of this procedure together with a new statistic in order to test the hypothesis that the spectral densities of not necessarily independent time series are equal. The test statistic proposed is based on a L2-distance between the nonparametrically estimated individual spectral densities and an overall, 'pooled' spectral density, the later being obtained using the whole set of m time series considered. The effects of the dependence between the time series on the power behavior of the test are investigated. Some simulations are presented and a real-life data example is discussed.

50 citations


Posted Content
TL;DR: In this paper, a method based on weighted k nearest neighbors is proposed for imputing missing genotypes for single nucleotide polymorphisms (SNPs) in association studies, which can also be applied to data from whole-genome studies.
Abstract: Motivation: Missing values are a common problem in genetic association studies concerned with single nucleotide polymorphisms (SNPs). Since most statistical methods cannot handle missing values, they have to be removed prior to the actual analysis. Considering only complete observations, however, often leads to an immense loss of information. Therefore, procedures are needed that can be used to replace such missing values. In this article, we propose a method based on weighted k nearest neighbors that can be employed for imputing such missing genotypes. Results: In a comparison to other imputation approaches, our procedure called KNNcatImpute shows the lowest rates of falsely imputed genotypes when applied to the SNP data from the GENICA study, a study dedicated to the identification of genetic and gene-environment interactions associated with sporadic breast cancer. Moreover, in contrast to other imputation methods that take all variables into account when replacing missing values of a particular variable, KNNcatImpute is not restricted to association studies comprising several ten to a few hundred SNPs, but can also be applied to data from whole-genome studies, as an application to a subset of the HapMap data shows.

29 citations


Journal Article
Martin Jung1
TL;DR: In this article, the authors propose a method to solve the problem of homonymity in homonym identification, i.e., homonymization, in the context of homology.
Abstract: ....................................................................................................................11

16 citations



Posted Content
TL;DR: In this article, the authors consider inverse regression models with convolution-type operators which mediate convolution on R^d (d>=1) and prove a pointwise central limit theorem for spectral regularization estimators which can be applied to construct pointwise confidence regions.
Abstract: We consider inverse regression models with convolution-type operators which mediate convolution on R^d (d>=1) and prove a pointwise central limit theorem for spectral regularisation estimators which can be applied to construct pointwise confidence regions. Here, we cope with the unknown bias of such estimators by undersmoothing. Moreover, we prove consistency of the residual bootstrap in this setting and demonstrate the feasibility of the bootstrap confidence bands at moderate sample sizes in a simulation study.

12 citations


Posted ContentDOI
TL;DR: RFreak is an R package providing a framework for evolutionary computation by enwrapping the functionality of an evolutionary algorithm kit written in Java, and is thus further supporting the use of evolutionary computation in computational statistics.
Abstract: RFreak is an R package providing a framework for evolutionary computation. By enwrapping the functionality of an evolutionary algorithm kit written in Java, it offers an easy way to do evolutionary computation in R. In addition, application examples where an evolutionary approach is promising in computational statistics are included and described in this paper. The package is thus further supporting the use of evolutionary computation in computational statistics.

7 citations


Posted Content
TL;DR: In this paper, a geometric characterization of c-optimal designs in this context is presented, which generalizes the classical result of Elfving (1952) for c-optimality.
Abstract: We consider the common nonlinear regression model where the variance as well as the mean is a parametric function of the explanatory variables. The c-optimal design problem is investigated in the case when the parameters of both the mean and the variance function are of interest. A geometric characterization of c-optimal designs in this context is presented, which generalizes the classical result of Elfving (1952) for c-optimal designs. As in Elfving's famous characterization c-optimal designs can be described as representations of boundary points of a convex set. However, in the case where there appear parameters of interest in the variance, the structure of the Elfving set is different. Roughly speaking the Elfving set corresponding to a heteroscedastic regression model is the convex hull of a set of ellipsoids induced by the underlying model and indexed by the design space. The c-optimal designs are characterized as representations of the points where the line in direction of the vector c intersects the boundary of the new Elfving set. The theory is illustrated in several examples including pharmacokinetic models with random effects.

7 citations


Posted Content
TL;DR: In this paper, the authors consider the problem of designing experiments for estimating the slope of the expected response in a regression, where the experimenter is only interested in the slope at a particular point, and standardized minimax optimal designs are used if precise estimation of the slope over a given region is required.
Abstract: In the common linear regression model we consider the problem of designing experiments for estimating the slope of the expected response in a regression. We discuss locally optimal designs, where the experimenter is only interested in the slope at a particular point, and standardized minimax optimal designs, which could be used if precise estimation of the slope over a given region is required. General results on the number of support points of locally optimal designs are derived if the regression functions form a Chebyshev system. For polynomial regression and Fourier regression models of arbitrary degree the optimal designs for estimating the slope of the regression are determined explicitly for many cases of practical interest.

5 citations


Posted ContentDOI
TL;DR: This article constructed uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators, where the convolution is between two non-periodic functions on the whole real line rather than between two period functions on a compact interval.
Abstract: We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two period functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt [Ann. Statist. 1, 10711095] we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract.

Posted Content
TL;DR: A framework combining information retrieval with machine learning and (pre-)processing for named entity recognition in order to extract events from a large document collection is presented.
Abstract: Where Information Retrieval (IR) and Text Categorization delivers a set of (ranked) documents according to a query, users of large document collections would rather like to receive answers. Question-answering from text has already been the goal of the Message Understanding Conferences. Since then, the task of text understanding has been reduced to several more tractable tasks, most prominently Named Entity Recognition (NER) and Relation Extraction. Now, pieces can be put together to form enhanced services added on an IR system. In this paper, we present a framework which combines standard IR with machine learning and (pre-)processing for NER in order to extract events from a large document collection. Some questions can already be answered by particular events. Other questions require an analysis of a set of events. Hence, the extracted events become input to another machine learning process which delivers the final output to the user's question. Our case study is the public collection of minutes of plenary sessions of the German parliament and of petitions to the German parliament.

Journal Article
TL;DR: This paper proposes and implements a novel IDbased secure ranging protocol that is inspired by existing authenticated ranging and distance–bounding protocols, and is tailored to work on existing Ultra Wide Band (UWB) ranging platforms.
Abstract: In this paper, we propose and implement a novel IDbased secure ranging protocol. Our protocol is inspired by existing authenticated ranging and distance–bounding protocols, and is tailored to work on existing Ultra Wide Band (UWB) ranging platforms. Building on the implementation of secure ranging, we further implement a secure localization protocol that enables the computation of a correct device location in the presence of an adversary. We study how various implementations of secure ranging and localization protocols impacts their security and performance. We further propose modifications to these protocols to increase their security and accuracy. To the best of our knowledge, this is the first implementation of a RF Time–of–Arrival (ToA) secure localization system.

Journal Article
TL;DR: This work investigates the Skyhook positioning system, available on PCs and used on a number of mobile platforms, and demonstrates that this system is vulnerable to location spoofing and location database manipulation attacks.

Journal Article
TL;DR: The Technical Report of the Studies on Water Resources of New York State and the Great Lakes at DigitalCommons @Brockport as mentioned in this paper has been accepted for inclusion in Technical Reports by an authorized administrator.
Abstract: This Technical Report is brought to you for free and open access by the Studies on Water Resources of New York State and the Great Lakes at DigitalCommons @Brockport. It has been accepted for inclusion in Technical Reports by an authorized administrator of Digital Commons @Brockport. Formore information, please contactkmyers@brockport.edu.



Posted Content
TL;DR: In this article, the authors constructed tests for shift detection in locally-stationary autoregressive time series which resist contamination by a substantial amount of outliers Tests based on a comparison of local medians standardized by a highly robust estimate of the variability show reliable performance in a broad variety of situations if the thresholds are adjusted for possible autocorrelations.
Abstract: Tests for shift detection in locally-stationary autoregressive time series are constructed which resist contamination by a substantial amount of outliers Tests based on a comparison of local medians standardized by a highly robust estimate of the variability show reliable performance in a broad variety of situations if the thresholds are adjusted for possible autocorrelations