scispace - formally typeset
Search or ask a question

Showing papers in "Algorithms in 2011"


Journal ArticleDOI
TL;DR: The impact on SMOS imagery of a sinusoidal RFI source is reviewed, and the problem is illustrated with actual RFI encountered by SMOS.
Abstract: The European Space Agency (ESA) successfully launched the Soil Moisture and Ocean Salinity (SMOS) mission in November 2, 2009. SMOS uses a new type of instrument, a synthetic aperture radiometer named MIRAS that provides full-polarimetric multi-angular L-band brightness temperatures, from which regular and global maps of Sea Surface Salinity (SSS) and Soil Moisture (SM) are generated. Although SMOS operates in a restricted band (1400-1427 MHz), radio-frequency interference (RFI) appears in SMOS imagery in many areas of the world, and it is an important issue to be addressed for quality SSS and SM retrievals. The impact on SMOS imagery of a sinusoidal RFI source is reviewed, and the problem is illustrated with actual RFI encountered by SMOS. Two RFI detection and mitigation algorithms are developed (dual-polarization and full-polarimetric modes), the performance of the second one has been quantitatively evaluated in terms of probability of detection and false alarm (using a synthetic test scene), and results presented

53 citations


Journal ArticleDOI
TL;DR: This paper analyzes how recommender systems can be applied to current e- learning systems to guide learners in personalized inclusive e-learning scenarios and presents three requirements to be considered: a recommendation model; an open standards-based service-oriented architecture; and a usable and accessible graphical user interface to deliver the recommendations.
Abstract: This paper analyzes how recommender systems can be applied to current e-learning systems to guide learners in personalized inclusive e-learning scenarios. Recommendations can be used to overcome current limitations of learning management systems in providing personalization and accessibility features. Recommenders can take advantage of standards-based solutions to provide inclusive support. To this end we have identified the need for developing semantic educational recommender systems, which are able to extend existing learning management systems with adaptive navigation support. In this paper we present three requirements to be considered in developing these semantic educational recommender systems, which are in line with the service-oriented approach of the third generation of learning management systems, namely: (i) a recommendation model; (ii) an open standards-based service-oriented architecture; and (iii) a usable and accessible graphical user interface to deliver the recommendations.

52 citations


Journal ArticleDOI
TL;DR: Radio Frequency Interference (RFI) detection and mitigation algorithms based on a signal’s spectrogram (frequency and time domain representation) are presented.
Abstract: Radio Frequency Interference (RFI) detection and mitigation algorithms based on a signal’s spectrogram (frequency and time domain representation) are presented. The radiometric signal’s spectrogram is treated as an image, and therefore image processing techniques are applied to detect and mitigate RFI by two-dimensional filtering. A series of Monte-Carlo simulations have been performed to evaluate the performance of a simple thresholding algorithm and a modified two-dimensional Wiener filter.

32 citations


Journal ArticleDOI
TL;DR: A new perspective on the smallest grammar problem is proposed by splitting it into two tasks: choosing which words will be the constituents of the grammar and searching for the largest grammar given this set of constituents.
Abstract: The smallest grammar problem—namely, finding a smallest context-free grammar that generates exactly one sequence—is of practical and theoretical importance in fields such as Kolmogorov complexity, data compression and pattern discovery. We propose a new perspective on this problem by splitting it into two tasks: (1) choosing which words will be the constituents of the grammar and (2) searching for the smallest grammar given this set of constituents. We show how to solve the second task in polynomial time parsing longer constituent with smaller ones. We propose new algorithms based on classical practical algorithms that use this optimization to find small grammars. Our algorithms consistently find smaller grammars on a classical benchmark reducing the size in 10% in some cases. Moreover, our formulation allows us to define interesting bounds on the number of small grammars and to empirically compare different grammars of small size.

16 citations


Journal ArticleDOI
TL;DR: The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented, which is suggested to be preferable for small files which are typical for dictionaries, since these are usually kept in small chunks.
Abstract: The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented. Experimental results suggest that the new methods are often preferable to earlier ones, in particular for small files which are typical for dictionaries, since these are usually kept in small chunks.

15 citations


Journal ArticleDOI
TL;DR: A survey of results concerning Lempel–Ziv data compression on parallel and distributed systems is presented, starting from the theoretical approach to parallel time complexity to conclude with the practical goal of designing distributed algorithms with low communication cost.
Abstract: We present a survey of results concerning Lempel–Ziv data compression on parallel and distributed systems, starting from the theoretical approach to parallel time complexity to conclude with the practical goal of designing distributed algorithms with low communication cost. Storer’s extension for image compression is also discussed.

10 citations


Journal ArticleDOI
TL;DR: This paper defends GLS in the PPP context, investigates when PPP can occur, illustrates when GLS can be beneficial for parameter estimation, reviews optimality properties of GLS estimators, and gives an example in which PPP does occur.
Abstract: Generalized least squares (GLS) for model parameter estimation has a long and successful history dating to its development by Gauss in 1795. Alternatives can outperform GLS in some settings, and alternatives to GLS are sometimes sought when GLS exhibits curious behavior, such as in Peelle’s Pertinent Puzzle (PPP). PPP was described in 1987 in the context of estimating fundamental parameters that arise in nuclear interaction experiments. In PPP, GLS estimates fell outside the range of the data, eliciting concerns that GLS was somehow flawed. These concerns have led to suggested alternatives to GLS estimators. This paper defends GLS in the PPP context, investigates when PPP can occur, illustrates when PPP can be beneficial for parameter estimation, reviews optimality properties of GLS estimators, and gives an example in which PPP does occur.

10 citations


Journal ArticleDOI
TL;DR: Stochastic context-free grammars are extended such that the probability of applying a rule can depend on the length of the subword that is eventually generated from the symbols introduced by the rule.
Abstract: In order to be able to capture effects from co-transcriptional folding, we extend stochastic context-free grammars such that the probability of applying a rule can depend on the length of the subword that is eventually generated from the symbols introduced by the rule, and we show that existing algorithms for training and for determining the most probable parse tree can easily be adapted to the extended model without losses in performance. Furthermore, we show that the extended model is suited to improve the quality of predictions of RNA secondary structures. The extended model may also be applied to other fields where stochastic context-free grammars are used like natural language processing. Additionally some interesting questions in the field of formal languages arise from it.

8 citations


Journal ArticleDOI
TL;DR: It is shown that the edit distance with block moves and block deletions is NP-complete, and that it can be reduced to the problem of non-recursive block Moves and Block deletions within a constant factor.
Abstract: Several variants of the edit distance problem with block deletions are considered. Polynomial time optimal algorithms are presented for the edit distance with block deletions allowing character insertions and character moves, but without block moves. We show that the edit distance with block moves and block deletions is NP-complete (Nondeterministic Polynomial time problems in which any given solution to such problem can be verified in polynomial time, and any NP problem can be converted into it in polynomial time), and that it can be reduced to the problem of non-recursive block moves and block deletions within a constant factor.

8 citations


Journal ArticleDOI
TL;DR: Performance comparisons among estimators are provided, approximate Bayesian computation (ABC) is introduced using density estimation applied to simulated data realizations to produce an alternative to the incomplete approach, and estimation error in the assumed covariance matrix cannot always be ignored.
Abstract: Peelle’s Pertinent Puzzle (PPP) was described in 1987 in the context of estimating fundamental parameters that arise in nuclear interaction experiments. In PPP, generalized least squares (GLS) parameter estimates fell outside the range of the data, which has raised concerns that GLS is somehow flawed and has led to suggested alternatives to GLS estimators. However, there have been no corresponding performance comparisons among methods, and one suggested approach involving simulated data realizations is statistically incomplete. Here we provide performance comparisons among estimators, introduce approximate Bayesian computation (ABC) using density estimation applied to simulated data realizations to produce an alternative to the incomplete approach, complete the incompletely specified approach, and show that estimation error in the assumed covariance matrix cannot always be ignored.

8 citations


Journal ArticleDOI
TL;DR: This paper proves that determining whether a given finite configuration of generalized Langton’s ant is repeatable or not is PSPACE-hard, and also proves the PSPace-hardness of the ant's problem on a hexagonal grid.
Abstract: Chris Langton proposed a model of an artificial life that he named “ant”: an agent- called ant- that is over a square of a grid moves by turning to the left (or right) accordingly to black (or white) color of the square where it is heading, and the square then reverses its color. Bunimovich and Troubetzkoy proved that an ant’s trajectory is always unbounded, or equivalently, there exists no repeatable configuration of the ant’s system. On the other hand, by introducing a new type of color where the ant goes straight ahead and the color never changes, repeatable configurations are known to exist. In this paper, we prove that determining whether a given finite configuration of generalized Langton’s ant is repeatable or not is PSPACE-hard. We also prove the PSPACE-hardness of the ant’s problem on a hexagonal grid.

Journal ArticleDOI
TL;DR: The space complexity required by a data structure to maintain such a data stream so that it can approximate the set of frequent items over a sliding time window with sufficient accuracy is studied.
Abstract: In an asynchronous data stream, the data items may be out of order with respect to their original timestamps. This paper studies the space complexity required by a data structure to maintain such a data stream so that it can approximate the set of frequent items over a sliding time window with sufficient accuracy. Prior to our work, the best solution is given by Cormode et al. [1], who gave an O (1/e log W log (eB/ log W) min {log W, 1/e} log |U|)- space data structure that can approximate the frequent items within an e error bound, where W and B are parameters of the sliding window, and U is the set of all possible item names. We gave a more space-efficient data structure that only requires O (1/e log W log (eB/ logW) log log W) space.

Journal ArticleDOI
TL;DR: This work develops an algorithm that efficiently computes the distribution of a pattern matching algorithm’s running time cost (such as the number of text character accesses) for any given pattern in a random text model.
Abstract: We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer--Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm's running time cost (such as the number of text character accesses) for any given pattern in a random text model. Text models range from simple uniform models to higher-order Markov models or hidden Markov models (HMMs). Furthermore, we provide an algorithm to compute the exact distribution of \emph{differences} in running time cost of two pattern matching algorithms. Methodologically, we use extensions of finite automata which we call \emph{deterministic arithmetic automata} (DAAs) and \emph{probabilistic arithmetic automata} (PAAs)~\cite{Marschall2008}. Given an algorithm, a pattern, and a text model, a PAA is constructed from which the sought distributions can be derived using dynamic programming. To our knowledge, this is the first time that substring- or suffix-based pattern matching algorithms are analyzed exactly by computing the whole distribution of running time cost. Experimentally, we compare Horspool's algorithm, Backward DAWG Matching, and Backward Oracle Matching on prototypical patterns of short length and provide statistics on the size of minimal DAAs for these computations.

Journal ArticleDOI
TL;DR: A new, wavelet-based algorithm is described that indicates a new measurement called a PLA index could be used to quantify the variability or predictability of blood glucose.
Abstract: Several measurements are used to describe the behavior of a diabetic patient’s blood glucose. We describe a new, wavelet-based algorithm that indicates a new measurement called a PLA index could be used to quantify the variability or predictability of blood glucose. This wavelet-based approach emphasizes the shape of a blood glucose graph. Using continuous glucose monitors (CGMs), this measurement could become a new tool to classify patients based on their blood glucose behavior and may become a new method in the management of diabetes.

Journal ArticleDOI
TL;DR: Two goodness-of-fit tests for copulas are being investigated using the expansion of the projection pursuit methodology, which enables us to determine on which axis system these copulas lie as well as the exact value of these very copulas in the basis formed by the axes previously determined irrespective of their value in their canonical basis.
Abstract: Two goodness-of-fit tests for copulas are being investigated. The first one deals with the case of elliptical copulas and the second one deals with independent copulas. These tests result from the expansion of the projection pursuit methodology that we will introduce in the present article. This method enables us to determine on which axis system these copulas lie as well as the exact value of these very copulas in the basis formed by the axes previously determined irrespective of their value in their canonical basis. Simulations are also presented as well as an application to real datasets.

Journal ArticleDOI
TL;DR: For k2 and m, the hierarchical typeclass of a data string of length n= k is a data alphabet of cardinality m, and this paper presents a new approach to solving this problem.
Abstract: For fixed k ≥ 2 and fixed data alphabet of cardinality m, the hierarchical type class of a data string of length n = kj for some j ≥ 1 is formed by permuting the string in all possible ways under permutations arising from the isomorphisms of the unique finite rooted tree of depth j which has n leaves and k children for each non-leaf vertex. Suppose the data strings in a hierarchical type class are losslessly encoded via binary codewords of minimal length. A hierarchical entropy function is a function on the set of m-dimensional probability distributions which describes the asymptotic compression rate performance of this lossless encoding scheme as the data length n is allowed to grow without bound. We determine infinitely many hierarchical entropy functions which are each self-affine. For each such function, an explicit iterated function system is found such that the graph of the function is the attractor of the system.

Journal ArticleDOI
TL;DR: This paper gives a 2 log2(n)-approximation algorithm for the minimum directed tour cover problem (DToCP), which is to find adirected tour cover of minimum cost.
Abstract: Given a directed graph G with non-negative cost on the arcs, a directed tour cover T of G is a cycle (not necessarily simple) in G such that either head or tail (or both of them) of every arc in G is touched by T. The minimum directed tour cover problem (DToCP), which is to find a directed tour cover of minimum cost, is NP-hard. It is thus interesting to design approximation algorithms with performance guarantee to solve this problem. Although its undirected counterpart (ToCP) has been studied in recent years, in our knowledge, the DToCP remains widely open. In this paper, we give a 2 log2(n)-approximation algorithm for the DToCP.