scispace - formally typeset
Search or ask a question

Showing papers on "Sequential algorithm published in 2015"


Journal ArticleDOI
TL;DR: In this paper, a Bayesian sequential sensor placement algorithm, based on robust information entropy, is proposed for multi-type of sensors, which is a holistic approach such that the overall performance of various types of sensors at different locations is assessed.
Abstract: Summary In this paper, a Bayesian sequential sensor placement algorithm, based on the robust information entropy, is proposed for multi-type of sensors. The presented methodology has two salient features. It is a holistic approach such that the overall performance of various types of sensors at different locations is assessed. Therefore, it provides a rational and effective strategy to design the sensor configuration, which optimizes the use of various available resources. This sequential algorithm is very efficient due to its Bayesian nature, in which prior distribution can be incorporated. Therefore, it avoids the possible unidentifiability problem encountered in a sequential process, which starts with small number of sensors. The proposed algorithm is demonstrated using a shear building and a lattice tower with consideration of up to four types of sensors. Copyright © 2014 John Wiley & Sons, Ltd.

101 citations


Journal ArticleDOI
13 Apr 2015
TL;DR: In this paper, the authors compare two multicore thread-parallel adaptations of a state-of-the-art branch-and-bound algorithm for the maximum clique problem and provide a novel explanation as to why they are successful.
Abstract: Finding a maximum clique in a given graph is one of the fundamental NP-hard problems. We compare two multicore thread-parallel adaptations of a state-of-the-art branch-and-bound algorithm for the maximum clique problem and provide a novel explanation as to why they are successful. We show that load balance is sometimes a problem but that the interaction of parallel search order and the most likely location of solutions within the search space is often the dominating consideration. We use this explanation to propose a new low-overhead, scalable work-splitting mechanism. Our approach uses explicit early diversity to avoid strong commitment to the weakest heuristic advice and late resplitting for balance. More generally, we argue that, for branch-and-bound, parallel algorithm design should not be performed independently of the underlying sequential algorithm.

39 citations


Proceedings ArticleDOI
04 Jan 2015
TL;DR: This work shows that simple sequential randomized iterative algorithms for random permutation, list contraction, and tree contraction are highly parallel if iterations of the algorithms are run as soon as all of their dependencies have been resolved, and the resulting computations have logarithmic depth with high probability.
Abstract: We show that simple sequential randomized iterative algorithms for random permutation, list contraction, and tree contraction are highly parallel. In particular, if iterations of the algorithms are run as soon as all of their dependencies have been resolved, the resulting computations have logarithmic depth (parallel time) with high probability. Our proofs make an interesting connection between the dependence structure of two of the problems and random binary trees. Building upon this analysis, we describe linear-work, polylogarithmic-depth algorithms for the three problems. Although asymptotically no better than the many prior parallel algorithms for the given problems, their advantages include very simple and fast implementations, and returning the same result as the sequential algorithm. Experiments on a 40-core machine show reasonably good performance relative to the sequential algorithms.

39 citations


Journal ArticleDOI
TL;DR: It is found that the change in pore-volume induced by mineral dissolution can impact on fluid pressure and failure status, followed by significant changes in permeability and flow variables, showing strong interrelations between flow-reactive transport and geomechanics.

34 citations


Journal ArticleDOI
TL;DR: A new distributed algorithm based on the partition of the requests space and the combination of the routes is presented and tested on a set of 24 different scenarios of a large-scale problem in the city of San Francisco, proving that the distributed algorithm can be an effective way to solve high dimensional dial-a-ride problems.
Abstract: These days, transportation and logistic problems in large cities are demanding smarter transportation services that provide flexibility and adaptability. A possible solution to this arising problem is to compute the best routes for each new scenario. In this problem, known in the literature as the dial-a-ride problem, a number of passengers are transported between pickup and delivery locations trying to minimize the routing costs while respecting a set of prespecified constraints. This problem has been solved in the literature with several approaches from small to medium sized problems. However, few efforts have dealt with large scale problems very common in massive scenarios (big cities or highly-populated regions). In this study, a new distributed algorithm based on the partition of the requests space and the combination of the routes is presented and tested on a set of 24 different scenarios of a large-scale problem (up to 16,000 requests or 32,000 locations) in the city of San Francisco. The results show that, not only the distributed algorithm is able to solve large problem instances that the corresponding sequential algorithm is unable to solve in a reasonable time, but also to have an average improvement of 9% in the smaller problems. The results have been validated by means of statistical procedures proving that the distributed algorithm can be an effective way to solve high dimensional dial-a-ride problems.

32 citations


Book ChapterDOI
12 Jan 2015
TL;DR: This work leverage the machine learning models used by existing sequential algorithm selectors, such as 3S, ISAC, SATzilla and ME-ASP, and modify their selection procedures to produce a ranking of the given candidate algorithms, and selects the top n algorithms to be run in parallel on n processing units.
Abstract: In view of the increasing importance of hardware parallelism, a natural extension of per-instance algorithm selection is to select a set of algorithms to be run in parallel on a given problem instance, based on features of that instance Here, we explore how existing algorithm selection techniques can be effectively parallelized To this end, we leverage the machine learning models used by existing sequential algorithm selectors, such as 3S, ISAC, SATzilla and ME-ASP, and modify their selection procedures to produce a ranking of the given candidate algorithms; we then select the top n algorithms under this ranking to be run in parallel on n processing units Furthermore, we adapt the pre-solving schedules obtained by aspeed to be effective in a parallel setting with different time budgets for each processing unit Our empirical results demonstrate that, using 4 processing units, the best of our methods achieves a 12-fold average speedup over the best single solver on a broad set of challenging scenarios from the algorithm selection library

31 citations


Journal ArticleDOI
TL;DR: A novel algorithm using a reproducing kernel for adaptive nonlinear estimation based on projection-along-subspace, selective update, and parallel projection is proposed, which yields excellent performances with small dictionary sizes.
Abstract: We propose a novel algorithm using a reproducing kernel for adaptive nonlinear estimation. The proposed algorithm is based on three ideas: projection-along-subspace, selective update, and parallel projection. The projection-along-subspace yields excellent performances with small dictionary sizes. The selective update effectively reduces the complexity without any serious degradation of performance. The parallel projection leads to fast convergence/tracking accompanied by noise robustness. A convergence analysis in the non-selective-update case is presented by using the adaptive projected subgradient method. Simulation results exemplify the benefits from the three ideas as well as showing the advantages over the state-of-the-art algorithms. The proposed algorithm bridges the quantized kernel least mean square algorithm of Chen and the sparse sequential algorithm of Dodd

29 citations


Proceedings ArticleDOI
04 Jan 2015
TL;DR: A new criterion for the Moser-Tardos algorithm to converge is shown, stronger than the LLLL criterion, and can yield better results even than the full Shearer criterion.
Abstract: The Lopsided Lovasz Local Lemma (LLLL) is a powerful probabilistic principle which has been used in a variety of combinatorial constructions. While this principle began as a general statement about probability spaces, it has recently been transformed into a variety of polynomial-time algorithms. The resampling algorithm of Moser & Tardos is the most well-known example of this. A variety of criteria have been shown for the LLLL; the strongest possible criterion was shown by Shearer, and other criteria which are easier to use computationally have been shown by Bissacot et al, Pegden, and Kolipaka & Szegedy.We show a new criterion for the Moser-Tardos algorithm to converge. This criterion is stronger than the LLLL criterion, and in fact can yield better results even than the full Shearer criterion. This is possible because it does not apply in the same generality as the original LLLL; yet, it is strong enough to cover many applications of the LLLL in combinatorics. We show a variety of new bounds and algorithms. A noteworthy application is for k-SAT, with bounded occurences of variables. As shown in Gebauer, Szabo, and Tardos, a k-SAT instance in which every variable appears L ≤ 2k+1/e(k+1) times, is satisfiable. Although this bound is asymptotically tight (in k), we improve it to L ≤ 2k+1(1−1/k)k/k−1 − 2/k which can be significantly stronger when k is small.We introduce a new parallel algorithm for the LLLL. While Moser & Tardos described a simple parallel algorithm for the Lovasz Local Lemma, and described a simple sequential algorithm for a form of the Lopsided Lemma, they were not able to combine the two. Our new algorithm applies in nearly all settings in which the sequential algorithm works --- this includes settings covered by our new stronger LLLL criterion.

25 citations


Journal ArticleDOI
TL;DR: It is shown that the parallel Tabu Search algorithm for graphics cards (GPUs) outperforms other existing Tabu search approaches in terms of quality of solutions and the number of evaluated schedules per second.

24 citations


Journal Article
TL;DR: This paper introduces a technique to parallelize GA based clustering by extending hadoop mapreduce and an analysis of proposed approach to evaluate performance gains with respect to a sequential algorithm is presented.
Abstract: Cluster analysis is used to classify similar objects under same group. It is one of the most important data mining methods. However, it fails to perform well for big data due to huge time complexity. For such scenarios parallelization is a better approach. Mapreduce is a popular programming model which enables parallel processing in a distributed environment. But, most of the clustering algorithms are not "naturally parallelizable" for instance Genetic Algorithms. This is so, due to the sequential nature of Genetic Algorithms. This paper introduces a technique to parallelize GA based clustering by extending hadoop mapreduce. An analysis of proposed approach to evaluate performance gains with respect to a sequential algorithm is presented. The analysis is based on a real life large data set.

19 citations


Proceedings ArticleDOI
07 Apr 2015
TL;DR: These algorithms improve upon the linear depth of the recent parallel algorithms by Fuentes-Sepulveda et al. and achieve up to 27x speedup over the sequential algorithm on a variety of real-world and artificial inputs.
Abstract: We present parallel algorithms for wavelet tree construction with polylogarithmic depth, improving upon the linear depth of the recent parallel algorithms by Fuentes-Sepulveda et al. We experimentally show that on a 40-core machine with two-way hyper-threading, we outperform the existing parallel algorithms by 1.3--5.6x and achieve up to 27x speedup over the sequential algorithm on a variety of real-world and artificial inputs. Our algorithms show good scalability with increasing thread count, input size and alphabet size. We also discuss extensions to variants of the standard wavelet tree.

Journal ArticleDOI
TL;DR: It is shown that the proposed algorithm with zero initial conditions can monotonically converge to the unique positive definite solutions of the coupled Lyapunov matrix equations if the associated Markovian jump system is stochastically stable.

Proceedings Article
01 Jan 2015
TL;DR: This paper introduces a new active multi- task learning paradigm, which selectively samples effective instances for multi-task learning and cast the selection procedure as a bandit framework, and provides an implementation of the algorithm based on a popular multi- Task learning algorithm that is trace-norm regularization method.
Abstract: In multi-task learning, the multiple related tasks allow each one to benefit from the learning of the others, and labeling instances for one task can also affect the other tasks especially when the task has a small number of labeled data. Thus labeling effective instances across different learning tasks is important for improving the generalization error of all tasks. In this paper, we propose a new active multi-task learning paradigm, which selectively samples effective instances for multi-task learning. Inspired by the multi-armed bandits, which can balance the trade-off between the exploitation and exploration, we introduce a new active learning strategy and cast the selection procedure as a bandit framework. We consider both the risk of multi-task learner and the corresponding confidence bounds and our selection tries to balance this trade-off. Our proposed method is a sequential algorithm, which at each round maintains a sampling distribution on the pool of data, queries the label for an instance according to this distribution and updates the distribution based on the newly trained multi-task learner. We provide an implementation of our algorithm based on a popular multi-task learning algorithm that is trace-norm regularization method. Theoretical guarantees are developed by exploiting the Rademacher complexities. Comprehensive experiments show the effectiveness and efficiency of the proposed approach.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: By benchmarking against the Cramér Rao Bound, it is shown that the proposed algorithm achieves near-optimal performance under a variety of settings and is compared with the classical MUSIC, and more recent Lasso algorithms in terms of estimation accuracy and computational complexity.
Abstract: We propose a fast sequential algorithm for the fundamental problem of estimating continuous-valued frequencies and amplitudes using samples of a noisy mixture of sinusoids. Each step consists of two phases: detection of a new sinusoid, and refining the parameters of already detected sinusoids. The detection phase is performed on an oversampled DFT grid, while the refinement phase enables continuous-valued estimation, thus avoiding basis mismatch. By benchmarking against the Cramer Rao Bound, we show that the proposed algorithm achieves near-optimal performance under a variety of settings. We also compare our algorithm with the classical MUSIC, and more recent Lasso algorithms in terms of estimation accuracy and computational complexity.

Journal ArticleDOI
TL;DR: A scale-bias adjustment migration strategy for integrating base and new models based on similar nature underlying processes is adopted and a Bayesian sequential algorithm for obtaining the statistically most informative data about the migrated model for use in parameter estimation is proposed.

Journal ArticleDOI
TL;DR: In this article, an intuitively simple sequential algorithm for the fair division of indivisible items that are strictly ranked by two or more players is proposed, and several properties of the algorithm are analyzed.
Abstract: We propose an intuitively simple sequential algorithm (SA) for the fair division of indivisible items that are strictly ranked by two or more players. We analyze several properties of the allocatio...

Book ChapterDOI
01 Jan 2015
TL;DR: This work investigates a GPU-based implementation of parallel SaDE using NVIDIA's CUDA technology, and aims to accelerate SaDE’s computation speed while maintaining its optimization accuracy.
Abstract: Differential evolution (DE) is a powerful population-based stochastic optimization algorithm, which has demonstrated high efficacy in various scientific and engineering applications. Among numerous variants of DE, self-adaptive differential evolution (SaDE) features the automatic adaption of the employed search strategy and its accompanying parameters via online learning the preceding behavior of the already applied strategies and their associated parameter settings. As such, SaDE facilitates the practical use of DE by avoiding the considerable efforts of identifying the most effective search strategy and its associated parameters. The original SaDE is a CPU-based sequential algorithm. However, the major algorithmic modules of SaDE are very suitable for parallelization. Given the fact that modern GPUs have become widely affordable while enabling personal computers to carry out massively parallel computing tasks, this work investigates a GPU-based implementation of parallel SaDE using NVIDIA’s CUDA technology. We aim to accelerate SaDE’s computation speed while maintaining its optimization accuracy. Experimental results on several numerical optimization problems demonstrate the remarkable speedups of the proposed parallel SaDE over the original sequential SaDE across varying problem dimensions and algorithmic population sizes.

Proceedings ArticleDOI
29 Jun 2015
TL;DR: A parallel version of RBAT using MapReduce is developed and results show that scalable economization of large transaction datasets can be achieved using Map reduce and the method can scale nearly linear to the number of processing nodes.
Abstract: Transaction data, such as market basket or diagnostic data, contain sensitive information about individuals. Such data are often disseminated widely to support analytic studies. This raises privacy concerns, as the confidentiality of individuals must be protected. Economization is an established methodology to protect transaction data, which can be applied using different algorithms. RBAT is an algorithm for anonymzitng transaction data that has many desirable features. These include flexible specification of privacy requirements and the ability to preserve data utility well. However, like most economization methods, RBAT is a sequential algorithm that is not scalable to large datasets. This limits the applicability of RBAT in practice. To address this issue, in this paper, we develop a parallel version of RBAT using MapReduce. We partition the data across cluster of computing nodes and implement the key operations of RBAT in parallel. Our experimental results show that scalable economization of large transaction datasets can be achieved using MapReduce and our method can scale nearly linear to the number of processing nodes.

Journal ArticleDOI
TL;DR: In this article, the authors introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, aims to choose those combinations of vector components that provide best prediction.
Abstract: We introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, aims to choose those combinations of vector components that provide best prediction. The algorithm is constructed specifically so that it devotes attention to components that might be of relatively little predictive value by themselves, and so might be ignored by more conventional methodology for model choice, but which, in combination with other difficult-to-find components, can be particularly beneficial for prediction. The design of the algorithm is also motivated by a desire to choose vector components that become redundant once appropriate combinations of other, more relevant components are selected. Our theoretical arguments show these goals are met in the sense that, with probability converging to 1 as sample size increases, the algorithm correctly determines a small, fixed number of variables on which the regression mean, g say, depends, even if dimension diverges to infinity much faster...

Book ChapterDOI
01 Jan 2015
TL;DR: The adaptive sequential algorithm proposed here selects features that for a given testing sample maximize the expected reduction of uncertainty about its class, where the uncertainty is updated with the values of the already selected features observed on this testing sample.
Abstract: In order to further a classifier construction, feature selection algorithms reduce the input dimensionality to a subset of the most informative features. Usually, such subset is fixed and chosen on the preprocessing step before the actual classification. However, when it is difficult to find a small number of features sufficient for classification of all data samples, as in cases of the heterogeneous input data, we suggest an adaptive approach assuming selection of different features for every testing sample. The adaptive sequential algorithm proposed here selects features that for a given testing sample maximize the expected reduction of uncertainty about its class, where the uncertainty is updated with the values of the already selected features observed on this testing sample. The provided experiments show that especially in cases of limited amount of training data our adaptive conditional mutual information feature selector outperforms two the most related information-based static and adaptive algorithms.

Journal ArticleDOI
TL;DR: A key feature of the proposed algorithm is that it exploits coarse grained parallelism and the components running in parallel need not synchronize after every iteration, thus the algorithm lends itself to be implemented on a geographically dispersed set of processing elements interconnected through a network.
Abstract: We propose a novel parallel Algorithm for inferring topic hierarchies using HLDA.We use loosely-coupled parallel tasks that do not require frequent synchronization.The parallel Algorithm is well-suited to be run on distributed computing systems.The proposed Algorithm achieves a predictive accuracy on par with that of HLDA.The parallel Algorithm exhibits a near-linear speed-up and scales well. The rapid growth of information in the digital world especially on the web, calls for automated methods of organizing the digital information for convenient access and efficient information retrieval. Topic modeling is a branch of machine learning and probabilistic graphical modeling that helps in arranging the web pages according to their topical structure. The topic distribution over a set of documents (web pages) and the affinity of a document toward a specific topic can be revealed using topic modeling. Topic modeling algorithms are typically computationally expensive due to their iterative nature. Recent research efforts have attempted to parallelize specific topic models and are successful in their attempts. These parallel algorithms however have tightly-coupled parallel processes which require frequent synchronization and are also tightly coupled with the underlying topic model which is used for inferring the topic hierarchy. In this paper, we propose a parallel algorithm to infer topic hierarchies from a large scale document corpus. A key feature of the proposed algorithm is that it exploits coarse grained parallelism and the components running in parallel need not synchronize after every iteration, thus the algorithm lends itself to be implemented on a geographically dispersed set of processing elements interconnected through a network. The parallel algorithm realizes a speed up of 53.5 on a 32-node cluster of dual-core workstations and at the same time achieving approximately the same likelihood or predictive accuracy as that of the sequential algorithm, with respect to the performance of Information Retrieval tasks.

Posted Content
TL;DR: In this paper, the authors develop a framework for bringing existing algorithms in the sequential setting to the distributed setting, achieving near optimal approximation ratios for many settings in only a constant number of MapReduce rounds.
Abstract: A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. A lot of recent effort has been devoted to developing distributed algorithms for these problems. However, these results suffer from high number of rounds, suboptimal approximation ratios, or both. We develop a framework for bringing existing algorithms in the sequential setting to the distributed setting, achieving near optimal approximation ratios for many settings in only a constant number of MapReduce rounds. Our techniques also give a fast sequential algorithm for non-monotone maximization subject to a matroid constraint.

Journal ArticleDOI
TL;DR: In this article, the authors study sequential prediction of real-valued, arbitrary, and unknown sequences under the squared error loss as well as the best parametric predictor out of a large, continuous class of predictors.
Abstract: We study sequential prediction of real-valued, arbitrary, and unknown sequences under the squared error loss as well as the best parametric predictor out of a large, continuous class of predictors. Inspired by recent results from computational learning theory, we refrain from any statistical assumptions and define the performance with respect to the class of general parametric predictors. In particular, we present generic lower and upper bounds on this relative performance by transforming the prediction task into a parameter learning problem. We first introduce the lower bounds on this relative performance in the mixture of experts framework, where we show that for any sequential algorithm, there always exists a sequence for which the performance of the sequential algorithm is lower bounded by zero. We then introduce a sequential learning algorithm to predict such arbitrary and unknown sequences, and calculate upper bounds on its total squared prediction error for every bounded sequence. We further show that in some scenarios, we achieve matching lower and upper bounds, demonstrating that our algorithms are optimal in a strong minimax sense such that their performances cannot be improved further. As an interesting result, we also prove that for the worst case scenario, the performance of randomized output algorithms can be achieved by sequential algorithms so that randomized output algorithms do not improve the performance.

Journal ArticleDOI
TL;DR: The proposed sequential algorithm, based on the concept of sequential minimum mean square error (MSE) estimation, to determine the coefficients of the scattering matrix, guarantees the convergence and the resulting computational complexity is linear with the number of iterations.
Abstract: In this paper, we propose an adaptive waveform polarization method for the estimation of target scattering matrix in the presence of clutter. The proposed sequential algorithm, based on the concept of sequential minimum mean square error (MSE) estimation, to determine the coefficients of the scattering matrix, guarantees the convergence and the resulting computational complexity is linear with the number of iterations. The effectiveness of the proposed method is validated through numerical results, underlining the performance improvement given by joint transmission and reception (Tx/Rx) polarization optimization for the scalar system. Also, the results show that the vector system with transmission polarization optimization provides a comparative performance as the scalar measurement system employing joint Tx/Rx polarization optimization. Less computation burden highlights the advantage of the former mode.

Posted Content
TL;DR: The Blossom-BP as discussed by the authors is a distributed version of the celebrated Edmonds' Blossom algorithm by jumping at once over many sub-steps with a single BP, which guarantees termination in O(n^2) of BP runs, where n is the number of vertices in the original graph.
Abstract: Max-product Belief Propagation (BP) is a popular message-passing algorithm for computing a Maximum-A-Posteriori (MAP) assignment over a distribution represented by a Graphical Model (GM). It has been shown that BP can solve a number of combinatorial optimization problems including minimum weight matching, shortest path, network flow and vertex cover under the following common assumption: the respective Linear Programming (LP) relaxation is tight, i.e., no integrality gap is present. However, when LP shows an integrality gap, no model has been known which can be solved systematically via sequential applications of BP. In this paper, we develop the first such algorithm, coined Blossom-BP, for solving the minimum weight matching problem over arbitrary graphs. Each step of the sequential algorithm requires applying BP over a modified graph constructed by contractions and expansions of blossoms, i.e., odd sets of vertices. Our scheme guarantees termination in O(n^2) of BP runs, where n is the number of vertices in the original graph. In essence, the Blossom-BP offers a distributed version of the celebrated Edmonds' Blossom algorithm by jumping at once over many sub-steps with a single BP. Moreover, our result provides an interpretation of the Edmonds' algorithm as a sequence of LPs.

Proceedings ArticleDOI
06 Dec 2015
TL;DR: This work extends the proof scheme in Pasupathy et al. (2015) to the class of kriging-based simulation-optimization algorithms and shows the parallelism between the two paradigms and exploits the deterministic counterpart of eTSSO, the more famous Efficient Global Optimization (EGO) procedure, in order to derive e TSSO structural properties.
Abstract: Motivated by our recent extension of the Two--Stage Sequential Algorithm (eTSSO), we propose an adaptation of the framework in Pasupathy et al. (2015) for the study of convergence of kriging--based procedures. Specifically, we extend the proof scheme in Pasupathy et al. (2015) to the class of kriging--based simulation--optimization algorithms. In particular, the asymptotic convergence and the convergence rate of eTSSO are investigated by interpreting the kriging--based search as a stochastic recursion. We show the parallelism between the two paradigms and exploit the deterministic counterpart of eTSSO, the more famous Efficient Global Optimization (EGO) procedure, in order to derive eTSSO structural properties. This work represents a first step towards a general proof framework for the asymptotic convergence and convergence rate analysis of meta--model based simulation--optimization.

Journal ArticleDOI
TL;DR: Five generative models are defined and compared for several time varying features extracted from audio clips that are recorded independently and asynchronously to form a probabilistic framework for alignment of multiple and partially overlapping audio sequences.
Abstract: We formulate the alignment problem of multiple and partially overlapping audio sequences in a probabilistic framework. We define and compare five generative models for several time varying features extracted from audio clips that are recorded independently and asynchronously. For each model, we derive the associated scoring function that evaluates the quality of an alignment. The matching is then achieved with a sequential algorithm. The derived score functions are also able to identify the cases where the sequences do not overlap and handle multiple sequences where no sequence is covering the entire timeline. The simulation results on real data suggest that the approach is able to handle difficult, ambiguous scenarios and partial matchings where simple baseline methods such as correlation fail.

Journal ArticleDOI
TL;DR: A new algorithm for the on-line determination of thicknesses of deposited layers that can be used in the course of coating production with broadband optical monitoring and Simulation and computational manufacturing experiments confirm high accuracy of the proposed algorithm.
Abstract: We present a new algorithm for the on-line determination of thicknesses of deposited layers that can be used in the course of coating production with broadband optical monitoring. The proposed algorithm can be considered as a modification of the well-known sequential algorithm. The main idea of the new algorithm is to re-calculate thicknesses of some of the previously deposited layers along with the determination of the thickness of the last deposited layer. The algorithm implies analytical estimations that enable recalculating only those layer thicknesses that can be found with better accuracy than before. Simulation and computational manufacturing experiments confirm high accuracy of the proposed algorithm.

Journal ArticleDOI
TL;DR: A GPU-based parallel algorithm for scientific workflow scheduling is proposed so that the computing speed can be improved greatly and attains a speed-up factor of 20.7.
Abstract: Scientific workflow scheduling problem is a combinatorial optimization problem. In the real application, the scientific workflow generally has thousands of task nodes. Scheduling large-scale workflow has huge computational overhead. In this paper, a parallel algorithm for scientific workflow scheduling is proposed so that the computing speed can be improved greatly. Our method used ant colony optimization approaches on the GPU. Thousands of GPU threads can parallel construct solutions. The parallel ant colony algorithm for workflow scheduling was implemented with CUDA C language. Scheduling problem instances with different scales were tested both in our parallel algorithm and CPU sequential algorithm. The experimental results on NVIDIA Tesla M2070 GPU show that our implementation for 1000 task nodes runs in 5 seconds, while a conventional sequential algorithm implementation runs in 104 seconds on Intel Xeon X5650 CPU. Thus, our GPU-based parallel algorithm implementation attains a speed-up factor of 20.7.

Journal ArticleDOI
TL;DR: The different types of parallelism in image processing i.e., data, task and pipeline parallelism are presented and three types of operators; point operators, neighborhood operators and global operators used for image processing are discussed.
Abstract: Application with sequential algorithm can no longer rely on technology scaling to improve performance. Image processing applications exhibits high degree of parallelism and are excellent source for multi-core platform. Major challenge of parallel processing is not only aim to high performance but is to give solution in less time and better utilization of resources. Medical imaging require more computing power than a traditional sequential computer can do and we also know that for medical imaging, it is necessary that the image is clear and be obtained as quickly as possible. We can achieve through the process of parallelizing. Parallelizing optimizes the speed at which the image is produced.This paper presents the different types of parallelism in image processing i.e., data, task and pipeline parallelism. This paper also discusses three types of operators; point operators, neighborhood operators and global operators used for image processing. Different algorithms used for parallel image processing are discussed and the application of medical imaging is discussed using work flow engine Taverna for scientific processing.