Showing papers in "arXiv: Computational Engineering, Finance, and Science in 2015"

PDF

Open Access

Book Chapter•DOI•

Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS

[...]

Páll Szilárd, Mark Abraham, Carsten Kutzner, Berk Hess, Erik Lindahl - Show less +1 more

02 Jun 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: GROMACS as discussed by the authors is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world.

...read moreread less

Abstract: GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of parallelization on all levels, combined with a constant focus on absolute performance. Release 4.6 of GROMACS uses SIMD acceleration on a wide range of architectures, GPU offloading acceleration, and both OpenMP and MPI parallelism within and between nodes, respectively. The recent work on acceleration made it necessary to revisit the fundamental algorithms of molecular simulation, including the concept of neighborsearching, and we discuss the present and future challenges we see for exascale simulation - in particular a very fine-grained task parallelism. We also discuss the software management, code peer review and continuous integration testing required for a project of this complexity.

...read moreread less

403 citations

Journal Article•DOI•

Diagnosis of diabetes using classification mining techniques

[...]

Aiswarya Iyer, S. Jeyalatha, Ronak Sumbaly

12 Feb 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients, by employing Decision Tree and Naive Bayes algorithms.

...read moreread less

Abstract: Diabetes has affected over 246 million people worldwide with a majority of them being women. According to the WHO report, by 2025 this number is expected to rise to over 380 million. The disease has been named the fifth deadliest disease in the United States with no imminent cure in sight. With the rise of information technology and its continued advent into the medical and healthcare sector, the cases of diabetes as well as their symptoms are well documented. This paper aims at finding solutions to diagnose the disease by analyzing the patterns found in the data through classification analysis by employing Decision Tree and Naive Bayes algorithms. The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients.

...read moreread less

212 citations

Journal Article•DOI•

Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data

[...]

Guoxu Zhou¹, Qibin Zhao, Yu Zhang², Tulay Adali³, Shengli Xie¹, Andrzej Cichocki - Show less +2 more•Institutions (3)

Guangdong University of Technology¹, East China University of Science and Technology², University of Maryland, Baltimore³

29 Aug 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: It is shown how constrained multiblock tensor decomposition methods are able to extract similar or statistically dependent common features that are shared by all blocks, by incorporating the multiway nature of data.

...read moreread less

Abstract: With the increasing availability of various sensor technologies, we now have access to large amounts of multi-block (also called multi-set, multi-relational, or multi-view) data that need to be jointly analyzed to explore their latent connections. Various component analysis methods have played an increasingly important role for the analysis of such coupled data. In this paper, we first provide a brief review of existing matrix-based (two-way) component analysis methods for the joint analysis of such data with a focus on biomedical applications. Then, we discuss their important extensions and generalization to multi-block multiway (tensor) data. We show how constrained multi-block tensor decomposition methods are able to extract similar or statistically dependent common features that are shared by all blocks, by incorporating the multiway nature of data. Special emphasis is given to the flexible common and individual feature analysis of multi-block data with the aim to simultaneously extract common and individual latent components with desired properties and types of diversity. Illustrative examples are given to demonstrate their effectiveness for biomedical data analysis.

...read moreread less

153 citations

Posted Content•

Convex Relaxations for Gas Expansion Planning

[...]

Conrado Borraz-Sánchez¹, Russell Bent¹, Scott Backhaus¹, Hassan Hijazi², Pascal Van Hentenryck - Show less +1 more•Institutions (2)

Los Alamos National Laboratory¹, Australian National University²

24 Jun 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This study presents a convex mixed-integer second-order cone relaxation for the gas expansion planning problem under steady-state conditions that offers tight lower bounds with high computational efficiency.

...read moreread less

Abstract: Expansion of natural gas networks is a critical process involving substantial capital expenditures with complex decision-support requirements. Given the non-convex nature of gas transmission constraints, global optimality and infeasibility guarantees can only be offered by global optimisation approaches. Unfortunately, state-of-the-art global optimisation solvers are unable to scale up to real-world size instances. In this study, we present a convex mixed-integer second-order cone relaxation for the gas expansion planning problem under steady-state conditions. The underlying model offers tight lower bounds with high computational efficiency. In addition, the optimal solution of the relaxation can often be used to derive high-quality solutions to the original problem, leading to provably tight optimality gaps and, in some cases, global optimal soluutions. The convex relaxation is based on a few key ideas, including the introduction of flux direction variables, exact McCormick relaxations, on/off constraints, and integer cuts. Numerical experiments are conducted on the traditional Belgian gas network, as well as other real larger networks. The results demonstrate both the accuracy and computational speed of the relaxation and its ability to produce high-quality solutions.

...read moreread less

96 citations

Posted Content•

Big Data Analytics in Bioinformatics: A Machine Learning Perspective.

[...]

Hirak J. Kashyap, Hasin Afzal Ahmed, Nazrul Hoque, Swarup Roy, Dhruba K. Bhattacharyya - Show less +1 more

15 Jun 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: The issues and challenges posed by several big data problems in bioinformatics are addressed, and an overview of the state of the art and the future research opportunities are given.

...read moreread less

Abstract: Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big data using the distributed and parallel computing technologies. Usually big data tools perform computation in batch-mode and are not optimized for iterative processing and high data dependency among operations. In the recent years, parallel, incremental, and multi-view machine learning algorithms have been proposed. Similarly, graph-based architectures and in-memory big data tools have been developed to minimize I/O cost and optimize iterative processing. However, there lack standard big data architectures and tools for many important bioinformatics problems, such as fast construction of co-expression and regulatory networks and salient module identification, detection of complexes over growing protein-protein interaction data, fast analysis of massive DNA, RNA, and protein sequence data, and fast querying on incremental and heterogeneous disease networks. This paper addresses the issues and challenges posed by several big data problems in bioinformatics, and gives an overview of the state of the art and the future research opportunities.

...read moreread less

92 citations

Journal Article•DOI•

Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering

[...]

Chris Gaiteri¹, Chris Gaiteri², Mingming Chen³, Boleslaw K. Szymanski³, Konstantin Kuzmin³, Jierui Xie⁴, Jierui Xie³, Changkyu Lee¹, Timothy J. Blanche¹, Elias Chaibub Neto⁵, Su-Chun Huang⁶, Thomas J. Grabowski⁶, Tara Madhyastha⁶, Vitalina Komashko - Show less +10 more•Institutions (6)

Allen Institute for Brain Science¹, Rush University Medical Center², Rensselaer Polytechnic Institute³, Samsung⁴, Sage Bionetworks⁵, University of Washington⁶

20 Jan 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This work proposes an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously and can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes.

...read moreread less

Abstract: Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging.

...read moreread less

62 citations

Posted Content•

The QC Relaxation: Theoretical and Computational Results on Optimal Power Flow.

[...]

Carleton Coffrin, Hassan Hijazi, Pascal Van Hentenryck

27 Feb 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This paper's comprehensive computational results show that the QC relaxation may produce significant improvements in accuracy over the SOC relaxation at a reasonable computational cost, especially for networks with tight bounds on phase angle differences.

...read moreread less

Abstract: Convex relaxations of the power flow equations and, in particular, the Semi-Definite Programming (SDP) and Second-Order Cone (SOC) relaxations, have attracted significant interest in recent years. The Quadratic Convex (QC) relaxation is a departure from these relaxations in the sense that it imposes constraints to preserve stronger links between the voltage variables through convex envelopes of the polar representation. This paper is a systematic study of the QC relaxation for AC Optimal Power Flow with realistic side constraints. The main theoretical result shows that the QC relaxation is stronger than the SOC relaxation and neither dominates nor is dominated by the SDP relaxation. In addition, comprehensive computational results show that the QC relaxation may produce significant improvements in accuracy over the SOC relaxation at a reasonable computational cost, especially for networks with tight bounds on phase angle differences. The QC and SOC relaxations are also shown to be significantly faster and reliable compared to the SDP relaxation given the current state of the respective solvers.

...read moreread less

52 citations

Posted Content•

Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks

[...]

Yangtuo Peng, Hui Jiang¹•Institutions (1)

York University¹

24 Jun 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: The popular word embedding methods and deep neural networks are applied to leverage financial news to predict stock price movements in the market and can significantly improve the stock prediction accuracy on a standard financial database over the baseline system using only the historical price information.

...read moreread less

Abstract: Financial news contains useful information on public companies and the market. In this paper we apply the popular word embedding methods and deep neural networks to leverage financial news to predict stock price movements in the market. Experimental results have shown that our proposed methods are simple but very effective, which can significantly improve the stock prediction accuracy on a standard financial database over the baseline system using only the historical price information.

...read moreread less

52 citations

Journal Article•DOI•

An Optimal Framework for Residential Load Aggregator

[...]

Qinran Hu, Fangxing Li

14 Jun 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this article, the authors proposed an optimal framework for residential load aggregators (RLAs) which helps solve the problems of evaluating the contributions of individual residents towards participating demand response (DR) program, and fairly distributing the rewards, and concerns on performing cost-effective demand reduction request (DRR) for LSEs with minimal rewards costs while not affecting their living comfortableness.

...read moreread less

Abstract: Due to the development of intelligent demand-side management with automatic control, distributed populations of large residential loads, such as air conditioners (ACs) and electrical water heaters (EWHs), have the opportunities to provide effective demand-side ancillary services for load serving entities (LSEs) to reduce the emissions and network operating costs. Most present approaches are restricted to 1) the scenarios involving with efficiently scheduling the large number of appliances in real time, 2) the issues about evaluating the contributions of individual residents towards participating demand response (DR) program, and fairly distributing the rewards, and 3) the concerns on performing cost-effective demand reduction request (DRR) for LSEs with minimal rewards costs while not affecting their living comfortableness. Therefore, this paper presents an optimal framework for residential load aggregators (RLAs) which helps solve the problems mentioned above. Under this framework, RLAs are able to realize the DRR for LSEs to generate optimal control strategies over residential appliances quickly and efficiently. To residents, the framework is designed with probabilistic model of comfortableness, which minimizes the impact of DR program to their daily life. To LSEs, the framework helps minimize the total reward costs of performing DRRs. Moreover, the framework fairly and strategically distributes the financial rewards to residents, which may stimulate the potential capability of loads optimized and controlled by RLAs in demand side management. The proposed framework has been validated on several numerical case studies.

...read moreread less

42 citations

Posted Content•

Design optimisation and resource assessment for tidal-stream renewable energy farms using a new continuous turbine approach

[...]

Simon W. Funke¹, Simon W. Funke², Stephan C. Kramer³, Matthew D. Piggott³•Institutions (3)

University of Oslo¹, Simula Research Laboratory², Imperial College London³

21 Jul 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this article, the authors present a new approach for optimising the design of tidal stream turbine farms, where the turbine farm is represented by a turbine density function that specifies the number of turbines per unit area and an associated continuous locallyenhanced bottom friction field.

...read moreread less

Abstract: This paper presents a new approach for optimising the design of tidal stream turbine farms. In this approach, the turbine farm is represented by a turbine density function that specifies the number of turbines per unit area and an associated continuous locally-enhanced bottom friction field. The farm design question is formulated as a mathematical optimisation problem constrained by the shallow water equations and solved with efficient, gradient-based optimisation methods. The resulting method is accurate, computationally efficient, allows complex installation constraints, and supports different goal quantities such as to maximise power or profit. The outputs of the optimisation are the optimal number of turbines, their location within the farm, the overall farm profit, the farm's power extraction, and the installation cost. We demonstrate the capabilities of the method on a validated numerical model of the Pentland Firth, Scotland. We optimise the design of four tidal farms simultaneously, as well as individually, and study how farms in close proximity may impact upon one another.

...read moreread less

34 citations

Proceedings Article•DOI•

MATEX: A Distributed Framework for Transient Simulation of Power Distribution Networks

[...]

Hao Zhuang¹, Shih-Hung Weng², Jeng-Hau Lin¹, Chung-Kuan Cheng¹•Institutions (2)

University of California, San Diego¹, Facebook²

14 Nov 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this paper, a distributed framework for transient simulation of power distribution networks (PDNs) is proposed, which utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit.

...read moreread less

Abstract: We proposed MATEX, a distributed framework for transient simulation of power distribution networks (PDNs). MATEX utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit. First, the whole simulation task is divided into subtasks based on decompositions of current sources, in order to reduce the computational overheads. Then these subtasks are distributed to different computing nodes and processed in parallel. Within each node, after the matrix factorization at the beginning of simulation, the adaptive time stepping solver is performed without extra matrix re-factorizations. MATEX overcomes the stiff-ness hinder of previous matrix exponential-based circuit simulator by rational Krylov subspace method, which leads to larger step sizes with smaller dimensions of Krylov subspace bases and highly accelerates the whole computation. MATEX outperforms both traditional fixed and adaptive time stepping methods, e.g., achieving around 13X over the trapezoidal framework with fixed time step for the IBM power grid benchmarks.

...read moreread less

Journal Article•DOI•

Immersed boundary-finite element model of fluid-structure interaction in the aortic root

[...]

Vittoria Flamini¹, Abe DeAnda², Boyce E. Griffith³•Institutions (3)

New York University¹, University of Texas Medical Branch², University of North Carolina at Chapel Hill³

09 Jan 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This study extends earlier IB models of the aortic root by employing an incompressible hyperelastic model of the mechanics of the sinuses and ascending aorta using a constitutive law fit to experimental data from human aortIC root tissue.

...read moreread less

Abstract: It has long been recognized that aortic root elasticity helps to ensure efficient aortic valve closure, but our understanding of the functional importance of the elasticity and geometry of the aortic root continues to evolve as increasingly detailed in vivo imaging data become available. Herein, we describe fluid-structure interaction models of the aortic root, including the aortic valve leaflets, the sinuses of Valsalva, the aortic annulus, and the sinotubular junction, that employ a version of Peskin's immersed boundary (IB) method with a finite element (FE) description of the structural elasticity. We develop both an idealized model of the root with three-fold symmetry of the aortic sinuses and valve leaflets, and a more realistic model that accounts for the differences in the sizes of the left, right, and noncoronary sinuses and corresponding valve cusps. As in earlier work, we use fiber-based models of the valve leaflets, but this study extends earlier IB models of the aortic root by employing incompressible hyperelastic models of the mechanics of the sinuses and ascending aorta using a constitutive law fit to experimental data from human aortic root tissue. In vivo pressure loading is accounted for by a backwards displacement method that determines the unloaded configurations of the root models. Our models yield realistic cardiac output at physiological pressures, with low transvalvular pressure differences during forward flow, minimal regurgitation during valve closure, and realistic pressure loads when the valve is closed during diastole. Further, results from high-resolution computations demonstrate that IB models of the aortic valve are able to produce essentially grid-converged dynamics at practical grid spacings for the high-Reynolds number flows of the aortic root.

...read moreread less

Posted Content•

MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME

[...]

Brian Drawert¹, Michael Trogdon¹, Salman Toor², Linda R. Petzold¹, Andreas Hellander¹ - Show less +1 more•Institutions (2)

University of California, Santa Barbara¹, University of Helsinki²

14 Aug 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models are presented.

...read moreread less

Abstract: Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools, a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments.

...read moreread less

Journal Article•DOI•

A domain-level DNA strand displacement reaction enumerator allowing arbitrary non-pseudoknotted secondary structures

[...]

Casey Grun, Karthik V. Sarma, Brian R. Wolfe, Seung Woo Shin, Erik Winfree - Show less +1 more

11 May 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: The DyNAMiC Workbench Integrated Development Environment as discussed by the authors provides a domain-level reaction enumerator that can handle arbitrary non-pseudoknotted secondary structures and reaction mechanisms including association and dissociation, 3-way and 4-way branch migration, and direct toehold activation.

...read moreread less

Abstract: DNA strand displacement systems have proven themselves to be fertile substrates for the design of programmable molecular machinery and circuitry. Domain-level reaction enumerators provide the foundations for molecular programming languages by formalizing DNA strand displacement mechanisms and modeling interactions at the "domain" level - one level of abstraction above models that explicitly describe DNA strand sequences. Unfortunately, the most-developed models currently only treat pseudo-linear DNA structures, while many systems being experimentally and theoretically pursued exploit a much broader range of secondary structure configurations. Here, we describe a new domain-level reaction enumerator that can handle arbitrary non-pseudoknotted secondary structures and reaction mechanisms including association and dissociation, 3-way and 4-way branch migration, and direct as well as remote toehold activation. To avoid polymerization that is inherent when considering general structures, we employ a time-scale separation technique that holds in the limit of low concentrations. This also allows us to "condense" the detailed reactions by eliminating fast transients, with provable guarantees of correctness for the set of reactions and their kinetics. We hope that the new reaction enumerator will be used in new molecular programming languages, compilers, and tools for analysis and verification that treat a wider variety of mechanisms of interest to experimental and theoretical work. We have implemented this enumerator in Python, and it is included in the DyNAMiC Workbench Integrated Development Environment.

...read moreread less

Book Chapter•DOI•

The Synchrosqueezing transform for instantaneous spectral analysis

[...]

Gaurav Thakur¹•Institutions (1)

Mitre Corporation¹

01 Jan 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: An overview of the theory and stability properties of Synchrosqueezing, as well as applications of the technique to topics in cardiology, climate science, and economics are presented.

...read moreread less

Abstract: The Synchrosqueezing transform is a time-frequency analysis method that can decompose complex signals into time-varying oscillatory components. It is a form of time-frequency reassignment that is both sparse and invertible, allowing for the recovery of the signal. This article presents an overview of the theory and stability properties of Synchrosqueezing, as well as applications of the technique to topics in cardiology, climate science, and economics.

...read moreread less

Posted Content•

Adjoint Lattice Boltzmann for Topology Optimization on multi-GPU architecture

[...]

Łukasz Łaniewski-Wołłk¹, Jacek Rokicki¹•Institutions (1)

Warsaw University of Technology¹

20 Jan 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this article, a discrete adjoint formulation for a wide class of Lattice Boltzmann Methods (LBM) is proposed, which is used to calculate sensitivity of the LBM solution to several type of parameters, both global and local.

...read moreread less

Abstract: In this paper we present a topology optimization technique applicable to a broad range of flow design problems. We propose also a discrete adjoint formulation effective for a wide class of Lattice Boltzmann Methods (LBM). This adjoint formulation is used to calculate sensitivity of the LBM solution to several type of parameters, both global and local. The numerical scheme for solving the adjoint problem has many properties of the original system, including locality and explicit time-stepping. Thus it is possible to integrate it with the standard LBM solver, allowing for straightforward and efficient parallelization (overcoming limitations typical for discrete adjoint solvers). This approach is successfully used for the channel flow to design a free-topology mixer and a heat exchanger. Both resulting geometries being very complex maximize their objective functions, while keeping viscous losses at acceptable level.

...read moreread less

Posted Content•

Synthesising Executable Gene Regulatory Networks from Single-cell Gene Expression Data

[...]

Jasmin Fisher¹, Jasmin Fisher², Ali Sinan Köksal³, Nir Piterman⁴, Steven Woodhouse¹ - Show less +1 more•Institutions (4)

University of Cambridge¹, Microsoft², University of California³, University of Leicester⁴

19 May 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this article, the authors introduce the idea of viewing single-cell gene expression profiles as states of an asynchronous Boolean network, and frame model inference as the problem of reconstructing a Boolean network from its state space.

...read moreread less

Abstract: Recent experimental advances in biology allow researchers to obtain gene expression profiles at single-cell resolution over hundreds, or even thousands of cells at once. These single-cell measurements provide snapshots of the states of the cells that make up a tissue, instead of the population-level averages provided by conventional high-throughput experiments. This new data therefore provides an exciting opportunity for computational modelling. In this paper we introduce the idea of viewing single-cell gene expression profiles as states of an asynchronous Boolean network, and frame model inference as the problem of reconstructing a Boolean network from its state space. We then give a scalable algorithm to solve this synthesis problem. We apply our technique to both simulated and real data. We first apply our technique to data simulated from a well established model of common myeloid progenitor differentiation. We show that our technique is able to recover the original Boolean network rules. We then apply our technique to a large dataset taken during embryonic development containing thousands of cell measurements. Our technique synthesises matching Boolean networks, and analysis of these models yields new predictions about blood development which our experimental collaborators were able to verify.

...read moreread less

Posted Content•

Probabilistic Power Flow Computation via Low-Rank and Sparse Tensor Recovery

[...]

Zheng Zhang, Hung D. Nguyen, Konstantin Turitsyn, Luca Daniel

11 Aug 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: A tensor-recovery method to solve probabilistic power flow problems that generates a high-dimensional and sparse generalized polynomial-chaos expansion that provides useful statistical information and can also speed up other essential routines in power systems.

...read moreread less

Abstract: This paper presents a tensor-recovery method to solve probabilistic power flow problems. Our approach generates a high-dimensional and sparse generalized polynomial-chaos expansion that provides useful statistical information. The result can also speed up other essential routines in power systems (e.g., stochastic planning, operations and controls). Instead of simulating a power flow equation at all quadrature points, our approach only simulates an extremely small subset of samples. We suggest a model to exploit the underlying low-rank and sparse structure of high-dimensional simulation data arrays, making our technique applicable to power systems with many random parameters. We also present a numerical method to solve the resulting nonlinear optimization problem. Our algorithm is implemented in MATLAB and is verified by several benchmarks in MATPOWER $5.1$. Accurate results are obtained for power systems with up to $50$ independent random parameters, with a speedup factor up to $9\times 10^{20}$.

...read moreread less

Journal Article•DOI•

Numerical simulation of skin transport using Parareal

[...]

Andreas Kreienbuehl¹, Arne Naegel², Daniel Ruprecht¹, Robert Speck³, Gabriel Wittum², Rolf Krause¹ - Show less +2 more•Institutions (3)

University of Lugano¹, Goethe University Frankfurt², Forschungszentrum Jülich³

12 Feb 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this paper, the applicability of the time-parallel Parareal algorithm to a brick and mortar setup, a precursory problem to skin permeation, is investigated.

...read moreread less

Abstract: In-silico investigation of skin permeation is an important but also computationally demanding problem. To resolve all scales involved in full detail will not only require exascale computing capacities but also suitable parallel algorithms. This article investigates the applicability of the time-parallel Parareal algorithm to a brick and mortar setup, a precursory problem to skin permeation. The C++ library Lib4PrM implementing Parareal is combined with the UG4 simulation framework, which provides the spatial discretization and parallelization. The combination's performance is studied with respect to convergence and speedup. It is confirmed that anisotropies in the domain and jumps in diffusion coefficients only have a minor impact on Parareal's convergence. The influence of load imbalances in time due to differences in number of iterations required by the spatial solver as well as spatio-temporal weak scaling is discussed.

...read moreread less

Journal Article•DOI•

A Critical Survey of Deconvolution Methods for Separating cell-types in Complex Tissues

[...]

Shahin Mohammadi¹, Neta Zuckerman², Andrea Goldsmith³, Ananth Grama¹•Institutions (3)

Purdue University¹, Genentech², Stanford University³

15 Oct 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this paper, the authors present a survey of models, methods, and assumptions underlying deconvolution techniques, and assess different combinations of these factors and use detailed statistical measures to evaluate their effectiveness.

...read moreread less

Abstract: Identifying concentrations of components from an observed mixture is a fundamental problem in signal processing. It has diverse applications in fields ranging from hyperspectral imaging to denoising biomedical sensors. This paper focuses on in-silico deconvolution of signals associated with complex tissues into their constitutive cell-type specific components, along with a quantitative characterization of the cell-types. Deconvolving mixed tissues/cell-types is useful in the removal of contaminants (e.g., surrounding cells) from tumor biopsies, as well as in monitoring changes in the cell population in response to treatment or infection. In these contexts, the observed signal from the mixture of cell-types is assumed to be a linear combination of the expression levels of genes in constitutive cell-types. The goal is to use known signals corresponding to individual cell-types along with a model of the mixing process to cast the deconvolution problem as a suitable optimization problem. In this paper, we present a survey of models, methods, and assumptions underlying deconvolution techniques. We investigate the choice of the different loss functions for evaluating estimation error, constraints on solutions, preprocessing and data filtering, feature selection, and regularization to enhance the quality of solutions, along with the impact of these choices on the performance of regression-based methods for deconvolution. We assess different combinations of these factors and use detailed statistical measures to evaluate their effectiveness. We identify shortcomings of current methods and avenues for further investigation. For many of the identified shortcomings, such as normalization issues and data filtering, we provide new solutions. We summarize our findings in a prescriptive step-by-step process, which can be applied to a wide range of deconvolution problems.

...read moreread less

Journal Article•DOI•

Accurate Impedance Calculation for Underground and Submarine Power Cables using MoM-SO and a Multilayer Ground Model

[...]

Utkarsh R. Patel¹, Piero Triverio¹•Institutions (1)

University of Toronto¹

17 Mar 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: A multilayer ground model for the recently proposed MoM-SO method is introduced, suitable to accurately predict ground return effects in such scenarios and delivers an accuracy comparable to the finite-element method (FEM).

...read moreread less

Abstract: An accurate knowledge of the per-unit length impedance of power cables is necessary to correctly predict electromagnetic transients in power systems. In particular, skin, proximity, and ground return effects must be properly estimated. In many applications, the medium that surrounds the cable is not uniform and can consist of multiple layers of different conductivity, such as dry and wet soil, water, or air. We introduce a multilayer ground model for the recently-proposed MoM-SO method, suitable to accurately predict ground return effects in such scenarios. The proposed technique precisely accounts for skin, proximity, ground and tunnel effects, and is applicable to a variety of cable configurations, including underground and submarine cables. Numerical results show that the proposed method is more accurate than analytic formulas typically employed for transient analyses, and delivers an accuracy comparable to the finite element method (FEM). With respect to FEM, however, MoM-SO is over 1000 times faster, and can calculate the impedance of a submarine cable inside a three-layer medium in 0.10~s per frequency point.

...read moreread less

Posted Content•

Reservoir characterization: a machine learning approach

[...]

Soumi Chaki

15 Jun 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This present work describes the development of algorithms to obtain the functional relationships between predictor seismic attributes and target lithological properties and proposes regularization of target property prior to building a prediction model.

...read moreread less

Abstract: Reservoir Characterization (RC) can be defined as the act of building a reservoir model that incorporates all the characteristics of the reservoir that are pertinent to its ability to store hydrocarbons and also to produce them.It is a difficult problem due to non-linear and heterogeneous subsurface properties and associated with a number of complex tasks such as data fusion, data mining, formulation of the knowledge base, and handling of the uncertainty.This present work describes the development of algorithms to obtain the functional relationships between predictor seismic attributes and target lithological properties. Seismic attributes are available over a study area with lower vertical resolution. Conversely, well logs and lithological properties are available only at specific well locations in a study area with high vertical resolution.Sand fraction, which represents per unit sand volume within the rock, has a balanced distribution between zero to unity.The thesis addresses the issues of handling the information content mismatch between predictor and target variables and proposes regularization of target property prior to building a prediction model.In this thesis, two Artificial Neural Network (ANN) based frameworks are proposed to model sand fraction from multiple seismic attributes without and with well tops information respectively. The performances of the frameworks are quantified in terms of Correlation Coefficient, Root Mean Square Error, Absolute Error Mean, etc.

...read moreread less

Journal Article•DOI•

An improved return-mapping scheme for nonsmooth yield surfaces: PART I - the Haigh-Westergaard coordinates

[...]

Stanislav Sysala, Martin Cermak, Tomáš Koudelka, Jaroslav Kruis, Jan Zeman, Radim Blaheta - Show less +2 more

12 Mar 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this article, an improved implicit return-mapping scheme for nonsmooth yield surfaces is proposed that systematically builds on a subdifferential formulation of the flow rule, similarly to smooth yield surfaces, where the treatment of singular points such as apices or edges at which the flow direction is multivalued involves only a uniquely defined set of nonlinear equations.

...read moreread less

Abstract: The paper is devoted to the numerical solution of elastoplastic constitutive initial value problems. An improved form of the implicit return-mapping scheme for nonsmooth yield surfaces is proposed that systematically builds on a subdifferential formulation of the flow rule. The main advantage of this approach is that the treatment of singular points, such as apices or edges at which the flow direction is multivalued involves only a uniquely defined set of non-linear equations, similarly to smooth yield surfaces. This paper (PART I) is focused on isotropic models containing: $a)$ yield surfaces with one or two apices (singular points) laying on the hydrostatic axis; $b)$ plastic pseudo-potentials that are independent of the Lode angle; $c)$ nonlinear isotropic hardening (optionally). It is shown that for some models the improved integration scheme also enables to a priori decide about a type of the return and investigate existence, uniqueness and semismoothness of discretized constitutive operators in implicit form. Further, the semismooth Newton method is introduced to solve incremental boundary-value problems. The paper also contains numerical examples related to slope stability with available Matlab implementation.

...read moreread less

Posted Content•

A discontinuous Galerkin method for cohesive zone modelling

[...]

Peter Hansbo¹, Kent Salomonsson¹•Institutions (1)

Jönköping University¹

04 Feb 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: A discontinuous finite element method for small strain elasticity allowing for cohesive zone modeling and yields a seamless transition between the discontinuous Galerkin method and classical cohesive zones modeling.

...read moreread less

Abstract: We propose a discontinuous finite element method for small strain elasticity allowing for cohesive zone modeling. The method yields a seamless transition between the discontinuous Galerkin method and classical cohesive zone modeling. Some relevant numerical examples are presented.

...read moreread less

Posted Content•

Reviving the Two-state Markov Chain Approach (Technical Report)

[...]

Andrzej Mizera, Jun Pang, Qixia Yuan

08 Jan 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This paper identifies a problem of generating biased results, due to the size of the initial sample with which the approach needs to start, and proposes a few heuristics to avoid such a pitfall and conducts an extensive experimental comparison of the two-state Markov chain approach and another approach based on the Skart method.

...read moreread less

Abstract: Probabilistic Boolean networks (PBNs) is a well-established computational framework for modelling biological systems. The steady-state dynamics of PBNs is of crucial importance in the study of such systems. However, for large PBNs, which often arise in systems biology, obtaining the steady-state distribution poses a significant challenge. In fact, statistical methods for steady-state approximation are the only viable means when dealing with large networks. In this paper, we revive the two-state Markov chain approach presented in the literature. We first identify a problem of generating biased results, due to the size of the initial sample with which the approach needs to start and we propose a few heuristics to avoid such a pitfall. Second, we conduct an extensive experimental comparison of the two-state Markov chain approach and another approach based on the Skart method and we show that statistically the two-state Markov chain has a better performance. Finally, we apply this approach to a large PBN model of apoptosis in hepatocytes.

...read moreread less

Journal Article•DOI•

Unified way for computing dynamics of Bose-Einstein condensates and degenerate Fermi gases

[...]

Krzysztof Gawryluk¹, Tomasz Karpiuk¹, Mariusz Gajda, Kazimierz Rzazewski, Mirosław Brewczyk¹ - Show less +1 more•Institutions (1)

University of Białystok¹

15 May 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: In this paper, the authors presented a very simple and efficient numerical scheme which can be applied to study the dynamics of bosonic systems like, for instance, spinor Bose-Einstein condensates with nonlocal interactions.

...read moreread less

Abstract: In this work we present a very simple and efficient numerical scheme which can be applied to study the dynamics of bosonic systems like, for instance, spinor Bose-Einstein condensates with nonlocal interactions but equally well works for Fermi gases. The method we use is a modification of well known Split Operator Method (SOM). We carefully examine this algorithm in the case of $F=1$ spinor Bose-Einstein condensate without and with dipolar interactions and for strongly interacting two-component Fermi gas. Our extension of the SOM method has many advantages: it is fast, stable, and keeps constant all the physical constraints (constants of motion) at high level.

...read moreread less

Posted Content•

Large-scale linear regression: Development of high-performance routines

[...]

Alvaro Frank, Diego Fabregat-Traver, Paolo Bientinesi

29 Apr 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: The design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations are illustrated and how to effectively deal with datasets that do not fit in main memory is shown.

...read moreread less

Abstract: In statistics, series of ordinary least squares problems (OLS) are used to study the linear correlation among sets of variables of interest; in many studies, the number of such variables is at least in the millions, and the corresponding datasets occupy terabytes of disk space. As the availability of large-scale datasets increases regularly, so does the challenge in dealing with them. Indeed, traditional solvers---which rely on the use of black-box" routines optimized for one single OLS---are highly inefficient and fail to provide a viable solution for big-data analyses. As a case study, in this paper we consider a linear regression consisting of two-dimensional grids of related OLS problems that arise in the context of genome-wide association analyses, and give a careful walkthrough for the development of {\sc ols-grid}, a high-performance routine for shared-memory architectures; analogous steps are relevant for tailoring OLS solvers to other applications. In particular, we first illustrate the design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations; then, we show how to effectively deal with datasets that do not fit in main memory; finally, we discuss how to cast the computation in terms of efficient kernels and how to achieve scalability. Importantly, each design decision along the way is justified by simple performance models. {\sc ols-grid} enables the solution of $10^{11}$ correlated OLS problems operating on terabytes of data in a matter of hours.

...read moreread less

Posted Content•

Multiscale model reduction for shale gas transport in fractured media

[...]

I. Y. Akkutlu¹, Yalchin Efendiev², Yalchin Efendiev¹, Maria V. Vasilyeva³, Maria V. Vasilyeva¹ - Show less +1 more•Institutions (3)

Texas A&M University¹, King Abdullah University of Science and Technology², North-Eastern Federal University³

01 Jul 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: A multiscale model reduction technique that describes shale gas transport in fractured media, which considers arbitrary fracture distributions on an unstructured grid; develops GMsFEM for nonlinear flows; and develops online basis function strategies to adaptively improve the convergence.

...read moreread less

Abstract: In this paper, we develop a multiscale model reduction technique that describes shale gas transport in fractured media. Due to the pore-scale heterogeneities and processes, we use upscaled models to describe the matrix. We follow our previous work \cite{aes14}, where we derived an upscaled model in the form of generalized nonlinear diffusion model to describe the effects of kerogen. To model the interaction between the matrix and the fractures, we use Generalized Multiscale Finite Element Method. In this approach, the matrix and the fracture interaction is modeled via local multiscale basis functions. We developed the GMsFEM and applied for linear flows with horizontal or vertical fracture orientations on a Cartesian fine grid. In this paper, we consider arbitrary fracture orientations and use triangular fine grid and developed GMsFEM for nonlinear flows. Moreover, we develop online basis function strategies to adaptively improve the convergence. The number of multiscale basis functions in each coarse region represents the degrees of freedom needed to achieve a certain error threshold. Our approach is adaptive in a sense that the multiscale basis functions can be added in the regions of interest. Numerical results for two-dimensional problem are presented to demonstrate the efficiency of proposed approach.

...read moreread less

Posted Content•

ASTROMLSKIT: A New Statistical Machine Learning Toolkit: A Platform for Data Analytics in Astronomy

[...]

Snehanshu Saha, Surbhi Agrawal, Manikandan. R, Kakoli Bora, Swati Routh, Anand Narasimhamurthy - Show less +2 more

29 Apr 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: The focus of the paper is the applicability and efficacy of various machine learning algorithms like K Nearest Neighbor (KNN), random forest (RF), decision tree (DT), Support Vector Machine (SVM), Na\"ive Bayes and Linear Discriminant Analysis (LDA) in analysis and inference of the decision theoretic problems in Astronomy.

...read moreread less

Abstract: Astroinformatics is a new impact area in the world of astronomy, occasionally called the final frontier, where several astrophysicists, statisticians and computer scientists work together to tackle various data intensive astronomical problems. Exponential growth in the data volume and increased complexity of the data augments difficult questions to the existing challenges. Classical problems in Astronomy are compounded by accumulation of astronomical volume of complex data, rendering the task of classification and interpretation incredibly laborious. The presence of noise in the data makes analysis and interpretation even more arduous. Machine learning algorithms and data analytic techniques provide the right platform for the challenges posed by these problems. A diverse range of open problem like star-galaxy separation, detection and classification of exoplanets, classification of supernovae is discussed. The focus of the paper is the applicability and efficacy of various machine learning algorithms like K Nearest Neighbor (KNN), random forest (RF), decision tree (DT), Support Vector Machine (SVM), Na\"ive Bayes and Linear Discriminant Analysis (LDA) in analysis and inference of the decision theoretic problems in Astronomy. The machine learning algorithms, integrated into ASTROMLSKIT, a toolkit developed in the course of the work, have been used to analyze HabCat data and supernovae data. Accuracy has been found to be appreciably good.

...read moreread less

Posted Content•

Pore-scale lattice Boltzmann simulation of laminar and turbulent flow through a sphere pack

[...]

Ehsan Fattahia, Christian Waluga, Barbara Wohlmuth, Ulrich Rüde, Michael Manhart, Rainer Helmig - Show less +2 more

12 Aug 2015-arXiv: Computational Engineering, Finance, and Science

TL;DR: This work first evaluates the lattice Boltzmann method with various boundary handling of the solid-wall and various collision operators to assess their suitability for large scale direct numerical simulation of porous media flow, and chooses the most efficient combination of theSolid boundary condition and collision operator.

...read moreread less

Abstract: The lattice Boltzmann method can be used to simulate flow through porous media with full geometrical resolution. With such a direct numerical simulation, it becomes possible to study fundamental effects which are difficult to assess either by developing macroscopic mathematical models or experiments. We first evaluate the lattice Boltzmann method with various boundary handling of the solid-wall and various collision operators to assess their suitability for large scale direct numerical simulation of porous media flow. A periodic pressure drop boundary condition is used to mimic the pressure driven flow through the simple sphere pack in a periodic domain. The evaluation of the method is done in the Darcy regime and the results are compared to a semi-analytic solution. Taking into account computational cost and accuracy, we choose the most efficient combination of the solid boundary condition and collision operator. We apply this method to perform simulations for a wide range of Reynolds numbers from Stokes flow over seven orders of magnitude to turbulent flow. Contours and streamlines of the flow field are presented to show the flow behavior in different flow regimes. Moreover, unknown parameters of the Forchheimer, the Barree--Conway and friction factor models are evaluated numerically for the considered flow regimes.

...read moreread less