Showing papers in "arXiv: Quantitative Methods in 2014"

PDF

Open Access

Journal Article•DOI•

Mitigation of infectious disease at school: targeted class closure vs school closure

[...]

Valerio Gemmetto¹, Alain Barrat², Alain Barrat¹, Ciro Cattuto¹•Institutions (2)

Institute for Scientific Interchange¹, Aix-Marseille University²

29 Aug 2014-arXiv: Quantitative Methods

TL;DR: In this article, the authors consider mitigation measures that involve the targeted closure of school classes or grades based on readily available information such as the number of symptomatic infectious children in a class.

...read moreread less

Abstract: School environments are thought to play an important role in the community spread of airborne infections (e.g., influenza) because of the high mixing rates of school children. The closure of schools has therefore been proposed as efficient mitigation strategy, with however high social and economic costs: alternative, less disruptive interventions are highly desirable. The recent availability of high-resolution contact networks in school environments provides an opportunity to design micro-interventions and compare the outcomes of alternative mitigation measures. We consider mitigation measures that involve the targeted closure of school classes or grades based on readily available information such as the number of symptomatic infectious children in a class. We focus on the case of a primary school for which we have high-resolution data on the close-range interactions of children and teachers. We simulate the spread of an influenza-like illness in this population by using an SEIR model with asymptomatics and compare the outcomes of different mitigation strategies. We find that targeted class closure affords strong mitigation effects: closing a class for a fixed period of time -equal to the sum of the average infectious and latent durations- whenever two infectious individuals are detected in that class decreases the attack rate by almost 70% and strongly decreases the probability of a severe outbreak. The closure of all classes of the same grade mitigates the spread almost as much as closing the whole school. Targeted class closure strategies based on readily available information on symptomatic subjects and on limited information on mixing patterns, such as the grade structure of the school, can be almost as effective as whole-school closure, at a much lower cost. This may inform public health policies for the management and mitigation of influenza-like outbreaks in the community.

...read moreread less

170 citations

Posted Content•

Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction

[...]

Jian Zhou¹, Olga G. Troyanskaya¹•Institutions (1)

Princeton University¹

06 Mar 2014-arXiv: Quantitative Methods

TL;DR: This work presents the supervised extension of GSN, which learns a Markov chain to sample from a conditional distribution, and applied it to protein structure prediction, and introduces a convolutional architecture, which allows efficient learning across multiple layers of hierarchical representations.

...read moreread less

Abstract: Predicting protein secondary structure is a fundamental problem in protein structure prediction. Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical representations. GSN is a recently proposed deep learning technique (Bengio & Thibodeau-Laufer, 2013) to globally train deep generative model. We present the supervised extension of GSN, which learns a Markov chain to sample from a conditional distribution, and applied it to protein structure prediction. To scale the model to full-sized, high-dimensional data, like protein sequences with hundreds of amino acids, we introduce a convolutional architecture, which allows efficient learning across multiple layers of hierarchical representations. Our architecture uniquely focuses on predicting structured low-level labels informed with both low and high-level representations learned by the model. In our application this corresponds to labeling the secondary structure state of each amino-acid residue. We trained and tested the model on separate sets of non-homologous proteins sharing less than 30% sequence identity. Our model achieves 66.4% Q8 accuracy on the CB513 dataset, better than the previously reported best performance 64.9% (Wang et al., 2011) for this challenging secondary structure prediction problem.

...read moreread less

110 citations

Posted Content•

Protein Secondary Structure Prediction with Long Short Term Memory Networks

[...]

Søren Kaae Sønderby, Ole Winther

25 Dec 2014-arXiv: Quantitative Methods

TL;DR: This work uses a bidirectional recurrent neural network with long short term memory cells for prediction of secondary structure from the amino acid sequence and reports better performance than state of the art on the secondary structure 8-class problem.

...read moreread less

Abstract: Prediction of protein secondary structure from the amino acid sequence is a classical bioinformatics problem. Common methods use feed forward neural networks or SVMs combined with a sliding window, as these models does not naturally handle sequential data. Recurrent neural networks are an generalization of the feed forward neural network that naturally handle sequential data. We use a bidirectional recurrent neural network with long short term memory cells for prediction of secondary structure and evaluate using the CB513 dataset. On the secondary structure 8-class problem we report better performance (0.674) than state of the art (0.664). Our model includes feed forward networks between the long short term memory cells, a path that can be further explored.

...read moreread less

108 citations

Posted Content•

Odor Landscapes in Turbulent Environments

[...]

Antonio Celani¹, Antonio Celani², Emmanuel Villermaux³, Massimo Vergassola², Massimo Vergassola⁴ - Show less +1 more•Institutions (4)

International Centre for Theoretical Physics¹, Pasteur Institute², Institut Universitaire de France³, University of California, San Diego⁴

13 Nov 2014-arXiv: Quantitative Methods

TL;DR: In this paper, a Lagrangian approach to the transport of pheromones by turbulent flows and exploit it to predict the statistics of odor detection during olfactory searches is presented.

...read moreread less

Abstract: The olfactory system of male moths is exquisitely sensitive to pheromones emitted by females and transported in the environment by atmospheric turbulence. Moths respond to minute amounts of pheromones and their behavior is sensitive to the fine-scale structure of turbulent plumes where pheromone concentration is detectible. The signal of pheromone whiffs is qualitatively known to be intermittent, yet quantitative characterization of its statistical properties is lacking. This challenging fluid dynamics problem is also relevant for entomology, neurobiology and the technological design of olfactory stimulators aimed at reproducing physiological odor signals in well-controlled laboratory conditions. Here, we develop a Lagrangian approach to the transport of pheromones by turbulent flows and exploit it to predict the statistics of odor detection during olfactory searches. The theory yields explicit probability distributions for the intensity and the duration of pheromone detections, as well as their spacing in time. Predictions are favorably tested by using numerical simulations, laboratory experiments and field data for the atmospheric surface layer. The resulting signal of odor detections lends to implementation with state-of-the-art technologies and quantifies the amount and the type of information that male moths can exploit during olfactory searches.

...read moreread less

107 citations

Posted Content•

Feedback Control as a Framework for Understanding Tradeoffs in Biology

[...]

Noah J. Cowan¹, M. Mert Ankarali¹, Jonathan P. Dyhr², Manu S. Madhav³, Eatai Roth², Shahin Sefati³, Simon Sponberg², Sarah A. Stamper³, Eric S. Fortune¹, Thomas L. Daniel² - Show less +6 more•Institutions (3)

New Jersey Institute of Technology¹, University of Washington², Johns Hopkins University³

24 Feb 2014-arXiv: Quantitative Methods

TL;DR: In this article, the authors focus on four case studies of the sensorimotor dynamics of animals, each of which involves the application of principles from control theory to probe stability and feedback in an organism's response to perturbations.

...read moreread less

Abstract: Control theory arose from a need to control synthetic systems. From regulating steam engines to tuning radios to devices capable of autonomous movement, it provided a formal mathematical basis for understanding the role of feedback in the stability (or change) of dynamical systems. It provides a framework for understanding any system with feedback regulation, including biological ones such as regulatory gene networks, cellular metabolic systems, sensorimotor dynamics of moving animals, and even ecological or evolutionary dynamics of organisms and populations. Here we focus on four case studies of the sensorimotor dynamics of animals, each of which involves the application of principles from control theory to probe stability and feedback in an organism's response to perturbations. We use examples from aquatic (electric fish station keeping and jamming avoidance), terrestrial (cockroach wall following) and aerial environments (flight control in moths) to highlight how one can use control theory to understand how feedback mechanisms interact with the physical dynamics of animals to determine their stability and response to sensory inputs and perturbations. Each case study is cast as a control problem with sensory input, neural processing, and motor dynamics, the output of which feeds back to the sensory inputs. Collectively, the interaction of these systems in a closed loop determines the behavior of the entire system.

...read moreread less

101 citations

Posted Content•

On the representation of de Bruijn graphs

[...]

Rayan Chikhi¹, Antoine Limasset, Shaun D. Jackman, Jared T. Simpson², Paul Medvedev - Show less +1 more•Institutions (2)

Pennsylvania State University¹, Ontario Institute for Cancer Research²

21 Jan 2014-arXiv: Quantitative Methods

TL;DR: In this article, a general data structure (DBGFM) is proposed for the enumeration of simple paths of the de Bruijn graph using only 43 MB of memory, which is a 46% improvement over previous approaches.

...read moreread less

Abstract: The de Bruijn graph plays an important role in bioinformatics, especially in the context of de novo assembly. However, the representation of the de Bruijn graph in memory is a computational bottleneck for many assemblers. Recent papers proposed a navigational data structure approach in order to improve memory usage. We prove several theoretical space lower bounds to show the limitation of these types of approaches. We further design and implement a general data structure (DBGFM) and demonstrate its use on a human whole-genome dataset, achieving space usage of 1.5 GB and a 46% improvement over previous approaches. As part of DBGFM, we develop the notion of frequency-based minimizers and show how it can be used to enumerate all maximal simple paths of the de Bruijn graph using only 43 MB of memory. Finally, we demonstrate that our approach can be integrated into an existing assembler by modifying the ABySS software to use DBGFM.

...read moreread less

77 citations

Journal Article•DOI•

Synergy and redundancy in the Granger causal analysis of dynamical networks

[...]

Sebastiano Stramaglia¹, Sebastiano Stramaglia², Jesus M. Cortes², Daniele Marinazzo³•Institutions (3)

University of Bari¹, Ikerbasque², Ghent University³

20 Mar 2014-arXiv: Quantitative Methods

TL;DR: In this article, the effect of synergy and redundancy in the inference of the information flow between subsystems of a complex network was analyzed by means of Granger causality, and two different strategies (based either on informational content for the candidate driver or on selecting the variables with highest pairwise influences) were proposed.

...read moreread less

Abstract: We analyze by means of Granger causality the effect of synergy and redundancy in the inference (from time series data) of the information flow between subsystems of a complex network. Whilst we show that fully conditioned Granger causality is not affected by synergy, the pairwise analysis fails to put in evidence synergetic effects. In cases when the number of samples is low, thus making the fully conditioned approach unfeasible, we show that partially conditioned Granger causality is an effective approach if the set of conditioning variables is properly chosen. We consider here two different strategies (based either on informational content for the candidate driver or on selecting the variables with highest pairwise influences) for partially conditioned Granger causality and show that depending on the data structure either one or the other might be valid. On the other hand, we observe that fully conditioned approaches do not work well in presence of redundancy, thus suggesting the strategy of separating the pairwise links in two subsets: those corresponding to indirect connections of the fully conditioned Granger causality (which should thus be excluded) and links that can be ascribed to redundancy effects and, together with the results from the fully connected approach, provide a better description of the causality pattern in presence of redundancy. We finally apply these methods to two different real datasets. First, analyzing electrophysiological data from an epileptic brain, we show that synergetic effects are dominant just before seizure occurrences. Second, our analysis applied to gene expression time series from HeLa culture shows that the underlying regulatory networks are characterized by both redundancy and synergy.

...read moreread less

75 citations

Journal Article•DOI•

The Increase of the Functional Entropy of the Human Brain with Age

[...]

Ye Yao¹, Wenlian Lu², Wenlian Lu¹, Bing Xu¹, Bing Xu², Cuidi Li³, Ching Po Lin⁴, David Waxman², Jianfeng Feng², Jianfeng Feng¹ - Show less +6 more•Institutions (4)

University of Warwick¹, Fudan University², Shanghai Jiao Tong University³, National Yang-Ming University⁴

08 Jun 2014-arXiv: Quantitative Methods

TL;DR: Analysis of fMRI data from a large dataset of individuals, using resting state BOLD signals, demonstrated that a functional entropy associated with brain activity increases with age, and the entropy of males at birth was lower than that of females.

...read moreread less

Abstract: We use entropy to characterize intrinsic ageing properties of the human brain. Analysis of fMRI data from a large dataset of individuals, using resting state BOLD signals, demonstrated that a functional entropy associated with brain activity increases with age. During an average lifespan, the entropy, which was calculated from a population of individuals, increased by approximately 0.1 bits, due to correlations in BOLD activity becoming more widely distributed. We attribute this to the number of excitatory neurons and the excitatory conductance decreasing with age. Incorporating these properties into a computational model leads to quantitatively similar results to the fMRI data. Our dataset involved males and females and we found significant differences between them. The entropy of males at birth was lower than that of females. However, the entropies of the two sexes increase at different rates, and intersect at approximately 50 years; after this age, males have a larger entropy.

...read moreread less

63 citations

Journal Article•DOI•

Analysis of the human diseasome reveals phenotype modules across common, genetic, and infectious diseases

[...]

Robert Hoehndorf, Paul N. Schofield, Georgios V. Gkoutos

03 Nov 2014-arXiv: Quantitative Methods

TL;DR: This paper applied a semantic text-mining approach to identify the phenotypes (signs and symptoms) associated with over 8,000 diseases and demonstrated that their method generates phenotypes that correctly identify known disease-associated genes in mice and humans with high accuracy.

...read moreread less

Abstract: Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text- mining approach to identify the phenotypes (signs and symptoms) associated with over 8,000 diseases. We demonstrate that our method generates phenotypes that correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that share signs and symptoms cluster together, and we use this network to identify phenotypic disease modules.

...read moreread less

56 citations

Posted Content•

Effective Genetic Risk Prediction Using Mixed Models

[...]

David E. Golan¹, Saharon Rosset¹•Institutions (1)

Tel Aviv University¹

12 May 2014-arXiv: Quantitative Methods

TL;DR: It is confirmed that the use of random effects is most beneficial for diseases that are known to be highly polygenic: hypertension (HT) and bipolar disorder (BD).

...read moreread less

Abstract: To date, efforts to produce high-quality polygenic risk scores from genome-wide studies of common disease have focused on estimating and aggregating the effects of multiple SNPs. Here we propose a novel statistical approach for genetic risk prediction, based on random and mixed effects models. Our approach (termed GeRSI) circumvents the need to estimate the effect sizes of numerous SNPs by treating these effects as random, producing predictions which are consistently superior to current state of the art, as we demonstrate in extensive simulation. When applying GeRSI to seven phenotypes from the WTCCC study, we confirm that the use of random effects is most beneficial for diseases that are known to be highly polygenic: hypertension (HT) and bipolar disorder (BD). For HT, there are no significant associations in the WTCCC data. The best existing model yields an AUC of 54%, while GeRSI improves it to 59%. For BD, using GeRSI improves the AUC from 55% to 62%. For individuals ranked at the top 10% of BD risk predictions, using GeRSI substantially increases the BD relative risk from 1.4 to 2.5.

...read moreread less

50 citations

Journal Article•DOI•

Thermodynamic limits to information harvesting by sensory systems

[...]

Stefano Bo¹, Marco Del Giudice¹, Antonio Celani²•Institutions (2)

University of Turin¹, International Centre for Theoretical Physics²

21 Aug 2014-arXiv: Quantitative Methods

TL;DR: In this article, an integral fluctuation theorem for the entropy production and a measure of the information accumulated in the memory device were derived. And they showed that the amount of information is bounded by the average thermodynamic entropy produced by the process.

...read moreread less

Abstract: In view of the relation between information and thermodynamics we investigate how much information about an external protocol can be stored in the memory of a stochastic measurement device given an energy budget. We consider a layered device with a memory component storing information about the external environment by monitoring the history of a sensory part coupled to the environment. We derive an integral fluctuation theorem for the entropy production and a measure of the information accumulated in the memory device. Its most immediate consequence is that the amount of information is bounded by the average thermodynamic entropy produced by the process. At equilibrium no entropy is produced and therefore the memory device does not add any information about the environment to the sensory component. Consequently, if the system operates at equilibrium the addition of a memory component is superfluous. Such device can be used to model the sensing process of a cell measuring the external concentration of a chemical compound and encoding the measurement in the amount of phosphorylated cytoplasmic proteins.

...read moreread less

Journal Article•DOI•

Silicene-based DNA Nucleobase Sensing

[...]

Hatef Sadeghi, Steven Bailey, Colin J. Lambert

13 May 2014-arXiv: Quantitative Methods

TL;DR: It is demonstrated that when nucleobases pass through a pore, even after sampling over many orientations, changes in the electrical properties of the ribbon can be used to discriminate between bases.

...read moreread less

Abstract: We propose a DNA sequencing scheme based on silicene nanopores. Using first principles theory, we compute the electrical properties of such pores in the absence and presence of nucleobases. Within a two-terminal geometry, we analyze the current-voltage relation in the presence of nucleobases with various orientations. We demonstrate that when nucleobases pass through a pore, even after sampling over many orientations, changes in the electrical properties of the ribbon can be used to discriminate between bases.

...read moreread less

Posted Content•

Alignment of cryo-EM movies of individual particles by optimization of image translations

[...]

John L. Rubinstein¹, Marcus A. Brubaker¹•Institutions (1)

University of Toronto¹

24 Sep 2014-arXiv: Quantitative Methods

TL;DR: An algorithm is described that allows for individual <1 MDa particle images to be aligned without frame averaging or linear trajectories, and can be used to improve 3-D maps from single particle cryo-EM.

...read moreread less

Abstract: Direct detector device (DDD) cameras have revolutionized single particle electron cryomicroscopy (cryo-EM). In addition to an improved camera detective quantum efficiency, acquisition of DDD movies allows for correction of movement of the specimen, due both to instabilities in the microscope specimen stage and electron beam-induced movement. Unlike specimen stage drift, beam-induced movement is not always homogeneous within an image. Local correlation in the trajectories of nearby particles suggests that beam-induced motion is due to deformation of the ice layer. Algorithms have already been described that can correct movement for large regions of frames and for > 1 MDa protein particles. Another algorithm allows individual < 1 MDa protein particle trajectories to be estimated, but requires rolling averages to be calculated from frames and fits linear trajectories for particles. Here we describe an algorithm that allows for individual < 1 MDa particle images to be aligned without frame averaging or linear trajectories. The algorithm maximizes the overall correlation of the shifted frames with the sum of the shifted frames. The optimum in this single objective function is found efficiently by making use of analytically calculated derivatives of the function. To smooth estimates of particle trajectories, rapid changes in particle positions between frames are penalized in the objective function and weighted averaging of nearby trajectories ensures local correlation in trajectories. This individual particle motion correction, in combination with weighting of Fourier components to account for increasing radiation damage in later frames, can be used to improve 3-D maps from single particle cryo-EM.

...read moreread less

Posted Content•

Learning microbial interaction networks from metagenomic count data

[...]

Surojit Biswas¹, Meredith McDonald¹, Derek S. Lundberg¹, Jeffery L. Dangl¹, Vladimir Jojic¹ - Show less +1 more•Institutions (1)

University of North Carolina at Chapel Hill¹

30 Nov 2014-arXiv: Quantitative Methods

TL;DR: In this paper, a Poisson-multivariate normal hierarchical model was developed to learn direct interactions from the count-based output of standard metagenomics sequencing experiments, and the model provided a structured, accurate, and distributionally reasonable way of modeling correlated count based random variables and capturing direct interactions among them.

...read moreread less

Abstract: Many microbes associate with higher eukaryotes and impact their vitality. In order to engineer microbiomes for host benefit, we must understand the rules of community assembly and maintenence, which in large part, demands an understanding of the direct interactions between community members. Toward this end, we've developed a Poisson-multivariate normal hierarchical model to learn direct interactions from the count-based output of standard metagenomics sequencing experiments. Our model controls for confounding predictors at the Poisson layer, and captures direct taxon-taxon interactions at the multivariate normal layer using an $\ell_1$ penalized precision matrix. We show in a synthetic experiment that our method handily outperforms state-of-the-art methods such as SparCC and the graphical lasso (glasso). In a real, in planta perturbation experiment of a nine member bacterial community, we show our model, but not SparCC or glasso, correctly resolves a direct interaction structure among three community members that associate with Arabidopsis thaliana roots. We conclude that our method provides a structured, accurate, and distributionally reasonable way of modeling correlated count based random variables and capturing direct interactions among them.

...read moreread less

Posted Content•

Delay effects in the response of low grade gliomas to radiotherapy: A mathematical model and its therapeutical implications

[...]

Víctor M. Pérez-García¹, Magdalena Urszula Bogdańska², Alicia Martínez-González¹, Juan Belmonte-Beitia¹, Philippe Schucht³, Luis A. Pérez-Romasanta - Show less +2 more•Institutions (3)

University of Castilla–La Mancha¹, University of Łódź², University of Bern³

12 Jan 2014-arXiv: Quantitative Methods

TL;DR: A mathematical model is constructed describing the basic facts of glioma progression and response to radiotherapy and proposed radiation fractionation schemes that might be therapeutically useful by helping to evaluate tumour malignancy while at the same time reducing the toxicity associated to the treatment.

...read moreread less

Abstract: Low grade gliomas (LGGs) are a group of primary brain tumors usually encountered in young patient populations. These tumors represent a difficult challenge because many patients survive a decade or more and may be at a higher risk for treatment-related complications. Specifically, radiation therapy is known to have a relevant effect on survival but in many cases it can be deferred to avoid side effects while maintaining its beneficial effect. However, a subset of low-grade gliomas manifests more aggressive clinical behavior and requires earlier intervention. Moreover, the effectiveness of radiotherapy depends on the tumor characteristics. Recently Pallud et al., [Neuro-oncology, 14(4):1-10, 2012], studied patients with LGGs treated with radiation therapy as a first line therapy. and found the counterintuitive result that tumors with a fast response to the therapy had a worse prognosis than those responding late. In this paper we construct a mathematical model describing the basic facts of glioma progression and response to radiotherapy. The model provides also an explanation to the observations of Pallud et al. Using the model we propose radiation fractionation schemes that might be therapeutically useful by helping to evaluate the tumor malignancy while at the same time reducing the toxicity associated to the treatment.

...read moreread less

Posted Content•

Integrated multimodal network approach to PET and MRI based on multidimensional persistent homology

[...]

Hyekyoung Lee¹, Hye Jin Kang¹, Moo K. Chung², Seonhee Lim³, Bung Nyun Kim¹, Dong Soo Lee¹ - Show less +2 more•Institutions (3)

New Generation University College¹, University of Wisconsin-Madison², UPRRP College of Natural Sciences³

17 Oct 2014-arXiv: Quantitative Methods

TL;DR: In this paper, a multidi-mensional persistent homology approach was proposed to visualize and discriminate the topological change of integrated brain networks by varying not only threshold but also mixing ratios between two different imaging modalities.

...read moreread less

Abstract: Finding the underlying relationships among multiple imaging modalities in a coherent fashion is one of challenging problems in the multimodal analysis. In this study, we propose a novel multimodal network approach based on multidi- mensional persistent homology. In this extension of the previous threshold-free method of persistent homology, we visualize and discriminate the topological change of integrated brain networks by varying not only threshold but also mixing ratios between two different imaging modalities. Moreover, we also pro- pose an integration method for multimodal networks, called one-dimensional projection, with a specific mixing ratio between modalities. We applied the proposed methods to PET and MRI data from 21 autism spectrum disorder (ASD) children and 10 pediatric control subjects. From the results, we found that the brain networks of ASD children and controls differ significantly, with ASD showing asymmetrical changes of connected structures between PET and MRI. The integrated MRI and PET networks showed that ASD children had weaker connections than controls within the visual cortex, between dorsal and ventral parts of the temporal pole, between frontal and parietal regions, and between the left perisylvian and other brain regions. These results provide a multidimensional homological understanding of disease-related PET and MRI networks that discloses the network association with ASD.

...read moreread less

Posted Content•

Replicating Kernels with a Short Stride Allows Sparse Reconstructions with Fewer Independent Kernels.

[...]

Peter F. Schultz, Dylan M. Paiton, Wei Lu, Garrett T. Kenyon

17 Jun 2014-arXiv: Quantitative Methods

TL;DR: A type of DCN is implemented using a modified Locally Competitive Algorithm to investigate the relationship between the number of kernels, the stride, the receptive field size, and the quality of reconstruction, and it is found that for a given stride and number of kernel, the patch size does not significantly affect reconstruction quality.

...read moreread less

Abstract: In sparse coding it is common to tile an image into nonoverlapping patches, and then use a dictionary to create a sparse representation of each tile independently. In this situation, the overcompleteness of the dictionary is the number of dictionary elements divided by the patch size. In deconvolutional neural networks (DCNs), dictionaries learned on nonoverlapping tiles are replaced by a family of convolution kernels. Hence adjacent points in the feature maps (V1 layers) have receptive fields in the image that are translations of each other. The translational distance is determined by the dimensions of V1 in comparison to the dimensions of the image space. We refer to this translational distance as the stride. We implement a type of DCN using a modified Locally Competitive Algorithm (LCA) to investigate the relationship between the number of kernels, the stride, the receptive field size, and the quality of reconstruction. We find, for example, that for 16x16-pixel receptive fields, using eight kernels and a stride of 2 leads to sparse reconstructions of comparable quality as using 512 kernels and a stride of 16 (the nonoverlapping case). We also find that for a given stride and number of kernels, the patch size does not significantly affect reconstruction quality. Instead, the learned convolution kernels have a natural support radius independent of the patch size.

...read moreread less

Journal Article•DOI•

Computational paradigm for dynamic logic-gates in neuronal activity

[...]

Amir Goldental, Shoshana Guberman, Roni Vardi, Ido Kanter

15 Apr 2014-arXiv: Quantitative Methods

TL;DR: This work proposes a new experimentally corroborated paradigm in which the truth tables of the brain's logic-gates are time dependent, i.e., dynamic logic-Gates (DLGs), and demonstrates the underlying biological mechanism is the unavoidable increase of neuronal response latencies to ongoing stimulations, which imposes a non-uniform gradual stretching of network delays.

...read moreread less

Abstract: In 1943 McCulloch and Pitts suggested that the brain is composed of reliable logic-gates similar to the logic at the core of today's computers. This framework had a limited impact on neuroscience, since neurons exhibit far richer dynamics. Here we propose a new experimentally corroborated paradigm in which the truth tables of the brain's logic-gates are time dependent, i.e. dynamic logicgates (DLGs). The truth tables of the DLGs depend on the history of their activity and the stimulation frequencies of their input neurons. Our experimental results are based on a procedure where conditioned stimulations were enforced on circuits of neurons embedded within a large-scale network of cortical cells in-vitro. We demonstrate that the underlying biological mechanism is the unavoidable increase of neuronal response latencies to ongoing stimulations, which imposes a nonuniform gradual stretching of network delays. The limited experimental results are confirmed and extended by simulations and theoretical arguments based on identical neurons with a fixed increase of the neuronal response latency per evoked spike. We anticipate our results to lead to better understanding of the suitability of this computational paradigm to account for the brain's functionalities and will require the development of new systematic mathematical methods beyond the methods developed for traditional Boolean algebra.

...read moreread less

Journal Article•DOI•

Conversion of cDNA differential display results (DDRT-PCR) into quantitative transcription profiles

[...]

Balakrishnan Venkatesh¹, Ursula Hettwer¹, Birger Koopmann¹, Petr Karlovsky¹•Institutions (1)

University of Göttingen¹

01 Nov 2014-arXiv: Quantitative Methods

TL;DR: A data processing procedure for the quantitative analysis of amplified cDNA fragments separated by electrophoresis is developed that provides an open-end alternative to DNA microarray analysis of the transcriptome and is expected to work equally well with DDRT-PCR and cDNA-AFLP data.

...read moreread less

Abstract: Background: Gene expression studies on non-model organisms require open-end strategies for transcription profiling. Gel-based analysis of cDNA fragments allows to detect alterations in gene expression for genes which have neither been sequenced yet nor are available in cDNA libraries. Commonly used protocols are cDNA Differential Display (DDRT-PCR) and cDNA-AFLP. Both methods have been used merely as qualitative gene discovery tools so far. Results: We developed procedures for the conversion of DDRT-PCR data into quantitative transcription profiles. Amplified cDNA fragments are separated on a DNA sequencer. Data processing consists of four steps: (i) cDNA bands in lanes corresponding to samples treated with the same primer combination are matched in order to identify fragments originating from the same transcript, (ii) intensity of bands is determined by densitometry, (iii) densitometric values are normalized, and (iv) intensity ratio is calculated for each pair of corresponding bands. Transcription profiles are represented by sets of intensity ratios (control vs. treatment) for cDNA fragments defined by primer combination and DNA mobility. We demonstrated the procedure by analyzing DDRT-PCR data on the effect of secondary metabolites of oilseed rape Brassica napus on the transcriptome of the pathogenic fungus Leptosphaeria maculans. Conclusion: We developed a data processing procedure for quantitative analysis of amplified cDNA fragments. The system utilizes common software and provides an open-end alternative to microarray analysis. The processing is expected to work equally well with DDRT-PCR and cDNA-AFLP data and be useful in research on organisms for which microarray analysis is not available or economical.

...read moreread less

Journal Article•DOI•

Differentiating the L\'evy walk from a composite correlated random walk

[...]

Marie Auger-Méthé, Andrew E. Derocher, Michael J. Plank, Edward A. Codling, Mark A. Lewis - Show less +1 more

17 Jun 2014-arXiv: Quantitative Methods

TL;DR: In this article, the L\'evy walk and the composite correlated random walk and its associated area-restricted search behavior are compared using likelihood functions and associated statistical measures that assess the relative support for and absolute fit of each model.

...read moreread less

Abstract: 1. Understanding how to find targets with very limited information is a topic of interest in many disciplines. In ecology, such research has often focused on the development of two movement models: i) the L\'evy walk and; ii) the composite correlated random walk and its associated area-restricted search behaviour. Although the processes underlying these models differ, they can produce similar movement patterns. Due to this similarity and because of their disparate formulation, current methods cannot reliably differentiate between these two models. 2. Here, we present a method that differentiates between the two models. It consists of likelihood functions, including one for a hidden Markov model, and associated statistical measures that assess the relative support for and absolute fit of each model. 3. Using a simulation study, we show that our method can differentiate between the two search models over a range of parameter values. Using the movement data of two polar bears (\textit{Ursus maritimus}), we show that the method can be applied to complex, real-world movement paths. 4. By providing the means to differentiate between the two most prominent search models in the literature, and a framework that could be extended to include other models, we facilitate further research into the strategies animals use to find resources.

...read moreread less

Posted Content•

An Automated Images-to-Graphs Framework for High Resolution Connectomics

[...]

William Gray Roncal¹, William Gray Roncal², Dean M. Kleissas¹, Joshua T. Vogelstein², Priya Manavalan², Kunal Lillaney², Michael Pekala¹, Randal Burns², R. Jacob Vogelstein¹, Carey E. Priebe², Mark A. Chevillet¹, Gregory D. Hager² - Show less +8 more•Institutions (2)

Johns Hopkins University Applied Physics Laboratory¹, Johns Hopkins University²

25 Nov 2014-arXiv: Quantitative Methods

TL;DR: This manuscript presents the first fully-automated images-to-graphs pipeline (i.e., a pipeline that begins with an imaged volume of neural tissue and produces a brain graph without any human interaction), and develops a metric to assess the quality of the output graphs.

...read moreread less

Abstract: Reconstructing a map of neuronal connectivity is a critical challenge in contemporary neuroscience. Recent advances in high-throughput serial section electron microscopy (EM) have produced massive 3D image volumes of nanoscale brain tissue for the first time. The resolution of EM allows for individual neurons and their synaptic connections to be directly observed. Recovering neuronal networks by manually tracing each neuronal process at this scale is unmanageable, and therefore researchers are developing automated image processing modules. Thus far, state-of-the-art algorithms focus only on the solution to a particular task (e.g., neuron segmentation or synapse identification). In this manuscript we present the first fully automated images-to-graphs pipeline (i.e., a pipeline that begins with an imaged volume of neural tissue and produces a brain graph without any human interaction). To evaluate overall performance and select the best parameters and methods, we also develop a metric to assess the quality of the output graphs. We evaluate a set of algorithms and parameters, searching possible operating points to identify the best available brain graph for our assessment metric. Finally, we deploy a reference end-to-end version of the pipeline on a large, publicly available data set. This provides a baseline result and framework for community analysis and future algorithm development and testing. All code and data derivatives have been made publicly available toward eventually unlocking new biofidelic computational primitives and understanding of neuropathologies.

...read moreread less

Posted Content•

Identifying recombination hotspots using population genetic data

[...]

Adam Auton, Simon Myers, Gil McVean

17 Mar 2014-arXiv: Quantitative Methods

TL;DR: This work presents a method for inferring the location of recombination hotspots from patterns of linkage disequilibrium within samples of population genetic data, and shows that it has hotspot detection power of approximately 50-60%, but depending on the magnitude of the hotspot.

...read moreread less

Abstract: Motivation: Recombination rates vary considerably at the fine scale within mammalian genomes, with the majority of recombination occurring within hotspots of ~2 kb in width. We present a method for inferring the location of recombination hotspots from patterns of linkage disequilibrium within samples of population genetic data. Results: Using simulations, we show that our method has hotspot detection power of approximately 50-60%, but depending on the magnitude of the hotspot. The false positive rate is between 0.24 and 0.56 false positives per Mb for data typical of humans. Availability: this http URL

...read moreread less

Posted Content•

Extreme protraction for low grade gliomas: Theoretical proof of concept of a novel therapeutical strategy

[...]

Víctor M. Pérez-García

07 Jul 2014-arXiv: Quantitative Methods

TL;DR: In this article, the authors used mathematical models describing the growth of grade II gliomas in response to radiotherapy and found that enlarging substantially the time interval between RT fractions may lead to a better tumor control.

...read moreread less

Abstract: Grade II gliomas are slowly growing primary brain tumors that affect mostly young patients and become fatal after a few years. Current clinical handling includes surgery as first line treatment. Cytotoxic therapies (radiotherapy RT or chemotherapy QT) are used initially only for patients having a bad prognosis. Therapies are administered following the 'maximum dose in minimum time' principle, what is the same schedule used for high grade brain tumors. Using mathematical models describing the growth of these tumors in response to radiotherapy, we find that a extreme protraction therapeutical strategy, i.e. enlarging substantially the time interval between RT fractions, may lead to a better tumor control. Explicit formulas are found providing the optimal spacing between doses in a very good agreement with the simulations of the full three-dimensional mathematical model approximating the tumor spatio-temporal dynamics. This idea, although breaking the well-stablished paradigm, has biological meaning since in these slowly growing tumors it may be more favourable to treat the tumor as the different tumor subpopulations move to more sensitive phases of the cell cycle.

...read moreread less

Posted Content•

Annotating Synapses in Large EM Datasets

[...]

Stephen M. Plaza, Toufiq Parag, Gary B. Huang, Donald J. Olbris, Mathew A. Saunders, Patricia K. Rivlin - Show less +2 more

05 Sep 2014-arXiv: Quantitative Methods

TL;DR: This work introduces a large-scale, high-throughput, and semi-automated methodology to efficiently identify synapses and successfully applied this methodology to the Drosophila medulla optic lobe, annotating many more synapses than previous connectome efforts.

...read moreread less

Abstract: Reconstructing neuronal circuits at the level of synapses is a central problem in neuroscience and becoming a focus of the emerging field of connectomics. To date, electron microscopy (EM) is the most proven technique for identifying and quantifying synaptic connections. As advances in EM make acquiring larger datasets possible, subsequent manual synapse identification ({\em i.e.}, proofreading) for deciphering a connectome becomes a major time bottleneck. Here we introduce a large-scale, high-throughput, and semi-automated methodology to efficiently identify synapses. We successfully applied our methodology to the Drosophila medulla optic lobe, annotating many more synapses than previous connectome efforts. Our approaches are extensible and will make the often complicated process of synapse identification accessible to a wider-community of potential proofreaders.

...read moreread less

Journal Article•DOI•

Performance evaluation of DNA copy number segmentation methods

[...]

Morgane Pierre-Jean, Guillem Rigaill¹, Pierre Neuvial•Institutions (1)

University of Évry Val d'Essonne¹

28 Feb 2014-arXiv: Quantitative Methods

TL;DR: This study indicates that no single method is uniformly better than all others, and helps identifying pros and cons of the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers.

...read moreread less

Abstract: A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. We have designed and implemented a framework to generate realistic DNA copy number profiles of cancer samples with known truth. These profiles are generated by resampling real SNP microarray data from genomic regions with known copy-number state. The original real data have been extracted from dilutions series of tumor cell lines with matched blood samples at several concentrations. Therefore, the signal-to-noise ratio of the generated profiles can be controlled through the (known) percentage of tumor cells in the sample. In this paper, we describe this framework and illustrate some of the benefits of the proposed data generation approach on a practical use case: a comparison study between methods for segmenting DNA copy number profiles from SNP microarrays. This study indicates that no single method is uniformly better than all others. It also helps identifying pros and cons for the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers. Availability: R package jointSeg: this http URL\_id=1562

...read moreread less

Journal Article•DOI•

Modeling-based determination of physiological parameters of systemic VOCs by breath gas analysis, part 2

[...]

Clemens Ager, Karl Unterkofler, Pawel Mochalski, Susanne Teschl, Gerald Teschl, Chris A. Mayhew, Julian King - Show less +3 more

24 Oct 2014-arXiv: Quantitative Methods

TL;DR: In this article, the influence of inhaled concentrations on exhaled breath concentrations for VOCs with higher Henry constants was investigated and an additional compartment was added to account for upper airway influence.

...read moreread less

Abstract: In a recent paper we presented a simple two compartment model which describes the influence of inhaled concentrations on exhaled breath concentrations for volatile organic compounds (VOCs) with small Henry constants. In this paper we extend this investigation concerning the influence of inhaled concentrations on exhaled breath concentrations for VOCs with higher Henry constants. To this end we extend our model with an additional compartment which takes into account the influence of the upper airways on exhaled breath VOC concentrations.

...read moreread less

Journal Article•DOI•

An adaptive multi-level simulation algorithm for stochastic biological systems

[...]

Christopher Lester, Christian A. Yates¹, Michael B. Giles, Ruth E. Baker•Institutions (1)

University of Bath¹

05 Sep 2014-arXiv: Quantitative Methods

TL;DR: Anderson and Higham as mentioned in this paper proposed a multi-level method to estimate system statistics using a collection of paired sample paths where one path of each pair is generated at a higher accuracy compared to the other (and so more expensive).

...read moreread less

Abstract: Discrete-state, continuous-time Markov models are widely used in the modeling of biochemical reaction networks. Their complexity often precludes analytic solution, and we rely on stochastic simulation algorithms to estimate system statistics. The Gillespie algorithm is exact, but computationally costly as it simulates every single reaction. As such, approximate stochastic simulation algorithms such as the tau-leap algorithm are often used. Potentially computationally more efficient, the system statistics generated suffer from significant bias unless tau is relatively small, in which case the computational time can be comparable to that of the Gillespie algorithm. The multi-level method (Anderson and Higham, Multiscale Model. Simul. 2012) tackles this problem. A base estimator is computed using many (cheap) sample paths at low accuracy. The bias inherent in this estimator is then reduced using a number of corrections. Each correction term is estimated using a collection of paired sample paths where one path of each pair is generated at a higher accuracy compared to the other (and so more expensive). By sharing random variables between these paired paths the variance of each correction estimator can be reduced. This renders the multi-level method very efficient as only a relatively small number of paired paths are required to calculate each correction term. In the original multi-level method, each sample path is simulated using the tau-leap algorithm with a fixed value of $\tau$. This approach can result in poor performance when the reaction activity of a system changes substantially over the timescale of interest. By introducing a novel, adaptive time-stepping approach where $\tau$ is chosen according to the stochastic behaviour of each sample path we extend the applicability of the multi-level method to such cases. We demonstrate the efficiency of our method using a number of examples.

...read moreread less

Posted Content•

Collaborative Regression

[...]

Samuel Gross, Robert Tibshirani

22 Jan 2014-arXiv: Quantitative Methods

TL;DR: In this article, a convex sparse supervised canonical correlation analysis (sparse sCCA) is proposed for sparse mCCA when one of the data sets is a vector.

...read moreread less

Abstract: We consider the scenario where one observes an outcome variable and sets of features from multiple assays, all measured on the same set of samples One approach that has been proposed for dealing with this type of data is ``sparse multiple canonical correlation analysis'' (sparse mCCA) All of the current sparse mCCA techniques are biconvex and thus have no guarantees about reaching a global optimum We propose a method for performing sparse supervised canonical correlation analysis (sparse sCCA), a specific case of sparse mCCA when one of the datasets is a vector Our proposal for sparse sCCA is convex and thus does not face the same difficulties as the other methods We derive efficient algorithms for this problem, and illustrate their use on simulated and real data

...read moreread less

Posted Content•

Parametric Inference using Persistence Diagrams: A Case Study in Population Genetics

[...]

Kevin J. Emmett¹, Daniel I. S. Rosenbloom¹, Pablo G. Camara, Raul Rabadan¹•Institutions (1)

Columbia University¹

18 Jun 2014-arXiv: Quantitative Methods

TL;DR: This work shows that, in certain models, parametric inference can be performed using statistics defined on the computed invariants of persistent homology, and develops this idea with a model from population genetics, the coalescent with recombination.

...read moreread less

Abstract: Persistent homology computes topological invariants from point cloud data. Recent work has focused on developing statistical methods for data analysis in this framework. We show that, in certain models, parametric inference can be performed using statistics defined on the computed invariants. We develop this idea with a model from population genetics, the coalescent with recombination. We apply our model to an influenza dataset, identifying two scales of topological structure which have a distinct biological interpretation.

...read moreread less

Posted Content•

Information processing in living systems

[...]

Gašper Tkačik¹, William Bialek²•Institutions (2)

Institute of Science and Technology Austria¹, Princeton University²

30 Dec 2014-arXiv: Quantitative Methods

TL;DR: In this article, the authors explore examples where it has been possible to measure, directly, the flow of information in biological networks, or more generally where information theoretic ideas have been used to guide the analysis of experiments.

...read moreread less

Abstract: Life depends as much on the flow of information as on the flow of energy. Here we review the many efforts to make this intuition precise. Starting with the building blocks of information theory, we explore examples where it has been possible to measure, directly, the flow of information in biological networks, or more generally where information theoretic ideas have been used to guide the analysis of experiments. Systems of interest range from single molecules (the sequence diversity in families of proteins) to groups of organisms (the distribution of velocities in flocks of birds), and all scales in between. Many of these analyses are motivated by the idea that biological systems may have evolved to optimize the gathering and representation of information, and we review the experimental evidence for this optimization, again across a wide range of scales.

...read moreread less

Collapse