scispace - formally typeset
Search or ask a question

Showing papers by "Ron Weiss published in 2011"


Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations


Journal ArticleDOI
02 Sep 2011-Science
TL;DR: A scalable transcriptional/posttranscriptional synthetic regulatory circuit—a cell-type “classifier”— is shown that selectively identifies HeLa cancer cells and triggers apoptosis without affecting non-HeLa cell types.
Abstract: Engineered biological systems that integrate multi-input sensing, sophisticated information processing, and precisely regulated actuation in living cells could be useful in a variety of applications. For example, anticancer therapies could be engineered to detect and respond to complex cellular conditions in individual cells with high specificity. Here, we show a scalable transcriptional/posttranscriptional synthetic regulatory circuit--a cell-type "classifier"--that senses expression levels of a customizable set of endogenous microRNAs and triggers a cellular response only if the expression levels match a predetermined profile of interest. We demonstrate that a HeLa cancer cell classifier selectively identifies HeLa cells and triggers apoptosis without affecting non-HeLa cell types. This approach also provides a general platform for programmed responses to other complex cell states.

637 citations


PatentDOI
TL;DR: The integrated system was tested by co-culturing PAO1 cells, on semisolid agar plates, together with engineered sentinel E. coli, capable of secreting FlgM-CoPy when induced by 3OC 12 HSL, and optical microscopy results show that the engineeredE.
Abstract: Aspects of the invention relate to compositions and methods for using recombinant cells to sense and destroy specific pathogens.

164 citations


Journal ArticleDOI
05 Aug 2011-PLOS ONE
TL;DR: This work presents a platform that enables synthetic biologists to express desired behavior using a convenient high-level biologically-oriented programming language, Proto, and features biologically relevant compiler optimizations, providing an important foundation for the development of sophisticated biological systems.
Abstract: Background: The field of synthetic biology promises to revolutionize our ability to engineer biological systems, providing important benefits for a variety of applications. Recent advances in DNA synthesis and automated DNA assembly technologies suggest that it is now possible to construct synthetic systems of significant complexity. However, while a variety of novel genetic devices and small engineered gene networks have been successfully demonstrated, the regulatory complexity of synthetic systems that have been reported recently has somewhat plateaued due to a variety of factors, including the complexity of biology itself and the lag in our ability to design and optimize sophisticated biological circuitry. Methodology/Principal Findings: To address the gap between DNA synthesis and circuit design capabilities, we present a platform that enables synthetic biologists to express desired behavior using a convenient high-level biologically-oriented programming language, Proto. The high level specification is compiled, using a regulatory motif based mechanism, to a gene network, optimized, and then converted to a computational simulation for numerical verification. Through several example programs we illustrate the automated process of biological system design with our platform, and show that our compiler optimizations can yield significant reductions in the number of genes (*50%) and latency of the optimized engineered gene networks. Conclusions/Significance: Our platform provides a convenient and accessible tool for the automated design of sophisticated synthetic biological systems, bridging an important gap between DNA synthesis and circuit design capabilities. Our platform is user-friendly and features biologically relevant compiler optimizations, providing an important foundation for the development of sophisticated biological systems.

119 citations


Journal ArticleDOI
TL;DR: A data-driven algorithm for automatically identifying repeated patterns in music which analyzes a feature matrix using shift-invariant probabilistic latent component analysis is described, resulting in an algorithm that is competitive with other state-of-the-art segmentation algorithms based on hidden Markov models and self similarity matrices.
Abstract: We describe a data-driven algorithm for automatically identifying repeated patterns in music which analyzes a feature matrix using shift-invariant probabilistic latent component analysis. We utilize sparsity constraints to automatically identify the number of patterns and their lengths, parameters that would normally need to be fixed in advance, as well as to control the structure of the decomposition. The proposed analysis is applied to beat-synchronous chromagrams in order to concurrently extract recurrent harmonic motifs and their locations within a song. We demonstrate how the analysis can be used to accurately identify riffs in popular music and explore the relationship between the derived parameters and a song's underlying metrical structure. Finally, we show how this analysis can be used for long-term music structure segmentation, resulting in an algorithm that is competitive with other state-of-the-art segmentation algorithms based on hidden Markov models and self similarity matrices.

47 citations


Book ChapterDOI
TL;DR: This chapter outlines work that has been done in developing design principles for robust synthetic circuits, as well as sharing the experiences designing and constructing gene circuits.
Abstract: Phenotypic robustness is a highly sought after goal for synthetic biology. There are many well-studied examples of robust systems in biology, and for the advancement of synthetic biology, particularly in performance-critical applications, fundamental understanding of how robustness is both achieved and maintained is very important. A synthetic circuit may fail to behave as expected for a multitude of reasons, and since many of these failures are difficult to predict a priori, a better understanding of a circuit’s behavior as well as its possible failures are needed. In this chapter, we outline work that has been done in developing design principles for robust synthetic circuits, as well as sharing our experiences designing and constructing gene circuits.

32 citations


Journal ArticleDOI
TL;DR: The proposed algorithm effectively combines information derived from low level perceptual cues, similar to those used by the human auditory system, with higher level information related to speaker identity to derive an EM algorithm for finding the maximum likelihood parameters of the joint model.

24 citations


Patent
22 Jul 2011
TL;DR: In this article, high-input detector modules and multi-input biological classifier circuits and systems that integrate sophisticated sensing, information processing, and actuation in living cells and permit new directions in basic biology, biotechnology and medicine.
Abstract: Provided herein are high-input detector modules and multi-input biological classifier circuits and systems that integrate sophisticated sensing, information processing, and actuation in living cells and permit new directions in basic biology, biotechnology and medicine. The multi-input biological classifier circuits described herein comprise synthetic, scaleable transcriptional/post-transcriptional regulatory circuits that are designed to interrogate the status of a cell by simultaneously sensing expression levels of multiple endogenous inputs, such as microRNAs. The classifier circuits then compute whether to trigger a desired output or response if the expression levels match a pre-determined profile of interest.

15 citations


Proceedings ArticleDOI
01 May 2011
TL;DR: This paper presents three showcase applications at the forefront of research of bio and nano communication networks and provides an interdisciplinary and holistic view of such novel communication systems and highlights future challenges and promises.
Abstract: In recent years, the importance of interconnects on top-down engineered lithography-based electronic chips has outrun the importance of transistors as a dominant factor of performance. The major challenges in traditional chips are related to delays of non-scalable global interconnects and reliability in general, which leads to the observation that simple scaling will no longer satisfy performance requirements as feature sizes continue to shrink. In addition, the advent of massive-scale multicore architectures, novel silicon and non-silicon manufacturing techniques (such as self-assembly), and an increasing interest in biological components for computing force us to rethink, re-evaluate, and re-design the communication infrastructure and the communication paradigms in the era of nano- and biotechnology. In this paper we present three showcase applications at the forefront of research of bio and nano communication networks. We focus on (1) the signaling and reliability in synthetic bio-circuits, (2) the pattern formation in distributed synthetic bio-networks, and on unstructured nanowire NOC (3). We provide an interdisciplinary and holistic view of such novel communication systems and highlight future challenges and promises.

7 citations


Proceedings ArticleDOI
22 May 2011
TL;DR: Under this measure, the best-performing imputation algorithm reconstructs masked sections by choosing the nearest neighbor to the surrounding observations within the song, which is consistent with the large amount of repetition found in pop music.
Abstract: Building models of the structure in musical signals raises the question of how to evaluate and compare different modeling approaches. One possibility is to use the model to impute deliberately-removed patches of missing data, then to compare the model's predictions with the part that was removed. We analyze a corpus of popular music audio represented as beat-synchronous chroma features, and compare imputation based on simple linear prediction to more complex models including nearest neighbor selection and shift-invariant probabilistic latent component analysis. Simple linear models perform best according to Euclidean distance, despite producing stationary results which are not musically meaningful. We therefore investigate alternate evaluation measures and observe that an entropy difference metric correlates better with our expectations for musically consistent reconstructions. Under this measure, the best-performing imputation algorithm reconstructs masked sections by choosing the nearest neighbor to the surrounding observations within the song. This result is consistent with the large amount of repetition found in pop music.

6 citations


ReportDOI
16 Dec 2011
TL;DR: In order to interpret the results obtained from the proposed array detector, a Bayesian-based computational method for extracting the identities and amounts of compounds in a mixture is developed, which is of broad use for any array based detector system.
Abstract: : The overall aim of the project is to develop a robust platform for an array based detector that could sense, distinguish and quantify diverse collections of environmental analytes. We have previously developed cell based reporters that afford the ability to recognize a large number of chemicals, built around G-protein coupled receptors (GPCRs), which provide high diversity and broad specificity. To render this detector system able to function in real time, we are applying synthetic biology approaches to engineer cells with a fast, phosphorylation based memory circuit. This solves two problems: the readout is based on protein phosphorylation and thus occurs within seconds. Second, the response, once established, remains fixed, so that the readout can be analyzed without a transient loss of signal. In order to interpret the results we obtain from the proposed array detector, we have developed a Bayesian-based computational method for extracting the identities and amounts of compounds in a mixture. Applying our computation approach to results obtained with a prototype GPCR-based array, we were able to extract the identity and amounts of compounds in complex mixtures. This provides validation of the method, which could be of broad use for any array based detector system.

Patent
23 Sep 2011
TL;DR: In this paper, the authors propose a method for isolating and assembling DNA molecules without intermediate cloning steps, without intermediate clonology steps, using a set of techniques and tools.
Abstract: Aspects herein relate to composition, and related methods, for isolating and assembling DNA molecules without intermediate cloning steps.