scispace - formally typeset
Search or ask a question
Author

Ron Weiss

Bio: Ron Weiss is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Synthetic biology & Speech synthesis. The author has an hindex of 82, co-authored 292 publications receiving 89189 citations. Previous affiliations of Ron Weiss include French Institute for Research in Computer Science and Automation & Google.


Papers
More filters
Journal ArticleDOI
TL;DR: One-pot evaluation enabled by poly-transfection accelerates and simplifies the design of genetic systems, providing a new high-information strategy for interrogating biology.
Abstract: Biological research is relying on increasingly complex genetic systems and circuits to perform sophisticated operations in living cells. Performing these operations often requires simultaneous delivery of many genes, and optimizing the stoichiometry of these genes can yield drastic improvements in performance. However, sufficiently sampling the large design space of gene expression stoichiometries in mammalian cells using current methods is cumbersome, complex, or expensive. We present a 'poly-transfection' method as a simple yet high-throughput alternative that enables comprehensive evaluation of genetic systems in a single, readily-prepared transfection sample. Each cell in a poly-transfection represents an independent measurement at a distinct gene expression stoichiometry, fully leveraging the single-cell nature of transfection experiments. We first benchmark poly-transfection against co-transfection, showing that titration curves for commonly-used regulators agree between the two methods. We then use poly-transfections to efficiently generate new insights, for example in CRISPRa and synthetic miRNA systems. Finally, we use poly-transfection to rapidly engineer a difficult-to-optimize miRNA-based cell classifier for discriminating cancerous cells. One-pot evaluation enabled by poly-transfection accelerates and simplifies the design of genetic systems, providing a new high-information strategy for interrogating biology.

28 citations

01 May 2015
TL;DR: It is proposed that knockdown of miR200 in Medalist +C fibroblasts and iPSCs rescued checkpoint protein expression and reduced DNA damage and is proposed as a potential therapeutic target for treating complications of diabetes.

28 citations

Posted Content
TL;DR: This paper proposes a novel approach to utilizing text-only data, by training a spelling correction (SC) model to explicitly correct errors made by the end-to-end model.
Abstract: Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language model component of the end-to-end model is only trained on transcribed audio-text pairs, which leads to performance degradation especially on rare words. While there have been a variety of work that look at incorporating an external LM trained on text-only data into the end-to-end framework, none of them have taken into account the characteristic error distribution made by the model. In this paper, we propose a novel approach to utilizing text-only data, by training a spelling correction (SC) model to explicitly correct those errors. On the LibriSpeech dataset, we demonstrate that the proposed model results in an 18.6% relative improvement in WER over the baseline model when directly correcting top ASR hypothesis, and a 29.0% relative improvement when further rescoring an expanded n-best list using an external LM.

28 citations

Journal ArticleDOI
TL;DR: A new form of engineering cell–cell communication based on manipulating metabolic pathways is described in E. coli and the model predicted the simultaneous effects of variations in ΔpH across the cell membrane and in NRI∼P binding affinity to the glnAp2 promoter.
Abstract: In the nascent field of synthetic biology (1), the engineering of novel cell–cell communication capabilities will become critical. Synthetic biology involves the creation of artificial gene and metabolic networks to program new cell and organism behaviors. Recent accomplishments have demonstrated that cells can be engineered to carry out novel tasks (refs. 2–10; R.W. and S. Basu, www.hpcaconf.org/hpca8), and hint that someday we will be able to program cell behaviors with the same ease and capability that we now program computers. However, to achieve higher level functions, we need to program individual cells to coordinate their activities. Previous efforts achieved such coordination by taking existing quorum sensing (QS) components from a source host, Vibrio fischeri, and integrating them into a target host, Escherichia coli (5, 7, 8). The work of Bulter et al. (11) in this issue of PNAS describes a new form of engineering cell–cell communication based on manipulating metabolic pathways. Specifically, the authors engineered the nitrogen regulation system and the acetate pathway in E. coli. As the engineered cells grow, they broadcast their presence by producing and secreting acetate. This intercellular signal is then detected by neighboring cells that respond by elevating the expression of a GFP. Because the acetate level correlates with cell density, this process confers QS behavior on the cells without using traditional QS biochemistry. Fig. 1 shows the circuit operation. First, during normal cell metabolism, amino acid biosynthesis results in acetate as a byproduct. Acetate, which functions as the communication signal, is readily converted between several forms in the cells. The protonated form of acetate, acetic acid, diffuses out of the cell to the growth media and then into neighboring cells. When in the cytoplasm, acetic acid is deprotonated to acetate, which in turn can be phosphorylated to acetyl phosphate by acetate kinase. Acetyl phosphate transfers its phosphate group to NRI, a transcriptional regulator of glnAp2 promoter (12). As a result, NRI∼P dimerizes, binds the glnAp2 promoter at two enhancer regions, and activates transcription of GFP in the engineered system (11). Through the above processes, the cytoplasmic level of NRI∼P reflects the extracellular acetate concentration, which in turn correlates to cell density. Thus, GFP expression reports cell density by using acetate as a QS signal. Fig. 1. Basic operation of the engineered acetate QS circuit. Ac∼P, acetyl phosphate; –OAc, acetate; HOAc, acetic acid; A.A., amino acid; NRI, nitrogen regulator protein I; gfp, gene for green fluorescent protein; AckA, acetate kinase; PglnAp2, ... Although the nitrogen regulation system and acetate pathway exist naturally in E. coli, harnessing these mechanisms to QS requires several modifications. For example, the natural system is highly sensitive to oxygen; acetate production increases significantly under anaerobic conditions. By using a pta– strain (13), where pta is one of the main genes in the acetate pathway, the authors removed the influence of oxygen; the mutant showed similar acetate production under aerobic or anaerobic conditions (11). Another benefit to using this mutant is slower degradation of acetate (14) that increases the dynamic response range to acetate (15). The authors also used a mathematical model to guide them in forward engineering and fine tuning of the system through additional genetic mutations and changes in environmental conditions. The model predicted the simultaneous effects of variations in ΔpH across the cell membrane and in NRI∼P binding affinity to the glnAp2 promoter. According to the model, decreasing the medium pH will shift the acetate/acetic acid equilibrium toward the latter, resulting in higher intercellular acetate concentration. Of the two forms, only acetic acid permeates the cell membrane. Therefore, decreasing the pH should enhance the system sensitivity. The model also predicted that the detection sensitivity can be fine-tuned by changing the binding affinity of the NRI∼P dimer to the glnAp2 enhancer regions. Stronger binding affinities increase the sensitivity to acetate, and weaker affinities have the opposite effect. These observations led to experiments under different pH conditions and the construction of three different versions of the enhancer domains with varying binding affinities. The experimental results with various pH and NRI∼P binding affinities confirmed the model predictions. First, to quantify the system's response under different pH, fluorescence was measured in a variety of extracellular acetate concentrations. As predicted, GFP expression in response to any given extracellular acetate concentration was significantly higher at lower pH. Accordingly, cells grown in lower pH were able to respond to lower cell densities. Experiments with a stronger enhancer region further improved the ability to detect lower cell densities. Conversely, a weaker enhancer resulted in higher detection thresholds. The experimentation along these two axes of control validated that the system could be fine tuned and optimized based on model predictions to achieve a particular task, a basic tenet of synthetic biology (ref. 6; R.W. and S. Basu). To date, most synthetic biology efforts have focused on implementing novel, nonnative behavior at the single-cell level (refs. 2–4, 6, and 9–11; R.W. and S. Basu). These explorations have taught us lessons about the operating principles of biological systems and are enabling the creation of bioengineered systems with new functionalities. However, to fully exploit the potential of synthetic biology, we must also explore the realm of cell–cell communication. Such endeavors will allow us to realize sophisticated applications that simply cannot be accomplished by single cells acting alone. Fig. 2 presents some of the issues central to the engineering of cell–cell communication. Any intercellular network consists of sender and receiver elements that communicate via the synthesis, transmission, and reception of signals. In many cases, the sender and receiver elements coexist in the same cell. For example, in the work of Bulter et al. (11), amino acid synthesis functions as the sender element because it produces the acetate signal. This signal is transmitted to other cells as well as the originating cell, and is finally received by the NRI/glnAp2 pathway. This design demonstrates the general requirements that the signal must be readily synthesized and secreted, and, ideally, controllable (e.g., the effect of pH). Fig. 2. Issues in architecting cell–cell communication. Having a diverse library of such signals affords the engineer flexibility and power. Different signals may be suitable for different environmental conditions (e.g., pH, temperature, light, and solid-phase media density). Multiple signals also allow architecting complex interactions that involve simultaneous coexisting communication dialogs between cells. Appropriate signals may take different forms. These include traditional cell–cell signaling molecules such as acyl-homoserine lactones (AHL) used for QS in Gram-negative bacteria (16), peptides used in Gram-positive bacterial QS (17), or yeast pheromones (18). Another class of molecules includes metabolic byproducts such as acetate (11) or lactate (see below) that are typically not considered as cell–cell communication molecules but, as the present study demonstrates, can be used effectively for such purposes. Nonbiochemical signals are also conceivable, such as light or physical contact. One can envision the use of different classes of signals simultaneously, such as AHL and acetate. This approach will likely reduce crosstalk that often exists between related signals [e.g., different AHLs (19)]. On the receiver side, the signal has to be detected. The detection is accomplished by cell surface or cytoplasmic receptors. For example, epidermal growth factors in eukaryotes bind cell surface receptors, whereas AHLs diffuse freely into the cell and bind cytoplasmic receptors. A promising approach is taken by Hellinga's group (20) to design novel receptors by “rational mutagenesis” of existing E. coli periplasmic binding protein, allowing them to bind other substrates such as lactate, TNT, and serotonin. This approach can expand the choice of useful metabolites and can be integrated with engineered organisms that have been modified to secrete such metabolites, e.g., lactate (21). Finally, once the receptor detects a signal, it needs to be processed. This signal processing may include amplification, threshold detection, digitization, or combination with other signals. Biochemical networks can perform such signal processing by regulating transcription and translation, controlling phosphorylation, and coordinating metabolic activities. The work of Bulter et al. (11) suggests a new paradigm for finding and exploiting preexisting nontraditional cell–cell signaling. As we improve our understanding of the crucial role cell–cell communication plays in biology and discover the rich complexities of communication protocols between cells, it becomes increasingly important to be able to engineer these protocols. Such engineering capabilities will benefit both the understanding of natural systems and the design of complex novel behaviors that require coordination between cells.

28 citations

Patent
Bo Li1, Ron Weiss1, Michiel Bacchiani1, Tara N. Sainath1, Kevin W. Wilson1 
20 Dec 2017
TL;DR: In this article, the adaptive shaping of neuron pattern for multichannel speech recognition has been proposed, where a first set of filter parameters for a first filter based on a first audio data channel and a second set of filters for a second one based on both the first and second channels are generated using a trained recurrent neural network.
Abstract: FIELD: information technology.SUBSTANCE: invention discloses means for adaptive shaping of neuron pattern for multichannel speech recognition. A first channel of audio data corresponding to a speech fragment and a second audio data channel corresponding to said speech fragment are received. A first set of filter parameters for a first filter based on a first audio data channel and a second audio data channel and a second set of filter parameters for a second filter based on a first audio data channel and a second audio data channel are generated using a trained recurrent neural network. Generating a single combined audio data channel by combining first channel audio data which has been filtered using first filter, and audio data of second channel, which was filtered using second filter. Audio data are introduced for a single combined channel into a neural network trained as an acoustic model.EFFECT: high efficiency of speech recognition.20 cl, 5 dwg

26 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Journal ArticleDOI
01 Apr 1998
TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

14,696 citations

Proceedings Article
11 Nov 1999
TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Abstract: The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. We compare PageRank to an idealized random Web surfer. We show how to efficiently compute PageRank for large numbers of pages. And, we show how to apply PageRank to search and to user navigation.

14,400 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations