scispace - formally typeset
Search or ask a question
Author

Ron Weiss

Bio: Ron Weiss is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Synthetic biology & Speech synthesis. The author has an hindex of 82, co-authored 292 publications receiving 89189 citations. Previous affiliations of Ron Weiss include French Institute for Research in Computer Science and Automation & Google.


Papers
More filters
Posted Content
11 Jun 2020
TL;DR: This paper proposes a completely unsupervised method, mixture invariant training (MixIT), that requires only single-channel acoustic mixtures and shows that MixIT can achieve competitive performance compared to supervised methods on speech separation.
Abstract: In recent years, rapid progress has been made on the problem of single-channel sound separation using supervised training of deep neural networks. In such supervised approaches, the model is trained to predict the component sources from synthetic mixtures created by adding up isolated ground-truth sources. The reliance on this synthetic training data is problematic because good performance depends upon the degree of match between the training data and real-world audio, especially in terms of the acoustic conditions and distribution of sources. The acoustic properties can be challenging to accurately simulate, and the distribution of sound types may be hard to replicate. In this paper, we propose a completely unsupervised method, mixture invariant training (MixIT), that requires only single-channel acoustic mixtures. In MixIT, training examples are constructed by mixing together existing mixtures, and the model separates them into a variable number of latent sources, such that the separated sources can be remixed to approximate the original mixtures. We show that MixIT can achieve competitive performance compared to supervised methods on speech separation. Using MixIT in a semi-supervised learning setting enables unsupervised domain adaptation and learning from large amounts of real-world data without ground-truth source waveforms. In particular, we significantly improve reverberant speech separation performance by incorporating reverberant mixtures, train a speech enhancement system from noisy mixtures, and improve universal sound separation by incorporating a large amount of in-the-wild data.

39 citations

Journal ArticleDOI
TL;DR: It is found that designing modules for synthetic heterogeneity can be complex, and in general requires a framework for non-linear and multifactorial analysis, so a ‘phenotypic sensitivity analysis’ method is adapted to determine how functional module behaviors combine to achieve optimal system performance.
Abstract: Synthetic biology efforts have largely focused on small engineered gene networks, yet understanding how to integrate multiple synthetic modules and interface them with endogenous pathways remains a challenge. Here we present the design, system integration, and analysis of several large scale synthetic gene circuits for artificial tissue homeostasis. Diabetes therapy represents a possible application for engineered homeostasis, where genetically programmed stem cells maintain a steady population of β-cells despite continuous turnover. We develop a new iterative process that incorporates modular design principles with hierarchical performance optimization targeted for environments with uncertainty and incomplete information. We employ theoretical analysis and computational simulations of multicellular reaction/diffusion models to design and understand system behavior, and find that certain features often associated with robustness (e.g., multicellular synchronization and noise attenuation) are actually detrimental for tissue homeostasis. We overcome these problems by engineering a new class of genetic modules for ‘synthetic cellular heterogeneity’ that function to generate beneficial population diversity. We design two such modules (an asynchronous genetic oscillator and a signaling throttle mechanism), demonstrate their capacity for enhancing robust control, and provide guidance for experimental implementation with various computational techniques. We found that designing modules for synthetic heterogeneity can be complex, and in general requires a framework for non-linear and multifactorial analysis. Consequently, we adapt a ‘phenotypic sensitivity analysis’ method to determine how functional module behaviors combine to achieve optimal system performance. We ultimately combine this analysis with Bayesian network inference to extract critical, causal relationships between a module's biochemical rate-constants, its high level functional behavior in isolation, and its impact on overall system performance once integrated.

38 citations

Book ChapterDOI
02 May 1994
TL;DR: A system that provides query based associative access to the contents of distributed information servers using content labels to permit users to learn about available resources and to formulate queries with adequate discriminatory power is described.
Abstract: We describe a system that provides query based associative access to the contents of distributed information servers. In typical distributed information systems there are so many objects that underconstrained queries can produce large result sets and extraordinary processing costs. To deal with this scaling problem we use content labels to permit users to learn about available resources and to formulate queries with adequate discriminatory power. We have implemented associative access to a distributed set of information servers in the content routing system. A content routing system is a hierarchy of servers called content routers that present a single query based image of a distributed information system. We have successfully used a prototype content routing system to locate documents on several user file systems and on a large number of information servers.

37 citations

Proceedings ArticleDOI
01 May 2001
TL;DR: Fault Recovery, an important availability feature in Oracle, designed to expedite recovery from unplanned outages is highlighted; Fast-Start allows the administrator to configure a running system to impose predictable bounds on the time required for crash recovery.
Abstract: Availability requirements for database systems are more stringent than ever before with the widespread use of databases as the foundation for ebusiness. This paper highlights Fast-Start™ Fault Recovery, an important availability feature in Oracle, designed to expedite recovery from unplanned outages. Fast-Start allows the administrator to configure a running system to impose predictable bounds on the time required for crash recovery. For instance, fast-start allows fine-grained control over the duration of the roll-forward phase of crash recovery by adaptively varying the rate of checkpointing with minimal impact on online performance. Persistent transaction locking in Oracle allows normal online processing to be resumed while the rollback phase of recovery is still in progress, and fast-start allows quick and transparent rollback of changes made by uncommitted transactions prior to a crash.

37 citations

Journal Article
01 May 2018-Nature
TL;DR: The design of a modular platform for intrabody-based protein sensing-actuation devices with transcriptional output triggered by detection of intracellular proteins in mammalian cells is reported.
Abstract: Understanding and reshaping cellular behaviors with synthetic gene networks requires the ability to sense and respond to changes in the intracellular environment. Intracellular proteins are involved in almost all cellular processes, and thus can provide important information about changes in cellular conditions such as infections, mutations, or disease states. Here we report the design of a modular platform for intrabody-based protein sensing-actuation devices with transcriptional output triggered by detection of intracellular proteins in mammalian cells. We demonstrate reporter activation response (fluorescence, apoptotic gene) to proteins involved in hepatitis C virus (HCV) infection, human immunodeficiency virus (HIV) infection, and Huntington’s disease, and show sensor-based interference with HIV-1 downregulation of HLA-I in infected T cells. Our method provides a means to link varying cellular conditions with robust control of cellular behavior for scientific and therapeutic applications.Synthetic biology principles are often used to design circuits that tune gene expression in response to changes in intracellular environments. Here the authors design a modular platform for intracellular protein sensing devices with transcriptional output.

36 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Journal ArticleDOI
01 Apr 1998
TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

14,696 citations

Proceedings Article
11 Nov 1999
TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Abstract: The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. We compare PageRank to an idealized random Web surfer. We show how to efficiently compute PageRank for large numbers of pages. And, we show how to apply PageRank to search and to user navigation.

14,400 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations