Showing papers by "Ron Weiss published in 2021"

PDF

Open Access

Proceedings Article•

WaveGrad: Estimating Gradients for Waveform Generation

[...]

Nanxin Chen¹, Yu Zhang², Heiga Zen², Ron Weiss², Mohammad Norouzi², William Chan² - Show less +2 more•Institutions (2)

03 May 2021

TL;DR: WaveGrad offers a natural way to trade inference speed for sample quality by adjusting the number of refinement steps, and bridges the gap between non-autoregressive and autoregressive models in terms of audio quality.

...read moreread less

Abstract: This paper introduces WaveGrad, a conditional model for waveform generation which estimates gradients of the data density. The model is built on prior work on score matching and diffusion probabilistic models. It starts from a Gaussian white noise signal and iteratively refines the signal via a gradient-based sampler conditioned on the mel-spectrogram. WaveGrad offers a natural way to trade inference speed for sample quality by adjusting the number of refinement steps, and bridges the gap between non-autoregressive and autoregressive models in terms of audio quality. We find that it can generate high fidelity audio samples using as few as six iterations. Experiments reveal WaveGrad to generate high fidelity audio, outperforming adversarial non-autoregressive baselines and matching a strong likelihood-based autoregressive baseline using fewer sequential operations. Audio samples are available at https://wavegrad-iclr2021.github.io/.

...read moreread less

351 citations

Journal Article•DOI•

Rethinking organoid technology through bioengineering

[...]

Elena Garreta¹, Roger D. Kamm², Susana M. Chuva de Sousa Lopes³, Madeline A. Lancaster⁴, Ron Weiss², Xavier Trepat, Insoo Hyun⁵, Insoo Hyun⁶, Nuria Montserrat¹, Nuria Montserrat⁷ - Show less +6 more•Institutions (7)

University of Barcelona¹, Massachusetts Institute of Technology², Leiden University Medical Center³, Laboratory of Molecular Biology⁴, Harvard University⁵, Case Western Reserve University⁶, Catalan Institution for Research and Advanced Studies⁷

01 Feb 2021-Nature Materials

TL;DR: An overview of bioengineering technologies that can be harnessed to facilitate the culture, self-organization and functionality of human pluripotent stem cell-derived organoids is provided.

...read moreread less

Abstract: In recent years considerable progress has been made in the development of faithful procedures for the differentiation of human pluripotent stem cells (hPSCs). An important step in this direction has also been the derivation of organoids. This technology generally relies on traditional three-dimensional culture techniques that exploit cell-autonomous self-organization responses of hPSCs with minimal control over the external inputs supplied to the system. The convergence of stem cell biology and bioengineering offers the possibility to provide these stimuli in a controlled fashion, resulting in the development of naturally inspired approaches to overcome major limitations of this nascent technology. Based on the current developments, we emphasize the achievements and ongoing challenges of bringing together hPSC organoid differentiation, bioengineering and ethics. This Review underlines the need for providing engineering solutions to gain control of self-organization and functionality of hPSC-derived organoids. We expect that this knowledge will guide the community to generate higher-grade hPSC-derived organoids for further applications in developmental biology, drug screening, disease modelling and personalized medicine. This Review provides an overview of bioengineering technologies that can be harnessed to facilitate the culture, self-organization and functionality of human pluripotent stem cell-derived organoids.

...read moreread less

130 citations

Proceedings Article•DOI•

Parallel Tacotron: Non-Autoregressive and Controllable TTS

[...]

Isaac Elias¹, Heiga Zen¹, Jonathan Shen¹, Yu Zhang¹, Ye Jia¹, Ron Weiss¹, Yonghui Wu¹ - Show less +3 more•Institutions (1)

Google¹

06 Jun 2021

TL;DR: Parallel Tacotron as mentioned in this paper uses a variational autoencoder-based residual encoder for text-to-speech models, which is highly parallelizable during both training and inference.

...read moreread less

Abstract: Although neural end-to-end text-to-speech models can synthesize highly natural speech, there is still room for improvements to its efficiency and naturalness. This paper proposes a non-autoregressive neural text-to-speech model augmented with a variational autoencoder-based residual encoder. This model, called Parallel Tacotron, is highly parallelizable during both training and inference, allowing efficient synthesis on modern parallel hardware. The use of the variational autoencoder relaxes the one-to-many mapping nature of the text-to-speech problem and improves naturalness. To further improve the naturalness, we use lightweight convolutions, which can efficiently capture local contexts, and introduce an iterative spectrogram loss inspired by iterative refinement. Experimental results show that Parallel Tacotron matches a strong autoregressive baseline in subjective evaluations with significantly decreased inference time.

...read moreread less

35 citations

Journal Article•DOI•

Context-aware synthetic biology by controller design: Engineering the mammalian cell.

[...]

Nika Shakiba¹, Ross D. Jones¹, Ron Weiss¹, Domitilla Del Vecchio¹•Institutions (1)

Massachusetts Institute of Technology¹

16 Jun 2021-Cell systems

TL;DR: In this article, the authors describe control systems approaches for achieving context-aware devices that are robust to context effects, and then consider cell fate programing as a case study to explore the potential impact of contextaware devices for regenerative medicine applications.

...read moreread less

Abstract: The rise of systems biology has ushered a new paradigm: the view of the cell as a system that processes environmental inputs to drive phenotypic outputs. Synthetic biology provides a complementary approach, allowing us to program cell behavior through the addition of synthetic genetic devices into the cellular processor. These devices, and the complex genetic circuits they compose, are engineered using a design-prototype-test cycle, allowing for predictable device performance to be achieved in a context-dependent manner. Within mammalian cells, context effects impact device performance at multiple scales, including the genetic, cellular, and extracellular levels. In order for synthetic genetic devices to achieve predictable behaviors, approaches to overcome context dependence are necessary. Here, we describe control systems approaches for achieving context-aware devices that are robust to context effects. We then consider cell fate programing as a case study to explore the potential impact of context-aware devices for regenerative medicine applications.

...read moreread less

26 citations

Proceedings Article•DOI•

Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis

[...]

Ron Weiss¹, RJ Skerry-Ryan¹, Eric Battenberg¹, Soroosh Mariooryad¹, Diederik P. Kingma¹ - Show less +1 more•Institutions (1)

Google¹

06 Jun 2021

TL;DR: The authors proposed a sequence-to-sequence neural network which directly generates speech waveforms from text inputs by incorporating a normalizing flow into the autoregressive decoder loop, which can be optimized directly with maximum likelihood, with-out using intermediate, hand-designed features.

...read moreread less

Abstract: We describe a sequence-to-sequence neural network which directly generates speech waveforms from text inputs. The architecture extends the Tacotron model by incorporating a normalizing flow into the autoregressive decoder loop. Output waveforms are modeled as a sequence of non-overlapping fixed-length blocks, each one containing hundreds of samples. The interdependencies of waveform samples within each block are modeled using the normalizing flow, enabling parallel training and synthesis. Longer-term dependencies are handled autoregressively by conditioning each flow on preceding blocks. This model can be optimized directly with maximum likelihood, with-out using intermediate, hand-designed features nor additional loss terms. Contemporary state-of-the-art text-to-speech (TTS) systems use a cascade of separately learned models: one (such as Tacotron) which generates intermediate features (such as spectrograms) from text, followed by a vocoder (such as WaveRNN) which generates waveform samples from the intermediate features. The proposed system, in contrast, does not use a fixed intermediate representation, and learns all parameters end-to-end. Experiments show that the proposed model generates speech with quality approaching a state-of-the-art neural TTS system, with significantly improved generation speed.

...read moreread less

20 citations

Journal Article•DOI•

An engineered protein-phosphorylation toggle network with implications for endogenous network discovery.

[...]

Deepak Mishra¹, Tristan Bepler¹, Brian Teague², Bonnie Berger¹, Jim Broach³, Ron Weiss - Show less +2 more•Institutions (3)

Massachusetts Institute of Technology¹, University of Wisconsin–Stout², Pennsylvania State University³

02 Jul 2021-Science

TL;DR: In this article, a bistable toggle switch was created in Saccharomyces cerevisiae using a cross-repression topology comprising 11 protein-protein phosphorylation elements.

...read moreread less

Abstract: Synthetic biological networks comprising fast, reversible reactions could enable engineering of new cellular behaviors that are not possible with slower regulation. Here, we created a bistable toggle switch in Saccharomyces cerevisiae using a cross-repression topology comprising 11 protein-protein phosphorylation elements. The toggle is ultrasensitive, can be induced to switch states in seconds, and exhibits long-term bistability. Motivated by our toggle's architecture and size, we developed a computational framework to search endogenous protein pathways for other large and similar bistable networks. Our framework helped us to identify and experimentally verify five formerly unreported endogenous networks that exhibit bistability. Building synthetic protein-protein networks will enable bioengineers to design fast sensing and processing systems, allow sophisticated regulation of cellular processes, and aid discovery of endogenous networks with particular functions.

...read moreread less

16 citations

Journal Article•DOI•

In Vivo Validation of a Reversible Small Molecule-Based Switch for Synthetic Self-Amplifying mRNA Regulation.

[...]

Séan Mc Cafferty¹, Joyca De Temmerman¹, Tasuku Kitada, Jacob R. Becraft, Ron Weiss², Darrell J. Irvine, Mathias Devreese¹, Siegrid De Baere¹, Francis Combes¹, Niek N. Sanders¹ - Show less +6 more•Institutions (2)

Ghent University¹, Massachusetts Institute of Technology²

03 Mar 2021-Molecular Therapy

TL;DR: The in vivo utility of a synthetic self-amplifying mRNA (RNA replicon) whose expression can be turned off using a genetic switch that responds to oral administration of trimethoprim (TMP), an FDA-approved small-molecule drug is validated.

...read moreread less

10 citations

Proceedings Article•DOI•

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

[...]

Nanxin Chen¹, Yu Zhang², Heiga Zen³, Ron Weiss², Mohammad Norouzi³, Najim Dehak¹, William Chan³ - Show less +3 more•Institutions (3)

Johns Hopkins University¹, Massachusetts Institute of Technology², Google³

17 Jun 2021

TL;DR: This article proposed WaveGrad 2, a non-autoregressive generative model for text-to-speech synthesis, which is trained to estimate the gradient of the log conditional density of the waveform given a phoneme sequence.

...read moreread less

Abstract: This paper introduces WaveGrad 2, a non-autoregressive generative model for text-to-speech synthesis. WaveGrad 2 is trained to estimate the gradient of the log conditional density of the waveform given a phoneme sequence. The model takes an input phoneme sequence, and through an iterative refinement process, generates an audio waveform. This contrasts to the original WaveGrad vocoder which conditions on mel-spectrogram features, generated by a separate model. The iterative refinement process starts from Gaussian noise, and through a series of refinement steps (e.g., 50 steps), progressively recovers the audio sequence. WaveGrad 2 offers a natural way to trade-off between inference speed and sample quality, through adjusting the number of refinement steps. Experiments show that the model can generate high fidelity audio, approaching the performance of a state-of-the-art neural TTS system. We also report various ablation studies over different model configurations. Audio samples are available at this https URL.

...read moreread less

10 citations

Posted Content•

Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation.

[...]

Scott Wisdom¹, Aren Jansen¹, Ron Weiss¹, Hakan Erdogan¹, John R. Hershey¹ - Show less +1 more•Institutions (1)

Google¹

01 Jun 2021-arXiv: Audio and Speech Processing

TL;DR: In this article, the authors introduce sparsity losses that favor fewer output sources and a covariance loss that discourages correlated outputs to combat over-separation in the mixture invariant training (MixIT) method.

...read moreread less

Abstract: Supervised neural network training has led to significant progress on single-channel sound separation. This approach relies on ground truth isolated sources, which precludes scaling to widely available mixture data and limits progress on open-domain tasks. The recent mixture invariant training (MixIT) method enables training on in-the wild data; however, it suffers from two outstanding problems. First, it produces models which tend to over-separate, producing more output sources than are present in the input. Second, the exponential computational complexity of the MixIT loss limits the number of feasible output sources. These problems interact: increasing the number of output sources exacerbates over-separation. In this paper we address both issues. To combat over-separation we introduce new losses: sparsity losses that favor fewer output sources and a covariance loss that discourages correlated outputs. We also experiment with a semantic classification loss by predicting weak class labels for each mixture. To extend MixIT to larger numbers of sources, we introduce an efficient approximation using a fast least-squares solution, projected onto the MixIT constraint set. Our experiments show that the proposed losses curtail over-separation and improve overall performance. The best performance is achieved using larger numbers of output sources, enabled by our efficient MixIT loss, combined with sparsity losses to prevent over-separation. On the FUSS test set, we achieve over 13 dB in multi-source SI-SNR improvement, while boosting single-source reconstruction SI-SNR by over 17 dB.

...read moreread less

8 citations

Posted Content•DOI•

Robust and tunable signal processing in mammalian cells via engineered covalent modification cycles

[...]

Ross D. Jones¹, Yili Qian¹, Ilia K¹, Benjamin X. Wang¹, Michael T. Laub¹, Domitilla Del Vecchio¹, Ron Weiss¹ - Show less +3 more•Institutions (1)

Massachusetts Institute of Technology¹

31 Mar 2021-bioRxiv

TL;DR: In this article, the authors used proteins derived from bacterial two-component signaling pathways to develop synthetic phosphorylation-based and feedback-controlled devices in mammalian cells with such properties.

...read moreread less

Abstract: Rewired and synthetic signaling networks can impart cells with new functionalities and enable efforts in engineering cell therapies and directing cell development However, there is a need for tools to build synthetic signaling networks that are tunable, can precisely regulate target gene expression, and are robust to perturbations within the complex context of mammalian cells Here, we use proteins derived from bacterial two-component signaling pathways to develop synthetic phosphorylation-based and feedback-controlled devices in mammalian cells with such properties First, we isolate kinase and phosphatase proteins from the bifunctional histidine kinase EnvZ We then use these proteins to engineer a synthetic covalent modification cycle, in which the kinase and phosphatase competitively regulate phosphorylation of the cognate response regulator OmpR, enabling analog tuning of OmpR-driven gene expression Further, we show that the phosphorylation cycle can be extended by connecting phosphatase expression to small molecule and miRNA inputs in the cell, with the latter enabling cell-type specific signaling responses and accurate cell type classification Finally, we implement a tunable negative feedback controller by co-expressing the kinase-driven output gene with the small molecule-tunable phosphatase This negative feedback substantially reduces cell-to-cell noise in output expression and mitigates the effects of cell context perturbations due to off-target regulation and resource competition Our work thus lays the foundation for establishing tunable, precise, and robust control over cell behavior with synthetic signaling network

...read moreread less

7 citations

Journal Article•DOI•

Quantitative characterization of recombinase-based digitizer circuits enables predictable amplification of biological signals

[...]

Katherine A. Kiwimagi¹, Justin H. Letendre², Benjamin H. Weinberg², Junmin Wang², Mingzhe Chen¹, Leandro Watanabe³, Chris J. Myers³, Jacob Beal⁴, Wilson W. Wong², Ron Weiss¹ - Show less +6 more•Institutions (4)

Massachusetts Institute of Technology¹, Boston University², University of Utah³, BBN Technologies⁴

15 Jul 2021

TL;DR: In this article, a combination of signal-to-noise ratio (SNR), area under a receiver operating characteristic curve (AUC), and fold change (FC) was used to quantitatively define digitizer performance and predict responses to different input signals.

...read moreread less

Abstract: Many synthetic gene circuits are restricted to single-use applications or require iterative refinement for incorporation into complex systems. One example is the recombinase-based digitizer circuit, which has been used to improve weak or leaky biological signals. Here we present a workflow to quantitatively define digitizer performance and predict responses to different input signals. Using a combination of signal-to-noise ratio (SNR), area under a receiver operating characteristic curve (AUC), and fold change (FC), we evaluate three small-molecule inducible digitizer designs demonstrating FC up to 508x and SNR up to 3.77 dB. To study their behavior further and improve modularity, we develop a mixed phenotypic/mechanistic model capable of predicting digitizer configurations that amplify a synNotch cell-to-cell communication signal (Δ SNR up to 2.8 dB). We hope the metrics and modeling approaches here will facilitate incorporation of these digitizers into other systems while providing an improved workflow for gene circuit characterization.

...read moreread less

Journal Article•DOI•

Incomplete Cell Sorting Creates Engineerable Structures with Long-Term Stability

[...]

Jesse Tordoff¹, Matej Krajnc², Nicholas Walczak³, Matthew Lima¹, Jacob Beal³, Stanislav Y. Shvartsman², Ron Weiss¹ - Show less +3 more•Institutions (3)

Massachusetts Institute of Technology¹, Princeton University², BBN Technologies³

20 Jan 2021

TL;DR: By varying the number of highly adhesive and less adhesive cells in multicellular aggregates, this work finds the cell-type ratio and total cell count control pattern formation, with resulting structures maintained for several days.

...read moreread less

Abstract: Summary Adhesion-mediated cell sorting has long been considered an organizing principle in developmental biology. While most computational models have emphasized the dynamics of segregation to fully sorted structures, cell sorting can also generate a plethora of transient, incompletely sorted states. The timescale of such states in experimental systems is unclear: if they are long-lived, they can be harnessed by development or engineered in synthetic tissues. Here, we use experiments and computational modeling to demonstrate how such structures can be systematically designed by quantitative control of cell composition. By varying the number of highly adhesive and less adhesive cells in multicellular aggregates, we find the cell-type ratio and total cell count control pattern formation, with resulting structures maintained for several days. Our work takes a step toward mapping the design space of self-assembling structures in development and provides guidance to the emerging field of shape engineering with synthetic biology.

...read moreread less

Proceedings Article•DOI•

Multitask Training with Text Data for End-to-End Speech Recognition

[...]

Peidong Wang¹, Tara N. Sainath², Ron Weiss³•Institutions (3)

Ohio State University¹, Google², Massachusetts Institute of Technology³

30 Aug 2021

TL;DR: The authors proposed a multitask training method for attention-based end-to-end speech recognition models to better incorporate language level information, which leads to an 11% relative performance improvement over the baseline and is comparable to language model shallow fusion.

...read moreread less

Abstract: We propose a multitask training method for attention-based end-to-end speech recognition models to better incorporate language level information. We regularize the decoder in a sequence-to-sequence architecture by multitask training it on both the speech recognition task and a next-token prediction language modeling task. Trained on either the 100 hour subset of LibriSpeech or the full 960 hour dataset, the proposed method leads to an 11% relative performance improvement over the baseline and is comparable to language model shallow fusion, without requiring an additional neural network during decoding. Analyses of sample output sentences and the word error rate on rare words demonstrate that the proposed method can incorporate language level information effectively.

...read moreread less

Posted Content•DOI•

Meeting Measurement Precision Requirements for Effective Engineering of Genetic Regulatory Networks

[...]

Jacob Beal¹, Brian Teague², John T. Sexton³, Sebastian M. Castillo-Hair³, Nicholas A. DeLateur², Meher Samineni¹, Jeffery J. Tabor³, Ron Weiss² - Show less +4 more•Institutions (3)

BBN Technologies¹, Massachusetts Institute of Technology², Rice University³

10 Oct 2021-bioRxiv

TL;DR: In this article, the authors derived an Engineering Error Inequality that provides a quantitative mathematical bound on the relationship between predictability of results, model accuracy, measurement precision, and device characteristics, recommending a target standard deviation of 1.5-fold.

...read moreread less

Abstract: Reliable, predictable engineering of cellular behavior is one of the key goals of synthetic biology. As the field matures, biological engineers will become increasingly reliant on computer models that allow for the rapid exploration of design space prior to the more costly construction and characterization of candidate designs. The efficacy of such models, however, depends on the accuracy of their predictions, the precision of the measurements used to parameterize the models, and the tolerance of biological devices for imperfections in modeling and measurement. To better understand this relationship, we have derived an Engineering Error Inequality that provides a quantitative mathematical bound on the relationship between predictability of results, model accuracy, measurement precision, and device characteristics. We apply this relation to estimate measurement precision requirements for engineering genetic regulatory networks given current model and device characteristics, recommending a target standard deviation of 1.5-fold. We then compare these requirements with the results of an interlaboratory study to validate that these requirements can be met via flow cytometry with matched instrument channels and an independent calibrant. Based on these results, we recommend a set of best practices for quality control of flow cytometry data and discuss how these might be extended to other measurement modalities and applied to support further development of genetic regulatory network engineering.

...read moreread less

Book Chapter•DOI•

TASBE Image Analytics: A Processing Pipeline for Quantifying Cell Organization from Fluorescent Microscopy.

[...]

Nicholas Walczak¹, Jacob Beal¹, Jesse Tordoff², Ron Weiss²•Institutions (2)

BBN Technologies¹, Massachusetts Institute of Technology²

01 Jan 2021-Methods of Molecular Biology

TL;DR: TASBE Image Analytics as discussed by the authors is a software pipeline for automatically segmenting collections of cells using the fluorescence channels of microscopy images, which can be grouped into spatially disjoint segments and the movement or development of these segments tracked over time.

...read moreread less

Abstract: Laboratory automation now commonly allows high-throughput sample preparation, culturing, and acquisition of microscopy images, but quantitative image analysis is often still a painstaking and subjective process. This is a problem especially significant for work on programmed morphogenesis, where the spatial organization of cells and cell types is of paramount importance. To address the challenges of quantitative analysis for such experiments, we have developed TASBE Image Analytics, a software pipeline for automatically segmenting collections of cells using the fluorescence channels of microscopy images. With TASBE Image Analytics, collections of cells can be grouped into spatially disjoint segments, the movement or development of these segments tracked over time, and rich statistical data output in a standardized format for analysis. Processing is readily configurable, rapid, and produces results that closely match hand annotation by humans for all but the smallest and dimmest segments. TASBE Image Analytics can thus provide the analysis necessary to complete the design-build-test-learn cycle for high-throughput experiments in programmed morphogenesis, as validated by our application of this pipeline to process experiments on shape formation with engineered CHO and HEK293 cells.

...read moreread less

Patent•

Methods of glycoengineering proteoglycans with distinct glycan structures

[...]

Michelle C. Y. Chang¹, Gaydukov Leonid A¹, Giyoung Jung¹, Nevin M. Summers¹, Timothy K. Lu¹, Ron Weiss¹, Scarcelli John¹, Richard J. Cornell¹, Jeffrey K. Marshall¹, Figueroa Bruno¹, Wen Allen Tseng¹ - Show less +7 more•Institutions (1)

Massachusetts Institute of Technology¹

22 Apr 2021

TL;DR: In this paper, methods of generating proteoglycans with distinct glycan structures in engineered, non-naturally occurring eukaryotic cells are presented, making accessible a dynamic range of protein glycosylation.

...read moreread less

Abstract: Disclosed herein are methods of generating proteoglycans with distinct glycan structures in engineered, non-naturally occurring eukaryotic cells. These methods make accessible a dynamic range of protein glycosylation. Compositions of engineered, non-naturally occurring cells capable of generating these proteoglycans are also disclosed herein.

...read moreread less

Patent•

Multichannel speech recognition using neural networks

[...]

Ehsan Variani¹, Kevin W. Wilson¹, Ron Weiss¹, Tara N. Sainath¹, Arun Narayanan¹ - Show less +1 more•Institutions (1)

Google¹

13 Jul 2021

Posted Content•

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

[...]

Nanxin Chen¹, Yu Zhang², Heiga Zen³, Ron Weiss², Mohammad Norouzi³, Najim Dehak¹, William Chan³ - Show less +3 more•Institutions (3)

Johns Hopkins University¹, Massachusetts Institute of Technology², Google³

17 Jun 2021-arXiv: Audio and Speech Processing

TL;DR: This paper proposed WaveGrad 2, a non-autoregressive generative model for text-to-speech synthesis, which is trained to estimate the gradient of the log conditional density of the waveform given a phoneme sequence.

...read moreread less