scispace - formally typeset
Search or ask a question

Showing papers by "Andrew D. Ellington published in 2021"


Journal ArticleDOI
14 Jan 2021
TL;DR: This Primer describes recombineering and MAGE, their optimal use, their diverse applications and methods for pairing them with other genetic editing tools, and looks forward to the future of genetic engineering.
Abstract: Recombination-mediated genetic engineering, also known as recombineering, is the genomic incorporation of homologous single-stranded or double-stranded DNA into bacterial genomes Recombineering and its derivative methods have radically improved genome engineering capabilities, perhaps none more so than multiplex automated genome engineering (MAGE) MAGE is representative of a set of highly multiplexed single-stranded DNA-mediated technologies First described in Escherichia coli, both MAGE and recombineering are being rapidly translated into diverse prokaryotes and even into eukaryotic cells Together, this modern set of tools offers the promise of radically improving the scope and throughput of experimental biology by providing powerful new methods to ease the genetic manipulation of model and non-model organisms In this Primer, we describe recombineering and MAGE, their optimal use, their diverse applications and methods for pairing them with other genetic editing tools We then look forward to the future of genetic engineering This Primer by Wannier and colleagues summarizes the methodology, analysis and utility of recombineering and multiplex automated genome engineering (MAGE) in microbial species In addition, this Primer examines advanced techniques that pair MAGE with other tools to improve editing efficiency

36 citations


Journal ArticleDOI
19 May 2021
TL;DR: In this paper, the authors describe a method for converting nonspecific read-out of loop-mediated isothermal amplification (LAMP) assays into sequence-specific readout using oligonucleotide strand displacement (OSD) probes.
Abstract: Isothermal nucleic acid amplification tests (iNATs), such as loop-mediated isothermal amplification (LAMP), are good alternatives to PCR-based amplification assays, especially for point-of-care and low-resource use, in part because they can be carried out with relatively simple instrumentation. However, iNATs can often generate spurious amplicons, especially in the absence of target sequences, resulting in false-positive results. This is especially true if signals are based on non-sequence-specific probes, such as intercalating dyes or pH changes. In addition, pathogens often prove to be moving, evolving targets and can accumulate mutations that will lead to inefficient primer binding and thus false-negative results. Multiplex assays targeting different regions of the analyte and logical signal readout using sequence-specific probes can help to reduce both false negatives and false positives. Here, we describe rapid conversion of three previously described SARS-CoV-2 LAMP assays that relied on a non-sequence-specific readout into individual and multiplex one-pot assays that can be visually read using sequence-specific oligonucleotide strand exchange (OSD) probes. We describe both fluorescence-based and Boolean logic-gated colorimetric lateral flow readout methods and demonstrate detection of SARS-CoV-2 virions in crude human saliva.IMPORTANCE One of the key approaches to treatment and control of infectious diseases, such as COVID-19, is accurate and rapid diagnostics that is widely deployable in a timely and scalable manner. To achieve this, it is essential to go beyond the traditional gold standard of quantitative PCR (qPCR) that is often faced with difficulties in scaling due to the complexity of infrastructure and human resource requirements. Isothermal nucleic acid amplification methods, such as loop-mediated isothermal amplification (LAMP), have been long pursued as ideal, low-tech alternatives for rapid, portable testing. However, isothermal approaches often suffer from false signals due to employment of nonspecific readout methods. We describe general principles for rapidly converting nonspecifically read LAMP assays into assays that are read in a sequence-specific manner by using oligonucleotide strand displacement (OSD) probes. We also demonstrate that inclusion of OSD probes in LAMP assays maintains the simplicity of one-pot assays and a visual yes/no readout by using fluorescence or colorimetric lateral-flow dipsticks while providing accurate sequence-specific readout and the ability to logically query multiplex amplicons for redundancy or copresence. These principles not only yielded high-surety isothermal assays for SARS-CoV-2 but might also aid in the design of more sophisticated molecular assays for other analytes.

33 citations


Posted ContentDOI
21 Aug 2021-bioRxiv
TL;DR: In this article, the authors used directed evolution to expand the effector specificity of the camphor-responsive TetR-family regulator CamR from Pseudomonas putida using a novel negative selection coupled with a high-throughput positive screen.
Abstract: Prokaryotic transcription factors can be repurposed as analytical and synthetic tools for precise chemical measurement and regulation. Monoterpenes encompass a broad chemical family that are commercially valuable as flavors, cosmetics, and fragrances, but have proven difficult to measure, especially in cells. Herein, we develop genetically-encoded, generalist monoterpene biosensors by using directed evolution to expand the effector specificity of the camphor-responsive TetR-family regulator CamR from Pseudomonas putida. Using a novel negative selection coupled with a high-throughput positive screen (Seamless Enrichment of Ligand-Inducible Sensors, SELIS), we evolve CamR biosensors that can recognize four distinct monoterpenes: borneol, fenchol, eucalyptol, and camphene. Different evolutionary trajectories surprisingly yielded common mutations, emphasizing the utility of CamR as a platform for creating generalist biosensors. Systematic promoter optimization driving the reporter increased the system’s signal-to-noise ratio to 150-fold. These sensors can serve as a starting point for the high-throughput screening and dynamic regulation of bicyclic monoterpene production strains.

21 citations


Journal ArticleDOI
TL;DR: In this paper, a new orthogonal translation machinery was developed to site-specifically incorporate 3,4-dihydroxyphenylalanine (L-DOPA) into recombinant proteins.
Abstract: The catechol group of 3,4-dihydroxyphenylalanine (L-DOPA) derived from L-tyrosine oxidation is a key post-translational modification (PTM) in many protein biomaterials and has potential as a bioorthogonal handle for precision protein conjugation applications such as antibody-drug conjugates. Despite this potential, indiscriminate enzymatic modification of exposed tyrosine residues or complete replacement of tyrosine using auxotrophic hosts remains the preferred method of introducing the catechol moiety into proteins, which precludes many protein engineering applications. We have developed new orthogonal translation machinery to site-specifically incorporate L-DOPA into recombinant proteins and a new fluorescent biosensor to selectively monitor L-DOPA incorporation in vivo. We show simultaneous biosynthesis and incorporation of L-DOPA and apply this translation machinery to engineer a novel metalloprotein containing a DOPA-Fe chromophore.

12 citations


Journal ArticleDOI
TL;DR: In this paper, a short fusion domain from the actin-binding protein villin was added to the DNA polymerase I from Geobacillus stearothermophilus (also known as Bst DNAP) to improve both stability and purification of the enzyme.
Abstract: The DNA polymerase I from Geobacillus stearothermophilus (also known as Bst DNAP) is widely used in isothermal amplification reactions, where its strand displacement ability is prized. More robust versions of this enzyme should be enabled for diagnostic applications, especially for carrying out higher temperature reactions that might proceed more quickly. To this end, we appended a short fusion domain from the actin-binding protein villin that improved both stability and purification of the enzyme. In parallel, we have developed a machine learning algorithm that assesses the relative fit of individual amino acids to their chemical microenvironments at any position in a protein and applied this algorithm to predict sequence substitutions in Bst DNAP. The top predicted variants had greatly improved thermotolerance (heating prior to assay), and upon combination, the mutations showed additive thermostability, with denaturation temperatures up to 2.5 °C higher than the parental enzyme. The increased thermostability of the enzyme allowed faster loop-mediated isothermal amplification assays to be carried out at 73 °C, where both Bst DNAP and its improved commercial counterpart Bst 2.0 are inactivated. Overall, this is one of the first examples of the application of machine learning approaches to the thermostabilization of an enzyme.

12 citations


Journal ArticleDOI
TL;DR: In this paper, the design, chemical synthesis, and flexizyme-catalyzed transfer RNA (tRNA) acylation of a variety of fluorescent amino acids (FAAs) was reported.

9 citations


Journal ArticleDOI
01 Jun 2021-PLOS ONE
TL;DR: In this paper, a simplified method for preparing cellular reagents that requires only a common bacterial incubator to grow and subsequently dry enzyme-expressing bacteria at 37°C with the aid of inexpensive chemical desiccants is presented.
Abstract: We recently developed 'cellular' reagents-lyophilized bacteria overexpressing proteins of interest-that can replace commercial pure enzymes in typical diagnostic and molecular biology reactions. To make cellular reagent technology widely accessible and amenable to local production with minimal instrumentation, we now report a significantly simplified method for preparing cellular reagents that requires only a common bacterial incubator to grow and subsequently dry enzyme-expressing bacteria at 37°C with the aid of inexpensive chemical desiccants. We demonstrate application of such dried cellular reagents in common molecular and synthetic biology processes, such as PCR, qPCR, reverse transcription, isothermal amplification, and Golden Gate DNA assembly, in building easy-to-use testing kits, and in rapid reagent production for meeting extraordinary diagnostic demands such as those being faced in the ongoing SARS-CoV-2 pandemic. Furthermore, we demonstrate feasibility of local production by successfully implementing this minimized procedure and preparing cellular reagents in several countries, including the United Kingdom, Cameroon, and Ghana. Our results demonstrate possibilities for readily scalable local and distributed reagent production, and further instantiate the opportunities available via synthetic biology in general.

9 citations


Posted ContentDOI
12 Oct 2021-bioRxiv
TL;DR: In this article, a structure-based, deep learning algorithm was used to engineer an extremely robust and highly active PET hydrolase, FAST-PETase, which exhibits superior PET-hydrolytic activity relative to both wild-type and engineered alternatives, and possesses enhanced thermostability and pH tolerance.
Abstract: Plastic waste poses an ecological challenge1. While current plastic waste management largely relies on unsustainable, energy-intensive, or even hazardous physicochemical and mechanical processes, enzymatic degradation offers a green and sustainable route for plastic waste recycling2. Poly(ethylene terephthalate) (PET) has been extensively used in packaging and for the manufacture of fabrics and single-used containers, accounting for 12% of global solid waste3. The practical application of PET hydrolases has been hampered by their lack of robustness and the requirement for high processing temperatures. Here, we use a structure-based, deep learning algorithm to engineer an extremely robust and highly active PET hydrolase. Our best resulting mutant (FAST-PETase: Functional, Active, Stable, and Tolerant PETase) exhibits superior PET-hydrolytic activity relative to both wild-type and engineered alternatives, (including a leaf-branch compost cutinase and its mutant4) and possesses enhanced thermostability and pH tolerance. We demonstrate that whole, untreated, post-consumer PET from 51 different plastic products can all be completely degraded by FAST-PETase within one week, and in as little as 24 hours at 50 {degrees}C. Finally, we demonstrate two paths for closed-loop PET recycling and valorization. First, we re-synthesize virgin PET from the monomers recovered after enzymatic depolymerization. Second, we enable in situ microbially-enabled valorization using a Pseudomonas strain together with FAST-PETase to degrade PET and utilize the evolved monomers as a carbon source for growth and polyhydroxyalkanoate production. Collectively, our results demonstrate the substantial improvements enabled by deep learning and a viable route for enzymatic plastic recycling at the industrial scale.

8 citations


Posted ContentDOI
08 Apr 2021-bioRxiv
TL;DR: In this article, the discovery of SARS-CoV-2 neutralizing antibodies isolated from COVID-19 patients using a high-throughput platform was reported, and the antibodies were identified from unpaired donor B-cell and serum repertoires using yeast surface display, proteomics, and public light chain screening.
Abstract: The ongoing evolution of SARS-CoV-2 into more easily transmissible and infectious variants has sparked concern over the continued effectiveness of existing therapeutic antibodies and vaccines. Hence, together with increased genomic surveillance, methods to rapidly develop and assess effective interventions are critically needed. Here we report the discovery of SARS-CoV-2 neutralizing antibodies isolated from COVID-19 patients using a high-throughput platform. Antibodies were identified from unpaired donor B-cell and serum repertoires using yeast surface display, proteomics, and public light chain screening. Cryo-EM and functional characterization of the antibodies identified N3-1, an antibody that binds avidly (K d,app = 68 pM) to the receptor binding domain (RBD) of the spike protein and robustly neutralizes the virus in vitro . This antibody likely binds all three RBDs of the trimeric spike protein with a single IgG. Importantly, N3-1 equivalently binds spike proteins from emerging SARS-CoV-2 variants of concern, neutralizes UK variant B.1.1.7, and binds SARS-CoV spike with nanomolar affinity. Taken together, the strategies described herein will prove broadly applicable in interrogating adaptive immunity and developing rapid response biological countermeasures to emerging pathogens.

7 citations


Journal ArticleDOI
TL;DR: The need to better understand how selection acts on patterns of synonymous codon usage across the genome is highlighted and the recoded bacteriophage ΦX174 provides a convenient system to investigate the genetic determinants of virulence.
Abstract: Natural selection acting on synonymous mutations in protein-coding genes influences genome composition and evolution. In viruses, introducing synonymous mutations in genes encoding structural proteins can drastically reduce viral growth, providing a means to generate potent, live-attenuated vaccine candidates. However, an improved understanding of what compositional features are under selection and how combinations of synonymous mutations affect viral growth is needed to predictably attenuate viruses and make them resistant to reversion. We systematically recoded all nonoverlapping genes of the bacteriophage ΦX174 with codons rarely used in its Escherichia coli host. The fitness of recombinant viruses decreases as additional deoptimizing mutations are made to the genome, although not always linearly, and not consistently across genes. Combining deoptimizing mutations may reduce viral fitness more or less than expected from the effect size of the constituent mutations and we point out difficulties in untangling correlated compositional features. We test our model by optimizing the same genes and find that the relationship between codon usage and fitness does not hold for optimization, suggesting that wild-type ΦX174 is at a fitness optimum. This work highlights the need to better understand how selection acts on patterns of synonymous codon usage across the genome and provides a convenient system to investigate the genetic determinants of virulence.

7 citations



Journal ArticleDOI
TL;DR: The Statement of Ethics in Engineering Biology Research (SOBR) as mentioned in this paper was developed to guide researchers as they incorporate the consideration of long-term ethical implications of their work into every phase of the research lifecycle.
Abstract: Engineering biology is being applied toward solving or mitigating some of the greatest challenges facing society. As with many other rapidly advancing technologies, the development of these powerful tools must be considered in the context of ethical uses for personal, societal, and/or environmental advancement. Researchers have a responsibility to consider the diverse outcomes that may result from the knowledge and innovation they contribute to the field. Together, we developed a Statement of Ethics in Engineering Biology Research to guide researchers as they incorporate the consideration of long-term ethical implications of their work into every phase of the research lifecycle. Herein, we present and contextualize this Statement of Ethics and its six guiding principles. Our goal is to facilitate ongoing reflection and collaboration among technical researchers, social scientists, policy makers, and other stakeholders to support best outcomes in engineering biology innovation and development.

Journal ArticleDOI
TL;DR: In this paper, a 3D convolutional neural network (3D CNN) was used to predict the wild-type amino acid or consensus in a multiple sequence alignment from the local structural context surrounding site of interest.
Abstract: One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

Journal ArticleDOI
29 May 2021-Life
TL;DR: In this paper, the ambiguities between the biotic and the abiotic at the origin of life were explored, and the role of grayness extends into later transitions as well.
Abstract: In the search for life beyond Earth, distinguishing the living from the non-living is paramount. However, this distinction is often elusive, as the origin of life is likely a stepwise evolutionary process, not a singular event. Regardless of the favored origin of life model, an inherent "grayness" blurs the theorized threshold defining life. Here, we explore the ambiguities between the biotic and the abiotic at the origin of life. The role of grayness extends into later transitions as well. By recognizing the limitations posed by grayness, life detection researchers will be better able to develop methods sensitive to prebiotic chemical systems and life with alternative biochemistries.


Posted ContentDOI
13 Jul 2021-bioRxiv
TL;DR: A unique combined screening and selection approach that quickly refines the affinities and specificities of generalist transcription factors is developed, and using RamR as a starting point highly specific and sensitive biosensors for the alkaloids tetrahydropapaversine, papaverine, glaucine, rotundine, and noscapine are evolved.
Abstract: A key bottleneck in the microbial production of therapeutic plant metabolites is identifying enzymes that can greatly improve yield. The facile identification of genetically encoded biosensors can overcome this limitation and become part of a general method for engineering scaled production. We have developed a unique combined screening and selection approach that quickly refines the affinities and specificities of generalist transcription factors, and using RamR as a starting point we evolve highly specific (>100-fold preference) and sensitive (EC50 <30 M) biosensors for the alkaloids tetrahydropapaverine, papaverine, glaucine, rotundine, and noscapine. High resolution structures reveal multiple evolutionary avenues for the fungible effector binding site, and the creation of new pockets for different chemical moieties. These sensors further enabled the evolution of a streamlined pathway for tetrahydropapaverine, an immediate precursor to four modern pharmaceuticals, collapsing multiple methylation steps into a single evolved enzyme. Our methods for evolving biosensors now enable the rapid engineering of pathways for therapeutic alkaloids.

Posted ContentDOI
04 Nov 2021-bioRxiv
TL;DR: The most cited aptamers were examined using a standardized methodology in order to categorize the extent to which the sequences themselves were apparently improperly reproduced, both in the literature and presumably in experiments beyond their discovery as discussed by the authors.
Abstract: Aptamers have been the subject of more than 144,000 papers to date. However, there has been a growing concern that errors in reporting aptamer research limit the reliability of these reagents for research and other applications. These observations noting inconsistencies in the use of our RNA anti-lysozyme aptamer served as an impetus for our systematic review of the reporting of aptamer sequences in the literature. Our detailed examination of literature citing the RNA anti-lysozyme aptamer revealed that 93% of the 61 publications reviewed reported unexplained altered sequences with 86% of those using DNA variants. The ten most cited aptamers were examined using a standardized methodology in order to categorize the extent to which the sequences themselves were apparently improperly reproduced, both in the literature and presumably in experiments beyond their discovery. Our review of 800 aptamer publications spanned decades, multiple journals, and research groups, and revealed that 44% of the papers reported unexplained sequence alterations. We identified ten common categories of sequence alterations including deletions, substitutions, additions, among others. The robust data set we have produced elucidates a source of irreproducibility and unreliability in our field and can be used as a starting point for building evidence-based best practices in publication standards to elevate the rigor and reproducibility of aptamer research.

Journal ArticleDOI
TL;DR: This research presents a probabilistic procedure to estimate the asteroid velocities and spacing between the Sun and Earth using a X-ray diffraction-gauging machine.
Abstract: This white paper argues for a more universal approach to life detection. We recommend that life detection missions focus on looking for signatures of life deemed to be shared by all possible types of life, independent of their specific biochemistries, rather than looking for signatures of life that could arguably be specific to Terran-life.

Journal ArticleDOI
TL;DR: It is shown that common markers used in many plant transformation systems function as expected in common dandelion including fluorescent proteins, GUS, and anthocyanin regulation, as well as resistance to kanamycin, Basta, and hygromycin.
Abstract: Taraxacum officinale, or the common dandelion, is a widespread perennial species recognized worldwide as a common lawn and garden weed. Common dandelion is also cultivated for use in teas, as edible greens, and for use in traditional medicine. It produces latex and is closely related to the Russian dandelion, T. kok-saghyz, which is being developed as a rubber crop. Additionally, the vast majority of extant common dandelions reproduce asexually through apomictically derived seeds- an important goal for many major crops in modern agriculture. As such, there is increasing interest in the molecular control of important pathways as well as basic molecular biology and reproduction of common dandelion. Here we present an improved Agrobacterium-based genetic transformation and regeneration protocol, a protocol for generation and transformation of protoplasts using free DNA, and a protocol for leaf Agrobacterium infiltration for transient gene expression. These protocols use easily obtainable leaf explants from soil-grown plants and reagents common to most molecular plant laboratories. We show that common markers used in many plant transformation systems function as expected in common dandelion including fluorescent proteins, GUS, and anthocyanin regulation, as well as resistance to kanamycin, Basta, and hygromycin. Reproducible, stable and transient transformation methods are presented that will allow for needed molecular structure and function studies of genes and proteins in T. officinale.


Posted ContentDOI
14 May 2021-bioRxiv
TL;DR: In this paper, a yeast Saccharomyces cerevisiae is used for studying G protein-coupled receptors (GPCRs) as they can be functionally coupled to its pheromone response pathway, which may be due to the presence of fungal sterol ergosterol instead of the animal sterol cholesterol.
Abstract: The yeast Saccharomyces cerevisiae is a powerful tool for studying G protein-coupled receptors (GPCRs) as they can be functionally coupled to its pheromone response pathway. However, some exogenous GPCRs, including the mu opioid receptor, are non-functional in yeast, which may be due to the presence of the fungal sterol ergosterol instead of the animal sterol cholesterol. We engineered yeast to produce cholesterol and introduced the human mu opioid receptor, creating an opioid biosensor capable of detecting the peptide DAMGO at an EC50 of 62 nM and the opiate morphine at an EC50 of 882 nM. Furthermore, introducing mu, delta, and kappa opioid receptors from diverse vertebrates consistently yielded active opioid biosensors that both recapitulated expected agonist binding profiles with EC50s as low as 2.5 nM and were inhibited by the antagonist naltrexone. Additionally, clinically relevant human mu opioid receptor alleles, or variants with terminal mutations, resulted in biosensors that largely displayed the expected changes in activity. We also tested mu opioid receptor-based biosensors with systematically adjusted biosynthetic intermediates of cholesterol, enabling us to relate sterol profiles with biosensor sensitivity. Finally our cholesterol-producing biosensor background was applied to other human GPCRs, resulting in SSTR5, 5-HTR4, FPR1 and NPY1R signaling with varying degrees of cholesterol dependence. Our sterol-optimized platform will be a valuable tool in generating human GPCR-based biosensors, aiding in ongoing receptor deorphanization efforts, and providing a framework for high-throughput screening of receptors and effectors.

Posted ContentDOI
15 Apr 2021-bioRxiv
TL;DR: In this article, a logical redesign of Bst DNA polymerase (Bst DNAP) was proposed by using multimodal application of several independent and orthogonal rational engineering methods such as domain addition, supercharging, and machine learning predictions of amino acid substitutions.
Abstract: DNA polymerase from Geobacillus stearothermophilus, Bst DNA polymerase (Bst DNAP), is a versatile enzyme with robust strand-displacing activity that enables loop-mediated isothermal amplification (LAMP). Despite its exclusive usage in LAMP assay, its properties remain open to improvement. Here, we describe logical redesign of Bst DNAP by using multimodal application of several independent and orthogonal rational engineering methods such as domain addition, supercharging, and machine learning predictions of amino acid substitutions. The resulting Br512g3 enzyme is not only thermostable and extremely robust but it also displays improved reverse transcription activity and the ability to carry out ultrafast LAMP at 74 {degrees}C. Our study illustrates a new enzyme engineering strategy as well as contributes a novel engineered strand displacing DNA polymerase of high value to diagnostics and other fields.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate that strategically positioned mismatches between circuit components can reduce unprogrammed hybridization reactions and therefore greatly diminish leakage, and demonstrate that a three-layer catalytic hairpin assembly cascade that can operate in a single tube and yield 3.7 × 104-fold signal amplification in only 4 hours.
Abstract: Signal amplification is ubiquitous in biology and engineering. Protein enzymes, such as DNA polymerases, can routinely achieve >106-fold signal increase, making them powerful tools for signal enhancement. Considerable signal amplification can also be achieved using nonenzymatic, cascaded nucleic acid strand exchange reactions. However, the practical application of such kinetically trapped circuits has so far proven difficult due to uncatalyzed leakage of the cascade. We now demonstrate that strategically positioned mismatches between circuit components can reduce unprogrammed hybridization reactions and therefore greatly diminish leakage. In consequence, we were able to synthesize a three-layer catalytic hairpin assembly cascade that could operate in a single tube and that yielded 3.7 × 104-fold signal amplification in only 4 h, a greatly improved performance relative to previous cascades. This advance should facilitate the implementation of nonenzymatic signal amplification in molecular diagnostics, as well as inform the design of a wide variety of increasingly intricate nucleic acid computation circuits.

Posted ContentDOI
19 Aug 2021-bioRxiv
TL;DR: In this article, the authors used 3D convolutional neural networks (3D CNNs) to predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding a site of interest.
Abstract: The fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding a site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate, and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

Journal ArticleDOI
TL;DR: Escherichia coli that utilize 4- and 5-fluoroindole and replace tryptophan with fluorinated derivatives throughout their proteomes have been evolved.
Abstract: Escherichia coli that utilize 4- and 5-fluoroindole and replace tryptophan with fluorinated derivatives throughout their proteomes have been evolved.

Posted ContentDOI
21 Jul 2021-bioRxiv
TL;DR: In this article, the authors examined the genome-wide distribution of Tus and found that only the six innermost Ter sites (TerA-E and G) were significantly bound by Tus proteins, and also found that a single ectopic insertion of TerB in its nonpermissive orientation could not be achieved, advocating against a need for back-up Ter sites.
Abstract: In Escherichia coli, DNA replication termination is orchestrated by two clusters of Ter sites forming a DNA replication fork trap when bound by Tus proteins. The formation of a locked Tus-Ter complex is essential for halting incoming DNA replication forks. However, the absence of replication fork arrest at some Ter sites raised questions about their significance. In this study, we examined the genome-wide distribution of Tus and found that only the six innermost Ter sites (TerA-E and G) were significantly bound by Tus. We also found that a single ectopic insertion of TerB in its non-permissive orientation could not be achieved, advocating against a need for back-up Ter sites. Finally, examination of the genomes of a variety of Enterobacterales revealed a new replication fork trap architecture exclusively found outside the Enterobacteriaceae family. Taken together, our data enabled the delineation of a narrow prototypical Tus-dependent DNA replication fork trap consisting of only two Ter sites.

Posted ContentDOI
17 Nov 2021-bioRxiv
TL;DR: In this paper, a panel of three mutually orthogonal promoters that can be acted on by artificial gRNAs bound by CRISPR regulators were designed, and guide RNA expression targeting these promoters was in turn controlled by either Pol III (U6) or ethylene-inducible Pol II promoters, implementing for the first time a fully artificial Orthogonal Control System.
Abstract: Background: The construction and application of synthetic genetic circuits is frequently improved if gene expression can be orthogonally controlled, relative to the host. In plants, orthogonality can be achieved via the use of CRISPR-based transcription factors that are programmed to act on natural or synthetic promoters. The construction of complex gene circuits can require multiple, orthogonal regulatory interactions, and this in turn requires that the full programmability of CRISPR elements be adapted to non-natural and non-standard promoters that have few constraints on their design. Therefore, we have developed synthetic promoter elements in which regions upstream of the minimal 35S CaMV promoter are designed from scratch to interact via programmed gRNAs with dCas9 fusions that allow activation of gene expression. Results: A panel of three, mutually orthogonal promoters that can be acted on by artificial gRNAs bound by CRISPR regulators were designed. Guide RNA expression targeting these promoters was in turn controlled by either Pol III (U6) or ethylene-inducible Pol II promoters, implementing for the first time a fully artificial Orthogonal Control System (OCS). Following demonstration of the complete orthogonality of the designs, the OCS was tied to cellular metabolism by putting gRNA expression under the control of an endogenous plant signaling molecule, ethylene. The ability to form complex circuitry was demonstrated via the ethylene-driven, ratiometric expression of fluorescent proteins in single plants. Conclusions: The design of synthetic promoters is highly generalizable to large tracts of sequence space, allowing Orthogonal Control Systems of increasing complexity to potentially be generated at will. The ability to tie in several different basal features of plant molecular biology (Pol II and Pol III promoters, ethylene regulation) to the OCS demonstrates multiple opportunities for engineering at the system level. Moreover, given the fungibility of the core 35S CaMV promoter elements, the derived synthetic promoters can potentially be utilized across a variety of plant species.