scispace - formally typeset
Search or ask a question
Author

Johannes Andries Roubos

Bio: Johannes Andries Roubos is an academic researcher from DSM. The author has contributed to research in topics: Fuzzy logic & CRISPR. The author has an hindex of 27, co-authored 82 publications receiving 4111 citations. Previous affiliations of Johannes Andries Roubos include Delft University of Technology & Netherlands Bioinformatics Centre.


Papers
More filters
Journal ArticleDOI
Herman Jan Pel1, Johannes H. de Winde1, Johannes H. de Winde2, David B. Archer3, Paul S. Dyer3, Gerald Hofmann4, Peter J. Schaap5, Geoffrey Turner6, Ronald P. de Vries7, Richard Albang8, Kaj Albermann8, Mikael Rørdam Andersen4, Jannick Dyrløv Bendtsen9, Jacques A.E. Benen5, Marco A. van den Berg1, Stefaan Breestraat1, Mark X. Caddick10, Roland Contreras11, Michael Cornell12, Pedro M. Coutinho13, Etienne Danchin13, Alfons J. M. Debets5, Peter J. T. Dekker1, Piet W.M. van Dijck1, Alard Van Dijk1, Lubbert Dijkhuizen14, Arnold J. M. Driessen14, Christophe d'Enfert15, Steven Geysens11, Coenie Goosen14, Gert S.P. Groot1, Piet W. J. de Groot16, Thomas Guillemette17, Bernard Henrissat13, Marga Herweijer1, Johannes Petrus Theodorus Wilhelmus Van Den Hombergh1, Cees A. M. J. J. van den Hondel18, René T. J. M. van der Heijden19, Rachel M. van der Kaaij14, Frans M. Klis16, Harrie J. Kools5, Christian P. Kubicek, Patricia Ann van Kuyk18, Jürgen Lauber, Xin Lu, Marc J. E. C. van der Maarel, Rogier Meulenberg1, Hildegard Henna Menke1, Martin Mortimer10, Jens Nielsen4, Stephen G. Oliver12, Maurien M.A. Olsthoorn1, K. Pal20, K. Pal5, Noël Nicolaas Maria Elisabeth Van Peij1, Arthur F. J. Ram18, Ursula Rinas, Johannes Andries Roubos1, Cornelis Maria Jacobus Sagt1, Monika Schmoll, Jibin Sun, David W. Ussery4, János Varga20, Wouter Vervecken11, Peter J.J. Van De Vondervoort18, Holger Wedler, Han A. B. Wösten7, An-Ping Zeng, Albert J. J. van Ooyen1, Jaap Visser, Hein Stam1 
TL;DR: The filamentous fungus Aspergillus niger is widely exploited by the fermentation industry for the production of enzymes and organic acids, particularly citric acid, and the sequenced genome revealed a large number of major facilitator superfamily transporters and fungal zinc binuclear cluster transcription factors.
Abstract: The filamentous fungus Aspergillus niger is widely exploited by the fermentation industry for the production of enzymes and organic acids, particularly citric acid. We sequenced the 33.9-megabase genome of A. niger CBS 513.88, the ancestor of currently used enzyme production strains. A high level of synteny was observed with other aspergilli sequenced. Strong function predictions were made for 6,506 of the 14,165 open reading frames identified. A detailed description of the components of the protein secretion pathway was made and striking differences in the hydrolytic enzyme spectra of aspergilli were observed. A reconstructed metabolic network comprising 1,069 unique reactions illustrates the versatile metabolism of A. niger. Noteworthy is the large number of major facilitator superfamily transporters and fungal zinc binuclear cluster transcription factors, and the presence of putative gene clusters for fumonisin and ochratoxin A synthesis.

1,161 citations

Journal ArticleDOI
TL;DR: Genes predicted to encode transporters were strongly overrepresented among the genes transcriptionally upregulated under conditions that stimulate penicillinG production, illustrating potential for future genomics-driven metabolic engineering.
Abstract: Industrial penicillin production with the filamentous fungus Penicillium chrysogenum is based on an unprecedented effort in microbial strain improvement. To gain more insight into penicillin synthesis, we sequenced the 32.19 Mb genome of P. chrysogenum Wisconsin54-1255 and identified numerous genes responsible for key steps in penicillin production. DNA microarrays were used to compare the transcriptomes of the sequenced strain and a penicillinG high-producing strain, grown in the presence and absence of the side-chain precursor phenylacetic acid. Transcription of genes involved in biosynthesis of valine, cysteine and alpha-aminoadipic acid-precursors for penicillin biosynthesis-as well as of genes encoding microbody proteins, was increased in the high-producing strain. Some gene products were shown to be directly controlling beta-lactam output. Many key cellular transport processes involving penicillins and intermediates remain to be characterized at the molecular level. Genes predicted to encode transporters were strongly overrepresented among the genes transcriptionally upregulated under conditions that stimulate penicillinG production, illustrating potential for future genomics-driven metabolic engineering.

457 citations

Journal ArticleDOI
TL;DR: In this article, the authors performed whole-genome sequencing of the Aspergillus niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality.
Abstract: The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.

308 citations

29 Apr 2011
TL;DR: In this paper, the authors performed whole-genome sequencing of the Aspergillus niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality.
Abstract: The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.

306 citations

Journal ArticleDOI
01 Mar 2003
TL;DR: An iterative approach for developing fuzzy classifiers is proposed and the initial model is derived from the data and subsequently, feature selection and rule-base simplification are applied to reduce the model, while a genetic algorithm is used for parameter optimization.
Abstract: The automatic design of fuzzy rule-based classification systems based on labeled data is considered. It is recognized that both classification performance and interpretability are of major importance and effort is made to keep the resulting rule bases small and comprehensible. For this purpose, an iterative approach for developing fuzzy classifiers is proposed. The initial model is derived from the data and subsequently, feature selection and rule-base simplification are applied to reduce the model, while a genetic algorithm is used for parameter optimization. An application to the Wine data classification problem is shown.

193 citations


Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Journal ArticleDOI
TL;DR: This work assembled 89 scaffolds to generate 34 Mbp of nearly contiguous T. reesei genome sequence comprising 9,129 predicted gene models, providing a roadmap for constructing enhanced T.Reesei strains for industrial applications such as biofuel production.
Abstract: Trichoderma reesei is the main industrial source of cellulases and hemicellulases used to depolymerize biomass to simple sugars that are converted to chemical intermediates and biofuels, such as ethanol. We assembled 89 scaffolds (sets of ordered and oriented contigs) to generate 34 Mbp of nearly contiguous T. reesei genome sequence comprising 9,129 predicted gene models. Unexpectedly, considering the industrial utility and effectiveness of the carbohydrate-active enzymes of T. reesei, its genome encodes fewer cellulases and hemicellulases than any other sequenced fungus able to hydrolyze plant cell wall polysaccharides. Many T. reesei genes encoding carbohydrate-active enzymes are distributed nonrandomly in clusters that lie between regions of synteny with other Sordariomycetes. Numerous genes encoding biosynthetic pathways for secondary metabolites may promote survival of T. reesei in its competitive soil habitat, but genome analysis provided little mechanistic insight into its extraordinary capacity for protein secretion. Our analysis, coupled with the genome sequence data, provides a roadmap for constructing enhanced T. reesei strains for industrial applications such as biofuel production.

1,085 citations

Journal ArticleDOI
TL;DR: The role of big data in supporting smart manufacturing is discussed, a historical perspective to data lifecycle in manufacturing is overviewed, and a conceptual framework proposed in the paper is proposed.

937 citations

Journal ArticleDOI
01 Apr 2016-Science
TL;DR: Electronic design automation principles from EDA are applied to enable increased circuit complexity and to simplify the incorporation of synthetic gene regulation into genetic engineering projects, and it is demonstrated that engineering principles can be applied to identify and suppress errors that complicate the compositions of larger systems.
Abstract: INTRODUCTION Cells respond to their environment, make decisions, build structures, and coordinate tasks. Underlying these processes are computational operations performed by networks of regulatory proteins that integrate signals and control the timing of gene expression. Harnessing this capability is critical for biotechnology projects that require decision-making, control, sensing, or spatial organization. It has been shown that cells can be programmed using synthetic genetic circuits composed of regulators organized to generate a desired operation. However, the construction of even simple circuits is time-intensive and unreliable. RATIONALE Electronic design automation (EDA) was developed to aid engineers in the design of semiconductor-based electronics. In an effort to accelerate genetic circuit design, we applied principles from EDA to enable increased circuit complexity and to simplify the incorporation of synthetic gene regulation into genetic engineering projects. We used the hardware description language Verilog to enable a user to describe a circuit function. The user also specifies the sensors, actuators, and “user constraints file” (UCF), which defines the organism, gate technology, and valid operating conditions. Cello (www.cellocad.org) uses this information to automatically design a DNA sequence encoding the desired circuit. This is done via a set of algorithms that parse the Verilog text, create the circuit diagram, assign gates, balance constraints to build the DNA, and simulate performance. RESULTS Cello designs circuits by drawing upon a library of Boolean logic gates. Here, the gate technology consists of NOT/NOR logic based on repressors. Gate connection is simplified by defining the input and output signals as RNA polymerase (RNAP) fluxes. We found that the gates need to be insulated from their genetic context to function reliably in the context of different circuits. Each gate is isolated using strong terminators to block RNAP leakage, and input interchangeability is improved using ribozymes and promoter spacers. These parts are varied for each gate to avoid breakage due to recombination. Measuring the load of each gate and incorporating this into the optimization algorithms further reduces evolutionary pressure. Cello was applied to the design of 60 circuits for Escherichia coli , where the circuit function was specified using Verilog code and transformed to a DNA sequence. The DNA sequences were built as specified with no additional tuning, requiring 880,000 base pairs of DNA assembly. Of these, 45 circuits performed correctly in every output state (up to 10 regulators and 55 parts). Across all circuits, 92% of the 412 output states functioned as predicted. CONCLUSION Our work constitutes a hardware description language for programming living cells. This required the co-development of design algorithms with gates that are sufficiently simple and robust to be connected by automated algorithms. We demonstrate that engineering principles can be applied to identify and suppress errors that complicate the compositions of larger systems. This approach leads to highly repetitive and modular genetics, in stark contrast to the encoding of natural regulatory networks. The use of a hardware-independent language and the creation of additional UCFs will allow a single design to be transformed into DNA for different organisms, genetic endpoints, operating conditions, and gate technologies.

813 citations