scispace - formally typeset
Search or ask a question
Author

Aalt D. J. van Dijk

Bio: Aalt D. J. van Dijk is an academic researcher from Wageningen University and Research Centre. The author has contributed to research in topics: Arabidopsis & Protein function prediction. The author has an hindex of 34, co-authored 88 publications receiving 5042 citations. Previous affiliations of Aalt D. J. van Dijk include University of Florence & Utrecht University.


Papers
More filters
Journal ArticleDOI
Predrag Radivojac1, Wyatt T. Clark1, Tal Ronnen Oron2, Alexandra M. Schnoes3, Tobias Wittkop2, Artem Sokolov4, Artem Sokolov5, Kiley Graim5, Christopher S. Funk6, Karin Verspoor6, Asa Ben-Hur5, Gaurav Pandey7, Gaurav Pandey8, Jeffrey M. Yunes8, Ameet Talwalkar8, Susanna Repo8, Susanna Repo9, Michael L Souza8, Damiano Piovesan10, Rita Casadio10, Zheng Wang11, Jianlin Cheng11, Hai Fang, Julian Gough12, Patrik Koskinen13, Petri Törönen13, Jussi Nokso-Koivisto13, Liisa Holm13, Domenico Cozzetto14, Daniel W. A. Buchan14, Kevin Bryson14, David T. Jones14, Bhakti Limaye15, Harshal Inamdar15, Avik Datta15, Sunitha K Manjari15, Rajendra Joshi15, Meghana Chitale16, Daisuke Kihara16, Andreas Martin Lisewski17, Serkan Erdin17, Eric Venner17, Olivier Lichtarge17, Robert Rentzsch14, Haixuan Yang18, Alfonso E. Romero18, Prajwal Bhat18, Alberto Paccanaro18, Tobias Hamp19, Rebecca Kaßner19, Stefan Seemayer19, Esmeralda Vicedo19, Christian Schaefer19, Dominik Achten19, Florian Auer19, Ariane Boehm19, Tatjana Braun19, Maximilian Hecht19, Mark Heron19, Peter Hönigschmid19, Thomas A. Hopf19, Stefanie Kaufmann19, Michael Kiening19, Denis Krompass19, Cedric Landerer19, Yannick Mahlich19, Manfred Roos19, Jari Björne20, Tapio Salakoski20, Andrew Wong21, Hagit Shatkay21, Hagit Shatkay22, Fanny Gatzmann23, Ingolf Sommer23, Mark N. Wass24, Michael J.E. Sternberg24, Nives Škunca, Fran Supek, Matko Bošnjak, Panče Panov, Sašo Džeroski, Tomislav Šmuc, Yiannis A. I. Kourmpetis25, Yiannis A. I. Kourmpetis26, Aalt D. J. van Dijk25, Cajo J. F. ter Braak25, Yuanpeng Zhou27, Qingtian Gong27, Xinran Dong27, Weidong Tian27, Marco Falda28, Paolo Fontana, Enrico Lavezzo28, Barbara Di Camillo28, Stefano Toppo28, Liang Lan29, Nemanja Djuric29, Yuhong Guo29, Slobodan Vucetic29, Amos Marc Bairoch30, Amos Marc Bairoch31, Michal Linial32, Patricia C. Babbitt3, Steven E. Brenner8, Christine A. Orengo14, Burkhard Rost19, Sean D. Mooney2, Iddo Friedberg33 
TL;DR: Today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets, and there is considerable need for improvement of currently available tools.
Abstract: Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.

859 citations

Journal ArticleDOI
01 Dec 2007-Proteins
TL;DR: HADDOCK2.0 as mentioned in this paper is the most recent version of HADDOCK, which incorporates considerable improvements and new features, such as random patch definition or center-of-mass restraints.
Abstract: Here we present version 2.0 of HADDOCK, which incorporates considerable improvements and new features. HADDOCK is now able to model not only protein-protein complexes but also other kinds of biomolecular complexes and multi-component (N > 2) systems. In the absence of any experimental and/or predicted information to drive the docking, HADDOCK now offers two additional ab initio docking modes based on either random patch definition or center-of-mass restraints. The docking protocol has been considerably improved, supporting among other solvated docking, automatic definition of semi-flexible regions, and inclusion of a desolvation energy term in the scoring scheme. The performance of HADDOCK2.0 is evaluated on the targets of rounds 4-11, run in a semi-automated mode using the original information we used in our CAPRI submissions. This enables a direct assessment of the progress made since the previous versions. Although HADDOCK performed very well in CAPRI (65% and 71% success rates, overall and for unbound targets only, respectively), a substantial improvement was achieved with HADDOCK2.0.

542 citations

Journal ArticleDOI
Yuxiang Jiang1, Tal Ronnen Oron2, Wyatt T. Clark3, Asma R. Bankapur4  +153 moreInstitutions (59)
TL;DR: The second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function, was conducted by as mentioned in this paper. But the results of the CAFA2 assessment are limited.
Abstract: BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

330 citations

Journal ArticleDOI
TL;DR: It is reported that two members of CYP711 enzymes can catalyze two distinct steps in SL biosynthesis, identifying the first enzymes involved in B-C ring closure and a subsequent structural diversification step of SLs.
Abstract: Strigolactones (SLs) are a class of phytohormones and rhizosphere signaling compounds with high structural diversity. Three enzymes, carotenoid isomerase DWARF27 and carotenoid cleavage dioxygenases CCD7 and CCD8, were previously shown to convert all-trans-β-carotene to carlactone (CL), the SL precursor. However, how CL is metabolized to SLs has remained elusive. Here, by reconstituting the SL biosynthetic pathway in Nicotiana benthamiana, we show that a rice homolog of Arabidopsis More Axillary Growth 1 (MAX1), encodes a cytochrome P450 CYP711 subfamily member that acts as a CL oxidase to stereoselectively convert CL into ent-2'-epi-5-deoxystrigol (B-C lactone ring formation), the presumed precursor of rice SLs. A protein encoded by a second rice MAX1 homolog then catalyzes the conversion of ent-2'-epi-5-deoxystrigol to orobanchol. We therefore report that two members of CYP711 enzymes can catalyze two distinct steps in SL biosynthesis, identifying the first enzymes involved in B-C ring closure and a subsequent structural diversification step of SLs.

289 citations

Journal ArticleDOI
TL;DR: Significant indications are provided that higher-order complex formation is a general and essential molecular mechanism for plant MADS box protein functioning and attribute a pivotal role to the SEP3 'glue' protein in mediating multimerization.
Abstract: Plant MADS box proteins play important roles in a plethora of developmental processes. In order to regulate specific sets of target genes, MADS box proteins dimerize and are thought to assemble into multimeric complexes. In this study a large-scale yeast three-hybrid screen is utilized to provide insight into the higher-order complex formation capacity of the Arabidopsis MADS box family. SEPALLATA3 (SEP3) has been shown to mediate complex formation and, therefore, special attention is paid to this factor in this study. In total, 106 multimeric complexes were identified; in more than half of these at least one SEP protein was present. Besides the known complexes involved in determining floral organ identity, various complexes consisting of combinations of proteins known to play a role in floral organ identity specification, and flowering time determination were discovered. The capacity to form this latter type of complex suggests that homeotic factors play essential roles in down-regulation of the MADS box genes involved in floral timing in the flower via negative auto-regulatory loops. Furthermore, various novel complexes were identified that may be important for the direct regulation of the floral transition process. A subsequent detailed analysis of the APETALA3, PISTILLATA, and SEP3 proteins in living plant cells suggests the formation of a multimeric complex in vivo. Overall, these results provide strong indications that higher-order complex formation is a general and essential molecular mechanism for plant MADS box protein functioning and attribute a pivotal role to the SEP3 'glue' protein in mediating multimerization.

261 citations


Cited by
More filters
Proceedings ArticleDOI
13 Aug 2016
TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.
Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

7,072 citations

Journal ArticleDOI
TL;DR: The new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies less on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence.
Abstract: Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/.

3,902 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Posted Content
TL;DR: In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

2,174 citations

Journal ArticleDOI
TL;DR: Important new components of jasmonate signalling including its receptor were identified, providing deeper insight into the role ofJASMONATE signalling pathways in stress responses and development.

1,868 citations