Author
Enrico Lavezzo
Other affiliations: Edmund Mach Foundation
Bio: Enrico Lavezzo is an academic researcher from University of Padua. The author has contributed to research in topics: Population & Medicine. The author has an hindex of 32, co-authored 72 publications receiving 6283 citations. Previous affiliations of Enrico Lavezzo include Edmund Mach Foundation.
Topics: Population, Medicine, Genome, Protein function prediction, Biology
Papers
More filters
••
TL;DR: It is shown that a relatively recent (>50 million years ago) genome-wide duplication has resulted in the transition from nine ancestral chromosomes to 17 chromosomes in the Pyreae, which partly support the monophyly of the ancestral paleohexaploidy of eudicots.
Abstract: We report a high-quality draft genome sequence of the domesticated apple (Malus × domestica). We show that a relatively recent (>50 million years ago) genome-wide duplication (GWD) has resulted in the transition from nine ancestral chromosomes to 17 chromosomes in the Pyreae. Traces of older GWDs partly support the monophyly of the ancestral paleohexaploidy of eudicots. Phylogenetic reconstruction of Pyreae and the genus Malus, relative to major Rosaceae taxa, identified the progenitor of the cultivated apple as M. sieversii. Expansion of gene families reported to be involved in fruit development may explain formation of the pome, a Pyreae-specific false fruit that develops by proliferation of the basal part of the sepals, the receptacle. In apple, a subclade of MADS-box genes, normally involved in flower and fruit development, is expanded to include 15 members, as are other gene families involved in Rosaceae-specific metabolism, such as transport and assimilation of sorbitol.
1,718 citations
••
TL;DR: Light is shed on the frequency of asymptomatic SARS-CoV-2 infection, their infectivity (as measured by the viral load) and the insights into its transmission dynamics and the efficacy of the implemented control measures are provided.
Abstract: On 21 February 2020, a resident of the municipality of Vo', a small town near Padua (Italy), died of pneumonia due to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection1. This was the first coronavirus disease 19 (COVID-19)-related death detected in Italy since the detection of SARS-CoV-2 in the Chinese city of Wuhan, Hubei province2. In response, the regional authorities imposed the lockdown of the whole municipality for 14 days3. Here we collected information on the demography, clinical presentation, hospitalization, contact network and the presence of SARS-CoV-2 infection in nasopharyngeal swabs for 85.9% and 71.5% of the population of Vo' at two consecutive time points. From the first survey, which was conducted around the time the town lockdown started, we found a prevalence of infection of 2.6% (95% confidence interval (CI): 2.1-3.3%). From the second survey, which was conducted at the end of the lockdown, we found a prevalence of 1.2% (95% CI: 0.8-1.8%). Notably, 42.5% (95% CI: 31.5-54.6%) of the confirmed SARS-CoV-2 infections detected across the two surveys were asymptomatic (that is, did not have symptoms at the time of swab testing and did not develop symptoms afterwards). The mean serial interval was 7.2 days (95% CI: 5.9-9.6). We found no statistically significant difference in the viral load of symptomatic versus asymptomatic infections (P = 0.62 and 0.74 for E and RdRp genes, respectively, exact Wilcoxon-Mann-Whitney test). This study sheds light on the frequency of asymptomatic SARS-CoV-2 infection, their infectivity (as measured by the viral load) and provides insights into its transmission dynamics and the efficacy of the implemented control measures.
882 citations
••
Indiana University1, Buck Institute for Research on Aging2, University of California, San Francisco3, University of California, Santa Cruz4, Colorado State University5, University of Colorado Denver6, Icahn School of Medicine at Mount Sinai7, University of California, Berkeley8, European Bioinformatics Institute9, University of Bologna10, University of Missouri11, University of Bristol12, University of Helsinki13, University College London14, Centre for Development of Advanced Computing15, Purdue University16, Baylor College of Medicine17, Royal Holloway, University of London18, Technische Universität München19, University of Turku20, Queen's University21, University UCINF22, Max Planck Society23, Imperial College London24, Wageningen University and Research Centre25, Nestlé26, Fudan University27, University of Padua28, Temple University29, University of Geneva30, Swiss Institute of Bioinformatics31, Hebrew University of Jerusalem32, Miami University33
TL;DR: Today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets, and there is considerable need for improvement of currently available tools.
Abstract: Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.
859 citations
••
TL;DR: The second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function, was conducted by as mentioned in this paper. But the results of the CAFA2 assessment are limited.
Abstract: BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.
330 citations
••
TL;DR: New light is shed on the frequency of asymptomatic SARS-CoV-2 infection and their infectivity (as measured by the viral load) and new insights are provided into its transmission dynamics, the duration of viral load detectability and the efficacy of the implemented control measures.
Abstract: On the 21st of February 2020 a resident of the municipality of Vo, a small town near Padua, died of pneumonia due to SARS-CoV-2 infection. This was the first COVID-19 death detected in Italy since the emergence of SARS-CoV-2 in the Chinese city of Wuhan, Hubei province. In response, the regional authorities imposed the lockdown of the whole municipality for 14 days. We collected information on the demography, clinical presentation, hospitalization, contact network and presence of SARS-CoV-2 infection in nasopharyngeal swabs for 85.9% and 71.5% of the population of Vo at two consecutive time points. On the first survey, which was conducted around the time the town lockdown started, we found a prevalence of infection of 2.6% (95% confidence interval (CI) 2.1-3.3%). On the second survey, which was conducted at the end of the lockdown, we found a prevalence of 1.2% (95% CI 0.8-1.8%). Notably, 43.2% (95% CI 32.2-54.7%) of the confirmed SARS-CoV-2 infections detected across the two surveys were asymptomatic. The mean serial interval was 6.9 days (95% CI 2.6-13.4). We found no statistically significant difference in the viral load (as measured by genome equivalents inferred from cycle threshold data) of symptomatic versus asymptomatic infections (p-values 0.6 and 0.2 for E and RdRp genes, respectively, Exact Wilcoxon-Mann-Whitney test). Contact tracing of the newly infected cases and transmission chain reconstruction revealed that most new infections in the second survey were infected in the community before the lockdown or from asymptomatic infections living in the same household. This study sheds new light on the frequency of asymptomatic SARS-CoV-2 infection and their infectivity (as measured by the viral load) and provides new insights into its transmission dynamics, the duration of viral load detectability and the efficacy of the implemented control measures.
319 citations
Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.
10,124 citations
••
13 Aug 2016TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.
Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.
7,072 citations
01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.
4,833 citations
••
TL;DR: The new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies less on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence.
Abstract: Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/.
3,902 citations