Home
/
Authors
/
Cátia Vaz

Author

Cátia Vaz

Instituto Superior de Engenharia de Lisboa

Other affiliations: Instituto Politécnico Nacional, INESC-ID, Instituto Superior Técnico ...read more

Bio: Cátia Vaz is an academic researcher from Instituto Superior de Engenharia de Lisboa. The author has contributed to research in topics: Web service & Process calculus. The author has an hindex of 9, co-authored 21 publications receiving 1080 citations. Previous affiliations of Cátia Vaz include Instituto Politécnico Nacional & INESC-ID.

Topics: Web service, Process calculus, Hamming distance, Population, Correctness ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods

[...]

Alexandre P. Francisco¹, Alexandre P. Francisco², Cátia Vaz¹, Pedro T. Monteiro², Pedro T. Monteiro³, José Melo-Cristino², Mário Ramirez², João A. Carriço², João A. Carriço¹ - Show less +5 more•Institutions (3)

INESC-ID¹, University of Lisbon², Instituto Gulbenkian de Ciência³

08 May 2012-BMC Bioinformatics

TL;DR: PHYLOViZ is platform independent Java software that allows the integrated analysis of sequence-based typing methods, including SNP data generated from whole genome sequence approaches, and associated epidemiological data.

...read moreread less

Abstract: With the decrease of DNA sequencing costs, sequence-based typing methods are rapidly becoming the gold standard for epidemiological surveillance. These methods provide reproducible and comparable results needed for a global scale bacterial population analysis, while retaining their usefulness for local epidemiological surveys. Online databases that collect the generated allelic profiles and associated epidemiological data are available but this wealth of data remains underused and are frequently poorly annotated since no user-friendly tool exists to analyze and explore it. PHYLOViZ is platform independent Java software that allows the integrated analysis of sequence-based typing methods, including SNP data generated from whole genome sequence approaches, and associated epidemiological data. goeBURST and its Minimum Spanning Tree expansion are used for visualizing the possible evolutionary relationships between isolates. The results can be displayed as an annotated graph overlaying the query results of any other epidemiological data available. PHYLOViZ is a user-friendly software that allows the combined analysis of multiple data sources for microbial epidemiological and population studies. It is freely available at http://www.phyloviz.net .

...read moreread less

452 citations

Journal Article•DOI•

GrapeTree : visualization of core genomic relationships among 100,000 bacterial pathogens

[...]

Zhemin Zhou¹, Nabil-Fareed Alikhan¹, Martin J. Sergeant¹, Nina Luhmann¹, Cátia Vaz², Alexandre P. Francisco², Alexandre P. Francisco³, João A. Carriço⁴, Mark Achtman¹ - Show less +5 more•Institutions (4)

University of Warwick¹, INESC-ID², Instituto Superior Técnico³, Instituto de Medicina Molecular⁴

26 Jul 2018-Genome Research

TL;DR: GrapeTree is a stand-alone package for investigating phylogenetic trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among bacterial pathogens.

...read moreread less

Abstract: Current methods struggle to reconstruct and visualize the genomic relationships of large numbers of bacterial genomes. GrapeTree facilitates the analyses of large numbers of allelic profiles by a static "GrapeTree Layout" algorithm that supports interactive visualizations of large trees within a web browser window. GrapeTree also implements a novel minimum spanning tree algorithm (MSTree V2) to reconstruct genetic relationships despite high levels of missing data. GrapeTree is a stand-alone package for investigating phylogenetic trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among bacterial pathogens.

...read moreread less

448 citations

Journal Article•DOI•

PHYLOViZ 2.0: providing scalable data integration and visualization for multiple phylogenetic inference methods

[...]

Marta Nascimento¹, Marta Nascimento², Adriano Sousa³, Mário Ramirez⁴, Alexandre P. Francisco¹, Alexandre P. Francisco², João A. Carriço⁴, Cátia Vaz³, Cátia Vaz² - Show less +5 more•Institutions (4)

Instituto Superior Técnico¹, INESC-ID², Instituto Superior de Engenharia de Lisboa³, Instituto de Medicina Molecular⁴

01 Jan 2017-Bioinformatics

TL;DR: PHYLOViZ 2.0 is presented, an extension of PHYLoviZ tool, a platform independent Java tool that allows phylogenetic inference and data visualization for large datasets of sequence based typing methods, including Single Nucleotide Polymorphism (SNP) and whole genome/core genome Multilocus Sequence Typing (wg/cgMLST) analysis.

...read moreread less

Abstract: Summary: High Throughput Sequencing provides a cost effective means of generating high resolution data for hundreds or even thousands of strains, and is rapidly superseding methodologies based on a few genomic loci. The wealth of genomic data deposited on public databases such as Sequence Read Archive/European Nucleotide Archive provides a powerful resource for evolutionary analysis and epidemiological surveillance. However, many of the analysis tools currently available do not scale well to these large datasets, nor provide the means to fully integrate ancillary data. Here we present PHYLOViZ 2.0, an extension of PHYLOViZ tool, a platform independent Java tool that allows phylogenetic inference and data visualization for large datasets of sequence based typing methods, including Single Nucleotide Polymorphism (SNP) and whole genome/core genome Multilocus Sequence Typing (wg/cgMLST) analysis. PHYLOViZ 2.0 incorporates new data analysis algorithms and new visualization modules, as well as the capability of saving projects for subsequent work or for dissemination of results. Availability and Implementation: http://www.phyloviz.net/ (licensed under GPLv3). Contact: cvaz@inesc-id.pt Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

257 citations

Posted Content•DOI•

GrapeTree: Visualization of core genomic relationships among 100,000 bacterial pathogens

[...]

University of Warwick¹, INESC-ID², Instituto Superior Técnico³, University of Lisbon⁴

09 Nov 2017-bioRxiv

TL;DR: G GrapeTree implements a novel minimum spanning tree algorithm to reconstruct genetic relationships despite missing data together with a static “GrapeTree Layout” algorithm to render interactive visualisations of large trees.

...read moreread less

Abstract: Current methods struggle to reconstruct and visualise the genomic relationships of ≥100,000 bacterial genomes. GrapeTree facilitates the analyses of allelic profiles from 10,000’s of core genomes within a web browser window. GrapeTree implements a novel minimum spanning tree algorithm to reconstruct genetic relationships despite missing data together with a static “GrapeTree Layout” algorithm to render interactive visualisations of large trees. GrapeTree is a stand-along package for investigating Newick trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among >160,000 genomes from bacterial pathogens. The GrapeTree package was released under the GPL v3.0 Licence.

...read moreread less

170 citations

Journal Article•DOI•

PHYLOViZ Online: web-based tool for visualization, phylogenetic inference, analysis and sharing of minimum spanning trees.

[...]

Bruno Ribeiro-Gonçalves¹, Alexandre P. Francisco², Cátia Vaz², Mário Ramirez¹, João A. Carriço¹ - Show less +1 more•Institutions (2)

Instituto de Medicina Molecular¹, Instituto Superior Técnico²

29 Apr 2016-Nucleic Acids Research

TL;DR: PHYLOViZ Online offers a RESTful API for programmatic access to data and algorithms, allowing it to be seamlessly integrated into any third party web service or software.

...read moreread less

Abstract: High-throughput sequencing methods generated allele and single nucleotide polymorphism information for thousands of bacterial strains that are publicly available in online repositories and created the possibility of generating similar information for hundreds to thousands of strains more in a single study. Minimum spanning tree analysis of allelic data offers a scalable and reproducible methodological alternative to traditional phylogenetic inference approaches, useful in epidemiological investigations and population studies of bacterial pathogens. PHYLOViZ Online was developed to allow users to do these analyses without software installation and to enable easy accessing and sharing of data and analyses results from any Internet enabled computer. PHYLOViZ Online also offers a RESTful API for programmatic access to data and algorithms, allowing it to be seamlessly integrated into any third party web service or software. PHYLOViZ Online is freely available at https://online.phyloviz.net.

...read moreread less

113 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

Evolution of Protein Molecules

[...]

S. Jeffery

01 Apr 1979-Biochemical Society Transactions

3,734 citations

Journal Article•

Fast Tree: Computing Large Minimum-Evolution Trees with Profiles instead of a Distance Matrix

[...]

Morgan N. Price, Paramvir S. Dehal, Adam P. Arkin

18 Jun 2009-Lawrence Berkeley National Laboratory

TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.

...read moreread less

Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

...read moreread less

2,436 citations

Journal Article•DOI•

Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications.

[...]

Keith A. Jolley¹, James E. Bray¹, Martin C. J. Maiden¹•Institutions (1)

University of Oxford¹

24 Sep 2018

TL;DR: Developments in the BIGSdb software made from publication to June 2018 are described and it is shown how the platform realises microbial population genomics for a wide range of applications.

...read moreread less

Abstract: The PubMLST.org website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 different microbial species and genera. Although the PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes. Here we describe developments in the BIGSdb software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications. The system is based on the gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify the genes present and systematically catalogue their variation. Originally intended as a means of characterising isolates with typing schemes, the synthesis of sequences and records of genetic variation with provenance and phenotype data permits highly scalable (whole genome sequence data for tens of thousands of isolates) means of addressing a wide range of functional questions, including: the prediction of antimicrobial resistance; likely cross-reactivity with vaccine antigens; and the functional activities of different variants that lead to key phenotypes. There are no limitations to the number of sequences, genetic loci, allelic variants or schemes (combinations of loci) that can be included, enabling each database to represent an expanding catalogue of the genetic variation of the population in question. In addition to providing web-accessible analyses and links to third-party analysis and visualisation tools, the BIGSdb software includes a RESTful application programming interface (API) that enables access to all the underlying data for third-party applications and data analysis pipelines.

...read moreread less

1,349 citations

Journal Article•DOI•

The Ï-Calculus: A theory of mobile processes

[...]

Raheel Ahmad

01 Jan 2008-Scalable Computing: Practice and Experience

TL;DR: The calculus' contribution to analyzing mobile processes is a major topic, and it is dealt with extensively starting from part three, and how π-calculus can be employed in studying practical, modern software engineering concepts such as object-oriented programming is shown.

...read moreread less

Abstract: The π-Calculus: A theory of mobile processes by Davide Sangiorgi and David Walker Formal methods have formed the foundation of Computer Science since its inception. Although, initially these formal methods dealt with processes and systems on an individual basis, the paradigm has shifted with the dawn of the age of computer networks. When dealing with systems with interconnected, communicating, dependent, cooperative, and competitive components, the older outlook of analyzing and developing singular systems—and the tools that went with it—were hardly suitable. This led to the development of theories and tools that would support the new paradigm. On the tools end, the development has been widespread and satisfactory: programming languages, development frameworks, databases, and even end-user software products such as word processors, have gained network-awareness. However on the theoretical end, the development has been far less satisfactory. The major work was done by Robin Milner, Joachim Parrow, and David Walker who developed the formalism known as π-calculus in 1989. π-calculus is a process calculus that treats communication between its components as the basic form of computation. It has been quite successful as a foundation of several other calculi in the field and as Milner puts it, it has become common to express ideas about interactions and mobility in variants of the calculus. Introduction The current book serves as a comprehensive reference to π-calculus. Besides Milner's own book on the subject, this is the only other book-length publication on the topic. In many ways, it is much more comprehensive than Milner's: a much wider area of topics are dealt with and in more detail as well. Contents The book is split into seven part. The first part presents the basic theory of π-calculus. However, basic does not mean concise: every topic is discussed in great detail. The section on bisimulation is particularly intensive and provides several insights about the motivation for the theory. Part two discusses several variants of the original calculus. By varying several characteristics of the calculus, such as whether a process can communicate with more than processes at a time, we can obtain these variants. A number of interesting properties of the language are studied by the other when discussing these variants. As can be understood from the title, the calculus' contribution to analyzing mobile processes is a major topic, and it is dealt with extensively starting from part three. The basics are introduced by the way of a sophisticated typing system whose application in speciying complex processes is presented in part four. Part five looks at higher-order π-calculus in which composed systems are considered as first-class citizens. Part six is one of the best in the book and discusses the relation between π-calculus and lambda-calculus, which is an older and more basic calculus. Finally part seven shows how π-calculus can be employed in studying practical, modern software engineering concepts such as object-oriented programming. Impressions One of my disappointments with this book is in how often the reader is left perplexed with some of the theoretical developments, specially in part three. π-calculus is a complicated topic, even for the graduate student audience to which this book is directed, and the author would have done much better by reducing the number of topics and instead focusing on more lucid and detailed explanations. There are several experimental digressions throughout the book, which although interesting, take away from some of the momentum of sequential study. For example, topics such as comparison and encoding of one language to another could be easily moved to a separate section in order to make the content more suitable for self-study. Another issue is the little effort towards making the connection from the theoretical to the practical. The main reason why formal methods have not been adopted in mainstream software development pracitces is that often it is unclear to developers how formalisms can contribute towards the software engineering process. The book would have served its purpose much better if it had spent part of eah chapter discussing the practical application of that chapter's content. For example, congruence checking and bisimulation can be incredbily exciting topics for programmers to learn if they can see practical applications of such powerful techniques. Beyond the above criticism, the book is absolutely indispensible to students and researchers in the field of formal methods. Currently it serves as the primary reference for anyone who wishes to learn the various aspects of π-calculus in detail. Raheel Ahmad

...read moreread less

484 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse