Home
/
Authors
/
Mareike Fischer

Author

Mareike Fischer

Other affiliations: University of Veterinary Medicine Vienna, University of Canterbury

Bio: Mareike Fischer is an academic researcher from University of Greifswald. The author has contributed to research in topics: Phylogenetic tree & Maximum parsimony. The author has an hindex of 11, co-authored 88 publications receiving 404 citations. Previous affiliations of Mareike Fischer include University of Veterinary Medicine Vienna & University of Canterbury.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007

Papers

PDF

Open Access

More filters

Journal Article•DOI•

On the Maximum Parsimony Distance Between Phylogenetic Trees

[...]

Mareike Fischer¹, Steven Kelk²•Institutions (2)

University of Greifswald¹, Maastricht University²

01 Mar 2016-Annals of Combinatorics

TL;DR: This article shows that this new distance is a metric and provides a lower bound to the well-known Subtree Prune and Regraft (SPR) distance, and shows that to compute the MP distance it is sufficient to consider only characters that are convex on one of the trees, and proves several additional structural properties of the distance.

...read moreread less

Abstract: Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Here, based on an idea of Bruen and Bryant, we propose and analyze a new distance measure: theMaximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. In this article we show that this new distance is a metric and provides a lower bound to the well-known Subtree Prune and Regraft (SPR) distance. We also show that to compute the MP distance it is sufficient to consider only characters that are convex on one of the trees, and prove several additional structural properties of the distance. On the complexity side, we prove that calculating the MP distance is in general NP-hard, and identify an interesting island of tractability in which the distance can be calculated in polynomial time.

...read moreread less

31 citations

Journal Article•DOI•

Sequence length bounds for resolving a deep phylogenetic divergence.

[...]

Mareike Fischer¹, Mike Steel¹•Institutions (1)

University of Canterbury¹

21 Jan 2009-Journal of Theoretical Biology

TL;DR: In this paper, an idealised form of the problem was analyzed, where the terminal edges of a symmetric four-taxon tree are some factor ( λ ) times the length of the interior edge.

...read moreread less

31 citations

Journal Article•DOI•

On Computing the Maximum Parsimony Score of a Phylogenetic Network

[...]

Mareike Fischer¹, Leo van Iersel², Steven Kelk, Celine Scornavacca•Institutions (2)

University of Greifswald¹, Delft University of Technology²

24 Mar 2015-SIAM Journal on Discrete Mathematics

TL;DR: Two different definitions of maximum parsimony on networks, “hardwired” and “softwired,” are discussed and the complexity of computing them given a network topology and a character is examined, showing that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming.

...read moreread less

Abstract: Phylogenetic networks are used to display the relationship among different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as maximum parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of maximum parsimony on networks, “hardwired” and “softwired,” and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem Multiterminal Cut, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming. The software has been made freely available

...read moreread less

28 citations

Posted Content•

Sequence length bounds for resolving a deep phylogenetic divergence

[...]

Mareike Fischer¹, Mike Steel¹•Institutions (1)

University of Canterbury¹

16 Jun 2008-arXiv: Populations and Evolution

TL;DR: An idealised form of this problem in which the terminal edges of a symmetric four-taxon tree are some factor (lambda) times the length of the interior edge, and an order lambda(2) lower bound on the growth rate for the sequence length required to resolve the tree is determined.

...read moreread less

Abstract: In evolutionary biology, genetic sequences carry with them a trace of the underlying tree that describes their evolution from a common ancestral sequence. The question of how many sequence sites are required to recover this evolutionary relationship accurately depends on the model of sequence evolution, the substitution rate, divergence times and the method used to infer phylogenetic history. A particularly challenging problem for phylogenetic methods arises when a rapid divergence event occurred in the distant past. We analyse an idealised form of this problem in which the terminal edges of a symmetric four--taxon tree are some factor ($p$) times the length of the interior edge. We determine an order $p^2$ lower bound on the growth rate for the sequence length required to resolve the tree (independent of any particular branch length). We also show that this rate of sequence length growth can be achieved by existing methods (including the simple `maximum parsimony' method), and compare these order $p^2$ bounds with an order $p$ growth rate for a model that describes low-homoplasy evolution. In the final section, we provide a generic bound on the sequence length requirement for a more general class of Markov processes.

...read moreread less

26 citations

Journal Article•DOI•

On the Complexity of Computing MP Distance Between Binary Phylogenetic Trees

[...]

Steven Kelk¹, Mareike Fischer²•Institutions (2)

Maastricht University¹, University of Greifswald²

01 Dec 2017-Annals of Combinatorics

TL;DR: In this paper, it was shown that computing the MP distance on two binary phylogenetic trees is NP-hard even if only two states are available, and a simple Integer Linear Program (ILP) formulation was given for small trees and for larger trees when only a small number of character states were available.

...read moreread less

Abstract: Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Recently, a new distance measure has been proposed: the Maximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. Here we show that computation of MP distance on two binary phylogenetic trees is NP-hard. This is a highly nontrivial extension of an earlier NP-hardness proof for two multifurcating phylogenetic trees, and it is particularly relevant given the prominence of binary trees in the phylogenetics literature. As a corollary to the main hardness result we show that computation of MP distance is also hard on binary trees if the number of states available is bounded. In fact, via a different reduction we show that it is hard even if only two states are available. Finally, as a first response to this hardness we give a simple Integer Linear Program (ILP) formulation which is capable of computing the MP distance exactly for small trees (and for larger trees when only a small number of character states are available) and which is used to computationally verify several auxiliary results required by the hardness proofs.

...read moreread less

21 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Evolution of Protein Molecules

[...]

S. Jeffery

01 Apr 1979-Biochemical Society Transactions

3,734 citations

Journal Article•

Evolution of protein molecules. I. Protein synthesis

[...]

Gajdos A

26 Feb 1972-La Nouvelle presse médicale

331 citations

Journal Article•DOI•

To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods.

[...]

Erin K. Molloy¹, Tandy Warnow¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Mar 2018-Systematic Biology

TL;DR: This work examines how incomplete lineage sorting, phylogenetic signal of individual loci, and missing data affect the absolute and the relative accuracy of species tree estimation methods and shows how these properties affect methods' responses to gene filtering strategies.

...read moreread less

Abstract: With the increasing availability of whole genome data, many species trees are being constructed from hundreds to thousands of loci. Although concatenation analysis using maximum likelihood is a standard approach for estimating species trees, it does not account for gene tree heterogeneity, which can occur due to many biological processes, such as incomplete lineage sorting. Coalescent species tree estimation methods, many of which are statistically consistent in the presence of incomplete lineage sorting, include Bayesian methods that coestimate the gene trees and the species tree, summary methods that compute the species tree by combining estimated gene trees, and site-based methods that infer the species tree from site patterns in the alignments of different loci. Due to concerns that poor quality loci will reduce the accuracy of estimated species trees, many recent phylogenomic studies have removed or filtered genes on the basis of phylogenetic signal and/or missing data prior to inferring species trees; little is known about the performance of species tree estimation methods when gene filtering is performed. We examine how incomplete lineage sorting, phylogenetic signal of individual loci, and missing data affect the absolute and the relative accuracy of species tree estimation methods and show how these properties affect methods' responses to gene filtering strategies. In particular, summary methods (ASTRAL-II, ASTRID, and MP-EST), a site-based coalescent method (SVDquartets within PAUP*), and an unpartitioned concatenation analysis using maximum likelihood (RAxML) were evaluated on a heterogeneous collection of simulated multilocus data sets, and the following trends were observed. Filtering genes based on gene tree estimation error improved the accuracy of the summary methods when levels of incomplete lineage sorting were low to moderate but did not benefit the summary methods under higher levels of incomplete lineage sorting, unless gene tree estimation error was also extremely high (a model condition with few replicates). Neither SVDquartets nor concatenation analysis using RAxML benefited from filtering genes on the basis of gene tree estimation error. Finally, filtering genes based on missing data was either neutral (i.e., did not impact accuracy) or else reduced the accuracy of all five methods. By providing insight into the consequences of gene filtering, we offer recommendations for estimating species tree in the presence of incomplete lineage sorting and reconcile seemingly conflicting observations made in prior studies regarding the impact of gene filtering.

...read moreread less

167 citations

Journal Article•DOI•

Phylogenetic Signal and Noise: Predicting the Power of a Data Set to Resolve Phylogeny

[...]

Jeffrey P. Townsend¹, Zhuo T. Su¹, Yonas I. Tekle¹•Institutions (1)

Yale University¹

01 Oct 2012-Systematic Biology

TL;DR: A Monte Carlo approach to estimating power to resolve as well as deriving a nearly equivalent faster deterministic calculation are developed and implemented and predicted power of resolution for the loci analyzed.

...read moreread less

Abstract: A principal objective for phylogenetic experimental design is to predict the power of a data set to resolve nodes in a phylogenetic tree. However, proactively assessing the potential for phylogenetic noise compared with signal in a candidate data set has been a formidable challenge. Understanding the impact of collection of additional sequence data to resolve recalcitrant internodes at diverse historical times will facilitate increasingly accurate and cost-effective phylogenetic research. Here, we derive theory based on the fundamental unit of the phylogenetic tree, the quartet, that applies estimates of the state space and the rates of evolution of characters in a data set to predict phylogenetic signal and phylogenetic noise and therefore to predict the power to resolve internodes. We develop and implement a Monte Carlo approach to estimating power to resolve as well as deriving a nearly equivalent faster deterministic calculation. These approaches are applied to describe the distribution of potential signal, polytomy, or noise for two example data sets, one recent (cytochrome c oxidase I and 28S ribosomal rRNA sequences from Diplazontinae parasitoid wasps) and one deep (eight nuclear genes and a phylogenomic sequence for diverse microbial eukaryotes including Stramenopiles, Alveolata, and Rhizaria). The predicted power of resolution for the loci analyzed is consistent with the historic use of the genes in phylogenetics.

...read moreread less

128 citations

Proceedings Article•DOI•

Separations in query complexity using cheat sheets

[...]

Scott Aaronson¹, Shalev Ben-David¹, Robin Kothari¹•Institutions (1)

Massachusetts Institute of Technology¹

19 Jun 2016

TL;DR: A power 2.5 separation between bounded-error randomized and quantum query complexity for a total Boolean function is shown, refuting the widely believed conjecture that the best such separation could only be quadratic (from Grover's algorithm).

...read moreread less

Abstract: We show a power 2.5 separation between bounded-error randomized and quantum query complexity for a total Boolean function, refuting the widely believed conjecture that the best such separation could only be quadratic (from Grover's algorithm). We also present a total function with a power 4 separation between quantum query complexity and approximate polynomial degree, showing severe limitations on the power of the polynomial method. Finally, we exhibit a total function with a quadratic gap between quantum query complexity and certificate complexity, which is optimal (up to log factors). These separations are shown using a new, general technique that we call the cheat sheet technique, which builds upon the techniques of Ambainis et al. [STOC 2016]. The technique is based on a generic transformation that converts any (possibly partial) function into a new total function with desirable properties for showing separations. The framework also allows many known separations, including some recent breakthrough results of Ambainis et al. [STOC 2016], to be shown in a unified manner.

...read moreread less

91 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

Collapse