scispace - formally typeset
Search or ask a question
Author

Jesper Jansson

Bio: Jesper Jansson is an academic researcher from Hong Kong Polytechnic University. The author has contributed to research in topics: Time complexity & Phylogenetic network. The author has an hindex of 25, co-authored 145 publications receiving 1971 citations. Previous affiliations of Jesper Jansson include Ochanomizu University & National University of Singapore.


Papers
More filters
Proceedings ArticleDOI
07 Jan 2007
TL;DR: The first succinct tree representation supporting every one of the fundamental operations previously proposed for BP or DFUDS along with some new operations in constant time is given, its size surpasses the information-theoretic lower bound and matches the entropy of the tree based on the distribution of node degrees.
Abstract: There exist two well-known succinct representations of ordered trees: BP (balanced parenthesis) [Munro, Raman 2001] and DFUDS (depth first unary degree sequence) [Benoit et al. 2005]. Both have size 2n + o(n) bits for n-node trees, which asymptotically matches the information-theoretic lower bound. Many fundamental operations on trees can be done in constant time on word RAM, for example finding the parent, the first child, the next sibling, the number of descendants, etc. However there has been no single representation supporting every existing operation in constant time; BP does not support i-th child, while DFUDS does not support lca (lowest common ancestor).In this paper, we give the first succinct tree representation supporting every one of the fundamental operations previously proposed for BP or DFUDS along with some new operations in constant time. Moreover, its size surpasses the information-theoretic lower bound and matches the entropy of the tree based on the distribution of node degrees. We call this an ultra-succinct data structure. As a consequence, a tree in which every internal node has exactly two children can be represented in n + o(n) bits. We also show applications for ultra-succinct compressed suffix trees and labeled trees.

92 citations

Journal ArticleDOI
TL;DR: It is proved that the problem of determining whether a given set $\T$ of rooted triplets can be merged without conflicts into a galled phylogenetic network and, if so, constructing such a network is NP-hard if extended to nondense inputs.
Abstract: This paper considers the problem of determining whether a given set $\T$ of rooted triplets can be merged without conflicts into a galled phylogenetic network and, if so, constructing such a network. When the input $\T$ is dense, we solve the problem in $O(|\T|)$ time, which is optimal since the size of the input is $\Theta(|\T|)$. In comparison, the previously fastest algorithm for this problem runs in $O(|\T|^2)$ time. We also develop an optimal $O(|\T|)$-time algorithm for enumerating all simple phylogenetic networks leaf-labeled by $L$ that are consistent with $\T$, where $L$ is the set of leaf labels in $\T$, which is used by our main algorithm. Next, we prove that the problem becomes NP-hard if extended to nondense inputs, even for the special case of simple phylogenetic networks. We also show that for every positive integer $n$, there exists some set $\T$ of rooted triplets on $n$ leaves such that any galled network can be consistent with at most $0.4883 \cdot |\T|$ of the rooted triplets in $\T$. On the other hand, we provide a polynomial-time approximation algorithm that always outputs a galled network consistent with at least a factor of $\frac{5}{12}$ ($> 0.4166$) of the rooted triplets in $\T$.

90 citations

Journal ArticleDOI
TL;DR: The maximum agreement phylogenetic subnetwork problem (MASN) is introduced and it is proved that the problem is NP-hard even if restricted to three phylogenetic networks and an O(n2)-time algorithm is given for the special case of two level-1 phylogenetics networks.

87 citations

Journal ArticleDOI
TL;DR: If no restrictions are placed on the hybrid nodes in the solution, the problem is trivially solved in polynomial time by a simple sorting network-based construction.

85 citations

Journal ArticleDOI
TL;DR: This paper proves that unless P=NP, MinRTI cannot be approximated within a ratio of c@?lnn for some constant c>0 in polynomial time, and provides a deterministic construction of a triplet set having a similar property which is subsequently used to prove that both MaxRTC and Min RTI are NP-hard even if restricted to minimally dense instances.

71 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

Journal ArticleDOI
TL;DR: Dendroscope 3 is a new program for working with rooted phylogenetic trees and networks that provides a number of methods for drawing and comparingRoot phylogenetic networks, and for computing them from rooted trees.
Abstract: Dendroscope 3 is a new program for working with rooted phylogenetic trees and networks. It provides a number of methods for drawing and comparing rooted phylogenetic networks, and for computing them from rooted trees. The program can be used interactively or in command-line mode. The program is written in Java, use of the software is free, and installers for all 3 major operating systems can be downloaded from www.dendroscope.org. [Phylogenetic trees; phylogenetic networks; software.].

1,396 citations

Book
02 Jan 1991

1,377 citations

Journal ArticleDOI
TL;DR: This work surveys the problem of comparing labeled trees based on simple local operations of deleting, inserting, and relabeling nodes and presents one or more of the central algorithms for solving the problem.

831 citations