Detecting history of species using mining of motifs in Phylogenetic Networks

doi:10.1145/2677855.2677935

Home
/
Papers
/
Detecting history of species using mining of motifs in Phylogenetic Networks

Proceedings Article•DOI•

Detecting history of species using mining of motifs in Phylogenetic Networks

Shamita Malik¹, Dolly Sharma²•Institutions (2)

Guru Gobind Singh Indraprastha University¹, Amity University²

27 Oct 2014-pp 80

TL;DR: A new algorithmic rule first rudiment to spot sample set, provides a promising new model for the strong reasoning of substructure and ancestry of wildlife trade and establishes a clear evolutionary connection among many different problem sets.

read less

Abstract: There has been continuous development in the wildlife DNA forensics research that relied on the collection and analysis of the biological samples over the past many years. But there is not enough progress to develop computational algorithms which could make the process of finding the origin of species easier and faster. Computational algorithms based on phylogenetic networks are capable of providing evidence to assist in wildlife law enforcement and species conversation. Our new algorithmic rule first rudiment to spot sample set, provides a promising new model for the strong reasoning of substructure and ancestry of wildlife trade. Our findings establish a clear evolutionary connection among many different problem sets.

...read moreread less

Citations

PDF

Open Access

More filters

Book•

Data-Intensive Workflow Management: For Clouds and Data-Intensive and Scalable Computing Environments

[...]

Daniel de Oliveira¹, Ji Liu², Esther Pacitti²•Institutions (2)

Federal Fluminense University¹, University of Montpellier²

13 May 2019

TL;DR: Abstract Workflows may be defined as abstractions used to model the coherent flow of activities in the context of an in silico scientific experiment.

...read moreread less

Abstract: Workflows may be defined as abstractions used to model the coherent flow of activities in the context of an in silico scientific experiment. They are employed in many domains of science su...

...read moreread less

16 citations

Dissertation•DOI•

Modelos de programação inteira para o problema de busca de motivos em redes biológicas

[...]

Ricardo Molinari dos Prazeres

03 Oct 2022

TL;DR: This work presents integer programming models for the motif search problem in biological networks and proposes a branch-and-cut approach for the general case of it (when G is an arbitrary graph), and presents an enumeration algorithm based on this approach.

...read moreread less

Abstract: Prazeres, Ricardo Molinari dos. Integer programming models for the motif search problem in biological networks. 2022. 81 p. Dissertation (Master of Science) – School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, 2022. There are several variants of the motif search problem in the literature, with many applications in bioinformatics. In the variant called motif search in graphs, proposed in 2006, we are given a colored graph G, a multiset of colorsM (called motif ) and we seek for a connected induced subgraph of G which contains the colors ofM. When the given motif cannot be found in G, we seek for an “approximate match” of it, considering some approximation criteria. In the mentioned work, it was proved that the problem is NP-hard even though G is restricted to trees and an exact enumeration algorithm was proposed, which enumerates only small-size motifs (containing at most 4 vertices). In 2018, an integer programming approach was proposed for the special case where G is restricted to trees. In the present work, we present integer programming models for this problem and propose a branch-and-cut approach for the general case of it (when G is an arbitrary graph). The solution connectivity constraints are added to the model as cutting planes. With a small adaptation of this approach, we get an enumeration algorithm. The presented approach was able to solve instances from protein-protein interaction networks containing, after pre-processing, approximately 3,000 proteins (vertices) and 4,100 interactions between them (edges).

...read moreread less

Book Chapter•DOI•

Analysing the Genetic Diversity of Commonly Occurring Diseases

[...]

Shamita Malik¹, Sunil Kumar Khatri, Dolly Sharma²•Institutions (2)

Guru Gobind Singh Indraprastha University¹, Shiv Nadar University²

01 Jan 2018

TL;DR: The objective of this paper is to test and analyse the transmission of commonly occurring diseases to fit into more realistic models and to know how they came into existence and how they migrated, helpful for the treatment of such diseases and drug discovery.

...read moreread less

Abstract: It is generally believed that the existence of all organisms present on this earth has their point of convergence in a common gene pool. The current species passed through an evolutionary process which is still underway. The theoretical assumptions relating to the common descent of all organisms are based on four simple facts: first, they had wide geographical dispersal; second, the different life forms were not remarkably unique and did not possess mutually exclusive characteristics; third, some of their attributes which apparently served no purpose had an uncanny similarity with some of their lost functional traits; and last, based on their common attributes these organisms can be put together into a well-defined, hierarchical and coherent group, like a family tree. Phylogenetic networks are the main tools that can be used to represent biological relationship between different species. Biologists, mathematicians, statisticians, computer scientists and others have designed various models for the reconstruction of evolutionary networks and developed numerous algorithms for efficient predictions and analysis. Even though these problems have been studied for a very long time, but the computational model built to solve the biological problems fail to give accurate results while working on real biological data, which could be due to the premises on which the model is based. The objective of this paper is to test and analyse the transmission of commonly occurring diseases to fit into more realistic models. The problems are not only important because we need to know how they came into existence and how they migrated, but also helpful for the treatment of such diseases and drug discovery.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

Maximum likelihood from incomplete data via the EM algorithm

[...]

Arthur P. Dempster¹, Nan M. Laird¹, Donald B. Rubin¹•Institutions (1)

Harvard University¹

01 Sep 1977-Journal of the royal statistical society series b-methodological

49,597 citations

Journal Article•DOI•

Evolutionary trees from DNA sequences: A maximum likelihood approach

[...]

Joseph Felsenstein¹•Institutions (1)

University of Washington¹

01 Jan 1981-Journal of Molecular Evolution

TL;DR: A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available that allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests.

...read moreread less

Abstract: The application of maximum likelihood techniques to the estimation of evolutionary trees from nucleic acid sequence data is discussed. A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available. This method has advantages over the traditional parsimony algorithms, which can give misleading results if rates of evolution differ in different lineages. It also allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests, and gives rough indication of the error of the estimate of the tree.

...read moreread less

13,111 citations

Journal Article•DOI•

Numerical Recipes in C: The Art of Scientific Computing

[...]

Mary C. Seiler, Fritz A. Seiler

01 Sep 1989-Risk Analysis

11,285 citations

Journal Article•DOI•

Application of Phylogenetic Networks in Evolutionary Studies

[...]

Daniel H. Huson¹, David Bryant•Institutions (1)

University of Tübingen¹

01 Feb 2006-Molecular Biology and Evolution

TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.

...read moreread less

Abstract: The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.

...read moreread less

7,273 citations

Numerical Recipes in FORTRAN - The Art of Scientific Computing - Second Edition

[...]

William H. Press, Saul A. Teukolsky, William T. Vetterling, Brian P. Flannery

01 Jan 1989

TL;DR: This paper presents a list of recommended recipes for making CDRom decks and some examples of how these recipes can be modified to suit theommelier's needs.

...read moreread less

Abstract: Keywords: informatique ; numerical recipes Note: contient un CDRom Reference Record created on 2004-09-07, modified on 2016-08-08

...read moreread less

4,920 citations