scispace - formally typeset
Search or ask a question

Showing papers by "Wing-Kin Sung published in 2017"


Journal ArticleDOI
TL;DR: In Drosophila larval brain, Chro promotes neural stem cell reactivation and prevents activated NSCs from entering quiescence, and that Chro carries out such a role by regulating the expression of key transcription factors in the nucleus.
Abstract: The switch between quiescence and proliferation is central for neurogenesis and its alteration is linked to neurodevelopmental disorders such as microcephaly. However, intrinsic mechanisms that reactivate Drosophila larval neural stem cells (NSCs) to exit from quiescence are not well established. Here we show that the spindle matrix complex containing Chromator (Chro) functions as a key intrinsic regulator of NSC reactivation downstream of extrinsic insulin/insulin-like growth factor signalling. Chro also prevents NSCs from re-entering quiescence at later stages. NSC-specific in vivo profiling has identified many downstream targets of Chro, including a temporal transcription factor Grainy head (Grh) and a neural stem cell quiescence-inducing factor Prospero (Pros). We show that spindle matrix proteins promote the expression of Grh and repress that of Pros in NSCs to govern their reactivation. Our data demonstrate that nuclear Chro critically regulates gene expression in NSCs at the transition from quiescence to proliferation.The spindle matrix proteins, including Chro, are known to regulate mitotic spindle assembly in the cytoplasm. Here the authors show that in Drosophila larval brain, Chro promotes neural stem cell (NSC) reactivation and prevents activated NSCs from entering quiescence, and that Chro carries out such a role by regulating the expression of key transcription factors in the nucleus.

24 citations


Journal ArticleDOI
TL;DR: The performance of BatVI was compared with existing methods VirusFinder and VirusSeq using both simulated and real-life datasets of liver cancer patients and it was able to predict almost twice the number of true positives compared to other methods while maintaining a false positive rate less than 1%.
Abstract: The study of virus integrations in human genome is important since virus integrations were shown to be associated with diseases. In the literature, few methods have been proposed that predict virus integrations using next generation sequencing datasets. Although they work, they are slow and are not very sensitive. This paper introduces a new method BatVI to predict viral integrations. Our method uses a fast screening method to filter out chimeric reads containing possible viral integrations. Next, sensitive alignments of these candidate chimeric reads are called by BLAST. Chimeric reads that are co-localized in the human genome are clustered. Finally, by assembling the chimeric reads in each cluster, high confident virus integration sites are extracted. We compared the performance of BatVI with existing methods VirusFinder and VirusSeq using both simulated and real-life datasets of liver cancer patients. BatVI ran an order of magnitude faster and was able to predict almost twice the number of true positives compared to other methods while maintaining a false positive rate less than 1%. For the liver cancer datasets, BatVI uncovered novel integrations to two important genes TERT and MLL4, which were missed by previous studies. Through gene expression data, we verified the correctness of these additional integrations. BatVI can be downloaded from http://biogpu.ddns.comp.nus.edu.sg/~ksung/batvi/index.html .

18 citations


Journal ArticleDOI
TL;DR: The current data substantiate knowledge on the role of CDH17 in the biology of HCC and suggest that components of the CDh17/β-catenin axis may serve as therapeutic targets in CDH 17 over-expressing HCC patients.
Abstract: Hepatocellular carcinoma (HCC) is the most common type of liver cancer worldwide. Previously, we reported that cadherin-17 (CDH17) and its related CDH17/β-catenin axis may be responsible for inducing HCC in a subset of patients exhibiting CDH17 over-expression. Here we aimed at obtaining a better understanding of the CDH17-related HCC biology and to obtain further indications for the design of targeted therapies in CDH17 over-expressing HCC patients. We found that SPINK1 acts as a downstream effector of the CDH17/β-catenin axis in HCC. In addition, we found that SPINK1 expression exhibited a positive correlation with CDH17 expression in human HCCs and was over-expressed in up to 70% of the tumors. We identified SPINK1 as a downstream effector of the CDH17/β-catenin axis using a spectrum of in vitro assays, including gene expression modulation and inhibitor assays, bioinformatics analyses and luciferase reporter assays. These in vitro results were validated in primary human HCCs, including the observation that alteration in β-catenin expression (a core component of the CDH17/β-catenin axis) in tumors affects SPINK1 serum levels in HCC patients. Similar to CDH17, SPINK1 expression in HCC cells was found to be associated with specific tumor-related properties via activating the c-Raf/MEK/ERK pathway. Our current data substantiate our knowledge on the role of CDH17 in the biology of HCC and suggest that components of the CDH17/β-catenin axis may serve as therapeutic targets in CDH17 over-expressing HCC patients.

12 citations


Book ChapterDOI
16 Dec 2017
TL;DR: The fastest known algorithm for the k-mappability problem with k = 1 requires time complexity of at most O(n) and space complexity of O(log n) as mentioned in this paper.
Abstract: In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where \(k=1\). The fastest known algorithm for \(k=1\) requires time \(\mathcal {O}(mn \log n/\log \log n)\) and space \(\mathcal {O}(n)\). We present two new algorithms that require worst-case time \(\mathcal {O}(mn)\) and \(\mathcal {O}(n \log n \log \log n)\), respectively, and space \(\mathcal {O}(n)\), thus greatly improving the state of the art. Moreover, we present another algorithm that requires average-case time and space \(\mathcal {O}(n)\) for integer alphabets of size \(\sigma \) if \(m=\varOmega (\log _\sigma n)\). Notably, we show that this algorithm is generalizable for arbitrary k, requiring average-case time \(\mathcal {O}(kn)\) and space \(\mathcal {O}(n)\) if \(m=\varOmega (k\log _\sigma n)\).

6 citations


Journal ArticleDOI
TL;DR: A fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels is presented, which relies on an extension of the wavelet tree-based technique of Bose et al. for orthogonal range counting on a grid.
Abstract: This article presents a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels. Its worst-case running time is O ( k n log ⁡ n ) , where k is the number of input trees and n is the size of the leaf label set; in comparison, the original algorithm of Adams has a worst-case running time of O ( k n 2 ) . To achieve subquadratic running time, the centroid path decomposition technique is applied in a novel way that traverses the input trees by following a centroid path in each of them in unison. For k = 2 , an even faster algorithm running in O ( n ⋅ log ⁡ n log ⁡ log ⁡ n ) time is provided, which relies on an extension of the wavelet tree-based technique of Bose et al. for orthogonal range counting on a grid. Our extended wavelet tree data structure also supports truncated range maximum/minimum queries efficiently.

6 citations


Book
24 May 2017
TL;DR: Algorithms for Next-Generation Sequencing (ALGS) as discussed by the authors is a tool for students and researchers in bioinformatics and computational biology, biologists seeking to process and manage the data generated by next-generation sequencing, and as a textbook or a self-study resource.
Abstract: Advances in sequencing technology have allowed scientists to study the human genome in greater depth and on a larger scale than ever before – as many as hundreds of millions of short reads in the course of a few days. But what are the best ways to deal with this flood of data? Algorithms for Next-Generation Sequencing is an invaluable tool for students and researchers in bioinformatics and computational biology, biologists seeking to process and manage the data generated by next-generation sequencing, and as a textbook or a self-study resource. In addition to offering an in-depth description of the algorithms for processing sequencing data, it also presents useful case studies describing the applications of this technology.

5 citations


Book ChapterDOI
03 May 2017
TL;DR: In this paper, a detailed characterization of how the computational complexity of the consistency problem changes under various restrictions is presented, and the main result is an efficient algorithm for dense inputs satisfying ''R^{-} = \emptyset'' whose running time is linear in the size of the input and therefore optimal.
Abstract: The \(\mathcal {R}^{+-} \mathcal {F}^{+-}\) Consistency problem takes as input two sets \(R^{+}\) and \(R^{-}\) of resolved triplets and two sets \(F^{+}\) and \(F^{-}\) of fan triplets, and asks for a distinctly leaf-labeled tree that contains all elements in \(R^{+} \cup F^{+}\) and no elements in \(R^{-} \cup F^{-}\) as embedded subtrees, if such a tree exists. This paper presents a detailed characterization of how the computational complexity of the problem changes under various restrictions. Our main result is an efficient algorithm for dense inputs satisfying \(R^{-} = \emptyset \) whose running time is linear in the size of the input and therefore optimal.

4 citations


Book ChapterDOI
05 Jun 2017
TL;DR: The fastest known algorithm for computing the rooted triplet distance between two input galled trees runs in O(n 2.687 ) time, where n is the cardinality of the leaf label set as discussed by the authors.
Abstract: The previously fastest algorithm for computing the rooted triplet distance between two input galled trees (i.e., phylogenetic networks whose cycles are vertex-disjoint) runs in \(O(n^{2.687})\) time, where n is the cardinality of the leaf label set. Here, we present an \(O(n \log n)\)-time solution. Our strategy is to transform the input so that the answer can be obtained by applying an existing \(O(n \log n)\)-time algorithm for the simpler case of two phylogenetic trees a constant number of times.

2 citations


Book ChapterDOI
17 Jul 2017
TL;DR: This work has shown that the Hopcroft–Karp algorithm can find a maximum bipartite matching of a bipartites graph G in \(O(\sqrt{n} m) time where n and m are the number of nodes and edges, respectively, in the bipartITE graph G.
Abstract: Maximum bipartite matching is a fundamental problem in computer science with many applications. The HopcroftKarp algorithm can find a maximum bipartite matching of a bipartite graph G in \(O(\sqrt{n} m)\) time where n and m are the number of nodes and edges, respectively, in the bipartite graph G. However, when G is dense (i.e., \(m=O(n^2)\)), the Hopcroft–Karp algorithm runs in \(O(n^{2.5})\) time.

2 citations


Posted Content
30 May 2017
TL;DR: This paper focuses on two of the most well-known and widely used oconsensus tree methods: the greedy consensus tree and the frequency difference consensus tree, and improves these running times to Õpknq and Õ pknq respectively.
Abstract: A consensus tree is a phylogenetic tree that captures the similarity between a set of conflicting phylogenetic trees. The problem of computing a consensus tree is a major step in phylogenetic tree reconstruction. It also finds applications in predicting a species tree from a set of gene trees. This paper focuses on two of the most well-known and widely used oconsensus tree methods: the greedy consensus tree and the frequency difference consensus tree. Given k conflicting trees each with n leaves, the previous fastest algorithms for these problems were Opknq for the greedy consensus tree [J. ACM 2016] and Õpmintkn, knuq for the frequency difference consensus tree [ACM TCBB 2016]. We improve these running times to Õpknq and Õpknq respectively.

2 citations


Posted Content
TL;DR: In this paper, the authors improved the running time of the greedy consensus tree and the frequency difference consensus tree to O(k n−1.5) and O((k n −2, k^2n) ), respectively, by computing a consensus tree from a set of conflicting phylogenetic trees.
Abstract: A consensus tree is a phylogenetic tree that captures the similarity between a set of conflicting phylogenetic trees. The problem of computing a consensus tree is a major step in phylogenetic tree reconstruction. It also finds applications in predicting a species tree from a set of gene trees. This paper focuses on two of the most well-known and widely used oconsensus tree methods: the greedy consensus tree and the frequency difference consensus tree. Given $k$ conflicting trees each with $n$ leaves, the previous fastest algorithms for these problems were $O(k n^2)$ for the greedy consensus tree [J. ACM 2016] and $\tilde O(\min \{ k n^2, k^2n\})$ for the frequency difference consensus tree [ACM TCBB 2016]. We improve these running times to $\tilde O(k n^{1.5})$ and $\tilde O(k n)$ respectively.

Posted Content
TL;DR: Two new algorithms that require worst-case time and space for integer alphabets of size \(m=\varOmega (\log _\sigma n)\) are presented, thus greatly improving the state of the art.
Abstract: In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). We present two algorithms that require worst-case time O(mn) and O(n log^2 n), respectively, and space O(n), thus greatly improving the state of the art. Moreover, we present an algorithm that requires average-case time and space O(n) for integer alphabets if m = {\Omega}(log n/ log {\sigma}), where {\sigma} is the alphabet size.