Showing papers by "Wing-Kin Sung published in 2015"

PDF

Open Access

Journal Article•DOI•

Recurrent Fusion Genes in Gastric Cancer: CLDN18-ARHGAP26 Induces Loss of Epithelial Integrity

[...]

Fei Yao¹, Fei Yao², Jaya P. Kausalya³, Yee Yen Sia², Yee Yen Sia¹, Audrey S. M. Teo², Audrey S. M. Teo¹, Wah Heng Lee², Alicia G.M. Ong³, Zhenshui Zhang², Joanna H.J. Tan², Guoliang Li⁴, Denis Bertrand², Xingliang Liu², Huay Mei Poh², Peiyong Guan², Peiyong Guan¹, Feng Zhu¹, Thushangi Nadeera Pathiraja¹, Thushangi Nadeera Pathiraja², Pramila N. Ariyaratne², Jaideepraj Rao⁵, Xing Yi Woo², Shaojiang Cai², Fabianus Hendriyan Mulawadi², Wan Ting Poh², Lavanya Veeravalli², Chee Seng Chan², Seong Soo Lim², See Ting Leong², Say Chuan Neo², Poh Sum D Choi², Elaine G.Y. Chew², Niranjan Nagarajan², Pierre-Étienne Jacques⁶, Jimmy Bok Yan So¹, Jimmy Bok Yan So⁷, Xiaoan Ruan², Khay Guan Yeoh⁷, Khay Guan Yeoh¹, Patrick Tan, Wing-Kin Sung², Wing-Kin Sung¹, Walter Hunziker³, Walter Hunziker¹, Yijun Ruan, Axel M. Hillmer², Axel M. Hillmer¹ - Show less +44 more•Institutions (7)

National University of Singapore¹, Genome Institute of Singapore², Institute of Molecular and Cell Biology³, Huazhong Agricultural University⁴, Tan Tock Seng Hospital⁵, Université de Sherbrooke⁶, University Health System⁷

14 Jul 2015-Cell Reports

TL;DR: Using DNA paired-end-tag (DNA-PET) whole-genome sequencing, 15 gastric cancers from Southeast Asians were analyzed and recurrent fusions between CLDN18, a tight junction gene, and ARHGAP26, a gene encoding a RHOA inhibitor were found.

...read moreread less

114 citations

Journal Article•DOI•

TP53 intron 1 hotspot rearrangements are specific to sporadic osteosarcoma and can cause Li-Fraumeni syndrome

[...]

Sebastian Ribi¹, Daniel Baumhoer², Kristy Lee³, Edison⁴, Audrey S.M. Teo¹, Babita Madan⁴, Kang Zhang⁵, Wendy Kohlmann³, Fei Yao¹, Wah Heng Lee¹, Qiangze Hoi¹, Shaojiang Cai¹, Xing Yi Woo¹, Patrick Tan¹, Patrick Tan⁴, Gernot Jundt², Jan Smida, Michaela Nathrath⁶, Wing-Kin Sung¹, Wing-Kin Sung⁴, Joshua D. Schiffman³, David M. Virshup⁴, Axel M. Hillmer¹ - Show less +19 more•Institutions (6)

Genome Institute of Singapore¹, University Hospital of Basel², Huntsman Cancer Institute³, National University of Singapore⁴, University of California, San Diego⁵, Technische Universität München⁶

28 Feb 2015-Oncotarget

TL;DR: In this paper, the authors used whole-genome sequencing of osteosarcoma (OS) to find features of TP53 intron 1 rearrangements suggesting a unique mechanism correlated with transcription.

...read moreread less

Abstract: Somatic mutations of TP53 are among the most common in cancer and germline mutations of TP53 (usually missense) can cause Li-Fraumeni syndrome (LFS). Recently, recurrent genomic rearrangements in intron 1 of TP53 have been described in osteosarcoma (OS), a highly malignant neoplasm of bone belonging to the spectrum of LFS tumors. Using whole-genome sequencing of OS, we found features of TP53 intron 1 rearrangements suggesting a unique mechanism correlated with transcription. Screening of 288 OS and 1,090 tumors of other types revealed evidence for TP53 rearrangements in 46 (16%) OS, while none were detected in other tumor types, indicating this rearrangement to be highly specific to OS. We revisited a four-generation LFS family where no TP53 mutation had been identified and found a 445 kb inversion spanning from the TP53 intron 1 towards the centromere. The inversion segregated with tumors in the LFS family. Cancers in this family had loss of heterozygosity, retaining the rearranged allele and resulting in TP53 expression loss. In conclusion, intron 1 rearrangements cause p53-driven malignancies by both germline and somatic mechanisms and provide an important mechanism of TP53 inactivation in LFS, which might in part explain the diagnostic gap of formerly classified "TP53 wild-type" LFS.

...read moreread less

55 citations

Journal Article•DOI•

Linked Dynamic Tries with Applications to LZ-Compression in Sublinear Time and Space

[...]

Jesper Jansson¹, Kunihiko Sadakane², Wing-Kin Sung³•Institutions (3)

Kyoto University¹, National Institute of Informatics², National University of Singapore³

01 Apr 2015-Algorithmica

TL;DR: A new technique for maintaining a dynamic trie T of size at most 2w nodes under the unit-cost RAM model with a fixed word size w is proposed, based on the idea of partitioning T into a set of linked small tries, each of which can be maintained efficiently.

...read moreread less

Abstract: The dynamic trie is a fundamental data structure with applications in many areas of computer science This paper proposes a new technique for maintaining a dynamic trie T of size at most 2 w nodes under the unit-cost RAM model with a fixed word size w It is based on the idea of partitioning T into a set of linked small tries, each of which can be maintained efficiently Our method is not only space-efficient, but also allows the longest common prefix between any query pattern P and the strings currently stored in T to be computed in o(|P|) time for small alphabets, and allows any leaf to be inserted into or deleted from T in o(log|T|) time To demonstrate the usefulness of our new data structure, we apply it to LZ-compression Significantly, we obtain the first algorithm for generating the lz78 encoding of a given string of length n over an alphabet of size ? in sublinear (o(n)) time and sublinear (o(nlog?) bits) working space for small alphabets ( $\sigma= 2^{o(\log n \frac{\log\log\log n}{(\log\log n)^{2}})}$ ) Moreover, the working space for our new algorithm is asymptotically less than or equal to the space for storing the output compressed text, regardless of the alphabet size

...read moreread less

26 citations

Journal Article•DOI•

Xenopus tropicalis Genome Re-Scaffolding and Re-Annotation Reach the Resolution Required for In Vivo ChIA-PET Analysis

[...]

Nicolas Buisine¹, Xiaoan Ruan², Patrice Bilesimo¹, Alexis Grimaldi¹, Gladys Alfama¹, Pramila N. Ariyaratne², Fabianus Hendriyan Mulawadi², Jieqi Chen², Wing-Kin Sung², Edison T. Liu³, Edison T. Liu², Barbara A. Demeneix¹, Yijun Ruan², Yijun Ruan³, Laurent M. Sachs¹ - Show less +11 more•Institutions (3)

Centre national de la recherche scientifique¹, Genome Institute of Singapore², University of Connecticut³

08 Sep 2015-PLOS ONE

TL;DR: This work improves the quality of Xenopus tropicalis genomic resources, reaching the standard required for ChIA-PET analysis of transcriptional networks, and considers that the workflow proposed offers useful conceptual and methodological guidance and can readily be applied to other non-conventional models that have low-resolution genome data.

...read moreread less

Abstract: Genome-wide functional analyses require high-resolution genome assembly and annotation. We applied ChIA-PET to analyze gene regulatory networks, including 3D chromosome interactions, underlying thyroid hormone (TH) signaling in the frog Xenopus tropicalis. As the available versions of Xenopus tropicalis assembly and annotation lacked the resolution required for ChIA-PET we improve the genome assembly version 4.1 and annotations using data derived from the paired end tag (PET) sequencing technologies and approaches (e.g., DNA-PET [gPET], RNA-PET etc.). The large insert (~10Kb, ~17Kb) paired end DNA-PET with high throughput NGS sequencing not only significantly improved genome assembly quality, but also strongly reduced genome “fragmentation”, reducing total scaffold numbers by ~60%. Next, RNA-PET technology, designed and developed for the detection of full-length transcripts and fusion mRNA in whole transcriptome studies (ENCODE consortia), was applied to capture the 5' and 3' ends of transcripts. These amendments in assembly and annotation were essential prerequisites for the ChIA-PET analysis of TH transcription regulation. Their application revealed complex regulatory configurations of target genes and the structures of the regulatory networks underlying physiological responses. Our work allowed us to improve the quality of Xenopus tropicalis genomic resources, reaching the standard required for ChIA-PET analysis of transcriptional networks. We consider that the workflow proposed offers useful conceptual and methodological guidance and can readily be applied to other non-conventional models that have low-resolution genome data.

...read moreread less

21 citations

Journal Article•DOI•

An $\bm{O(m\, \log\, m)}$ -Time Algorithm for Detecting Superbubbles

[...]

Wing-Kin Sung¹, Kunihiko Sadakane², Tetsuo Shibuya², Abha Belorkar¹, Iana Pyrogova¹ - Show less +1 more•Institutions (2)

National University of Singapore¹, University of Tokyo²

01 Jul 2015-IEEE/ACM Transactions on Computational Biology and Bioinformatics

TL;DR: An O(mlogm)-time algorithm is given to solve the problem for a graph with m edges of Superbubble, a complex generalization of bubbles, for analyzing assembly graphs.

...read moreread less

Abstract: In genome assembly graphs, motifs such as tips, bubbles, and cross links are studied in order to find sequencing errors and to understand the nature of the genome. Superbubble, a complex generalization of bubbles, was recently proposed as an important subgraph class for analyzing assembly graphs. At present, a quadratic time algorithm is known. This paper gives an O(mlogm)-time algorithm to solve this problem for a graph with m edges.

...read moreread less

19 citations

Journal Article•DOI•

BatAlign: an incremental method for accurate alignment of sequencing reads

[...]

Jing Quan Lim¹, Chandana Tennakoon¹, Peiyong Guan¹, Wing-Kin Sung¹•Institutions (1)

National University of Singapore¹

18 Sep 2015-Nucleic Acids Research

TL;DR: BatAlign is an algorithm that integrated two strategies called ‘Reverse-Alignment’ and ‘Deep-Scan’ to improve the accuracy of read-alignment and was able to obtain the highest F-measures in read-alignments on mismatch-aberrant, indel-aberrants, concordantly/discordantly paired and SV-spanning data sets.

...read moreread less

Abstract: Structural variations (SVs) play a crucial role in genetic diversity. However, the alignments of reads near/across SVs are made inaccurate by the presence of polymorphisms. BatAlign is an algorithm that integrated two strategies called ‘Reverse-Alignment’ and ‘Deep-Scan’ to improve the accuracy of read-alignment. In our experiments, BatAlign was able to obtain the highest F-measures in read-alignments on mismatch-aberrant, indel-aberrant, concordantly/discordantly paired and SV-spanning data sets. On real data, the alignments of BatAlign were able to recover 4.3% more PCR-validated SVs with 73.3% less callings. These suggest BatAlign to be effective in detecting SVs and other polymorphic-variants accurately using high-throughput data. BatAlign is publicly available at https://goo.gl/a6phxB.

...read moreread less

11 citations

Proceedings Article•DOI•

On Finding the Adams Consensus Tree

[...]

Jesper Jansson¹, Zhaoxian Li², Wing-Kin Sung³, Wing-Kin Sung²•Institutions (3)

Kyoto University¹, National University of Singapore², Agency for Science, Technology and Research³

01 Jan 2015

TL;DR: In this paper, the authors presented a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels, for the first time improving the time complexity of a widely used algorithm invented by Adams in 1972.

...read moreread less

Abstract: This paper presents a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels, for the first time improving the time complexity of a widely used algorithm invented by Adams in 1972 [1]. Our algorithm applies the centroid path decomposition technique [9] in a new way to traverse the input trees' centroid paths in unison, and runs in O(k n \log n) time, where k is the number of input trees and n is the size of the leaf label set. (In comparison, the old algorithm from 1972 has a worst-case running time of O(k n^2).) For the special case of k = 2, an even faster algorithm running in O(n \cdot \frac{\log n}{\log\log n}) time is provided, which relies on an extension of the wavelet tree-based technique by Bose et al. [6] for orthogonal range counting on a grid. Our extended wavelet tree data structure also supports truncated range maximum queries efficiently and may be of independent interest to algorithm designers.

...read moreread less

2 citations

An OðmlogmÞ OðmlogmÞ-Time Algorithm for Detecting Superbubbles

[...]

Wing-Kin Sung, Kunihiko Sadakane, Tetsuo Shibuya, Abha Belorkar, Iana Pyrogova - Show less +1 more

01 Jan 2015

TL;DR: In this paper, the authors proposed an OðmlogmÞ-time algorithm to solve the problem for a graph with m edges, where m is the number of vertices in the graph.

...read moreread less

Abstract: In genome assembly graphs, motifs such as tips, bubbles, and cross links are studied in order to find sequencing errors and to understand the nature of the genome. Superbubble, a complex generalization of bubbles, was recently proposed as an important subgraph class for analyzing assembly graphs. At present, a quadratic time algorithm is known. This paper gives an OðmlogmÞ-time algorithm to solve this problem for a graph with m edges.

...read moreread less