scispace - formally typeset
Search or ask a question
Author

Ming-Yang Kao

Bio: Ming-Yang Kao is an academic researcher from Northwestern University. The author has contributed to research in topics: Time complexity & Planar graph. The author has an hindex of 37, co-authored 202 publications receiving 4438 citations. Previous affiliations of Ming-Yang Kao include Tufts University & Indiana University.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper introduces a graph search called the scan-first search, and shows that a certificate with at most $k(n - 1)$ edges can be computed by executing scan- first search k times in sequence on subgraphs of G.
Abstract: Given a graph $G = (V,E)$, a certificate of k-vertex connectivity is an edge subset $E' \subset E$ such that the subgraph $(V,E')$ is k-vertex connected if and only if G is k-vertex connected. Let n and m denote the number of vertices and edges. A certificate is called sparse if it contains $O(kn)$ edges.For undirected graphs, this paper introduces a graph search called the scan-first search, and shows that a certificate with at most $k(n - 1)$ edges can be computed by executing scan-first search k times in sequence on subgraphs of G. For each of the parallel, distributed, and sequential models of computation, the complexity of scan-first search matches the best complexity of any graph search on that model. In particular, the parallel scan-first search runs in $O(\log n)$ time using $C(n,m)$ processors on a CRCW PRAM, where $C(n,m)$ is the number of processors needed to find a spanning tree in each connected component in $O(\log n)$ time, and the parallel certificate algorithm runs in $O(k\log n)$ time us...

70 citations

Journal ArticleDOI
TL;DR: In this article, it was shown that if G is triangulated, it can be encoded in 4/3m-1 bits, improving on the best previous bound of about 1.53m bits.
Abstract: Let G be an embedded planar undirected graph that has n vertices, m edges, and f faces but has no self-loop or multiple edge. If G is triangulated, we can encode it using 4/3m-1 bits, improving on the best previous bound of about 1.53m bits. In case exponential time is acceptable, roughly 1.08m bits have been known to suffice. If G is triconnected, we use at most $(2.5+2\log{3})\min\{n,f\}-7$ bits, which is at most 2.835m bits and smaller than the best previous bound of 3m bits. Both of our schemes take O(n) time for encoding and decoding.

62 citations

Journal ArticleDOI
TL;DR: A new decomposition theorem is presented for maximum weight bipartite matchings and the weight of a maximum weight matching of G - {u} for all nodes u in O(W) time is computed.
Abstract: Let G be a bipartite graph with positive integer weights on the edges and without isolated nodes. Let n, N, and W be the node count, the largest edge weight, and the total weight of G. Let k(x, y) be log x / log (x2/y). We present a new decomposition theorem for maximum weight bipartite matchings and use it to design an $O(\sqrt{n}W / k(n, W/N))$-time algorithm for computing a maximum weight matching of G. This algorithm bridges a long-standing gap between the best known time complexity of computing a maximum weight matching and that of computing a maximum cardinality matching. Given G and a maximum weight matching of G, we can further compute the weight of a maximum weight matching of G - {u} for all nodes u in O(W) time.

60 citations

Journal ArticleDOI
TL;DR: A recent development in microarray research entails the unbiased coverage, or tiling, of genomic DNA for the large-scale identification of transcribed sequences and regulatory elements, and two algorithms for finding an optimal tile path composed of longer sequence tiles are developed.
Abstract: A recent development in microarray research entails the unbiased coverage, or tiling, of genomic DNA for the large-scale identification of transcribed sequences and regulatory elements. A central issue in designing tiling arrays is that of arriving at a single-copy tile path, as significant sequence cross-hybridization can result from the presence of non-unique probes on the array. Due to the fragmentation of genomic DNA caused by the widespread distribution of repetitive elements, the problem of obtaining adequate sequence coverage increases with the sizes of subsequence tiles that are to be included in the design. This becomes increasingly problematic when considering complex eukaryotic genomes that contain many thousands of interspersed repeats. The general problem of sequence tiling can be framed as finding an optimal partitioning of non-repetitive subsequences over a prescribed range of tile sizes, on a DNA sequence comprising repetitive and non-repetitive regions. Exact solutions to the tiling problem become computationally infeasible when applied to large genomes, but successive optimizations are developed that allow their practical implementation. These include an efficient method for determining the degree of similarity of many oligonucleotide sequences over large genomes, and two algorithms for finding an optimal tile path composed of longer sequence tiles. The first algorithm, a dynamic programming approach, finds an optimal tiling in linear time and space; the second applies a heuristic search to reduce the space complexity to a constant requirement. A Web resource has also been developed, accessible at http://tiling.gersteinlab.org, to generate optimal tile paths from user-provided DNA sequences.

60 citations

Journal ArticleDOI
TL;DR: A fast methodology for encoding graphs with information-theoretically minimum numbers of bits is proposed, applicable to general classes of graphs; this paper focuses on planar graphs.
Abstract: We propose a fast methodology for encoding graphs with information-theoretically minimum numbers of bits. Specifically, a graph with property $\pi$ is called a {\em $\pi$-graph}. If $\pi$ satisfies certain properties, then an n-node m-edge $\pi$-graph G can be encoded by a binary string X such that (1) G and X can be obtained from each other in O(n log n) time, and (2) X has at most $\beta(n)+o(\beta(n))$ bits for any continuous superadditive function $\beta(n)$ so that there are at most $2^{\beta(n)+o(\beta(n))}$ distinct $n$-node $\pi$-graphs. The methodology is applicable to general classes of graphs; this paper focuses on planar graphs. Examples of such $\pi$ include all conjunctions over the following groups of properties: (1) G is a planar graph or a plane graph; (2) $G$ is directed or undirected; (3) $G$ is triangulated, triconnected, biconnected, merely connected, or not required to be connected; (4) the nodes of G are labeled with labels from $\{1,\ldots, \ell_1\}$ for $\ell_1\leq n$; (5) the edges of G are labeled with labels from $\{1,\ldots, \ell_2\}$ for $\ell_2\leq m$; and (6) each node (respectively, edge) of G has at most $\ell_3=O(1)$ self-loops (respectively, $\ell_4=O(1)$ multiple edges). Moreover, $\ell_3$ and $\ell_4$ are not required to be O(1) for the cases of $\pi$ being a plane triangulation. These examples are novel applications of small cycle separators of planar graphs and are the only nontrivial classes of graphs, other than rooted trees, with known polynomial-time information-theoretically optimal coding schemes.

59 citations


Cited by
More filters
Journal ArticleDOI

3,734 citations

Journal ArticleDOI
03 Jun 2011-Science
TL;DR: This work experimentally demonstrated several digital logic circuits, culminating in a four-bit square-root circuit that comprises 130 DNA strands, which enables fast and reliable function in large circuits with roughly constant switching time and linear signal propagation delays.
Abstract: To construct sophisticated biochemical circuits from scratch, one needs to understand how simple the building blocks can be and how robustly such circuits can scale up. Using a simple DNA reaction mechanism based on a reversible strand displacement process, we experimentally demonstrated several digital logic circuits, culminating in a four-bit square-root circuit that comprises 130 DNA strands. These multilayer circuits include thresholding and catalysis within every logical operation to perform digital signal restoration, which enables fast and reliable function in large circuits with roughly constant switching time and linear signal propagation delays. The design naturally incorporates other crucial elements for large-scale circuitry, such as general debugging tools, parallel circuit preparation, and an abstraction hierarchy supported by an automated circuit compiler.

1,249 citations

Journal ArticleDOI
TL;DR: A new de novo sequencing software package, PEAKS, is described, to extract amino acid sequence information without the use of databases, using a new model and a new algorithm to efficiently compute the best peptide sequences whose fragment ions can best interpret the peaks in the MS/MS spectrum.
Abstract: A number of different approaches have been described to identify proteins from tandem mass spectrometry (MS/MS) data. The most common approaches rely on the available databases to match experimental MS/MS data. These methods suffer from several drawbacks and cannot be used for the identification of proteins from unknown genomes. In this communication, we describe a new de novo sequencing software package, PEAKS, to extract amino acid sequence information without the use of databases. PEAKS uses a new model and a new algorithm to efficiently compute the best peptide sequences whose fragment ions can best interpret the peaks in the MS/MS spectrum. The output of the software gives amino acid sequences with confidence scores for the entire sequences, as well as an additional novel positional scoring scheme for portions of the sequences. The performance of PEAKS is compared with Lutefisk, a well-known de novo sequencing software, using quadrupole-time-of-flight (Q-TOF) data obtained for several tryptic peptides from standard proteins.

1,239 citations

Journal ArticleDOI
21 Jul 2011-Nature
TL;DR: It is suggested that DNA strand displacement cascades could be used to endow autonomous chemical systems with the capability of recognizing patterns of molecular events, making decisions and responding to the environment.
Abstract: The impressive capabilities of the mammalian brain—ranging from perception, pattern recognition and memory formation to decision making and motor activity control—have inspired their re-creation in a wide range of artificial intelligence systems for applications such as face recognition, anomaly detection, medical diagnosis and robotic vehicle control Yet before neuron-based brains evolved, complex biomolecular circuits provided individual cells with the ‘intelligent’ behaviour required for survival However, the study of how molecules can ‘think’ has not produced an equal variety of computational models and applications of artificial chemical systems Although biomolecular systems have been hypothesized to carry out neural-network-like computations in vivo and the synthesis of artificial chemical analogues has been proposed theoretically, experimental work has so far fallen short of fully implementing even a single neuron Here, building on the richness of DNA computing and strand displacement circuitry, we show how molecular systems can exhibit autonomous brain-like behaviours Using a simple DNA gate architecture that allows experimental scale-up of multilayer digital circuits, we systematically transform arbitrary linear threshold circuits (an artificial neural network model) into DNA strand displacement cascades that function as small neural networks Our approach even allows us to implement a Hopfield associative memory with four fully connected artificial neurons that, after training in silico, remembers four single-stranded DNA patterns and recalls the most similar one when presented with an incomplete pattern Our results suggest that DNA strand displacement cascades could be used to endow autonomous chemical systems with the capability of recognizing patterns of molecular events, making decisions and responding to the environment

884 citations