scispace - formally typeset
Search or ask a question
Author

Ming-Yang Kao

Bio: Ming-Yang Kao is an academic researcher from Northwestern University. The author has contributed to research in topics: Time complexity & Planar graph. The author has an hindex of 37, co-authored 202 publications receiving 4438 citations. Previous affiliations of Ming-Yang Kao include Tufts University & Indiana University.


Papers
More filters
Proceedings ArticleDOI
Zhichun Li1, Manan Sanghi1, Yan Chen1, Ming-Yang Kao1, B. Chavez1 
21 May 2006
TL;DR: Hamsa is proposed, a network-based automated signature generation system for polymorphic worms which is fast, noise-tolerant and attack-resilient, and significantly outperforms Polygraph in terms of efficiency, accuracy, and attack resilience.
Abstract: Zero-day polymorphic worms pose a serious threat to the security of Internet infrastructures. Given their rapid propagation, it is crucial to detect them at edge networks and automatically generate signatures in the early stages of infection. Most existing approaches for automatic signature generation need host information and are thus not applicable for deployment on high-speed network links. In this paper, we propose Hamsa, a network-based automated signature generation system for polymorphic worms which is fast, noise-tolerant and attack-resilient. Essentially, we propose a realistic model to analyze the invariant content of polymorphic worms which allows us to make analytical attack-resilience guarantees for the signature generation algorithm. Evaluation based on a range of polymorphic worms and polymorphic engines demonstrates that Hamsa significantly outperforms Polygraph (J. Newsome et al., 2005) in terms of efficiency, accuracy, and attack resilience.

313 citations

Proceedings ArticleDOI
01 Feb 2000
TL;DR: The de novo peptide sequencing problem is to reconstruct the peptide sequence from a given tandem mass spectral data of k ions by implicitly transforming the spectral data into an NC-spectrum graph G (V, E) where /V/ = 2k + 2, and this approach can be further used to discover a modified amino acid in O(/V//E/) time.
Abstract: Tandem mass spectrometry fragments a large number of molecules of the same peptide sequence into charged molecules of prefix and suffix peptide subsequences and then measures mass/charge ratios of ...

242 citations

Journal ArticleDOI
TL;DR: In this paper, the authors studied the complexity of tile self-assembly under various generalizations of the tile selfassembly model and provided a lower bound of Ω( √ n 1/k) for the standard model.
Abstract: In this paper, we study the complexity of self-assembly under models that are natural generalizations of the tile self-assembly model. In particular, we extend Rothemund and Winfree's study of the tile complexity of tile self-assembly [Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, Portland, OR, 2000, pp. 459--468]. They provided a lower bound of $\Omega(\frac{\log N}{\log\log N})$ on the tile complexity of assembling an $N\times N$ square for almost all N. Adleman et al. [Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, Heraklion, Greece, 2001, pp. 740--748] gave a construction which achieves this bound. We consider whether the tile complexity for self-assembly can be reduced through several natural generalizations of the model. One of our results is a tile set of size $O(\sqrt{\log N})$ which assembles an $N\times N$ square in a model which allows flexible glue strength between nonequal glues. This result is matched for almost all N by a lower bound dictated by Kolmogorov complexity. For three other generalizations, we show that the $\Omega(\frac{\log N}{\log\log N})$ lower bound applies to $N\times N$ squares. At the same time, we demonstrate that there are some other shapes for which these generalizations allow reduced tile sets. Specifically, for thin rectangles with length N and width k, we provide a tighter lower bound of $\Omega(\frac{N^{1/k}}{k})$ for the standard model, yet we also give a construction which achieves $O(\frac{\log N}{\log\log N})$ complexity in a model in which the temperature of the tile system is adjusted during assembly. We also investigate the problem of verifying whether a given tile system uniquely assembles into a given shape; we show that this problem is NP-hard for three of the generalized models.

225 citations

Journal ArticleDOI
Ting Chen1, Ming-Yang Kao, Matthew Tepel1, John Rush1, George M. Church1 
TL;DR: In this paper, the authors proposed a dynamic programming-based method to reconstruct the peptide sequence from a given tandem mass spectral data of k ions by implicitly transforming the spectral data into an NC-spectrum graph G (V, E).
Abstract: Tandem mass spectrometry fragments a large number of molecules of the same peptide sequence into charged molecules of prefix and suffix peptide subsequences and then measures mass/charge ratios of these ions. The de novo peptide sequencing problem is to reconstruct the peptide sequence from a given tandem mass spectral data of k ions. By implicitly transforming the spectral data into an NC-spectrum graph G (V, E) where /V/ = 2k + 2, we can solve this problem in O(/V//E/) time and O(/V/2) space using dynamic programming. For an ideal noise-free spectrum with only b- and y-ions, we improve the algorithm to O(/V/ + /E/) time and O(/V/) space. Our approach can be further used to discover a modified amino acid in O(/V//E/) time. The algorithms have been implemented and tested on experimental data.

224 citations

Journal ArticleDOI
TL;DR: In this paper, the cow-path problem is studied and the first randomized algorithm for the cow path problem is presented. But the algorithm is optimal for two paths (w = 2) and is not optimal for larger values of w.
Abstract: Searching for a goal is a central and extensively studied problem in computer science. In classical searching problems, the cost of a search function is simply the number of queries made to an oracle that knows the position of the goal. In many robotics problems, as well as in problems from other areas, we want to charge a cost proportional to the distance between queries (e.g., the time required to travel between two query points). With this cost function in mind, the abstract problem known as thew-lane cow-path problem was designed. There are known optimal deterministic algorithms for the cow-path problem; we give the first randomized algorithm in this paper. We show that our algorithm is optimal for two paths (w=2) and give evidence that it is optimal for larger values ofw. Subsequent to the preliminary version of this paper, Kaoet al.(in“Proceedings, 5th ACM?SIAM Symposium on Discrete Algorithm,” pp. 372?381, 1994) have shown that our algorithm is indeed optimal for allw?2. Our randomized algorithm gives expected performance that is almost twice as good as is possible with a deterministic algorithm. For the performance of our algorithm, we also derive the asymptotic growth with respect tow?despite similar complexity results for related problems, it appears that this growth has never been analyzed.

149 citations


Cited by
More filters
Journal ArticleDOI

3,734 citations

Journal ArticleDOI
03 Jun 2011-Science
TL;DR: This work experimentally demonstrated several digital logic circuits, culminating in a four-bit square-root circuit that comprises 130 DNA strands, which enables fast and reliable function in large circuits with roughly constant switching time and linear signal propagation delays.
Abstract: To construct sophisticated biochemical circuits from scratch, one needs to understand how simple the building blocks can be and how robustly such circuits can scale up. Using a simple DNA reaction mechanism based on a reversible strand displacement process, we experimentally demonstrated several digital logic circuits, culminating in a four-bit square-root circuit that comprises 130 DNA strands. These multilayer circuits include thresholding and catalysis within every logical operation to perform digital signal restoration, which enables fast and reliable function in large circuits with roughly constant switching time and linear signal propagation delays. The design naturally incorporates other crucial elements for large-scale circuitry, such as general debugging tools, parallel circuit preparation, and an abstraction hierarchy supported by an automated circuit compiler.

1,249 citations

Journal ArticleDOI
TL;DR: A new de novo sequencing software package, PEAKS, is described, to extract amino acid sequence information without the use of databases, using a new model and a new algorithm to efficiently compute the best peptide sequences whose fragment ions can best interpret the peaks in the MS/MS spectrum.
Abstract: A number of different approaches have been described to identify proteins from tandem mass spectrometry (MS/MS) data. The most common approaches rely on the available databases to match experimental MS/MS data. These methods suffer from several drawbacks and cannot be used for the identification of proteins from unknown genomes. In this communication, we describe a new de novo sequencing software package, PEAKS, to extract amino acid sequence information without the use of databases. PEAKS uses a new model and a new algorithm to efficiently compute the best peptide sequences whose fragment ions can best interpret the peaks in the MS/MS spectrum. The output of the software gives amino acid sequences with confidence scores for the entire sequences, as well as an additional novel positional scoring scheme for portions of the sequences. The performance of PEAKS is compared with Lutefisk, a well-known de novo sequencing software, using quadrupole-time-of-flight (Q-TOF) data obtained for several tryptic peptides from standard proteins.

1,239 citations

Journal ArticleDOI
21 Jul 2011-Nature
TL;DR: It is suggested that DNA strand displacement cascades could be used to endow autonomous chemical systems with the capability of recognizing patterns of molecular events, making decisions and responding to the environment.
Abstract: The impressive capabilities of the mammalian brain—ranging from perception, pattern recognition and memory formation to decision making and motor activity control—have inspired their re-creation in a wide range of artificial intelligence systems for applications such as face recognition, anomaly detection, medical diagnosis and robotic vehicle control Yet before neuron-based brains evolved, complex biomolecular circuits provided individual cells with the ‘intelligent’ behaviour required for survival However, the study of how molecules can ‘think’ has not produced an equal variety of computational models and applications of artificial chemical systems Although biomolecular systems have been hypothesized to carry out neural-network-like computations in vivo and the synthesis of artificial chemical analogues has been proposed theoretically, experimental work has so far fallen short of fully implementing even a single neuron Here, building on the richness of DNA computing and strand displacement circuitry, we show how molecular systems can exhibit autonomous brain-like behaviours Using a simple DNA gate architecture that allows experimental scale-up of multilayer digital circuits, we systematically transform arbitrary linear threshold circuits (an artificial neural network model) into DNA strand displacement cascades that function as small neural networks Our approach even allows us to implement a Hopfield associative memory with four fully connected artificial neurons that, after training in silico, remembers four single-stranded DNA patterns and recalls the most similar one when presented with an incomplete pattern Our results suggest that DNA strand displacement cascades could be used to endow autonomous chemical systems with the capability of recognizing patterns of molecular events, making decisions and responding to the environment

884 citations