Author
Michael G. Thomason
Other affiliations: Duke University, Union Carbide
Bio: Michael G. Thomason is an academic researcher from University of Tennessee. The author has contributed to research in topics: Markov chain & Markov model. The author has an hindex of 19, co-authored 54 publications receiving 1965 citations. Previous affiliations of Michael G. Thomason include Duke University & Union Carbide.
Papers published on a yearly basis
Papers
More filters
TL;DR: This paper describes a method for statistical testing based on a Markov chain model of software usage that allows test input sequences to be generated from multiple probability distributions, making it more general than many existing techniques.
Abstract: Statistical testing of software establishes a basis for statistical inference about a software system's expected field quality. This paper describes a method for statistical testing based on a Markov chain model of software usage. The significance of the Markov chain is twofold. First, it allows test input sequences to be generated from multiple probability distributions, making it more general than many existing techniques. Analytical results associated with Markov chains facilitate informative analysis of the sequences before they are generated, indicating how the test is likely to unfold. Second, the test input sequences generated from the chain and applied to the software are themselves a stochastic model and are used to create a second Markov chain to encapsulate the history of the test, including any observed failure information. The influence of the failures is assessed through analytical computations on this chain. We also derive a stopping criterion for the testing process based on a comparison of the sequence generating properties of the two chains. >
433 citations
Book•
01 Jun 1978
TL;DR: This book provides an introduction to basic concepts and techniques of syntactic pattern recognition and emphasizes fundamental and practical material rather than strictly theoretical topics.
Abstract: This book provides an introduction to basic concepts and techniques of syntactic pattern recognition. The presentation emphasizes fundamental and practical material rather than strictly theoretical topics, and numerous examples illustrate the principles. The subject is developed according to the following topics: introduction (background, patterns and pattern classes, approaches to pattern recognition, elements of a pattern recognition system, concluding remarks); elements of formal language theory (introduction; string grammars and languages; examples of pattern languages and grammars; equivalent context-free grammars; syntax-directed translations; deterministic, nondeterministic, and stochastic systems; concluding remarks); higher-dimensional grammars (introduction; tree grammars; web grammars; plex grammars; shape gammars; concluding remarks); recognition and translation of syntactic structures (introduction; string language recognizers; automata for simple syntax-directed translation; parsing in string languages; recognition of imperfect strings; tree automata; concluding remarks); stochastic grammars, languages, and recognizers (introduction; stochastic grammars and languages; consisting of stochastic context-free grammars; stochastic reocgnizers; stochastic syntax-directed translations; modified Cocke-Younger-Kasami parsing algorithm for stochastic errors of changed symbols; concluding remarks); and grammatical inference (introduction; inference of regular grammars; inference of context-free grammars; inference of tree grammars; inference of stochastic grammar; concluding remarks). 155 references, 93 figures, 4 tables. (RWR)
296 citations
TL;DR: In this paper, some sufficient conditions for convergence under “max(min)” products of the powers of a square fuzzy matrix and of a fuzzy state process are established.
Abstract: A Boolean matrix is a matrix with elements having values of either 1 or 0; a fuzzy matrix is a matrix with elements having values in the closed interval [0, 1]. Fuzzy matrices occur in the modeling of various fuzzy systems, with products usually determined by the “max(min)” rule arising from fuzzy set theory. In this paper, some sufficient conditions for convergence under “max(min)” products of the powers of a square fuzzy matrix and of a fuzzy state process are established.
229 citations
01 Jan 1982
TL;DR: This paper reviews concepts of syntactic pattern recognition with emphasis on syntax-directed translations and discusses active research areas which include methods of grammatical inference, probabilistic systems, approaches to error correction, and techniques of combining syntax with semantics.
Abstract: This paper reviews concepts of syntactic pattern recognition with emphasis on syntax-directed translations. Examples of recent work on hybrid and hierarchical systems are cited. There is a brief discussion of active research areas which include methods of grammatical inference, probabilistic systems, approaches to error correction, and techniques of combining syntax with semantics.
134 citations
28 Jun 2004
TL;DR: This analysis focuses on the performance of individual codes for finite systems, and addresses several important heretofore unanswered questions about employing LDPC codes in real-world systems.
Abstract: As peer-to-peer and widely distributed storage systems proliferate, the need to perform efficient erasure coding, instead of replication, is crucial to performance and efficiency. Low-density parity-check (LDPC) codes have arisen as alternatives to standard erasure codes, such as Reed-Solomon codes, trading off vastly improved decoding performance for inefficiencies in the amount of data that must be acquired to perform decoding. The scores of papers written on LDPC codes typically analyze their collective and asymptotic behavior. Unfortunately, their practical application requires the generation and analysis of individual codes for finite systems. This paper attempts to illuminate the practical considerations of LDPC codes for peer-to-peer and distributed storage systems. The three main types of LDPC codes are detailed, and a huge variety of codes are generated, then analyzed using simulation. This analysis focuses on the performance of individual codes for finite systems, and addresses several important heretofore unanswered questions about employing LDPC codes in real-world systems.
126 citations
Cited by
More filters
Book•
01 Jan 2004TL;DR: This book provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.
Abstract: Kernel methods provide a powerful and unified framework for pattern discovery, motivating algorithms that can act on general types of data (e.g. strings, vectors or text) and look for general types of relations (e.g. rankings, classifications, regressions, clusters). The application areas range from neural networks and pattern recognition to machine learning and data mining. This book, developed from lectures and tutorials, fulfils two major roles: firstly it provides practitioners with a large toolkit of algorithms, kernels and solutions ready to use for standard pattern discovery problems in fields such as bioinformatics, text analysis, image analysis. Secondly it provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.
6,050 citations
TL;DR: This work surveys the current techniques to cope with the problem of string matching that allows errors, and focuses on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms.
Abstract: We survey the current techniques to cope with the problem of string matching that allows errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms and their complexities. We present a number of experiments to compare the performance of the different algorithms and show which are the best choices. We conclude with some directions for future work and open problems.
2,723 citations
TL;DR: It is shown that there is a fundamental tradeoff between storage and repair bandwidth which is theoretically characterize using flow arguments on an appropriately constructed graph and regenerating codes are introduced that can achieve any point in this optimal tradeoff.
Abstract: Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a single node failure is for a new node to reconstruct the whole encoded data object to generate just one encoded block. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to communicate functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.
1,919 citations
TL;DR: This paper identifies some promising techniques for image retrieval according to standard principles and examines implementation procedures for each technique and discusses its advantages and disadvantages.
Abstract: More and more images have been generated in digital form around the world. There is a growing interest in 1nding images in large collections or from remote databases. In order to 1nd an image, the image has to be described or represented by certain features. Shape is an important visual feature of an image. Searching for images using shape features has attracted much attention. There are many shape representation anddescription techniques in the literature. In this paper, we classify and review these important techniques. We examine implementation procedures for each technique and discuss its advantages and disadvantages. Some recent research results are also included and discussed in this paper. Finally, we identify some promising techniques for image retrieval according to standard principles.
1,910 citations
1,778 citations