scispace - formally typeset
Search or ask a question

Showing papers on "String (computer science) published in 1989"


Journal ArticleDOI
TL;DR: An enhanced analysis feature set consisting of both instantaneous and transitional spectral information is used and the hidden-Markov-model (HMM)-based connected-digit recognizer in speaker-trained, multispeaker, and speaker-independent modes is tested.
Abstract: The authors use an enhanced analysis feature set consisting of both instantaneous and transitional spectral information and test the hidden-Markov-model (HMM)-based connected-digit recognizer in speaker-trained, multispeaker, and speaker-independent modes. For the evaluation, both a 50-talker connected-digit database recorded over local, dialed-up telephone lines, and the Texas Instruments, 225-adult-talker, connected-digits database are used. Using these databases, the performance achieved was 0.35, 1.65, and 1.75% string error rates for known-length strings, for speaker-trained, multispeaker, and speaker-independent modes, respectively, and 0.78, 2.85, and 2.94% string error rates for unknown-length strings of up to seven digits in length for the three modes. Several experiments were carried out to determine the best set of conditions (e.g., training, recognition, parameters, etc.) for recognition of digits. The results and the interpretation of these experiments are described. >

205 citations


Patent
26 May 1989
TL;DR: In this article, a method and device for coded entry of Chinese character text data into a word processing, display, printing, telecommunication, etc. system is presented, where an electronic input keyboard is used that has keys marked with phonetic notations suitable to represent Chinese speech sounds, as well as a set of "character position keys," operated by the following encoding rules: (1) the text is divided into blocks of characters, where one block may contain one or more characters, each block to be encoded by one uninterrupted typing sequence; (2) if the pronunciation of
Abstract: A method and device for coded entry of Chinese character text data into a word processing, display, printing, telecommunication, etc. system. In the principal embodiment of the invention, an electronic input keyboard is used that has keys marked with phonetic notations suitable to represent Chinese speech sounds, as well as a set of "character position keys," operated by the following encoding rules: (1) the text is divided into blocks of characters, where one block may contain one or more characters, each block to be encoded by one uninterrupted typing sequence; (2) if the pronunciation of a block is unique, encoding is done simply by entering on the keyboard the phonetic data of the character(s) making up the block; (3) if the pronunciation of a block is not unique, first the phonetic data of a string of characters making up a longer block is entered, the pronunciation of that longer block being unique and the block to be encoded being a part of the longer block, and then by using the "character position keys" the operator enters the "position data," that is, the position(s) which the character(s) of the block to be encoded occupy within that longer block. In an alternative embodiment, part or all of the phonetic data of the characters are entered into the encoding apparatus, not by keyboard means, but by the use of an acoustic speech sound analyzer.

204 citations


Patent
06 Oct 1989
TL;DR: In this article, an apparatus and method for converting an input data character stream into a variable length encoded data stream in a data compression system is described. But the method is not described.
Abstract: An apparatus and method are disclosed for converting an input data character stream into a variable length encoded data stream in a data compression system. The data compression system includes a shift register means. The shift register means has a plurality of entries and each entry of the shift register means is for storing a data character of the input data stream. The method for converting the input data character stream includes the following steps. Performing a search in the shift register means for a data string which matches the input data string. The step for performing the search includes the steps of broadcasting each input data character of the input data stream to each entry of the shift register means and comparing each input data character simultaneously with the previously stored contents of each entry of said shift register means. If the matching data string is found within the shift register means, the next step includes encoding the longest matching data string by appending to the encoded data stream a tag indicating the matching data string and a string substitution code. If the matching data string is not found within the shift register means, the next step includes encoding the first character of the input data string by appending to the encoded data stream a raw data tag and the first character of the input data string.

159 citations


Patent
03 Nov 1989
TL;DR: In this article, a directional drill bit is coupled to the lower end of a drill string through a universsal joint, which allows the bit to pivot relative to the string axis.
Abstract: A directional drilling apparatus and method in which the drill bit is coupled to the lower end of a drill string through a universsal joint which allows the bit to pivot relative to the string axis. The bit is contra-nutated in an orbit of fixed radius and at a rate equal to string rotation but in the opposite direction. This speed-controlled and phase-controlled bit nutation keeps the bit heading off-axis in a fixed direction. The invention enables directional drilling while the drill string rotates normally.

152 citations


Journal ArticleDOI
TL;DR: In this paper, similarity measure based on 2D string longest common subsequence is defined and the algorithm for similarity retrieval is also proposed.

141 citations


Patent
Alan Clark1
08 Dec 1989
TL;DR: A data compression system in which a dictionary stored strings of characters and an encoder matched the longest of the stored string with a current string of a data stream input to the encoder is described in this paper.
Abstract: A data compression system in which a dictionary stored strings of characters and an encoder matches the longest of the stored string with a current string of a data stream input to the encoder. The index of the longest matched stored string is output by the encoder and the dictionary is updated by a new string consisting of the previous match concatenated with the first two characters only of the present match. If the present match has only one or two characters, it is added without reduction.

140 citations


Journal ArticleDOI
TL;DR: This work describes several approximation algorithms that produce solutions that are always within a factor of two of optimum with respect to the overlap measure of the shortest common superstring problem (SCS).
Abstract: The object of the shortest common superstring problem (SCS) is to find the shortest possible string that contains every string in a given set as substrings. As the problem is NP-complete, approximation algorithms are of interest. The value of an aproximate solution to SCS is normally taken to be its length, and we seek algorithms that make the length as small as possible. A different measure is given by the sum of the overlaps between consecutive strings in a candidate solution. When considering this measure, the object is to find solutions that make it as large as possible. These two measures offer different ways of viewing the problem. While the two viewpoints are equivalent with respect to optimal solutions, they differ with respect to approximate solutions. We describe several approximation algorithms that produce solutions that are always within a factor of two of optimum with respect to the overlap measure. We also describe an efficient implementation of one of these, using McCreight's compact suffix tree construction algorithm. The worstcase running time is O ( m log n ) for small alphabets, where m is the sum of the lengths of all the strings in the set and n is the number of strings. For large alphabets, the algorithm can be implemented in O ( m log m ) time by using Sleator and Tarjan's lexicographic splay tree data structure.

139 citations


Proceedings ArticleDOI
26 Jun 1989
TL;DR: An algorithm for generating strings from logical form encodings that improves upon previous algorithms in that it places fewer restrictions on the class of grammars to which it is applicable, yet unlike topdown methods, it also permits left-recursion.
Abstract: We present an algorithm for generating strings from logical form encodings that improves upon previous algorithms in that it places fewer restrictions on the class of grammars to which it is applicable. In particular, unlike an Earley deduction generator (Shieber, 1988), it allows use of semantically nonmonotonic grammars, yet unlike topdown methods, it also permits left-recursion. The enabling design feature of the algorithm is its implicit traversal of the analysis tree for the string being generated in a semantic-head-driven fashion.

108 citations


Journal ArticleDOI
TL;DR: A technique for synthesizing systolic arrays which have non-uniform data flow governed by control signals is presented and it is shown how to derive the control signals in such arrays by applying similar pipelining transformations to theselinear conditional expressions.
Abstract: We present a technique for synthesizing systolic arrays which have non-uniform data flow governed by control signals. The starting point for the synthesis is anAffine Recurrence Equation—a generalization of the simple recurrences encountered in mathematics. A large class of programs, including most (single and multiple) nested-loop programs can be described by such recurrences. In this paper we extend our earlier work (Rajopadhye and Fujimoto 1986) in two principal directions. Firstly, we characterize a class of transformations calleddata pipelining and show that they yield recurrences that havelinear conditional expressions governing the computation. Secondly, we discuss the synthesis of systolic arrays that have non-uniform data flow governed by control signals. We show how to derive the control signals in such arrays by applying similar pipelining transformations to theselinear conditional expressions. The approach is illustrated by deriving the Guibas-Kung-Thompson architecture for computing the cost of optimal string parenthesization.

94 citations


Proceedings ArticleDOI
01 Dec 1989
TL;DR: This chapter focuses on a polynomial-time algorithm for learning k -variable pattern languages in the learning model introduced by Valiant, where k is a string of constant and variable symbols and p is a target pattern.
Abstract: This chapter focuses on a polynomial-time algorithm for learning k -variable pattern languages in the learning model introduced by Valiant for each constant k . A pattern is a string of constant and variable symbols. For any constant k , the algorithm learns a k -variable target pattern p by producing a polynomial-sized disjunction of patterns, each of between 0 and k variables. The algorithm allows empty substitutions and can be extended to handle restricted homomorphisms on the substitution strings. It is assumed that the algorithm has access to a random source of negative examples, generated according to an arbitrary distribution, and a random source of positive examples of the target pattern p in which the k -tuple of substitution strings is drawn not from an arbitrary distribution but from any product distribution.

83 citations


Patent
20 Apr 1989
TL;DR: In this paper, a method for generating a chain code expression examines the convolution information for a bit which identifies an adjacent configuration pixel, and then the position of the identified bit is then the next code in the chain.
Abstract: A method for processing images is useful for generating a chain code representation of a configuration of pixels in an image. The method stores a pixel representation in first frame memory. For each pixel, "convolution information" is stored in a second frame memory. Convolution information indicates whether neighborhood pixels are part of the configuration. In one embodiment, convolution information for an object pixel is a bit string. The location of a bit in the string corresponds to a direction of displacement from the object pixel. The value of a bit in the string indicates whether a neighborhood pixel located in the corresponding direction is part of the configuration. A first method for generating a chain code expression examines the convolution information for a bit which identifies an adjacent configuration pixel. Examination begins from a bit location determined from a prior code in the chain. The position of the identified bit is then the next code in the chain. An alternate method uses a look-up table. The address of the look-up tab is the convolution information expression, and the content of the table is a chain code.

Journal ArticleDOI
TL;DR: It is shown that it is possible to improve the average time of the Boyer‐Moore string matching algorithm using more space by applying a transformation that virtually increases the size of the alphabet in use.
Abstract: We show that it is possible to improve the average time of the Boyer-Moore string matching algorithm using more space. This is accomplished by applying a transformation that virtually increases the size of the alphabet in use. The improvement is such that for long patterns it is possible to obtain an algorithm more than 50 per cent faster than the original one. We include experimental results on random and English text. Some improvements for searching on English text are also discussed.

Patent
31 Aug 1989
TL;DR: In this article, a method for detecting and correcting spelling errors in a string of information signals is proposed. But the method is limited to the detection and correction of misspelled words, such as "Horse" as "house".
Abstract: A method of detecting and correcting an error in a string of information signals. When each information signal represents a word, the method detects and corrects spelling errors. The method detects and corrects an error which is a properly spelled word, but which is the wrong (not intended) word. For example, the method is capable of detecting and correcting a misspelling of "HORSE" as "HOUSE". In the spelling error detection and correction method, a first word in an input string of words is changed to form a second word different from a first word to form a candidate string of words. The spellings of the first word and the second word are in the spelling dictionary. The probability of occurrence of the input string of words is compared to the product of the probability of occurrence of the candidate string of words multiplied by the probability of misrepresenting the candidate string of words as the input string of words. If the former is greater than or equal to the latter, no correction is made. If the former is less than the latter, the candidate string of words is selected as a spelling correction.

Journal ArticleDOI
TL;DR: A simple abstract model for a class of discrete control processes, motivated in part by recent work about the behavior of imperfect random sources in computer algorithms, that produces a string of bits which is a “success” or “failure” depending on whether the string produced belongs to a prespecified setL.
Abstract: We consider a simple abstract model for a class of discrete control processes, motivated in part by recent work about the behavior of imperfect random sources in computer algorithms. The process produces a string ofn bits and is a “success” or “failure” depending on whether the string produced belongs to a prespecified setL. In an uninfluenced process each bit is chosen by a fair coin toss, and hence the probability of success is ¦L¦/2 n . A player called the controller, is introduced who has the ability to intervene in the process by specifying the value of some of the bits of the string. We answer the following questions for both worst and average case: (1) how much can the player increase the probability of success given a fixed number of interventions? (2) in terms of ¦L¦what is the expected number of interventions needed to guarantee success? In particular our results imply that if ¦L¦/2 n =1/Ω(n) where Ω(n) tends to infinity withn (so the probability of success with no interventions is 0(1)) then withO(√n logΩ(n)) interventions the probability of success is 1−0(1). Our main results and the proof techniques are related to well-known results of Kruskal, Katona and Harper in extremal set theory.

Journal ArticleDOI
TL;DR: A new connectionist paradigm, the optimum path paradigm (OPP), and its application to the problem of string instrument fingering, whose generality goes beyond musical applications (Sayegh and Manzor-Coats 1988).
Abstract: This paper introduces a new connectionist paradigm, the optimum path paradigm (OPP) and its application to the problem of string instrument fingering. The optimization approach on a Viterbi network is the feedforward phase that maps input to output once the values of the weights are known. The approach taken is a natural one that proceeds from the formulation of the problem, goes through the rule-based and the optimization approaches, and leads to the final learning phase. A discussion of how the optimum path paradigm relates to other connectionist paradigms is also presented. The paper will address the specific problem of fingering for string instruments (Yampolsky 1967; Gilardino 1975), in particular the fingering of homophonic music written for the classical guitar (Gilardino 1975; Sayegh 1987; 1988). The method presented is general, however, and can be used in a variety of musical applications, where the written transcription often carries only partial information about the sound to be rendered-the remaining information depending on context and/or interpretation. The reason for restricting the study to a particular application is the desire to introduce another level of generality, illustrating different approaches that can be taken. One such approach, the optimization approach, leads naturally to a new connectionist learning paradigm, the optimum path paradigm, whose generality goes beyond musical applications (Sayegh and Manzor-Coats 1988). The connectionist aspect of the problem is introduced in a very natural way. In the early phases of the treatment, it is present at the level of constraint propagation (Waltz 1975) or of connectionism (Feldman 1985) and is very closely tied to the nature of the problem. The most interesting aspect is that it then goes one step beyond, culminating in the learning aspect that is absent in similar treatments, although essential for a viable connectionist paradigm.

Proceedings ArticleDOI
04 Oct 1989
TL;DR: A method that facilitates the rapid retrieval of a given image sequence from a large database is presented, exploit the fact that much of the information stored is redundant and extend the two-dimensional string methodology to image sequences.
Abstract: A method that facilitates the rapid retrieval of a given image sequence from a large database is presented. The authors exploit the fact that much of the information stored is redundant. They extend the two-dimensional string methodology to image sequences. This permits queries on the relative positions of objects within video sequences, including changes in position over time. >

Journal ArticleDOI
TL;DR: The main result is that there exist sequential splicing systems with recursively unsolvable membership problem and the technique of the proof is to embed Turing machine computations in the languages.
Abstract: The notion of splicing system has been used to abstract the process of DNA digestion by restriction enzymes and subsequent religation. A splicing system language is the formal language of all DNA strings producible by such a process. The membership problem is to devise an algorithm (if possible) to answer the question of whether or not a given DNA string belongs to a splicing system language given by initial strings and enzymes. In this paper the concept of a sequential splicing system is introduced. A sequential splicing system differs from a splicing system in that the latter allows arbitrarily many copies of any string in the initial set whereas the sequential splicing system may restrict the initial number of copies of some strings. The main result is that there exist sequential splicing systems with recursively unsolvable membership problem. The technique of the proof is to embed Turing machine computations in the languages.

Journal ArticleDOI
01 Apr 1989
TL;DR: This work surveys several algorithms for searching a string in a piece of text and includes theoretical and empirical results, as well as the actual code of each algorithm.
Abstract: We survey several algorithms for searching a string in a piece of text. We include theoretical and empirical results, as well as the actual code of each algorithm. An extensive bibliography is also included.

01 Jan 1989
TL;DR: This paper showed that for any language that has a perfect zero-knowledge proof system, its complement has a short interactive protocol, which implies that there are not any perfect zero knowledge protocols for NP-complete languages unless the polynomial time hierarchy collapses.
Abstract: A Perfect Zero-Knowledge interactive proof system convinces a verifier that a string is in a language without revealing any additional knowledge in an information-theoretic sense. We show that for any language that has a perfect zero-knowledge proof system, its complement has a short interactive protocol. This result implies that there are not any perfect zero-knowledge protocols for NP-complete languages unless the polynomial time hierarchy collapses. This paper demonstrates that knowledge complexity can be used to show that a language is easy to prove.

PatentDOI
TL;DR: In this article, a sustaining device is provided for prolonging the vibration of a string of a musical instrument having a first magnetic pickup that is responsive to the vibrations of the string.
Abstract: A sustaining device is provided for prolonging the vibration of a string of a stringed musical instrument having a first magnetic pickup means responsive to the vibration of the string. The sustaining device includes a magnetic string driver capable of inducing a vibration in the string. A first amplifier amplifies the output of the pickup to a level that provides sufficient energy to the driver to prolong the vibration of the string. A switch is coupled to the driver for selecting the mode of operation of the driver between the pickup mode of operation wherein the driver functions as a second magnetic pickup, and a driver mode of operation wherein the driver functions as a magnetic string driver. An output changing device is provided that is responsive to the switch for changing the output of at least one of the first pickup and a driver in response to a change in the mode of operation of the driver.

Patent
20 Sep 1989
TL;DR: In this article, a data storage device for storing data strings each including units of data has a plurality of memory sections each for storing therein a table, a pluralityof processor elements one provided for each of the tables and a controlling unit having an internal memory in which data strings are stored.
Abstract: A data storage device for storing data strings each including units of data has a plurality of memory sections each for storing therein a table, a plurality of processor elements one provided for each of the tables and a controlling unit having an internal memory in which data strings are stored. The table contains a plurality of records each including a unit of data, a first index data representative of the number of units of data of a data string which the unit of data constitutes and a second index data unique to each individual data string. The processor elements access in parallel their associated tables under control of the control unit for data storage and data retrieval. The first and index data are generated by the controller for the purpose of data storage. For retrieval of data strings: a key data string is divided into units of data; retrieval keys are produced each from a combination of one of the divided units of data and a third index data representing the number of units of data of the key data string; second index data are retrieved in parallel in the tables of the data storage device with the retrieval keys so that a single second index data common to all of the retrieved second index data is found among them; and a data string corresponding to the found common second index data is outputted from the internal memory of the storage device.

Patent
Keiji Kojima1, Yusuke Mishina1
17 Oct 1989
TL;DR: In this article, a method of search in which a plurality of candidate of character strings likely to coincide with a designated key word are detected from a text character string, and it is decided whether the candidate character strings detected include a character string coincident with the key word character string.
Abstract: The present invention relates to a method of search in which a plurality of candidate of character strings likely to coincide with a designated key word are detected from a text character string, and it is decided whether the candidate character strings detected include a character string coincident with the key word character string. Further, the sequence of detection of character string candidates is determined in such a manner that a portion having a succession of characters coincident with any of a plurality of characters included in the key word character string is selected as a candidate character string.

Proceedings ArticleDOI
01 Jan 1989
TL;DR: It is shown that recurrent nets trained with the RTRL (real-time recurrent learning) algorithm are able to learn tasks that Elman nets appear unable to learn, and they learn a more stringent form of the task that does not require knowledge of the string boundaries to be used.
Abstract: It is shown that recurrent nets trained with the RTRL (real-time recurrent learning) algorithm are able to learn tasks that Elman nets appear unable to learn. Moreover, they learn a more stringent form of the task that does not require knowledge of the string boundaries to be used. This is of potential importance in cases in which this information is not known beforehand but must be learned by the network, such as speech recognition. Although the recurrent nets learn the prediction tasks, they do so with great difficulty, requiring many string presentations to reach criteria. This is a significant problem because of the large amount of computation required by the recurrent algorithm. Some of the tasks have taken 30 hours of CPU time on a Sun-4/280 to be learned. This time could be greatly reduced if advantage could be taken of the very high degree of parallelism in the RTRL algorithm. >

Patent
01 Dec 1989
TL;DR: The branch history table stores a plurality of pairs, each pair including a branch destination address and a set of an address of a branch instruction and a value obtained by subtracting a given value from the address as discussed by the authors.
Abstract: An instruction fetching device includes one or both of a cache device and a branch history table The cache device stores a plurality of pairs, each pair including an instruction string divided into minimum unit instructions and an address of the instruction string At the time of reading an instruction, an instruction string is selected and output by every minimum unit instruction from at least two pairs The branch history table stores a plurality of pairs, each pair including a branch destination address and a set of an address of a branch instruction and a value obtained by subtracting a given value from the address At the time of reading an instruction, first, a pair having an address of a branch instruction which address is the nearest to a head address of an instruction string to be read out is selected from a plurality of pairs, each pair including an address of a branch instruction which address is in a predetermined address range including the instruction string to be read out, or each pair of the plurality of pairs including a value obtained by subtracting a given value from the address of the branch instruction Then, a branch destination address is selected and output from the pair which is selected first from the plurality of pairs

Journal ArticleDOI
TL;DR: A technique for implementing a static transition table of a string pattern matching machine which locates all occurrences of a finite number of keywords in a string is described and can be speeded up by a finite straight program without loops.
Abstract: A technique for implementing a static transition table of a string pattern matching machine which locates all occurrences of a finite number of keywords in a string is described The approach is based on SC Johnson's (1975) storage and retrieval method of the transition table of a finite-state machine By restricting the transition table of the finite-state machine to that of the string pattern-matching machine, triple arrays of Johnsons's data structure can be reduced to two arrays The retrieval program of the reduced data structure can be speeded up by a finite straight program without loops >

Patent
19 Jun 1989
TL;DR: In this article, an improved method of generating a compressed representation of a source data string, each symbol of which is taken from a finite set of m+1 symbols, a o to a m, is presented.
Abstract: An improved method of generating a compressed representation of a source data string, each symbol of which is taken from a finite set of m+1 symbols, a o to a m . The method is based on an arithmetic coding procedure wherein the source data string is recursively generated as successive subintervals within a predetermined interval. The width of each subinterval is theoretically equal to the width of the previous subinterval multiplied by the probability of the current symbol. The improvement derives from approximating the width of the previous subinterval so that the approximation can be achieved by a single SHIFT and ADD operation using a suitable shift register.

Book ChapterDOI
14 Jun 1989
TL;DR: The parser is an extension of Earley's algorithm, which was originally developed for context free string grammars, and is able to recognize not only complete structures generated by a plex grammar but also partial ones.
Abstract: Plex grammars according to [13], generating two-dimensional plex structures, are a generalization of string grammars. In this paper we describe a parser for context free plex grammars. The parser is an extension of Earley's algorithm, which was originally developed for context free string grammars. Our parser is able to recognize not only complete structures generated by a plex grammar but also partial ones. The algorithm has been implemented and tested on a number of examples. The time complexity of the parser is exponential in general, but there exist subclasses of plex languages for which the parser has a polynomial time complexity.

Patent
Albert Pierson Vreeland1
26 Dec 1989
TL;DR: In this paper, a method for detecting identical consecutive runs of data is described. Butler et al. used the logical operation exclusive OR (XOR) on each pair of adjacent bytes in a source string of data bytes in one machine instruction, resulting in a comparison data string.
Abstract: A method of detecting certain runs of data, such as identical consecutive runs of data, is disclosed. The logical operation exclusive OR (XOR) is performed on each pair of adjacent bytes in a source string of data bytes in a single machine instruction, resulting in a comparison data string of bytes. The comparison data string is then sequentially searched for a byte which matches a predetermined byte. In the case of a search, the predetermined byte may be any value, which value is determined by the XOR of two adjacent bytes that are to be found. In the case of compression, as used in this invention, the predetermined byte has a value of zero, indicating that two adjacent bytes in the source string are identical. The sequential search occurs in a single Translate and Test (TRT) machine instruction. Once an all-zero byte is located, the subsequent byte in the comparision data string is examined. If the subsequent byte is also an all-zero byte, three identical, consecutive bytes in the source string have been located and are considered to be compressible. Additional bytes in the identical, consecutive run of bytes are located by further searching. If the subsequent byte in the comparison data string is not also an all-zero byte, the search for an all-zero byte is restarted with the byte following the subsequent byte in the comparison data string.

Journal ArticleDOI
TL;DR: There is a finite automaton which accepts one but not the other and has a number of states much less than the length of either string and this considerably strengthens the best previously known result.

Patent
22 Dec 1989
TL;DR: In this paper, a selective call system is provided in which string searching or text matching operations are performed on data from data generating devices such as condition sensors in a security system, when a predetermined string is identified, an address of a pager is generated and that receiver is called.
Abstract: A selective call system is provided in which string searching or text matching operations are performed on data from data generating devices such as condition sensors in a security system. When a predetermined string is identified, an address of a selective call receiver such as a pager is generated and that receiver is called. The invention is applicable to other applications where data is generated which includes predetermined character strings.