scispace - formally typeset
Search or ask a question

Showing papers on "String (computer science) published in 1986"


Journal ArticleDOI
TL;DR: This paper extracts a single hybrid approach having a rich language that mixes algebra and logic and having a natural class of models of concurrent processes, a notion of partial string derived from the view of a string as a linearly ordered multiset by relaxing the linearity constraint, thereby permitting partially ordering multisets or pomsets.
Abstract: Concurrency has been expressed variously in terms of formal languages (typically via the shuffle operator), partial orders, and temporal logic,inter alia. In this paper we extract from these three approaches a single hybrid approach having a rich language that mixes algebra and logic and having a natural class of models of concurrent processes. The heart of the approach is a notion of partial string derived from the view of a string as a linearly ordered multiset by relaxing the linearity constraint, thereby permitting partially ordered multisets orpomsets. Just as sets of strings form languages, so do sets of pomsets form processes. We introduce a number of operations useful for specifying concurrent processes and demonstrate their utility on some basic examples. Although none of the operations is particularly oriented to nets it is nevertheless possible to use them to express processes constructed as a net of subprocesses, and more generally as a system consisting of components. The general benefits of the approach are that it is conceptually straightforward, involves fewer artificial constructs than many competing models of concurrency, yet is applicable to a considerably wider range of types of systems, including systems with buses and ethernets, analog systems, and real-time systems.

658 citations


PatentDOI
TL;DR: In this article, the authors proposed a text locating system that allows a user to switch between a dictation mode, which inserts recognized words into text, and a search mode which uses them to search for new cursor locations.
Abstract: A text locating system recognizes spoken utterances, uses the recognized words as a search string, and searches text for words matching that search string. The probability that a given vocabulary word is selected as a search word is altered both by limiting the recognizable vocabulary to words in the text to the searched, and by altering the probability that individual recognizable words will be selected as a function of the number of time they occur in that text. The system performs incremental searches by adding successively recognized words to the search string and searching for the next occurrence of the string in response to each such addition. The invention can be used in a text editing system which enables a user to switch between a dictation mode, which inserts recognized words into text, and a search mode, which uses them to search for new cursor locations. Broadly speaking, the invention provides a computer system which recognizes spoken words, which has a data structure representing words; which uses that data structure for a purpose other than speech recognition; and which alters the probability that a given vocabulary word will be recognized as a function of the frequency of that word in the data structure.

314 citations


Proceedings Article
01 Aug 1986
TL;DR: The idea is to generate a string of symbols using an L−system, and to interpret this string as a sequence of commands which control a "turtle", which can be used to create a variety of fractal curves.
Abstract: A new method for generating pictures is presented and illustrated with examples. The idea is to generate a string of symbols using an L−system, and to interpret this string as a sequence of commands which control a "turtle". Suitable generalizations of the notions of the L−system and of a turtle are introduced. The resulting mathematical model can be used to create a variety of (finite approximations of) fractal curves, ranging from Koch curves, to classic space−filling curves, to relatively realistic−looking pictures of plants and trees. All these pictures are defined in a uniform and compact way.

286 citations


Journal ArticleDOI
TL;DR: Shannon's self-information of a string is generalized to its complexity relative to the class of finite-state-machine (FSM) defined sources by a theorem stating that, asymptotically, the mean complexity provides a tight lower bound for the mean length of all so-called regular codes.
Abstract: Shannon's self-information of a string is generalized to its complexity relative to the class of finite-state-machine (FSM) defined sources. Unlike an earlier generalization, the new one is valid for both short and long strings. The definition is justified in part by a theorem stating that, asymptotically, the mean complexity provides a tight lower bound for the mean length of all so-called regular codes. This also generalizes Shannon's noiseless coding theorem. For a large subclass of FSM sources a simple algorithm is described for computing the complexity.

210 citations


PatentDOI
TL;DR: A speech recognition system which can perform multiple recognition passes on each word, which may also be used as an interactive transcription system for prerecorded speech and can operate on either discrete utterances or continuous speech.
Abstract: A speech recognition system which can perform multiple recognition passes on each word. If the recognizer is correct in its first pass, the operator may abort later passes by either pressing a key or speaking the next word. Otherwise, the operator may either wait for a second recognition pass to be performed against a larger vocabulary, or may specify one or more initial letters causing the second recognition pass to be performed against a vocabulary substantially restricted to words starting with those initial letters. Each time the user adds an additional letter to the initial string, any previous recognition is aborted and the re-recognition process is started anew with the new string. If the user types a control character after the initial string, then the string itself is used as the output of the recognizer. In one embodiment, a language model limits a relatively small vocabulary used in the first pass to the words most likely to occur given the language context of the dictated word. The system may also be used as an interactive transcription system for prerecorded speech and can operate on either discrete utterances or continuous speech. When used with prerecorded speech, the system displays the best scoring words of a recognition to the user, and, when the user choses a desired word from such a display, the system employs the portion of prerecorded speech matched against the chosen word to help determine where in that prerecorded speech the system should look for the next word to recognize.

171 citations


Patent
Victor S. Miller1, Mark N. Wegman1
11 Aug 1986
TL;DR: In this paper, a data compression method for communications between a host computing system and a number of remote terminals is enhanced by adding new character and string extensions to improve the compression ratio and deletion of a least recently used routine.
Abstract: Communications between a Host Computing System and a number of remote terminals is enhanced by a data compression method which modifies the data compression method of Lempel and Ziv by addition of new character and new string extensions to improve the compression ratio, and deletion of a least recently used routine to limit the encoding tables to a fixed size to significantly improve data transmission efficiency.

162 citations


Journal ArticleDOI
TL;DR: Based on the Boyer–Moore–Galil approach, a new algorithm is proposed which requires a number of character comparisons bounded by 2n, regardless of the number of occurrences of the pattern in the textstring.
Abstract: Based on the Boyer–Moore–Galil approach, a new algorithm is proposed which requires a number of character comparisons bounded by 2n, regardless of the number of occurrences of the pattern in the textstring. Preprocessing is only slightly more involved and still requires a time linear in the pattern size.

148 citations


Journal ArticleDOI
15 Jun 1986
TL;DR: This paper develops a mathematical model, derive formulas giving the average performance of both methods and shows that the proposed method achieves 0% - 50% relative savings over the binary codes.
Abstract: @)2 of this string decides the bucket that the record is stored. In this paper we propose to use Gray codes instead of binary codes, in order to map record signatures to buckets. In Gray codes, successive codewords differ in the value of exactly one bit position, thus, successive buckets hold records with similar record signatures. The proposed method achieves better clustering of similar records and avoids some of the (expensive) random disk accesses, replacing them with sequential ones. We develop a mathematical model, derive formulas giving the average performance of both methods and show that the proposed method achieves 0% - 50% relative savings over the binary codes. We also discuss how Gray codes could be applied to some retrieval methods designed for range queries, such as the grid file [Nievergelt84a] and the approach based on the so-called z-ordering [Orenstein84a].

123 citations


Proceedings Article
01 Jan 1986

93 citations


Journal ArticleDOI
TL;DR: An OPM/L data compression scheme suggested by Ziv and Lempel, LZ77, is applied to text compression and a slightly modified version suggested by Storer and Szymanski, L ZSS, is found to achieve compression ratios as good as most existing schemes for a wide range of texts.
Abstract: An OPM/L data compression scheme suggested by Ziv and Lempel, LZ77, is applied to text compression. A slightly modified version suggested by Storer and Szymanski, LZSS, is found to achieve compression ratios as good as most existing schemes for a wide range of texts. LZSS decoding is very fast, and comparatively little memory is required for encoding and decoding. Although the time complexity of LZ77 and LZSS encoding is O(M) for a text of M characters, straightforward implementations are very slow. The time consuming step of these algorithms is a search for the longest string match. Here a binary search tree is used to find the longest string match, and experiments show that this results in a dramatic increase in encoding speed. The binary tree algorithm can be used to speed up other OPM/L schemes, and other applications where a longest string match is required. Although the LZSS scheme imposes a limit on the length of a match, the binary tree algorithm will work without any limit.

88 citations


Patent
09 Jun 1986
TL;DR: In this paper, a method for unambiguously separating a polysyllabic PCL character string into separate words is presented, provided that homotones and identical ideograms are grouped together even if strict alphabetical ordering of the string would have separated them.
Abstract: A method and apparatus for data processing and word processing in the Chinese language. A Phonetic Chinese Language (PCL) is defined in which any ideogram can be unambiguously represented by a Phonetic Chinese Word (PCW) no more than four characters in length, each word being composed of letters selected from a defined set of letters that can each be uniquely represented by a 7-bit digital code. Each PCW represents one and only one ideogram and provides the full sound and tone information required to pronounce it. Ambiguities caused by homonyms and homotones are avoided. PCL words are translated into their corresponding ideograms and vice versa by means of a stored monosyllabic dictionary. A method for unambiguously separating a polysyllabic PCL character string into separate words is also provided, which makes it unnecessary to employ a polysyllabic dictionary. Also disclosed is a method of forming an alphagrammic listing from PCL character strings by separating the strings into separate characters and listing them in alphabetical order, provided that homotones and identical ideograms are grouped together even if strict alphabetical ordering of the string would have separated them. The disclosure also includes a keyboard adapted for efficiently entering PCL characters for processing.

PatentDOI
Kenneth Church1
TL;DR: In this paper, text is analyzed to determine the language source of words therein by successively selecting letter sequences of each word in the text and at least one signal representative of the probability that the text word corresponds to a particular language source is generated responsive to the selected letter sequences.
Abstract: In text-to-speech generating arrangements, text is analyzed to determine the language source of words therein by successively selecting letter sequences of each word in the text. At least one signal representative of the probability that the text word corresponds to a particular language source is generated responsive to the selected letter sequences and a language source is selected for converting the text work to a phonetic string responsive to the probability representative signals.

PatentDOI
TL;DR: In this article, the unknown utterance is analyzed as a sequence of phonemes, then each phoneme is labeled to form a string of labels, and the shortest label interval which is recognized as a word is assigned a storage stack where similar sounding candidate words are stored.
Abstract: Continuous speech recognition is improved by use of a known vocabulary and context probabilities. First, the unknown utterance is analyzed as a sequence of phonemes, then each phoneme labelled to form a string of labels. The shortest label interval which is recognized as a word is assigned a storage stack where similar-sounding candidate words are stored. Multiple stack decoding, and liklihood envelope criteria for word path extension decisions, are further features of the system.

Proceedings Article
01 Jan 1986
TL;DR: A new method for algorithmically generating musical scores is presented and illustrated with examples to produce a string of symbols using an L−system, and to interpret this string as a sequence of notes.
Abstract: A new method for algorithmically generating musical scores is presented and illustrated with examples. The idea is to produce a string of symbols using an L−system, and to interpret this string as a sequence of notes. The proposed musical interpretation of L−systems is closely related to their graphical interpretation, which in turn associates L−systems to fractals.

Book ChapterDOI
01 Apr 1986
TL;DR: A deterministic parallel algorithm to compute algebraic expressions in log n time using n/log(n) processors on a parallel random access machine without write conflicts (P-RAM) with no free preprocessing is described.
Abstract: We describe a deterministic parallel algorithm to compute algebraic expressions in log n time using n/log(n) processors on a parallel random access machine without write conflicts (P-RAM) with no free preprocessing. The input to our algorithm is a string, given by an array, of the expression. Such a form for the input enables a consecutive numbering of the operands in the expression in log(n) time with n/log(n) processors. This corresponds to a consecutive numbering of leaves in the tree of the expression which further enables a suitable partitioning of the leaves into small segments. We improve the result of Miller and Reif (1985), who described an optimal parallel randomized algorithm. Our algorithm can be used to construct optimal parallel algorithms for the recognition of two nontrivial subclasses of context-free languages: bracket and input-driven languages. These languages are the most complicated context-free languages known to be recognizable in deterministic logarithmic space. This strengthens the result of Matheyses and Fiduccia (1982) who constructed an almost optimal parallel algorithm for Dyck languages, since Dyck languages are a proper subclass of input-driven languages.


Patent
Ichikawa Yoshio1
14 Jan 1986
TL;DR: A radio paging system for transmitting address signals and messages, as well as information common to a plurality of receivers, and a receiver applicable to such a system is described in this paper.
Abstract: A radio paging system for transmitting address signals and messages, as well as information common to a plurality of receivers, and a receiver applicable to such a system. The radio paging system includes: an encoder which produces a first code, followed by a string of address codes each of which is followed by message codes, and a second code followed by a string of information codes, a transmitter for transmitting a string of codes produced by the encoder; and a receiver which upon detection of the first code produces an alert signal and displays the message codes when the address code agrees with an address code stored in memory, and which, upon detection of the second code, displays the information codes which follow.

Patent
21 Jan 1986
TL;DR: In this paper, a decorative string set used for Christmas lighting is disclosed, which comprises various embodiments of clamping means which provide positive retention of incandescent lamps to their respective electrical sockets.
Abstract: A decorative string set used for Christmas lighting is disclosed. The string set comprises various embodiments of clamping means which provide positive retention of incandescent lamps to their respective electrical sockets.

Journal ArticleDOI
TL;DR: A unified system for automatically recognizing fluently spoken digit strings based on whole-word reference units is presented, which can use either hidden Markov model (HMM) technology or template-based technology and contains features from both approaches.

Journal ArticleDOI
TL;DR: Here it is shown that given a homogeneous trellis automaton the authors can construct an equivalent one (stable or superstable) which allows to feed the input string to any sufficiently long row of processors.
Abstract: Systolic trellis automata are models of hexagonally connected and triangularly shaped systolic arrays. This paper studies the problems of stability, decidability, and complexity for them. The original definition of systolic trellis automata requires that an input string is fed to a specific row of processors. Here it is shown that given a homogeneous trellis automaton we can construct an equivalent one (stable or superstable) which allows to feed the input string to any sufficiently long row of processors. Moreover, some closure and decidability results for trellis automata are established and the computational complexity of languages accepted by trellis automata is investigated.

Book ChapterDOI
Gabriele Rohr1
01 Jan 1986
TL;DR: The recent use of icons in user interfaces of application software and also some few attempts to design visual programming languages impose the question of the usefulness of introducing visual concepts into computer systems understanding.
Abstract: The recent use of icons in user interfaces of application software and also some few attempts to design visual programming languages impose the question of the usefulness of introducing visual concepts into computer systems understanding. Usually these visual symbols or icons are integrated with something like a visual language, which serves together with special manipulation devices as a kind of command language. This form is known as a “graphical interface.” Graphical interfaces can be defined in more detail by expressing all possible spatial functions (e.g., moving a string, etc.) as direct manipulation on visual objects and places and showing all property changes visually.

Journal ArticleDOI
TL;DR: By this description and generation method, this paper can encode structural characters without much memory, and generate natural shapes of patterns, and consider its educational and graphical applications.
Abstract: We report a pattern description and generation method of structural characters, and show some examples of Chinese and Korean character patterns actually generated. We also consider its educational and graphical applications. In this method, any character is regarded as a composite pattern constructed by several simpler subpatterns, and is described in terms of them by introducing three kinds of positional relationships among them. A composite pattern can become a subpattern, too. We call these patterns blocks. Syntactic grammar is defined to encode a pattern expression by two code strings, that is, a string of blocks and a string of production rules. They are used in generating patterns, namely, derivation of pattern expression from the code strings is defined as a process of pattern generation. By this description and generation method, we can encode structural characters without much memory, and generate natural shapes of patterns.

Patent
Funatsu Shigehiro1
28 Mar 1986
TL;DR: In this article, a shift register string is used to simulate a logic circuit with a plurality of logic elements which can be formed by forming a shift-register string, and the logical simulation is carried out without serial shift operation on setting shift-in signals into the shift-regist register string and/or on extracting internal states from the shift registers.
Abstract: In a method of controlling a logical simulation of a logic circuit comprising a plurality of logic elements which can be simulated by forming a shift register string, the logical simulation is carried out without serial shift operation of the shift register string on setting shift-in signals into the shift register string and/or on extracting internal states from the shift register string. Setting of the shift-in signals and extraction of the internal states are possible by monitoring a simulation table which has a plurality of addresses for the logic elements.

Journal ArticleDOI
TL;DR: Two algorithms which can recognize general context-free languages without restriction on the length of input string are proposed which essentially speed-up the dynamic programming procedure by using highly pipelining and parallelism of VLSI architecture.


Journal ArticleDOI
TL;DR: The author considers the problem of constructing partial systems, where the program of a partial system is obtained by selecting only those code segments of the complete program that implement the capabilities needed.
Abstract: The author considers the problem of constructing partial systems, where the program of a partial system is obtained by selecting only those code segments of the complete program that implement the capabilities needed. A heuristic for determining fragments of a program system, which can serve as the building blocks for the programs of partial systems, is presented. The notion of `B-program' is introduced: a B-program contains, in addition to the fragments themselves for each fragment, substitute code and control information specifying the set of partial systems the fragment is relevant for. A representation of B-programs as a string is given such that generating a partial system consists in scanning this string and selecting substrings. A formal model for this type of program generation is developed. B-program reduction is dealt with; transformations for the elimination of superfluous vertices are presented; and the issue of uniqueness and the problem of constructing a minimal reduced B-program are discussed.

01 Jul 1986
TL;DR: This manual describes the English language syntactic analyzer developed by the PROTEUS Project at New York University, and the version of Restriction Language which is used to write grammars for this analyzer.
Abstract: : This manual describes the English language syntactic analyzer developed by the PROTEUS Project at New York University, and the version of Restriction Language which is used to write grammars for this analyzer. This system is a direct descendant of the Linguistic String Parser, developed by the Linguistic String Project at New York University (since 1973 in collaboration with the Computer Science Department). In particular, we have tried to maintain as much commonality as possible in the Restriction Language used for stating grammars. In developing our new implementation, we have had three objectives: 1) use LISP. The current Linguistic String Parser is implemented in FORTRAN. It is therefore quite efficient but is hard to interface to AI applications, which are usually best developed in LISP. The PROTEUS system has been entirely implemented in LISP. 2) remain small and modular. The Linguistic String Parser gradually became so large and complex that further modification was difficult. Through redesign and the elimination of some features, we have sought to return to a simpler, more easily modifiable system. 3) accomodate different analysis algorithms. One aspect of our current research is the study of alternative analysis strategies. We have therefore tried to develop a system which could accomodate different analysis algorithms. In particular, we have designed the grammar formalism to work with both top-down and bottom-up analyzers.

Patent
12 Jun 1986
TL;DR: In this article, a basic instruction for moving a string of bytes in a word has been devised, and the control necessary to optimize the operations is then in the compiler instead of the hardware.
Abstract: A basic instruction for moving a string of bytes in a word has been devised. Because the operations in the instruction are basic, very few variations are necessary to accommodate diversity of lengths and variables. These operations are imbedded in a single code sequence; the compiler can therefore generate exactly the minimum sequence necessary to perform the operations and can precompute many of the operands at compile time, typically completing the instruction within a single cycle time. The control necessary to optimize the operations is then in the compiler instead of the hardware.

Patent
11 Jul 1986
TL;DR: In this paper, a method and apparatus for adjusting clock pulses of time-equidistant digital line scanning values of a moving-image signal present in a composite color video signal to a clock used in a digital transmission link by means of a Moving-Image Coder was described.
Abstract: A method and apparatus are described for adjusting clock pulses of time-equidistant digital line scanning values of a moving-image signal present in a composite color video signal to a clock used in a digital transmission link by means of a moving-image coder. In the transmitting part (10) of the moving-image coder, self-identifying stuffing characters are inserted into the string of image scanning values if clock deviations are found. The identification may be carried out by means of an additional bit or, alternatively, by a special code word when this is first removed from the supply of image scanning values. In the receiving part (30) of the moving-image coder, the stuffing character is recognized and removed and the original character string is restored.

Patent
Jan T. Galkowski1
10 Feb 1986
TL;DR: In this article, the hierarchial information is annotated by selectively adding to it two spatial representing characters, each instance of a first spatial representing character corresponds to a unit step to the right and the first instance of the second spatial represents a unit identation to the left.
Abstract: To encode hierarchial information the invention recognizes that there is implicit information which must be made explicit. Thus the hierarchial information is annotated by selectively adding to it two spatial representing characters, each instance of a first spatial representing character corresponds to a unit step to the right, each instance of a second spatial representing character represents a unit identation to the left, and the first instance of the second spatial representing character in a string of the second spatial representing characters also represents a carriage return and line feed. The annotated hierarchial information is then scanned to produce two byproduct strings. The first byproduct string is merely the sequence of first and second spatial representing characters, in the order in which they appear, to which is added two place holding characters, one representing an alpha-numeric string, and the second representing a string of one or more blank or null characters. The second byproduct string is merely the concatenation of the alpha-numeric strings appearing in the hierarchial information. The first and second byproduct strings are then stored. The method is reversible so tht the encoded and stored hierarchial data structure (including its contents) can be retrieved and reconstituted. The encoded stored hierarchy (and its contents) can also be rapidly searched in its encoded form.