scispace - formally typeset
Search or ask a question

Showing papers on "String (computer science) published in 1984"


Journal ArticleDOI
Hermann Ney1
TL;DR: The algorithm to be developed is essentially identical to one presented by Vintsyuk and later by Bridle and Brown, but the notation and the presentation have been clarified and the computational expenditure per word is independent of the number of words in the input string.
Abstract: This paper is of tutorial nature and describes a one-stage dynamic programming algorithm for file problem of connected word recognition. The algorithm to be developed is essentially identical to one presented by Vintsyuk [1] and later by Bridle and Brown [2] ; but the notation and the presentation have been clarified. The derivation used for optimally time synchronizing a test pattern, consisting of a sequence of connected words, is straightforward and simple in comparison with other approaches decomposing the pattern matching problem into several levels. The approach presented relies basically on parameterizing the time warping path by a single index and on exploiting certain path constraints both in the word interior and at the word boundaries. The resulting algorithm turns out to be significantly more efficient than those proposed by Sakoe [3] as well as Myers and Rabiner [4], while providing the same accuracy in estimating the best possible matching string. Its most important feature is that the computational expenditure per word is independent of the number of words in the input string. Thus, it is well suited for recognizing comparatively long word sequences and for real-time operation. Furthermore, there is no need to specify the maximum number of words in the input string. The practical implementation of the algorithm is discussed; it requires no heuristic rules and no overhead. The algorithm can be modified to deal with syntactic constraints in terms of a finite state syntax.

364 citations


Patent
18 Jun 1984
TL;DR: In this paper, a data compressor compresses an input stream of data character signals by storing in a string table strings encountered in the input stream, where each string comprises a prefix string and an extension character where the extension character is the last character in the string and the prefix string comprises all but the extension characters.
Abstract: A data compressor compresses an input stream of data character signals by storing in a string table strings of data character signals encountered in the input stream. The compressor searches the input stream to determine the longest match to a stored string. Each stored string comprises a prefix string and an extension character where the extension character is the last character in the string and the prefix string comprises all but the extension character. Each string has a code signal associated therewith and a string is stored in the string table by, at least implicitly, storing the code signal for the string, the code signal for the string prefix and the extension character. When the longest match between the input data character stream and the stored strings is determined, the code signal for the longest match is transmitted as the compressed code signal for the encountered string of characters and an extension string is stored in the string table. The prefix of the extended string is the longest match and the extension character of the extended string is the next input data character signal following the longest match. Searching through the string table and entering extended strings therein is effected by a limited search hashing procedure. Decompression is effected by a decompressor that receives the compressed code signals and generates a string table similar to that constructed by the compressor to effect lookup of received code signals so as to recover the data character signals comprising a stored string. The decompressor string table is updated by storing a string having a prefix in accordance with a prior received code signal and an extension character in accordance with the first character of the currently recovered string.

356 citations


Journal ArticleDOI
TL;DR: An O(n log n) algorithm is presented to find all repetitions in a string of lenght n, which uses a variation of the Knuth-Morris-Pratt algorithm to finding all partial occurrences of a pattern within a text string.

282 citations


Journal ArticleDOI
TL;DR: An algorithm that produces the shortest edit sequence transforming one string into another is presented and is optimal in the sense that it generates a minimal covering set of common substrings of one string with respect to another.
Abstract: The string-to-string correction problem is to find a minimal sequence of edit operations for changing a given string into another given string. Extant algorithms compute a longest common subsequence (LCS) of the two strings and then regard the characters not included in the LCS as the differences. However, an LCS does not necessarily include all possible matches, and therefore does not produce the shortest edit sequence. An algorithm that produces the shortest edit sequence transforming one string into another is presented. The algorithm is optimal in the sense that it generates a minimal covering set of common substrings of one string with respect to another. Two improvements of the basic algorithm are developed. The first improvement performs well on strings with few replicated symbols. The second improvement runs in time and space linear to the size of the input. Efficient algorithms for regenerating a string from an edit sequence are also presented.

239 citations


Patent
04 Dec 1984
TL;DR: In this article, an elevated platform is provided on which a person may move to a location near the raised upper end of the drill string for assisting in making the connection, with that platform preferably being mounted to swing between a retracted position in which it remains during the drilling operation and an active position projecting to a point near the top of the string during the connecting process.
Abstract: Methods and apparatus for facilitating the drilling of a well with a top drive unit having a motor which is connected to the upper end of the drill string and moves upwardly and downwardly therewith. The methods and apparatus of the invention enable the drill string to be pulled upwardly off of the bottom of the well each time an additional length of drill pipe is added to the string, with the connection between that added length and the upper end of the string being made at an elevated location spaced above the rig floor. In one form of the invention, an elevated platform is provided on which a person may move to a location near the raised upper end of the string for assisting in making the connection, with that platform preferably being mounted to swing between a retracted position in which it remains during the drilling operation and an active position in which it projects to a point near the top of the string during the connecting process. In another form of the invention, the back-up tool for preventing rotation of the upper end of the string may be controlled remotely and be removable relative to the string between an active position in which it can grip the elevated upper end of the string and a retracted position in which it remains during drilling.

119 citations


PatentDOI
TL;DR: In this article, a string energization and fret acquisition system for an electronic music synthesizer utilizes a multiplexed string energisation and fret selection system wherein a high impedance buffer allows voltages to be detected off the strings at the various frets without drawing current through the frets or string contacts.
Abstract: A guitar controller for an electronic music synthesizer utilizes a multiplexed string energization and fret acquisition system wherein a high impedance buffer allows voltages to be detected off the strings at the various frets without drawing current through the frets or fret/string contacts. Unique string bend and string vibration sensors and expression auxiliary sensors are additionally disclosed.

80 citations


Journal ArticleDOI
TL;DR: A parallel Earley's recognition algorithm in terms of an ``X*'' operator is presented, which can be executed on a triangular-shape VLSI array by restricting the input context-free grammar to be ¿-free, and which gives the correct error count.
Abstract: Earley's algorithm has been commonly used for the parsing of general context-free languages and the error-correcting parsing in syntactic pattern recognition The time complexity for parsing is 0(n3) This paper presents a parallel Earley's recognition algorithm in terms of an ``X*'' operator By restricting the input context-free grammar to be ?-free, the parallel algorithm can be executed on a triangular-shape VLSI array This array system has an efficient way of moving data to the right place at the right time Simulation results show that this system can recognize a string with length n in 2n + 1 system time We also present a parallel parse-extraction algorithm, a complete parsing algorithm, and an error-correcting recognition algorithm The parallel complete parsing algorithm has been simulated on a processor array which is similar to the triangular VLSI array For an input string of length n the processor array will give the correct right-parse at system time 2n + 1 if the string is accepted The error-correcting recognition algorithm has also been simulated on a triangular VLSI array This array recognizes an erroneous string of length n in time 2n + 1 and gives the correct error count These parallel algorithms are especially useful for syntactic pattern recognition

72 citations


Patent
06 Aug 1984
TL;DR: In this article, an input data string including repetitive data more in number than the specified value is transformed into a data string having a format including the first region where non-compressed data are placed, the second region including a datum representative of a compressed data string section which has undergone the compression process and information indicative of the number of repetitive data.
Abstract: Method of data compression and restoration wherein an input data string including repetitive data more in number than the specified value is transformed into a data string having a format including the first region where non-compressed data are placed, the second region including a datum representative of a data string section which has undergone the compression process and information indicative of the number of repetitive data, i.e., the length of the data string section, and control information inserted at the front and back of the first region indicative of the number of data included in the first region, said transformed data string being recorded on the recording medium, and, for data reproduction, the first and second regions are identified on the basis of the control information read out on the recording medium so that the compressed data string section is transformed back to the original data string in the form of repetitive data.

71 citations


Patent
05 Mar 1984
TL;DR: In this article, a purse-string instrument with arc shaped jaws and handles is described, by which it is possible to place mechanically, in surgery, purse string sutures in very narrow and deep areas of the organism.
Abstract: A new purse-string instrument having arc shaped jaws and handles is described, by which it is possible to place mechanically, in surgery, purse-string sutures in very narrow and deep areas of the organism.

58 citations


Patent
29 Oct 1984
TL;DR: A serdes device includes circuitry for loading or reading bit configurations into or out of strings of variable length nk+r, where n is the number of bits in a byte, k is the total number of bytes and r is the residual bits, with r being smaller than n.
Abstract: A serdes device includes circuitry for loading or reading bit configurations into or out of strings of latches of variable length nk+r, where n is the number of bits in a byte, k is the number of whole bytes and r is the number of residual bits, with r being smaller than n. Under the control of a service processor (8), there is formed a ring comprised of the latches of the serializer/deserializer register (14), the latches of the string considered (3 or 4) and a selected number (n-r) of latches of an extension register (16). The bytes to be loaded are sequentially sent to register (14), starting with the byte that contains the residual bits, and n bits are shifted out after loading each successive byte, so that after k+1 shifts the desired configuration will be contained in the string. For reading the contents of a string (for example, string 3), n bits are shifted, register (14) is read out, then k shifts of n bits each are performed, with register (14) being read out after each shift.

53 citations


Journal ArticleDOI
TL;DR: A simple efficient algorithm to find the loops and subloops thus defined in a directed graph model of a program derived from the source code.
Abstract: Loops are directly defined on an instruction execution string rather than on a directed graph model of a program derived from the source code. A simple efficient algorithm to find the loops and subloops thus defined is presented.

Patent
Kazunori Muraki1
30 Apr 1984
TL;DR: In this paper, a pragmatic table is used to link the pivot words into pairs by relation symbols incompliance with the dominant and dependant pairs and the source surface data therefor.
Abstract: In a machine translating system wherein word units of a source language are translated to word units of a target language through pivot words of a pivot language, a pragmatic table keeps the pivot words as dominant and dependant pairs together with a relation symbol specifying a semantic relationship between the pivot words of each pair in the pivot language and with surface data representative of the relationship in surface structures of the source and target languages. On analyzing an input word unit string into a source semantic structure, the pragmatic table is referenced to link the pivot words into pairs by relation symbols incompliance with the dominant and dependant pairs and the source surface data therefor. The semantic structure is therefore allowed by the pragmatic table. On mapping the semantic structure to a pivot representation and the representation to a target semantic structure in consideration of wording of the target language, the pragmatic table is similarly referred to. The representation and the target semantic structure are thereby checked against the dominant and dependant pairs and the relation symbols therefor. On generating an output word unit string from the target semantic structure, the pragmatic table is likewise referenced to check whether or not the output string is allowed by the dominant and dependant pairs and the target surface data therefor.

Patent
11 Dec 1984
TL;DR: In this paper, a radio frequency probing technique is proposed for determining whether an rf signal injected into one termination of a series-connected string of photovoltaic (PV) modules has been conducted to successive positions in the string.
Abstract: A radio frequency (rf) probing technique is disclosed for determining whether an rf signal injected into one termination of a series-connected string of photovoltaic (PV) modules has been conducted to successive positions in the string. A signal generator is provided for generating a carrier signal which is modulated by an audio frequency tone. The generator is capacitively coupled to one of the terminations of the module string to inject the modulated carrier signal into the string. A receiver is tuned to the carrier frequency and adapted to demodulate the audio tone. Rf signals are coupled to the receiver by a probe including a shielded plate electrode, which may be disposed adjacent the active surfaces of successive ones of the modules in order to detect the presence or absence of the rf signal at that point in the string. An alternate technique is disclosed, whereby a measurement of the capacitance-to-ground of the string is compared against a reference capacitance value of a fault-free system.


Patent
01 Mar 1984
TL;DR: In this paper, a method and apparatus for encoding and decoding a variable length augmented code for use in the transmission of sequential information as an indefinite length string of data is described, and a fixed length depleted code is also disclosed.
Abstract: The specification discloses a method and apparatus for encoding and decoding a variable length augmented code for use in the transmission of sequential information as an indefinite length string of data. Both binary and alternate character code sets are discussed for transmitting and translating information. The variable length code symbols are self synchronizing, and will automatically reestablish synchronization within two characters if a bit or number of bits is lost through noise or faulty transmission. The resynchronization is automatic and occurs by virtue of the construction of the variable length augmented codes. In addition, a method and means of creating a fixed length depleted code for use in digital processors and digital storage media is also disclosed. Inasmuch as most digital processors utilize fixed length words, it is desirable to be able to convert the variable length augmented code into a fixed length depleted code, and to be able to reconvert from the depleted code back to the augmented code without necessity of resorting to an extensive lookup table for each of the characters. In creating the augmented set of self synchronizing variable length code symbols, the original character set C0 is augmented 9 times until the Cq=2q(n-1)+1 wherein n represents the number of distinct elements in the original character set C0 that was augmented, and ¦Cq¦ is equal to the number of symbols derived in the final augmented set Cq, and is equal to or greater than the desired number of characters to be used in the data handling and communication.

PatentDOI
TL;DR: The specification discloses a method of transforming input symbolic data to output symbolic data for use in text-to-speech and other environments by sequentially mapped a set of rules defining a desired mapping of byte values.
Abstract: The specification discloses a method of transforming input symbolic data to output symbolic data for use in text-to-speech and other environments. A string of digital byte values representing the input symbolic data is stored in a first buffer memory location in rules processor (10). A set of rules defining a desired mapping of byte values is stored in a rules storage (12), along with a set of user special symbols. The rules ae sequentially mapped to transform the stored byte values in accordance with the rules and the special symbols from a first buffer memory location to a second buffer memory location.

Book ChapterDOI
J.K. O'Regan1
TL;DR: This article showed that the eye's behavior is entirely determined by the amount of processing being done in the viclnity of the location fixated, where the position of information within a string or a word is either mainly at the beginning or not.
Abstract: There is increasing evidence in the literature that eye movement patterns reflect ongoing processing in reading. However linguistic processing is insuffciently understood for precise predictions to be made about how the eye should behave. The present paper looks at the simpler activity of recognizing words and letter strigs in an attempt to understand the eye's scanning behavior there. Experiments are described in which the position of information within a string or a word is either mainly at the beginning or not. A satisfactory account of the data can be put forward in which it is assumed that the eye's behavior is entirely determined by the amount of processing being done in the viclnity of the location fixated.

Patent
27 Dec 1984
TL;DR: In this article, a text-to-speech system is examined for syllable boundaries to help accent generation for speech synthesis, and hyphenation in word processing, and allophone code characters in the form of a byte string are compared with prestored rules.
Abstract: In a text to speech system, digital text ASCII is examined for syllable boundaries to help accent generation for speech synthesis, and hyphenation in word processing. Digital allophone code characters in the form of a byte string are compared with prestored rules. Byte string segments which may comprise consonant clusters, and left and right adjacent environments thereof are examined.

Patent
05 Jan 1984
TL;DR: A central processor for use in a data processing system that is adapted for processing sequences of characters is described in this article, where information identifying a string of characters to be examined, including the memory location for the first character and the total number of characters in the sequence, is placed in working registers of the central processor.
Abstract: A central processor for use in a data processing system that is adapted for processing sequences of characters. Information identifying a string of characters to be examined, including the memory location for the first character in the sequence and the total number of characters in the sequence, is placed in working registers of the central processor. Other working registers in the central processor receive information corresponding to a predetermined characteristic, which may be a specific character or information identifying another character string. One of several character string instructions then can be processed. In response to a typical character string instructuion, the central processor retrieves each character from the memory and compares it with the predetermined characteristic. Processing continues until either the predetermined characteristic is detected or all the characters in the character string are examined. During processing, the central processor controls an arithmetic-logic condition code during each comparison. When the processing terminates, the condition code indicates whether the character string contained the predetermined characteristic.

Patent
20 Apr 1984
TL;DR: In this paper, the authors proposed to improve recording density on a recording medium by referring data before and after 2-bit data and a code word before a 5-bit code word when converting binary data into the 5-bits code word.
Abstract: PURPOSE:To set up a minimum value to 4 and a maximum value to a prescribed value or less and to improve recording density on a recording medium by referring data before and after 2-bit data and a code word before a 5-bit code word when converting binary data into the 5-bit code word in each 2-bit data. CONSTITUTION:Original data from an input terminal 1 and a clock from a clock input terminal 2 are inputted to a 12-bit serial/parallel shift register 3, which shifts the input data sequentially and applies its outputs Q1-Q12 to an encoder 4. Encoded algorithms decided by a condition deciding circuit 9 to which the outputs of a 10-bit serial/parallel shift register 8 are inputted are applied to the encoder 4. The encoder 4 applies outputs obtained by encoding input data by prescribed algorithm to a 5-bit parallel/serial register 7. While referring the data before and after the 2-bit data and the code word before the 5-bit code word, the minimum value of the number of continuous ''1''s and the succeeding ''0''s in the converted 5-bit code word string is set up 4 and the maximum value is set up the prescribed value or less.

Patent
17 Aug 1984
TL;DR: In this article, a measurement system for collecting data relating to the total, partial or distributed flow of a river or like course is disclosed, where a generally planar array of flow velocity sensors is erected transversely with respect to the general direction of flow.
Abstract: A measurement system for collecting data relating to the total, partial or distributed flow of a river or like course is disclosed. A generally planar array of flow velocity sensors is erected transversely with respect to the general direction of flow. The array comprises a number of vertically extending strings of sensors which are anchored at one end of the bed of the course and the free end of the string is supported by a buoyant float. The sensors in a string are attached at fixed intervals along the string so that each always resides at the corresponding fixed proportion of total depth up to the maximum which can be accommodated by the length of the string. Alternative sensor designs are also discussed.

Proceedings ArticleDOI
TL;DR: This paper presents an expressive model, suitable for paper and electronic documents, that is based on a graph-like structure that has proven useful for specifying a wide variety of document objects, and is the basis for an implemented document preparation system.
Abstract: Underlying every document processing system is a model of the document. For many applications a simple model, such as a long string of characters, is adequate. However, more expressive models are desirable for more demanding applications that involve complex textual material and also nontextual objects, such as mathematical notation, tables, and figures. In this paper we present an expressive model, suitable for paper and electronic documents, that is based on a graph-like structure.The principal concepts are the notions of abstract and concrete objects, hierarchical composition of ordered and unordered objects, sharing of components, and reference links. This model has proven useful for specifying a wide variety of document objects, and is the basis for an implemented document preparation system.

Patent
04 May 1984
TL;DR: In this paper, a character generator responds to a non-display character with the cyclic redundancy check (CRC) code for the bits emitted by the generator in response to the string of display characters used in the diagnostic routine.
Abstract: As diagnostic system for testing the video output and secondary storage facilities of a computer system. The refresh memory of the video subsytem is loaded with the codes of a character string including the complete set of display characters and the code for a character that is not dlsplayable. A character generator responds to this non-display character with the cyclic redundancy check (CRC) code for the bits emitted by the generator in response to the string of display characters used in the diagnostic routine. A slowed-down replica of video output from the character generator is fed as a stream of bits to a stow speed serial communications port and then to the processor. The processor generates the CRC code in response to the stream of bits from the character generator and it compares this generated code with the CRC code received from the character generator. Similarly, for testing the secondary storage subsystem, certain predetermined data is stored in a portion of a primary memory unit. This data, simulating an MFM-encoded recording of known information, is transmitted by the central processor to the control unit for the secondary storage system, and is looped backthrough the controller for data separation and decoding and is then returned to the central processor. The returned signal is compared with the unencoded form of the data provided to the controller.

Book ChapterDOI
11 Apr 1984
TL;DR: The main result is a polynomial time algorithm constructing descriptive patterns of maximal length for the general case of patterns containing variable symbols from any finite set a priori fixed.
Abstract: Assume a finite alphabet of constant symbols and a disjoint infinite alphabet of variable symbols. A pattern p is a non-empty string of constant and variable symbols. The language L(p) is the set of all words over the alphabet of constant symbols generated from p by substituting some non-empty words for the variables in p. A sample S is a finite set of words over the same alphabet. A pattern p is descriptive of a sample S if and only if it is possible to generate all elements of S from p and, moreover, there is no other pattern q also able to generate S such that L(q) is a proper subset of L(p). The problem of finding a pattern being descriptive of a given sample is studied. It is known that the problem of finding a pattern of maximal length is NP-hard. Till now has be known a polynomial-time algorithm only for the special case of patterns containing only one variable symbol. The main result is a polynomial time algorithm constructing descriptive patterns of maximal length for the general case of patterns containing variable symbols from any finite set a priori fixed.

01 Sep 1984
TL;DR: This paper answers the case of the open question, whether a three-head one-way deterministic finite automaton can perform string-matching, and proves the first result showing that checking is easier than generating.
Abstract: New techniques for obtaining lower bounds on string-matching problems are developed and we prove the following new results String-matching cannot be performed by a three-head one-way deterministic finite automaton This answers the $k=3$ case of the open question, due to Galil and Seiferas [GS], whether a $k$-head one-way deterministic finite automaton can perform string-matching String-matching by a k-head two-way DFA with k-1 heads blind (can only see two end symbols) is studied, tight upper and lower bounds are provided Probabilistically moving a string on one tape (requiring $n^{2}$ time) is harder than probabilistically matching two strings on 1 tape Notice that this is not true for deterministic or even nondeterministic TMs This is the first result showing that checking is easier than generating

Journal ArticleDOI
TL;DR: It is shown that 1−(1/c)+(1/ c 2 ) is an accurate approximation for the ratio kmp /N aive , provided both P attern and T ext are random strings over an alphabet of size c.

Journal ArticleDOI
TL;DR: One-and two-dimensional pattern matching oriented systolic array processors are presented that support the detection of all repetitions in a string x and the statistics of all substrings of x with and without overlap.
Abstract: One-and two-dimensional pattern matching oriented systolic array processors are presented that support, respectively, the detection of all repetitions in a string x and the statistics of all substrings of x with and without overlap. The time is linear in the length of x in both applications, whereas the number of processors is linear and quadratic, respectively.

Book ChapterDOI
09 Apr 1984
TL;DR: This work proves a (reasonably) tight lower bound for the minimum value of CEf(-), when the minimum is taken over all n-bit strings which consists of m ones and n - m zeros, and is the best result known concerning the security of RSA's least significant bit.
Abstract: We consider the following problem: Let s be a n-bit string with m ones and n – m zeros. Denote by C E t(s) the number of pairs, of equal bits which are within distance t apart, in the string s. What is the minimum value of C E t(·), when the minimum is taken over all n-bit strings which consists of m ones and n – m zeros?

Book ChapterDOI
03 Sep 1984
TL;DR: An O(n L) algorithm is given for solving the problem of is every string in Γ+ uniquely decipherable, or does the equation cx = dy, where c, d, c ≠ d, and x,y ∈ Γ*, have a solution?
Abstract: We consider the following problem: Given a set Γ = {c1,...,cn} of nonempty strings over a fixed, finite alphabet Σ, is every string in Γ+ uniquely decipherable, or does the equation cx = dy, where c, d ∈ Γ, c ≠ d, and x,y ∈ Γ*, have a solution? We give an O(n L) algorithm for this problem, where L = |c1| + ... + |cn|, and use this algorithm to investigate the impact of structural properties of Γ on the complexity of testing unique decipherability. We then give an O(L log(n)) unique decipherability test for sets Γ which may be linearly ordered by the prefix relation.

Patent
03 May 1984
TL;DR: In this paper, a run-length-limited encoding scheme is proposed to encode strings of data including a sequence of synchronizing bits having a value of binary zero for providing corresponding encoded bits which, when recorded on a magnetic storage medium, provide a maximum number of flux transitions.
Abstract: A method of run-length-limited encoding strings of data including a sequence of synchronizing bits having a value of binary zero for providing corresponding encoded bits which, when recorded on a magnetic storage medium, provide a maximum number of flux transitions. The steps of this method include serially receiving such a string of input bits, dividing the string of input bits into unique bit groups, replacing each group with a corresponding collection of encoded bits conforming with the limitations of the run-length-limited encoding scheme, including replacing each divided group of three input binary zeros into a collection of six encoded bits having two binary ones separated by two binary zeros, serially transmitting these collections of encoded bits for recording the same on the medium in the same sequence as the corresponding input bit groups are received.