scispace - formally typeset
Search or ask a question

Showing papers on "String (computer science) published in 1998"


Journal ArticleDOI
TL;DR: Unsupervised self-organizing maps, as well as supervised learning by Learning Vector Quantization (LVQ) can be defined for string variables, too when the SOM and the LVQ algorithms are expressed as batch versions.

283 citations


Patent
13 Apr 1998
TL;DR: In this paper, a computerized method selectively accepts access requests from a client computer connected to a server computer by a network is proposed, where the server computer receives an access request from the client computer and generates a predetermined number of random characters.
Abstract: A computerized method selectively accepts access requests from a client computer connected to a server computer by a network. The server computer receives an access request from the client computer. In response, the server computer generates a predetermined number of random characters. The random characters are used to form a string in the server computer. The string is randomly modified either visually or audibly to form a riddle. The original string becomes the correct answer to the riddle. The server computer renders the riddle on an output device of the client computer. In response, the client computer sends an answer to the server. Hopefully, the answer is a user's guess for the correct answer. The server determines if the guess is the correct answer, and if so, the access request is accepted. If the correct answer is not received within a predetermined amount of time, the connection between the client and server computer is terminated by the server on the assumption that an automated agent is operating in the client on behalf of the user.

281 citations


Patent
11 Feb 1998
TL;DR: In this paper, a tokenizer is proposed to generate information retrieval tokens that characterize the semantic relationship expressed in the input string, which can be used for both constructing an index representing target documents and processing a query against that index.
Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each has an 'is a' relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more altenative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms. The tokenizer is preferably used to generate tokens for both constructing an index representing target documents and processing a query against that index.

268 citations


Patent
15 Dec 1998
TL;DR: In this paper, a three-field text string class is employed for data entered in a language which does not employ the latin alphabet or latin character set, but does employ a character set which may be readily sound-mapped to the Latin character set.
Abstract: A three-field text string class is employed for data entered in a language which does not employ the latin alphabet or latin character set, but does employ a character set which may be readily sound-mapped to the latin character set. The entered text is stored in a first field of the text string class, while an automatically transliterated representation of the data entered is stored in a second field. The transliteration is generated utilizing a character-mapping resource file table specific to the language in which the text was entered and the language employing the latin character set. The contents of the second field thus provide a recognizable representation of the text string to users unfamiliar with the character set of the language in which the text was entered. The second field's contents also provide a pronunciation key for the entered text string for nonspeakers. An abstract object name entered in Cyrillic characters may thus be recognized and properly pronounced by an user who only speaks English.

225 citations


Journal ArticleDOI
TL;DR: This paper considers the following incremental version of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them, and obtains O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison.
Abstract: The problem of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them has been much studied. In this paper we consider the following incremental version of these problems: given an appropriate encoding of a comparison between A and B, can one incrementally compute the answer for A and bB, and the answer for A and Bb with equal efficiency, where b is an additional symbol? Our main result is a theorem exposing a surprising relationship between the dynamic programming solutions for two such "adjacent" problems. Given a threshold k on the number of differences to be permitted in an alignment, the theorem leads directly to an O(k) algorithm for incrementally computing a new solution from an old one, as contrasts the O(k2) time required to compute a solution from scratch. We further show, with a series of applications, that this algorithm is indeed more powerful than its nonincremental counterpart. We show this by solving the applications with greater asymptotic efficiency than heretofore possible. For example, we obtain O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison.

216 citations


Patent
28 Aug 1998
TL;DR: An LED light string employs a plurality of LEDs wired in a series-parallel block as discussed by the authors, each of which is coupled in parallel, the parallel connection coupled across a supply voltage through an electrical interface.
Abstract: An LED light string employs a plurality of LEDs wired in a series-parallel block. Further, each series-parallel block may be coupled in parallel, the parallel connection coupled across a supply voltage through an electrical interface. LEDs of the light string may comprise either a single color LED or an LED including multiple sub-dies, each sub-die of a different color. LED series-parallel blocks of the light string may be operated in continuous, periodic or pseudo-random state. The LED light string may provide polarized connectors to couple LED light strings end-to-end and in parallel with the supply voltage. The electrical interface may have one or more parallel outputs and a switch so as to operate multiple LED light strings in continuous, periodic or pseudo-random states. The LED light string may be adapted so as to employ LEDs of different drive voltages in each series section of the series-parallel block. Fiber optic bundles may be coupled to individual LEDs to diffuse LED light output in a predetermined manner.

191 citations


Patent
26 Jun 1998
TL;DR: In this article, a system and method of allowing a user to control a computer application with spoken commands, include the steps of processing the spoken commands with a Speech Recognition application into candidate word phrases, and parsing at least one candidate word phrase with a Context Free Grammar (CFG) parser, into a parse tree.
Abstract: A system and method of allowing a user to control a computer application with spoken commands, include the steps of processing the spoken commands with a Speech Recognition application into candidate word phrases, and parsing at least one candidate word phrase with a Context Free Grammar (CFG) parser, into a parse tree. A plurality of predefined rewrite rules grouped into a plurality of phrases are to the parse tree, for rewriting the parse tree. Each of the plurality of rewrite rules includes a pattern matching portion, for matching at least a part of the parse tree, and a rewrite component, for rewriting the matched part. A command string is produced by traversing node of the modified parse tree. The command string is sent to an interpreter application or directly to the computer application.

191 citations


Journal ArticleDOI
TL;DR: This paper gives the first nontrivial compressed matching algorithm for the classic adaptive compression scheme, the LZ77 algorithm, which is known to compress more than other dictionary compression schemes, such as LZ78 and LZW, though for strings with constant per bit entropy, all these schemes compress optimally in the limit.
Abstract: String matching and compression are two widely studied areas of computer science. The theory of string matching has a long association with compression algorithms. Data structures from string matching can be used to derive fast implementations of many important compression schemes, most notably the Lempel—Ziv (LZ77) algorithm. Intuitively, once a string has been compressed—and therefore its repetitive nature has been elucidated—one might be tempted to exploit this knowledge to speed up string matching. The Compressed Matching Problem is that of performing string matching in a compressed text, without uncompressing it. More formally, let T be a text, let Z be the compressed string representing T , and let P be a pattern. The Compressed Matching Problem is that of deciding if P occurs in T , given only P and Z . Compressed matching algorithms have been given for several compression schemes such as LZW. In this paper we give the first nontrivial compressed matching algorithm for the classic adaptive compression scheme, the LZ77 algorithm. In practice, the LZ77 algorithm is known to compress more than other dictionary compression schemes, such as LZ78 and LZW, though for strings with constant per bit entropy, all these schemes compress optimally in the limit. However, for strings with o(1) per bit entropy, while it was recently shown that the LZ77 gives compression to within a constant factor of optimal, schemes such as LZ78 and LZW may deviate from optimality by an exponential factor. Asymptotically, compressed matching is only relevant if |Z|=o(|T|) , i.e., if the compression ratio |T|/|Z| is more than a constant. These results show that LZ77 is the appropriate compression method in such settings. We present an LZ77 compressed matching algorithm which runs in time O(n log 2 u/n + p) where n=|Z| , u=|T| , and p=|P| . Compare with the naive ``decompresion'' algorithm, which takes time Θ(u+p) to decide if P occurs in T . Writing u+p as (n u)/n+p , we see that we have improved the complexity, replacing the compression factor u/n by a factor log 2 u/n . Our algorithm is competitive in the sense that O(n log 2 u/n + p)=O(u+p) , and opportunistic in the sense that O(n log 2 u/n + p)=o(u+p) if n=o(u) and p=o(u) .

179 citations


01 May 1998
TL;DR: This thesis introduces a general method for computing the set of reachable states of an infinite-state system, based on the concept of meta-transition, which is a mathematical object that can be associated to the model, and whose purpose is to make it possible to compute in a finite amount of time an infinite set of Reachable states.
Abstract: In this thesis, we introduce a general method for computing the set of reachable states of an infinite-state system. The basic idea, inspired by well-known statespace exploration methods for finite-state systems, is to propagate reachability from the initial state of the system in order to determine exactly which are the reachable states. Of course, the problem being in general undecidable, our goal is not to obtain an algorithm which is guaranteed to produce results, but one that often produces results on practically relevant cases. Our approach is based on the concept ofmeta-transition, which is a mathematical object that can be associated to the model, and whose purpose is to make it possible to compute in a finite amount of time an infinite set of reachable states. Different methods for creating meta-transitions are studied. We also study the properties that can be verified by state-space exploration, in particular linear-time temporal properties. The state-space exploration technique that we introduce relies on a symbolic representation system for the sets of data values manipulated during exploration. This representation system has to satisfy a number of conditions. We give a generic way of obtaining a suitable representation system, which consists of encoding each data value as a string of symbols over some finite alphabet, and to represent a set of values by a finite-state automaton accepting the language of the encodings of the values in the set. Finally, we particularize the general representation technique to two important domains: unbounded FIFO buffers, and unbounded integer variables. For each of those domains, we give detailed algorithms for performing the required operations on represented sets of values.

165 citations


Proceedings Article
01 Dec 1998
TL;DR: It is shown how combined phoneme/stress prediction is better than separate prediction processes, and still better when including in the model the last phonemes transcribed and part of speech information.
Abstract: This paper presents trainable methods for generating letter to sound rules from a given lexicon for use in pronouncing out-ofvocabulary words and as a method for lexicon compression. As the relationship between a string of letters and a string of phonemes representing its pronunciation for many languages is not trivial, we discuss two alignment procedures, one fully automatic and one hand seeded which produce reasonable alignments of letters to phones (or epsilon). Top Down Induction Tree models are trained on the aligned entries. We show how combined phoneme/stress prediction is better than separate prediction processes, and still better when including in the model the last phonemes transcribed and part of speech information. For the lexicons we have tested, our models have a word accuracy (including stress) of 78% for OALD, 62% for CMU and 94% for BRULEX, allowing substantial reduction in the size of these lexicons.

148 citations


Journal ArticleDOI
TL;DR: No position in any word can be the beginning of the rightmost occurrence of more than two squares, from which the maximum number of distinct primitive rooted squares in a word of length n is deduced.

Journal ArticleDOI
TL;DR: The presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers and the estimated formantfrequency were used in a number of variants for recognition.
Abstract: This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and the segment boundaries that optimally match the spectrum. We used this method in experimental tests that were carried out on the TI digit string data base. The main results of the experimental tests are: (1) the presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers; and (2) the estimated formant frequencies were used in a number of variants for recognition. The best set-up resulted in a string error rate of 4.2% on the adult corpus of the TI digit string data base.

Book ChapterDOI
TL;DR: A program transformation tech- nique is used, namely, partial evaluation, to automatically transform a DSL program into a compiled program, given only an interpreter.
Abstract: I m p l e m e n t a t i o n . The abstract machine is then given an implementation (typ- ically, a library), or possibly many, to account for different operational con- texts. The valuation function can be implemented as an interpreter based on an abstract machine implementation, or as a compiler to abstract machine instructions. P a r t i a l e v a l u a t i o n . While interpreting is more flexible, compiling is more effi- cient. To get the best of both worlds, we use a program transformation tech- nique, namely, partial evaluation, to automatically transform a DSL program into a compiled program, given only an interpreter. Each of the above methodology steps is further detailed in a separate section of this paper. 1.6 A W o r k i n g E x a m p l e To illustrate our approach, an example of DSL is used throughout the paper. We introduce a simple electronic mail processing application as a working example. Conceptually this application enables users to specify automatic treatments of incoming messages depending on their nature and contents: dispatching mes- sages to people or folders, filtering spam, offering a shell escape (e.g., to feed an electronic agenda), replying to messages when absent, etc. This example is inspired by a Unix program called s l o c a l which offers users a way of processing inbound mail. With s l o c a l , user-defined treatments are expressed in the form of rules. Each rule consists of a string to be searched in a message field (e.g., Subjec t , From) and an action to be performed if the string

Journal ArticleDOI
TL;DR: In this article, the authors gather descriptive information about orchestra programs that can be used as baseline data when considering the needs of school string programs, and use it to evaluate the performance of orchestra programs.
Abstract: The objective of this study was to gather descriptive information about orchestra programs that can be used as baseline data when considering the needs of school string programs. Of the 1,345 surve...

Patent
01 Jun 1998
TL;DR: In this paper, an apparatus and method for obtaining samples of pristine formation or formation fluid, using a work string designed for performing other downhole work such as drilling, workover operations, or re-entry operations is disclosed.
Abstract: An apparatus and method are disclosed for obtaining samples of pristine formation or formation fluid, using a work string designed for performing other downhole work such as drilling, workover operations, or re-entry operations. An extendable element extends against the formation wall to obtain the pristine formation or fluid sample. While the test tool is in standby condition, the extendable element is withdrawn within the work string, protected by other structure from damage during operation of the work string. The apparatus is used to sense or sample downhole conditions while using a work string, and the measurements or samples taken can be used to adjust working fluid properties without withdrawing the work string from the bore hole. When the extendable element is a packer, the apparatus can be used to prevent a kick from reaching the surface, adjust the density of the drilling fluid, and thereafter continuing use of the work string.

Journal ArticleDOI
TL;DR: Applications that can be implemented efficiently and effectively using sets of n‐grams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look‐up, text compression, and language identification.
Abstract: This paper provides an introduction to the use of n‐grams in textual information systems, where an n‐gram is a string of n, usually adjacent, characters extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of n‐grams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look‐up, text compression, and language identification.

Patent
Hirotaka Shiiyama1
17 Mar 1998
TL;DR: In this paper, an image feature amount extraction unit segments an image into a plurality of blocks and calculates the feature amount of each block, and then adds a label to each block in accordance with the feature amounts acquired for the block and arranges the labels on the basis of a predetermined block sequence to generate a label string.
Abstract: An image feature amount extraction unit segments an image into a plurality of blocks and calculates the feature amount of each block. A feature amount label string generation unit adds a label to each block in accordance with the feature amount acquired for the block and arranges the labels on the basis of a predetermined block sequence to generate a label string. In retrieval, the similarity between the label string of a specified image and that of an image to be compared is calculated, and an image whose similarity exceeds a predetermined value is output as a retrieval result. In this way, similar image retrieval can be performed in consideration of the arrangement of feature amounts of images, and simultaneously, similar image retrieval can be performed while absorbing any difference due to a variation in photographing condition or the like.

Book ChapterDOI
28 Mar 1998
TL;DR: A new technique is presented that allows the application of the well known and established interprocedural analysis theory to loops and is implemented in the Program Analyzer Generator PAG, which is used to demonstrate the findings by applying the techniques to several real world programs.
Abstract: Programs spend most of their time in loops and procedures. Therefore, most program transformations and the necessary static analyses deal with these. It has been long recognized, that different execution contexts for procedures may induce different execution properties. There are well established techniques for interprocedural analysis like the call string approach. Loops have not received similar attention in the area of data flow analysis and abstract interpretation. All executions are treated in the same way, although typically the first and later executions may exhibit very different properties. In this paper a new technique is presented that allows the application of the well known and established interprocedural analysis theory to loops. It turns out that the call string approach has limited flexibility in its possibilities to group several calling contexts together for the analysis. An extension to overcome this problem is presented that relies on a similar approach but gives more useful results in practice. The classical and the new techniques are implemented in our Program Analyzer Generator PAG, which is used to demonstrate our findings by applying the techniques to several real world programs.

Patent
Tomohiro Miyahira1, Eiichi Tazoe1
31 Aug 1998
TL;DR: In this article, an n-gram statistical analysis is employed to acquire frequently appearing character strings of n characters or more, and individual character strings having n characters is replaced by character translation codes of 1 byte each.
Abstract: A n-gram statistical analysis is employed to acquire frequently appearing character strings of n characters or more, and individual character strings having n characters or more are replaced by character translation codes of 1 byte each. The correlation between the original character strings having n characters and the character translation codes is registered in a character translation code table. Assume that a character string of three characters, i.e., a character string of three bytes, “sta,” is registered as 1-byte code “e5” and that a character string of four characters, i.e., a character string of four bytes, “tion,” is registered as 1-byte code “f1.” Then, the word “station,” which consists of a character string of seven characters, i.e., seven bytes, is represented by the 2-byte code “e5 f1,” so that this contributes to a compression of five bytes.

Proceedings ArticleDOI
08 Nov 1998
TL;DR: A new algorithm for suffix tree construction in which almost all disk accesses to be via the sort and scan primitives is choreographed, which is the first optimal algorithm known for either model.
Abstract: The suffix tree of a string is the fundamental data structure of string processing. Recent focus on massive data sets has sparked interest in overcoming the memory bottlenecks of known algorithms for building suffix trees. Our main contribution is a new algorithm for suffix tree construction in which we choreograph almost all disk accesses to be via the sort and scan primitives. This algorithm achieves optimal results in a variety of sequential and parallel computational models. Two of our results are: In the traditional external memory model, in which only the number of disk accesses is counted, we achieve an optimal algorithm, both for single and multiple disk cases. This is the first optimal algorithm known for either model. Traditional disk page access counting does not differentiate between random page accesses and block transfers involving several consecutive pages. This difference is routinely exploited by expert programmers to get fast algorithms on real machines. We adopt a simple accounting scheme and show that our algorithm achieves the same optimal tradeoff for block versus random page accesses as the one we establish for sorting.

Proceedings ArticleDOI
TL;DR: In this paper, it was shown that N/2+sqrt(N) calls to the oracle are sufficient to guess the whole content of the binary oracle (being an N bit string) with probability greater than 95%.
Abstract: Consider a quantum computer in combination with a binary oracle of domain size N. It is shown how N/2+sqrt(N) calls to the oracle are sufficient to guess the whole content of the oracle (being an N bit string) with probability greater than 95%. This contrasts the power of classical computers which would require N calls to achieve the same task. From this result it follows that any function with the N bits of the oracle as input can be calculated using N/2+sqrt(N) queries if we allow a small probability of error. It is also shown that this error probability can be made arbitrary small by using N/2+O(sqrt(N)) oracle queries. In the second part of the article `approximate interrogation' is considered. This is when only a certain fraction of the N oracle bits are requested. Also for this scenario does the quantum algorithm outperform the classical protocols. An example is given where a quantum procedure with N/10 queries returns a string of which 80% of the bits are correct. Any classical protocol would need 6N/10 queries to establish such a correctness ratio.

Journal ArticleDOI
TL;DR: The qualitative dynamics of a catalytic self-organizing system of binary strings that is inspired by the chemical information processing metaphor is examined, and every variation is performed by the objects themselves in their machine form.
Abstract: We examine the qualitative dynamics of a catalytic self-organizing system of binary strings that is inspired by the chemical information processing metaphor. A string is interpreted in two different ways: either (a) as raw data or (b) as a machine that is able to process another string as data in order to produce a third one. This article focuses on the phenomena of evolution whose appearance is notable because no explicit mutation, recombination, or artificial selection operators are introduced. We call the system self-evolving because every variation is performed by the objects themselves in their machine form.

Journal ArticleDOI
Olivier Danvy1
01 Nov 1998
TL;DR: It is shown how changing the representation of the control string makes it possible to program printf in ML (which does not allow dependent types), which is well typed and perceptibly more efficient than the corresponding library functions in Standard ML of New Jersey and in Caml.
Abstract: A string-formatting function such as printf in C seemingly requires dependent types, because its control string determines the rest of its arguments. Examples:formula hereWe show how changing the representation of the control string makes it possible to program printf in ML (which does not allow dependent types). The result is well typed and perceptibly more efficient than the corresponding library functions in Standard ML of New Jersey and in Caml.

Patent
27 May 1998
TL;DR: In this article, a device and method for converting product-specific identification numbers associated with bar code indicia on pharmaceutical products to an industry standard identification number is presented, which can include a removable member for interchanging and updating bar code indicators information rather than reprogramming the device.
Abstract: A device and method is provided for converting product-specific identification numbers associated with bar code indicia on pharmaceutical products to an industry standard identification number. The process involves reading a bar code indicia, converting the indicia into an input string and standardizing the input string by means of adding or subtracting characters in accordance with rules based on the bar code type and length of the input string. By means of the invention pharmaceutical products of two different sources may be compared to determine if they contain the same drug as determined by the standard identification number. The device can include a removable member for interchanging and updating bar code indicia information rather than reprogramming the device.

Patent
Alan M. Rooke1
07 Dec 1998
TL;DR: In this paper, a microcontroller (12) provides the address bit and device data start bits for the first peripheral device (PD) in a string of PDs (20-24 or 30-34 ).
Abstract: A microcontroller ( 12 ) provides the address bit and device data start bits for the first peripheral device (PD) in a string of PDs ( 20-24 or 30-34 ). Each PD includes a string of flipflops ( 72-86 ). If the complete address matches, then the first PD ( 20 ) shifts out the device data start bits for the next PD ( 22 ) in that string. The address bit for the next PD is provided by the pull-up or pull-down device ( 52 ) hard-wired on the SI of the next PD. The device data start bits for the next PD are provided by right-most flipflops of the previous PD. If the address bit and device data start bits match the contents of the device registers of next PD, then that next PD enables SYSCLK to shift out the contents of flipflops.

Journal ArticleDOI
TL;DR: An off-line system for the recognition of handwritten numeral strings based on a cascade of two recognition methods that compares favorably to other published methods.

Patent
26 Jun 1998
TL;DR: In this paper, the alignment probabilities between various source word and target word pairs are ascertained by evaluating incrementally statistical translation performance of various target word strings, deciding on an optimum target word string, and outputting the latter.
Abstract: For translating a word-organized source text into a word-organized target text through mapping of source words on target words, both a translation model and a language model are used. In particular, alignment probabilities are ascertained between various source word & target word pairs, whilst preemptively assuming that alignment between such word pairs is monotonous through at least substantial substrings of a particular sentence. This is done by evaluating incrementally statistical translation performance of various target word strings, deciding on an optimum target word string, and outputting the latter.

Book ChapterDOI
24 Aug 1998
TL;DR: This paper considers one algorithmic problem from each of these areas and presents highly efficient (linear or near linear time) algorithms for both problems, relying on augmenting the suffix tree, a fundamental data structure in string algorithmics.
Abstract: Information retrieval and data compression are the two main application areas where the rich theory of string algorithmics plays a fundamental role. In this paper, we consider one algorithmic problem from each of these areas and present highly efficient (linear or near linear time) algorithms for both problems. Our algorithms rely on augmenting the suffix tree, a fundamental data structure in string algorithmics. The augmentations are nontrivial and they form the technical crux of this paper. In particular, they consist of adding extra edges to suffix trees, resulting in Directed Acyclic Graphs (DAGs). Our algorithms construct these "suffix DAGs" and manipulate them to solve the two problems efficiently.

Patent
27 Mar 1998
TL;DR: The Self Implementing Modules (SIMs) as mentioned in this paper are parametric modules that implement themselves at the time the design is elaborated, targeting a specified FPGA according to specified parameters that may, for example, include the required timing, data width, number of taps for a FIR filter and so forth.
Abstract: The invention provides parametric modules called Self Implementing Modules (SIMs) for use in programmable logic devices such as FPGAs. The invention further provides tools and methods for generating and using SIMs. SIMs implement themselves at the time the design is elaborated, targeting a specified FPGA according to specified parameters that may, for example, include the required timing, data width, number of taps for a FIR filter, and so forth. In one embodiment, the SIM parameters may be symbolic expressions, which may comprise strings or string expressions, logical (Boolean) expressions, or a combination of these data types. The variables in these expressions are either parameters of the SIM or parameters of the “parent” of the SIM. Parametric expressions are parsed and evaluated at the time the SIM is elaborated; i.e., at run-time, usually when the design is mapped, placed, and routed in a specific FPGA. The use of parametric expressions interpreted at elaboration time allows dynamic inheritance and synthesis of actual parameter values, rather than the static value inheritance commonly found in programming languages such as C++ and Java.

Proceedings Article
01 Jul 1998
TL;DR: A model for strings of characters that is loosely based on the Lempel Ziv model with the addition that a repeated substring can be an approximate match to the original substring is described; this is close to the situation of DNA, for example.
Abstract: We describe a model for strings of characters that is loosely based on the Lempel Ziv model with the addition that a repeated substring can be an approximate match to the original substring; this is close to the situation of DNA, for example. Typically there are many explanations for a given string under the model, some optimal and many suboptimal. Rather than commit to one optimal explanation, we sum the probabilities over all explanations under the model because this gives the probability of the data under the model. The model has a small number of parameters and these can be estimated from the given string by an expectationmaximization (EM) algorithm. Each iteration of the EM algorithm takes O(n 2) time and a few iterations are typically sufficient. O(n 2) complexity is impractical for strings of more than a few tens of thousands of characters and a faster approximation algorithm is also given. The model is further extended to include approximate reverse complementary repeats when analyzing DNA strings. Tests include the recovery of parameter estimates from known sources and applications to real DNA strings.