scispace - formally typeset
Search or ask a question
Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.


Papers
More filters
Proceedings ArticleDOI
18 May 1993
TL;DR: The question of how complex a leaf language must be in order to characterize some given class C is investigated, which leads to the examination of the closure of different language classes under bit-reducibility.
Abstract: For a nondeterministic polynomial-time Turing machine M and an input string x, the leaf string of M on x is the 0-1-sequence of leaf-values (0 approximately reject, 1 approximately accept) of the computation tree of M with input x. The set A is said to be bit-reducible to B if there exists and M as above such that every input x is in A if and only if the leaf string of M on x is in B. A class C is definable via leaf language B, if C is the class of all languages that are bit-reducible to B. The question of how complex a leaf language must be in order to characterize some given class C is investigated. This question leads to the examination of the closure of different language classes under bit-reducibility. The question is settled for subclasses of regular languages, context free languages, and a number of time and space bounded classes, resulting in a number of surprising characterizations for PSPACE. >

98 citations

Journal ArticleDOI
TL;DR: This article introduces the first compressed suffix tree representation that requires only sublinear space on top of the compressed text size, and supports a wide set of navigational operations in almost logarithmic time.
Abstract: Suffix trees are by far the most important data structure in stringology, with a myriad of applications in fields like bioinformatics and information retrieval. Classical representations of suffix trees require Θ(n log n) bits of space, for a string of size n. This is considerably more than the n log2 σ bits needed for the string itself, where σ is the alphabet size. The size of suffix trees has been a barrier to their wider adoption in practice. Recent compressed suffix tree representations require just the space of the compressed string plus Θ(n) extra bits. This is already spectacular, but the linear extra bits are still unsatisfactory when σ is small as in DNA sequences. In this article, we introduce the first compressed suffix tree representation that breaks this Θ(n)-bit space barrier. The Fully Compressed Suffix Tree (FCST) representation requires only sublinear space on top of the compressed text size, and supports a wide set of navigational operations in almost logarithmic time. This includes extracting arbitrary text substrings, so the FCST replaces the text using almost the same space as the compressed text. An essential ingredient of FCSTs is the lowest common ancestor (LCA) operation. We reveal important connections between LCAs and suffix tree navigation. We also describe how to make FCSTs dynamic, that is, support updates to the text. The dynamic FCST also supports several operations. In particular, it can build the static FCST within optimal space and polylogarithmic time per symbol. Our theoretical results are also validated experimentally, showing that FCSTs are very effective in practice as well.

98 citations

Patent
25 Aug 2005
TL;DR: In this article, a method, apparatus, and system are provided for performing real-time or near-real-time localization of data, which comprises monitoring an input string and comparing a semantic associated with the input string to a semantics associated with at least, one stored string.
Abstract: A method, apparatus, and system are provided for performing a real-time or a near real-­time localization of data. The method comprises monitoring an input string and comparing a semantic associated with the input string to a semantic associated with at least, one stored string. The method further comprises providing the stored string as an alternative to the input string.

97 citations

Proceedings ArticleDOI
15 Jun 2009
TL;DR: This work presents a decision procedure that solves systems of equations over regular language variables over systems of constraints, and finds satisfying assignments for the variables in the system.
Abstract: Reasoning about string variables, in particular program inputs, is an important aspect of many program analyses and testing frameworks Program inputs invariably arrive as strings, and are often manipulated using high-level string operations such as equality checks, regular expression matching, and string concatenation It is difficult to reason about these operations because they are not well-integrated into current constraint solversWe present a decision procedure that solves systems of equations over regular language variables Given such a system of constraints, our algorithm finds satisfying assignments for the variables in the system We define this problem formally and render a mechanized correctness proof of the core of the algorithm We evaluate its scalability and practical utility by applying it to the problem of automatically finding inputs that cause SQL injection vulnerabilities

97 citations

Posted Content
TL;DR: These results lead to algorithms for assertion checking and for checking functional equivalence of two programs, written possibly in different programming styles, for commonly used routines such as insert, delete, and reverse.
Abstract: We introduce streaming data string transducers that map input data strings to output data strings in a single left-to-right pass in linear time. Data strings are (unbounded) sequences of data values, tagged with symbols from a finite set, over a potentially infinite data domain that supports only the operations of equality and ordering. The transducer uses a finite set of states, a finite set of variables ranging over the data domain, and a finite set of variables ranging over data strings. At every step, it can make decisions based on the next input symbol, updating its state, remembering the input data value in its data variables, and updating data-string variables by concatenating data-string variables and new symbols formed from data variables, while avoiding duplication. We establish that the problems of checking functional equivalence of two streaming transducers, and of checking whether a streaming transducer satisfies pre/post verification conditions specified by streaming acceptors over input/output data-strings, are in PSPACE. We identify a class of imperative and a class of functional programs, manipulating lists of data items, which can be effectively translated to streaming data-string transducers. The imperative programs dynamically modify a singly-linked heap by changing next-pointers of heap-nodes and by adding new nodes. The main restriction specifies how the next-pointers can be used for traversal. We also identify an expressively equivalent fragment of functional programs that traverse a list using syntactically restricted recursive calls. Our results lead to algorithms for assertion checking and for checking functional equivalence of two programs, written possibly in different programming styles, for commonly used routines such as insert, delete, and reverse.

97 citations


Network Information
Related Topics (5)
Time complexity
36K papers, 879.5K citations
88% related
Tree (data structure)
44.9K papers, 749.6K citations
86% related
Graph (abstract data type)
69.9K papers, 1.2M citations
85% related
Computational complexity theory
30.8K papers, 711.2K citations
82% related
Supervised learning
20.8K papers, 710.5K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
2021491
2020704
2019759
2018816
2017806